Jürgen Schmidhuber

Computer

Birthday January 17, 1963

Birth Sign Capricorn

Birthplace Munich, West Germany

Age 61 years old

Nationality Germany

#51737 Most Popular

1963

Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist noted for his work in the field of artificial intelligence, specifically artificial neural networks.

He is a scientific director of the Dalle Molle Institute for Artificial Intelligence Research in Switzerland.

He is also director of the Artificial Intelligence Initiative and professor of the Computer Science program in the Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE) division at the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.

1980

In the 1980s, backpropagation did not work well for deep learning with long credit assignment paths in artificial neural networks.

1987

Schmidhuber completed his undergraduate (1987) and PhD (1991) studies at the Technical University of Munich in Munich, Germany.

His PhD advisors were Wilfried Brauer and Klaus Schulten.

1991

To overcome this problem, Schmidhuber (1991) proposed a hierarchy of recurrent neural networks (RNNs) pre-trained one level at a time by self-supervised learning.

It uses predictive coding to learn internal representations at multiple self-organizing time scales.

This can substantially facilitate downstream deep learning.

The RNN hierarchy can be collapsed into a single RNN, by distilling a higher level chunker network into a lower level automatizer network.

In 1991, Schmidhuber published adversarial neural networks that contest with each other in the form of a zero-sum game, where one network's gain is the other network's loss.

The first network is a generative model that models a probability distribution over output patterns.

The second network learns by gradient descent to predict the reactions of the environment to these patterns.

This was called "artificial curiosity."

Schmidhuber supervised the 1991 diploma thesis of his student Sepp Hochreiter and called it "one of the most important documents in the history of machine learning".

It not only tested the neural history compressor, but also analyzed and overcame the vanishing gradient problem.

This led to the deep learning method called long short-term memory (LSTM), a type of recurrent neural network.

1993

In 1993, a chunker solved a deep learning task whose depth exceeded 1000.

1995

He has served as the director of Dalle Molle Institute for Artificial Intelligence Research (IDSIA), a Swiss AI lab, since 1995.

The name LSTM was introduced in a tech report (1995)

1997

leading to the most cited LSTM publication (1997), co-authored by Hochreiter and Schmidhuber.

The standard LSTM architecture which is used in almost all current applications

2000

was introduced in 2000 by Felix Gers, Schmidhuber, and Fred Cummins.

2004

He taught there from 2004 until 2009.

2005

Today's "vanilla LSTM" using backpropagation through time was published with his student Alex Graves in 2005, and its connectionist temporal classification (CTC) training algorithm in 2006.

CTC enabled end-to-end speech recognition with LSTM.

2009

From 2009, until 2021, he was a professor of artificial intelligence at the Università della Svizzera Italiana in Lugano, Switzerland.

2010

He is best known for his foundational and highly-cited work on long short-term memory (LSTM), a type of neural network architecture which went on to become the dominant technique for various natural language processing tasks in research and commercial applications in the 2010s.

He also introduced principles of meta-learning, generative adversarial networks and linear transformers, all of which are widespread in modern AI.

By the 2010s, the LSTM became the dominant technique for a variety of natural language processing tasks including speech recognition and machine translation, and was widely implemented in commercial technologies such as Google Translate and Siri.

LSTM has become the most cited neural network of the 20th century.

LSTM was called "arguably the most commercial AI achievement."

2014

In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and self-driving cars.

Sepp Hochreiter, Jaan Tallinn, and Marcus Hutter are advisers to the company.

In 2014, this principle was used in a generative adversarial network where the environmental reaction is 1 or 0 depending on whether the first network's output is in a given set.

This can be used to create realistic deepfakes.

2015

In 2015, Rupesh Kumar Srivastava, Klaus Greff, and Schmidhuber used LSTM principles to create the Highway network, a feedforward neural network with hundreds of layers, much deeper than previous networks.

2016

Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue.

2017

Nnaisense raised its first round of capital funding in January 2017.

Schmidhuber's overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks.