Tomlinson-Online - Elements Of Deep Learning Theory

While the field of Deep Learning has been advancing at frightening speeds during the last ten years, we do not have a plausible theoretical explanation of its success yet. One can safely say that Deep Learning works, but nobody really understands why. Nevertheless, starting around five years before now, a decent number of theoretical papers specifically concerning neural nets started to emerge at top machine learning venues. This makes us think that the core of Deep Learning theory has already started to crystallize.The goal of the present book is to present these core concepts of Deep Learning theory to readers so that they could directly dive into recent papers of this area. For this purpose, each chapter elaborates a simple model or a classical result in details first and then discusses possible generalizations and more recent developments of the same idea.We have to warn the reader that the present book is not a mathematical manuscript. Not all results here are stated as formal theorems and not all theorems are provided with complete and rigorous proofs. Many of the theorems of the book are proven up to some technical lemmas, while for some there is only a proof sketch. This conforms with the main idea of the book: present and illustrate concepts rather than reproduce all the results.The book, in its present form, covers the following topics: uniform generalization bounds, PAC-bayesian generalization bounds, double descent phenomena, infinitely-wide networks, implicit bias of gradient descent, loss landscape, gradient descent convergence guarantees, and initialization strategies.