There once was a baby raccoon who loved snuggling his mommy. His name was Racarslin.
Blog Posts
Notes for – Building makemore Part 3: Activations & Gradients, BatchNorm
Building makemore Part 3 Jupyter Notebook makemore: part 3¶ Recurring Neural Networks RNNs are not as easily optimizable with first order gradient techniques we have available to us and the key to understanding why they are not optimizable easily is to understand the activations and their gradients and how they behave during training. In [2]: import […]
Tesla Full Self Driving
Currently it costs in the USA ~ $684/mo to drive 1000 miles with FSD subscription, insurance and energy costs. This works out to $0.684(0.69)/mile. How odd. Most cars sit around doing not that much, most of the time. As you decrease the number $684, you broaden the available market. Even starting with a couple buying […]
Gimp’s default screen grab
This is totally useless. It immediately takes a picture of the window. Which is Gimp. The “Select a region to grab” option would be a better default.
Notes for – Building makemore Part 2: MLP
Attempting to scale normalized probability counts grows exponentially You can use Multi Layer Perceptrons (MLPs) as a solution to maximize the log-likelihood of the training data. MLPs let you make predictions by embedding words close togehter in a space such that knowledge transfer of interchangability can occur with good confidence. With a vocabulary of 17000 […]
Notes for — The spelled-out intro to language modeling: building makemore
Derived from: https://github.com/karpathy/nn-zero-to-hero/blob/master/lectures/makemore/makemore_part1_bigrams.ipynb Makemore Purpose: to make more of examples you give it. Ie: names training makemore on names will make unique sounding names this dataset will be used to train a character level language model modelling sequence of characters and able to predict next character in a sequence makemore implements a services of language […]
Notes for – The spelled-out intro to neural networks and backpropagation: building micrograd
Summary of above: Backpropagation is just a recursive application of chain rule backward through the graph storing the computed derivative as a gradient (grad) variable within each node. Gotcha! We must be careful that when adding multiple of the same term we correctly derive wrt all terms instead of just a single term! Ie: b […]
How ML (machine learning) Works
If you are here, you are probably wasting your time. Below is a summary of notes I have taken for working through this: I defer you to that resource because Andrej is my mentor wrt ML. If you want to know my specific thoughts and things that I have learned from his “Building makemore” class, […]