I have been looking forward to my visit to MIT for a long time, because Computer Science is one of my most interested study areas.
Also, everyone have heard of MIT, it is one of the most famous university and research institution in the world. Though, I was still bit worried about the program, due to my lack of knowledge in the field of programming.
With excitement and concerns, my journey at MIT begins. The very first challenge came to me just as I expected.
Deep learning requires advanced mathematics knowledge such as matrix and linear algebras, which were not part of my high school curriculum.
As a result, lots of additional preparation are required before each lecture. Adding to this, another challenge appeared unexpectedly, the large amount of reading required as a daily task.
With little experience in terms of scientific research, it is a fresh thing for me to read through all the lengthy papers and documents.
The very first task given by the professor is to read two papers that will each give an introduction to the basic of convolutional neural network.
It was not an easy task for me. From all the unfamiliar professional vocabularies, to rather complicated mathematical equations.
After reading the paper over and over, I realized the content of the paper perhaps out of my reach.
So, I turned to my professor for help, and after discussing the issue together, we decided to focus on the math part of the programming first, as it is the essential to the algorithm.
Thus, we had lectures together, which first introduced me to the world of matrix. Matrix is essentially a group of numbers organized in columns and rows, however, each sets of value in the matrix can be assigned with different meanings, thus complicated information can be transformed into just sets of numbers, and calculated by the computer with great ease.
After acquiring the most important knowledge to deep learning, problems only got more interesting.
Since I knew how to transform questions into series of numbers, our focus turned to the an optimization function that could enable visual identification using a simple computer setup.
My goal is to understand the equation and how to utilize to achieve visual recognition. Although optimization might sound easy in the beginning, but it is not it seems.
For most of the time, human and computers are only able to find an optimized point for very simple equations, say when the graph of the equation only has a few local minima, and we can pick out the global minima easily.
However, in our case, it is a completely different story. Since the input to the function is acquired in real life by two sets of cameras, thus the graph will likely to present countless number of local minima, thus, the time taken for the computer to find out the global minima would be too great.
In order to solve the mystery, my processor gave me a paper published in 2018 that tackles on the exact same problem we faced trying to solve the equation.
He also told me that, we will try to follow the proposal given by the paper, and see if we can write an function that can resolve the problem.
Thereafter, Professor H and me read through, and discussed the paper together.
Then he gave me the task to write an program that accompany the solution given by the paper after I went back to China.