What is BERT? BERT is a deep learning model that has given state-of-the-art results on a wide variety of natural language processing tasks. It stands for Bidirectional Encoder Representations for Transformers. It has been pre-trained on Wikipedia and BooksCorpus and requires task-specific fine-tuning. What is the model architecture of BERT? BERT is a multi-layer bidirectional… Continue reading BERT Explained – A list of Frequently Asked Questions
Category: Research Paper
How can Unsupervised Neural Machine Translation Work?
Neural Machine Translation has arguably reached human-level performance. But, effective training of these systems is strongly dependent on the availability of a large amount of parallel text. Because of which supervised techniques have not been so successful in low resource language pairs. Unsupervised Machine Translation requires only monolingual corpora and is a viable alternative in… Continue reading How can Unsupervised Neural Machine Translation Work?
A Disciplined Approach to Neural Network Hyper-Parameters – Paper Dissected
Training a neural network requires carefully selecting hyper-parameters. The optimal parameters vary from one dataset to another. With so many things to tune, this can easily go out of control. Leslie N. Smith in his paper - A Disciplined Approach to Neural Network Hyper-Parameters: Part 1 - Learning Rate, Batch Size, Momentum, and Weight Decay discusses several efficient… Continue reading A Disciplined Approach to Neural Network Hyper-Parameters – Paper Dissected
What makes the AWD-LSTM great?
The AWD-LSTM has been dominating the state-of-the-art language modeling. All the top research papers on word-level models incorporate AWD-LSTMs. And it has shown great results on character-level models as well (Source). In this blog post, I go through the research paper - Regularizing and Optimizing LSTM Language Models that introduced the AWD-LSTM and try to explain… Continue reading What makes the AWD-LSTM great?
A Walkthrough of InferSent – Supervised Learning of Sentence Embeddings
Universal Embeddings of text data have been widely used in natural language processing. It involves encoding words or sentences into fixed length numeric vectors which are pre-trained on a large text corpus and can be used to improve the performance of other NLP tasks (like classification, translation). While word embeddings have been massively popular and… Continue reading A Walkthrough of InferSent – Supervised Learning of Sentence Embeddings
Understanding the Working of Universal Language Model Fine Tuning (ULMFiT)
(Edit) A big thanks to Jeremy Howard for the shout-out 😊 https://twitter.com/jeremyphoward/status/1008156649788325889 Transfer Learning in natural language processing is an area that had not been explored with great success. But, last month (May 2018), Jeremy Howard and Sebastian Ruder came up with the paper - Universal Language Model Fine-tuning for Text Classification which explores the benefits… Continue reading Understanding the Working of Universal Language Model Fine Tuning (ULMFiT)