动手学深度学习Gluon 相关资源

标题	说明
《动手学深度学习》	官网
diveintodeeplearning/d2l-zh	GitHub
《动手学深度学习》PDF	PDF

循环神经网络 Recurrent Neural Networks

发表于 2018-11-29 更新于 2018-11-30 分类于深度学习，循环神经网络

Recurrent Neural Networks

Introduction
- We introduce Recurrent Neural Networks and how they are able to feed in a sequence and predict either a fixed target (categorical/numerical) or another sequence (sequence to sequence).
Implementing an RNN Model for Spam Prediction
- We create an RNN model to improve on our spam/ham SMS text predictions.
Implementing an LSTM Model for Text Generation
- We show how to implement a LSTM (Long Short Term Memory) RNN for Shakespeare language generation. (Word level vocabulary)
Stacking Multiple LSTM Layers
- We stack multiple LSTM layers to improve on our Shakespeare language generation. (Character level vocabulary)
Creating a Sequence to Sequence Translation Model (Seq2Seq)
- We show how to use TensorFlow’s sequence-to-sequence models to train an English-German translation model.
Training a Siamese Similarity Measure
- Here, we implement a Siamese RNN to predict the similarity of addresses and use it for record matching. Using RNNs for record matching is very versatile, as we do not have a fixed set of target categories and can use the trained model to predict similarities across new addresses.

阅读全文 »

BERT Pre-training of Deep Bidirectional Transformers for Language Understanding

发表于 2018-11-27 更新于 2020-06-18 分类于论文，语言模型

本文介绍了一种新的语言表征模型 BERT——来自 Transformer 的双向编码器表征。与最近的语言表征模型不同，BERT 旨在基于所有层的左、右语境来预训练深度双向表征。BERT 是首个在大批句子层面和 token 层面任务中取得当前最优性能的基于微调的表征模型，其性能超越许多使用任务特定架构的系统，刷新了 11 项 NLP 任务的当前（2018年）最优性能记录。

阅读全文 »

待读One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

发表于 2018-11-27 更新于 2018-12-25 分类于论文，论文阅读

One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, Tony Robinson
(Submitted on 11 Dec 2013 (v1), last revised 4 Mar 2014 (this version, v3))

阅读全文 »

待读Universal Transformers

发表于 2018-11-27 更新于 2018-12-25 分类于论文，论文阅读

腾讯AI Lab提出翻译改进模型Transformer的3个优化方法

2017 年，谷歌发布了机器学习模型 Transformer，该模型在机器翻译及其他语言理解任务上的表现远远超越了以往算法。今天 2018-08-16，谷歌发布该模型最新版本——Universal Transformer，弥补了在大规模语言理解任务上具有竞争力的实际序列模型与计算通用模型之间的差距，其 BLEU 值比去年的 Transformer 提高了 0.9。在多项有难度的语言理解任务上，Universal Transformer 的泛化效果明显更好，且它在 bAbI 语言推理任务和很有挑战性的 LAMBADA 语言建模任务上达到了新的当前最优性能。

阅读全文 »

Efficient Estimation of Word Representations in Vector Space

发表于 2018-11-27 分类于论文，论文阅读

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean
(Submitted on 16 Jan 2013 (v1), last revised 7 Sep 2013 (this version, v3))

阅读全文 »

tensorflow cookbook 自然语言处理

发表于 2018-11-26 更新于 2019-01-07 分类于深度学习，自然语言处理

Natural Language Processing

Up to this point, we have only considered machine learning algorithms that mostly operate on numerical inputs. If we want to use text, we must find a way to convert the text into numbers. There are many ways to do this and we will explore a few common ways this is achieved.”

阅读全文 »

神经网络 Neural Networks

发表于 2018-11-24 更新于 2018-11-28 分类于深度学习，神经网络

神经网络

Neural Network
（人工）神经网络是一种起源于 20 世纪 50 年代的监督式机器学习模型，那时候研究者构想了「感知器（perceptron）」的想法。这一领域的研究者通常被称为「联结主义者（Connectionist）」，因为这种模型模拟了人脑的功能。神经网络模型通常是通过反向传播算法应用梯度下降训练的。目前神经网络有两大主要类型，它们是前馈神经网络（主要是卷积神经网络-CNN）和循环神经网络（RNN），其中 RNN 又包含长短期记忆（LSTM）、门控循环单元（GRU）等子类。深度学习（deep learning）是一种主要应用于神经网络技术以帮助其取得更好结果的技术。尽管神经网络主要用于监督学习，但也有一些为无监督学习设计的变体，如自动编码器（AutoEncoder）和生成对抗网络（GAN）。

阅读全文 »

线性回归 Linear_Regression

发表于 2018-11-24 分类于机器学习，线性回归

线性回归

Linear Regression (function) https://en.wikipedia.org/wiki/Linear_regression
在现实世界中，存在着大量这样的情况：两个变量例如X和Y有一些依赖关系。由X可以部分地决定Y的值，但这种决定往往不很确切。常常用来说明这种依赖关系的最简单、直观的例子是体重与身高，用Y表示他的体重。众所周知，一般说来，当X大时，Y也倾向于大，但由X不能严格地决定Y。又如，城市生活用电量Y与气温X有很大的关系。在夏天气温很高或冬天气温很低时，由于室内空调、冰箱等家用电器的使用，可能用电就高，相反，在春秋季节气温不高也不低，用电量就可能少。但我们不能由气温X准确地决定用电量Y。类似的例子还很多，变量之间的这种关系称为“相关关系”，回归模型就是研究相关关系的一个有力工具。

阅读全文 »

人工智能