Hierarchical Probabilistic Neural Network Language Model

  • 更新时间: 2017-04-19
  • 作者: Frederic Morin,Yoshua Bengio
  • 浏览数: 34
  • 发表评论

【摘 要】 In recent years, variants of a neural network ar- chitecture for statistical language modeling have been proposed and successfully applied, e.g. in the language modeling component of speech rec- ognizers. The main advantage of these architec- tures is that they learn an embedding for words (or other symbols) in a continuous space that helps to smooth the language model and pro- vide good generalization even when the number of training examples is insufficient. How- ever, these models are extremely slow in com- parison to the more commonly used n-gram models, both for training and recognition. As an alternative to an importance sampling method pro- posed to speed-up training, we introduce a hier- archical decomposition of the conditional proba- bilities that yields a speed-up of about 200 both during training and recognition. The hierarchical decomposition is a binary hierarchical cluster- ing constrained by the prior knowledge extracted from the WordNet semantic hierarchy.

【发布时间】 2005-01-01

【发布位置】 Aistats


我来评分 :6