Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT) Page: Parallel Viterbi (https://cwiki.apache.org/confluence/display/MAHOUT/Parallel+Viterbi)
Edited by Sergey Bartunov: --------------------------------------------------------------------- Viterbi algorithm is known as inference algorithm (synonyms: segmentation, decoding etc) for Hidden Markov Model \[1\] which finds the most likely sequence of hidden states by given sequence of observed states. Apache Mahout has both [sequential|Hidden Markov Models] and parallel (that's what you're reading about) implementations of the algorithm. _This article is under the work_ h3. Parallelization strategy is quite straightforward and based on data parallelizm. There are some studies on Viterbi (and Belief Propogation which is inference algorithm for loop-less Markov Random Fields and is quite similar to Viterbi) parallelization, but at the moment of writing this article none of them seem to be applyable for MapReduce paradigm. Input sequences of observed variables are supposed to be divided into the chunks of some length, enough to store O(N*K) data in RAM. The algorithm process the data from chunk to chunk (that means from all chunks number one to all chunks number two and so on), performing forward and backward Viterbi passes. h3. Usage Run "bin/apache pviterbi" and see what it wants from you. That is: * serialized HmmModel (i.e. by LossyHmmModelSerializer class) * input data (observed sequences) in the format described above *References* # [Wikipedia article|http://en.wikipedia.org/wiki/Viterbi_algorithm] Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action
