Given infinite memory, you should be able to duplicate the results of the Viterbi algorithm using Dijkstra’s algorithm. In many cases this should be more efficient. But it also isn’t subject to the same limitations as the Viterbi algorithm. In particular, it can handle insertion and deletion “errors” as well as “substitution errors”.
This also suggests you should be able to handle the longest-common-subsequence problem with Dijkstra’s algorithm as well. But I’m not clear that it is necessarily faster to do so. I haven’t tried either of these, and I don’t really understand much about this stuff, but here’s the idea. The Viterbi algorithm --------------------- The Viterbi algorithm calculates the most probable sequence of hidden states in a hidden Markov model to have produced some observed output symbols from the Markov process. It works by calculating, at each time step, the most probable path to reach each possible hidden state. It’s a simple dynamic programming algorithm: it examines all the possible state transitions from the previous state and for each one multiplies the a priori probability of that transition, transitions (specified by the HMM), the probabilities of the states they come from (calculated in the previous step), and the probability that that state produced the observed output symbol (specified by the HMM). Since the probabilities are only ever multiplied, this can be done by adding logarithms of probabilities instead of multiplying them directly, which allows the Viterbi algorithm to scale to long sequences. You can handle the problem of reading a text with substitution errors by treating the letters of the underlying text as the “hidden states”. Each letter has some probability of being seen as each letter; typically it is very likely to be seen as itself, and much less likely to be seen as other letters. Dijkstra’s algorithm -------------------- Dijkstra’s algorithm finds the shortest path from some origin node to all nodes in a weighted node-link graph by growing a spanning tree of that graph starting from the origin node, at each step adding the node closest to the origin node out of all the nodes not yet added to the tree. If you only care about the paths to one node or a few nodes, you can stop running the algorithm once the tree reaches them. The mapping ----------- The hidden states are the nodes of the tree; the edge weights are, I think, given by the sum of the log-probability of the transition in the underlying Markov model and of the observed output symbol appearing given that state. (Maybe I should say log-improbability: the log-probability itself is negative, and this is its absolute value.) This way, Dijkstra’s algorithm finds the same path that the Viterbi algorithm would. However, my intuition is that, since the edge weights differ greatly from each other, it will usually be able to find that path without considering most of the nodes in the graph. This should enable the use of underlying Markov models with larger sets of states. The cost, however, is that distances and paths to all the past hidden Markov states need to be retained (at least until all of their possible successor states have been explored), while the Viterbi algorithm can retain only the probability and path for each of the possible states for the previous timestep. Maybe more importantly, for domains where there’s no reliable framing (e.g. OCR of proportional fonts or speech recognition), it should allow the consideration of insertions and deletions, avoiding the Viterbi algorithm’s lockstep approach. An insertion error is simply a transition from a state at time T-2 to T, with the log-improbability of the observed symbol at T-1 being inserted added to the normal edge weight. I’m not sure exactly how to account for deletions: there’s an extra set of nodes associated with the deleted symbol, with transitions from the T-1 nodes to the deleted-symbol nodes and from them to the nodes at time T, but what probability should be assigned to those transitions? -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol