On Wed, 4 Jun 2008, John Melesky wrote: > So you use those occurrence statistics to pick a feasible next word > (let's choose "system", since it's the highest probability here -- in > practice you'd probably choose one randomly based on a weighted > likelihood). Then you look for all the word pairs which start with > "system", and choose the next word in the same fashion. Repeat for as > long as you want.
"Markov chain" means, that you have a sequence of random experiments, where the outcome of each experiment depends exclusively on a fixed number (the level) of experiments immediately before the current one. > Those word-pair statistics, when you have them for all the words in > your vocabulary, comprise the first-level Markov data for your corpus. > > When you extend it to word triplets, it's second-level Markov data > (and it will generate more reasonable fake text). You can build higher > and higher Markov levels if you'd like. If the level is too high, you will just reproduce the training text. _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
