Re: Markov Analysis Help

Andrew Lee Thu, 22 May 2008 09:51:20 -0700

dave wrote:

Hi Guys,
I've written a Markov analysis program and would like to get yourcomments on the code As it stands now the final input comes out as atuple, then list, then tuple. Something like ('the', 'water') ['us']('we', 'took')..etc...
I'm still learning so I don't know any advanced techniques or methodsthat may have made this easier.
here's the code:

def makelist(f):     #turn a document into a list
    fin = open(f)
    results = []
    for line in fin:
               line = line.replace('"', '')
        line = line.strip().split()
        for word in line:
            results.append(word)
    return results


What's you data look like?  Just straight text?

def markov(f, preflen=2): #f is the file to analyze, preflen isprefix length

    convert_file = makelist(f)
    mapdict = {}        #dict where the prefixes will map to suffixes
    start = 0
    end = preflen         #start/end set the slice size
    for words in convert_file:
        prefix = tuple(convert_file[start:end])     #tuple as mapdict key
        suffix = convert_file[start + 2 : end + 1]  #word as suffix to key
        mapdict[prefix] = mapdict.get(prefix, []) + suffix #append suffixes
        start += 1
        end += 1
    return mapdict


What is convert_file??


def randsent(f, amt=10):     #prints a random sentence
       analyze = markov(f)
    for i in range(amt):
        rkey = random.choice(analyze.keys())
        print rkey, analyze[rkey],


The book gave a hint  saying to make the prefixes in the dict using:

def shift(prefix, word):
    return prefix[1:] + (word, )


That's not a very helpful hint.

It works if you call it with a tuple and a word --- it shifts off thefront of the tuple ... so :


shift(('foo','bar') "word")
becomes   ('bar', 'word')

Whoopty doo --- I'm not sure what that accomplishes!!

Unless the author means "pass a list and a randomly pick a word from thelist" in which case the return statement could be


random.choice(prefix) + (word, )

* shrug *

But -- that's not very Markov ... you'd want a weighted choice of words... depending on how you define your Markov chain -- say a Markov chainbased on part-of-speech or probability of occurrence from a given word-set.


Can you give some more detail??

However I can't seem to wrap my head around incorporating that into thecode above, if you know a method or could point me in the rightdirection (or think that I don't need to use it) please let me know.
Thanks for all your help,

Dave

--
http://mail.python.org/mailman/listinfo/python-list

Re: Markov Analysis Help

Reply via email to