Hi Guys,

I've written a Markov analysis program and would like to get your comments on the code As it stands now the final input comes out as a tuple, then list, then tuple. Something like ('the', 'water') ['us'] ('we', 'took')..etc...

I'm still learning so I don't know any advanced techniques or methods that may have made this easier.


here's the code:

def makelist(f):        #turn a document into a list
        fin = open(f)
        results = []
        for line in fin:
               line = line.replace('"', '')
                line = line.strip().split()
                for word in line:
                        results.append(word)
        return results



def markov(f, preflen=2):       #f is the file to analyze, preflen is prefix 
length
        convert_file = makelist(f)
        mapdict = {}            #dict where the prefixes will map to suffixes
        start = 0
        end = preflen           #start/end set the slice size
        for words in convert_file:
                prefix = tuple(convert_file[start:end])     #tuple as mapdict 
key
                suffix = convert_file[start + 2 : end + 1]  #word as suffix to 
key
                mapdict[prefix] = mapdict.get(prefix, []) + suffix #append 
suffixes
                start += 1
                end += 1
        return mapdict



def randsent(f, amt=10):     #prints a random sentence
       analyze = markov(f)
        for i in range(amt):
                rkey = random.choice(analyze.keys())
                print rkey, analyze[rkey],


The book gave a hint  saying to make the prefixes in the dict using:

def shift(prefix, word):
        return prefix[1:] + (word, )

However I can't seem to wrap my head around incorporating that into the code above, if you know a method or could point me in the right direction (or think that I don't need to use it) please let me know.

Thanks for all your help,

Dave

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to