Re: [Tutor] Bigrams and nested dictionaries

Michael Broe Mon, 03 Apr 2006 21:39:49 -0700

Well coming up with this has made me really love Python. I worked on  
this with my online pythonpenpal Kyle, and here is what we came up  
with. Thanks to all for input so far.


My first idea was to use a C-type indexing for-loop, to grab a two- 
element sequence [i, i+1]:

dict = {}
for i in range(len(t) - 1):
        if not dict.has_key(t[i]):
                dict[t[i]] = {}
        if not dict[t[i]].has_key(t[i+1]):
                dict[t[i]][t[i+1]] = 1
        else:
                dict[t[i]][t[i+1]] += 1

Which works, but. Kyle had an alternative take, with no indexing, and  
after we worked on this strategy it seemed very Pythonesque, and ran  
almost twice as fast.

----

#!/usr/local/bin/python

import sys
file = open(sys.argv[1], 'rb').read()

# We imagine a 2-byte 'window' moving over the text from left to right
#
#          +-------+
# L  o  n  | d   o |  n  .  M  i  c  h  a  e  l  m  a  s      t  e   
r  m ...
#          +-------+
#
# At any given point, we call the leftmost byte visible in the window  
'L', and the
# rightmost byte 'R'.
#
#          +-----------+
# L  o  n  | L=d   R=o |  n  .  M  i  c  h  a  e  l  m  a  s      t   
e  r  m ...
#          +-----------+
#
# When the program begins, the first byte is preloaded into L, and we  
position R
# at the second byte of the file.
#

dict = {}

L = file[0]
for R in file[1:]:      # move right edge of window across the file
        if not L in dict:
                dict[L] = {}
                
        if not R in dict[L]:
                dict[L][R] = 1
        else:
                dict[L][R] += 1
                
        L = R           # move character in R over to L

# that's it. here's a printout strategy:
        
for entry in dict:
        print entry, ':', sum(dict[entry].values())
        print dict[entry]
        print

----




_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Bigrams and nested dictionaries

Reply via email to