On May 4, 2:04 am, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote:
> En Sun, 04 May 2008 02:17:07 -0300, dave <[EMAIL PROTECTED]> escribió:
>
>
>
> > Hello,
>
> > I made a function that takes a word list (one word per line, text file)
> > and searches for all the words in the list that are 'shifts' of
> > eachother. 'abc' shifted 1 is 'bcd'
>
> > Please take a look and tell me if this is a viable solution.
>
> > def shift(word, amt):
> > ans = ''
> > for letter in word:
> > ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
> > return ans
>
> > def fileshift(x):
> > fin = open(x)
> > d = {}
> > for line in fin:
> > d[line.strip()] = [1]
> > for i in range(1, 26):
> > ite = shift(line.strip(), i)
> > if ite in d:
> > print ite
>
> > Any tips/suggestions/critiques greatly appreciated.. I'm trying to
> > teach myself Python (and still beginning) and would love any helpful
> > info.
>
> First, looking at the code, you're evaluating line.strip() a lot of times;
> I'd avoid it.
> Looks like you're using a dictionary just for the keys - the set type is more
> adequate here (seehttp://docs.python.org/lib/types-set.html). In any case,
> I'd use a value like None instead of [1]
> But I'd use a different algorithm. Instead of generating all posible shifts
> for a given word, I'd substract the newly read word from each previous words
> (here, "substract two words" means substract the character code for
> corresponding positions, modulo 26). Shifted words, when substracted, have
> the same number on all positions.
A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.
Here's a straightforward implementation that generates all the lists
of equally-shifted words:
from collections import defaultdict
def iter_shifted(words):
key2shifted = defaultdict(list)
for word in words:
ords = map(ord,word)
key = tuple((ords[i]-ords[i-1]) % 26
for i in xrange(1,len(ords)))
key2shifted[key].append(word)
for shifted in key2shifted.itervalues():
if len(shifted) > 1:
yield shifted
if __name__ == '__main__':
words = 'abc bef jas cba cde zab azy hkl'.split()
for shifted in iter_shifted(words):
print shifted
# *** output ***
#['bef', 'hkl']
#['abc', 'cde', 'zab']
#['cba', 'azy']
--
http://mail.python.org/mailman/listinfo/python-list