On Sun, Oct 5, 2008 at 1:52 PM, Robert Berman <[EMAIL PROTECTED]> wrote:

> The database item consists of the key; the actual word, and the value, the
> size as a string. For example, the word 'myth' is represented as key=
> 'myth', value = '4'.  I think the slow part of the algorithm is the script
> retrieves a list of all database words of length (myth)=4. That's a lot of
> words. Each word is sorted  and then compared to the alpha sorted word. When
> there is a match, that word becomes a part of the anagram list. Even a very
> basic look at the word 'myth' shows there are no anagrams. So, to return an
> empty list, we have scanned all words in the database of size 4.

This is pretty inefficient, yes. You actually scan every word in the
database because you have to find the ones of length 4. You are really
using the database as just a list of words. A better data structure
would be to index by word length with the value being a list of words
of that length. Even better would be to index by the sorted letters in
the word, with the value being a list of all words with that sorting.

Kent
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to