Kent,
Thank you for two excellent suggestions. I will implement your
suggestion of indexing by the sorted letters in the word.
Robert
Kent Johnson wrote:
On Sun, Oct 5, 2008 at 1:52 PM, Robert Berman <[EMAIL PROTECTED]> wrote:
The database item consists of the key; the actual word, and the value, the
size as a string. For example, the word 'myth' is represented as key=
'myth', value = '4'. I think the slow part of the algorithm is the script
retrieves a list of all database words of length (myth)=4. That's a lot of
words. Each word is sorted and then compared to the alpha sorted word. When
there is a match, that word becomes a part of the anagram list. Even a very
basic look at the word 'myth' shows there are no anagrams. So, to return an
empty list, we have scanned all words in the database of size 4.
This is pretty inefficient, yes. You actually scan every word in the
database because you have to find the ones of length 4. You are really
using the database as just a list of words. A better data structure
would be to index by word length with the value being a list of words
of that length. Even better would be to index by the sorted letters in
the word, with the value being a list of all words with that sorting.
Kent
|
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor