I think a way to solve the problem may be: 1) create a little Python script to separate the original words in many files, each one containing only words of the same length. Every filename can contain the relative word length. 2) write a little C program with two nested loops, that scans all the pairs of the words of a single file, looking for single char differences using a scanning of the chars. Probably there are many possible tricks to speed up this search (like separating the words in subgroups, of using some assembly-derived tricks, etc) but maybe you don't need them. As C data structure you may simply use a single block (because you know the length of a single word, so the files produced by Python can be without spaces and returns).
I have suggested C because if the words are all of the same length then you have 30000^2 = 90 000 000 000 pairs to test. If you want, before creating the C program you can create a little Python+Psyco prototype that uses array.array of chars (because sometimes Psyco uses them quite quickly). Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list