> Is there a clever way to see if two strings of the same length vary by > only one character, and what the character is in both strings. > > E.g. str1=yaqtil str2=yaqtel > > they differ at str1[4] and the difference is ('i','e') > > But if there was str1=yiqtol and str2=yaqtel, I am not interested. > > can anyone suggest a simple way to do this?
Use the levenshtein distance. http://en.wikisource.org/wiki/Levenshtein_distance > My next problem is, I have a list of 300,000+ words and I want to find > every pair of such strings. I thought I would first sort on length of > string, but how do I iterate through the following: > > str1 > str2 > str3 > str4 > str5 > > so that I compare str1 & str2, str1 & str3, str 1 & str4, str1 & str5, > str2 & str3, str3 & str4, str3 & str5, str4 & str5. decorate-sort-undecorate is the idion for this l = <list of strings> l = [(len(w), w) for w in l] l.sort() l = [w for _, w in l] Diez -- http://mail.python.org/mailman/listinfo/python-list