On Nov 17, 7:19 pm, Steven D'Aprano <[EMAIL PROTECTED]> wrote: [snip]
> You want to see "HIDEDCT1" match closer to "HIDESCT1" than "HIDEDST1": > > HIDEDCT1 -- John's "best match" target string > HIDEDST1 -- difflib's "best match" target string > HIDESCT1 -- source string > > John's best match matches in seven of eight positions, compared to six > of eight for the difflib best match. Disregarding order, both have seven > matching characters. That's a pretty slim difference between the two: > > >>> "".join(difflib.Differ().compare("HIDEDCT1", "HIDESCT1")) ' H I D E- D+ S C T 1' >>> "".join(difflib.Differ().compare("HIDEDST1", "HIDESCT1")) ' H I D E- D S+ C T 1' > > I honestly don't know how to interpret those results :( Take 1, firing from the hip: The formatting of that output is suboptimal. Better would be: ' H I D E -D +S C T 1' interpreted as "delete D, insert S" ' H I D E -D S +C T 1' interpreted as "delete D, insert C" After reflection that the author of difflib is known not to be so silly, take 2: | >>> list(difflib.Differ().compare("HIDEDCT1", "HIDESCT1")) [' H', ' I', ' D', ' E', '- D', '+ S', ' C', ' T', ' 1'] And the docs shed some light on what is going on: | >>> help(difflib.Differ().compare) Help on method compare in module difflib: compare(self, a, b) method of difflib.Differ instance Compare two sequences of lines; generate the resulting delta. Each sequence must contain individual single-line strings ending with newlines. Such sequences can be obtained from the `readlines()` method of file-like objects. The delta generated also consists of newline- terminated strings, ready to be printed as-is via the writeline() method of a file-like object. Example: >>> print ''.join(Differ().compare('one\ntwo\nthree\n'.splitlines(1), ... 'ore\ntree\nemu\n'.splitlines(1))), - one ? ^ + ore ? ^ - two - three ? - + tree + emu HTH, John -- http://mail.python.org/mailman/listinfo/python-list