In article <mailman.1002.1310591600.1164.python-l...@python.org> Vlastimil Brom <vlastimil.b...@gmail.com> wrote: >I'd like to ask about the availability of a text diff library, like >difflib, which would support the detection of moved text blocks.
If you allow arbitrary moves, the "minimal edit distance" problem (string-to-string edit) becomes substantially harder. If you only allow insert, delete, or in-place-substitute, you have what is called the "Levenshtein distance" case. If you allow transpositions you get "Damerau-Levenshtein". These are both solveable with a dynamic programming algorithm. Once you allow move operations, though, the problem becomes NP-complete. See http://pages.cs.brandeis.edu/~shapird/publications/JDAmoves.pdf for instance. (They give an algorithm that produces "usually acceptable" results in polynomial time.) -- In-Real-Life: Chris Torek, Wind River Systems Intel require I note that my opinions are not those of WRS or Intel Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: gmail (figure it out) http://web.torek.net/torek/index.html
-- http://mail.python.org/mailman/listinfo/python-list