> On Oct 9, 5:41 pm, Stefan Behnel <stefan...@behnel.de> wrote: > > "Number of characters" sounds like a rather useless measure here. > > What I meant by number of characters was the number of edits happened > between the two versions..Levenshtein distance may be one way for > this..but I was wondering if difflib could do this > regards
As pointed out above, you also need to consider how the structure of the web page has changed. If you are only looking at plain text, the Levenshtein distance measures the number of edit operations (insertion, deletion or substition) necessary to transform string A into string B. Cheers, Emm -- http://mail.python.org/mailman/listinfo/python-list