> Usually, Differ receives two sequences of lines, being each line a > sequence of characters (strings). It uses a SequenceMatcher to compare > lines; the linejunk argument is used to ignore certain lines. For each > pair of similar lines, it uses another SequenceMatcher to compare > characters inside lines; the charjunk is used to ignore characters. > As you are feeding Differ with a single string (not a list of text lines), > the "lines" it sees are just characters. To ignore whitespace and > newlines, in this case one should use the linejunk argument: > > def ignore_ws_nl(c): > return c in " \t\n\r" > > a =difflib.Differ(linejunk=ignore_ws_nl).compare(d1,d2) > dif = list(a) > print ''.join(dif) > > I n a d d i t i o n , t h e c o n s i d e > r e > d p r o b l e m d o e s n o t h a v e > a m > e a n i n g f u l t r a d i t i o n a l t y > p e > o f- + > a d j o i n t- > + p r o b l e m e v e n f o r t h e s i > m p > l e f o r m s o f t h e d i f f e r e n t > i a > l e q u a t i o n a n d t h e n o n l o > c a l > c o n d i t i o n s . D u e- + > t o t h e s e f a c t s , s o m e s e r > i o > u s d i f f i c u l t i e s a r i s e i n > t h > e a p p l i c a t i o n o f t h e c l a > s s i > c a l m e t h o d s t o s u c h a- + > p r o b l e m .+ >
Thanks! It works fine but I was wondering why the result isn't consistent. I am comparing two huge documents with several paragraphs in it. Some parts in the paragraph returns the diff perfectly but others aren't. I am confused. Thanks. Jen -- http://mail.python.org/mailman/listinfo/python-list