On Sat, 2011-01-01, Johan Corveleyn wrote: > [ Taking a privately-started discussion with danielsh to the list, in > case others have inspiration/insight about this. Question at hand: I'm > having trouble making diff3 work with prefix/suffix scanning of the > diff-optimizations-bytes branch. Any feedback is highly appreciated > :-). ] > > On Fri, Dec 31, 2010 at 2:38 PM, Daniel Shahaf <d...@daniel.shahaf.name> > wrote: > [...snip...] > > In diff3 with prefix enabled, I was seeing failures like this: > > > > EXPECTED: > > line1 > > <<< > > line2 > > === > > line2 modified > >>>> > > > > ACTUAL: > > <<< > > line1 > > line2 > > === > > line1 > > line2 modified > >>>> > > > > so I assumed that when the prefix code gobbled the shared prefix. How > > to fix it from there was purely guesswork on my part (and I haven't > > implemented it). > > > > These failures would go away if I prepended a "line0 left" and "line0 > > right" to the file. > > Yes, I know. That's indeed because the prefix-scanning code gobbled up > the identical prefix. That problem is also there in diff2. > > In diff2, I fixed this by simply pre-pending a piece of "common" diff > to the diff linked list, in the function diff.c#svn_diff__diff. > > >From r1037371: > [[[ > --- subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff.c > (original) > +++ subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff.c > Sun Nov 21 02:36:07 2010 > @@ -43,6 +43,22 @@ svn_diff__diff(svn_diff__lcs_t *lcs, > svn_diff_t *diff; > svn_diff_t **diff_ref = &diff; > > + if (want_common && (original_start > 1)) > + { > + /* we have a prefix to skip */ > + (*diff_ref) = apr_palloc(pool, sizeof(**diff_ref)); > + > + (*diff_ref)->type = svn_diff__type_common; > + (*diff_ref)->original_start = 0; > + (*diff_ref)->original_length = original_start - 1; > + (*diff_ref)->modified_start = 0; > + (*diff_ref)->modified_length = modified_start - 1; > + (*diff_ref)->latest_start = 0; > + (*diff_ref)->latest_length = 0; > + > + diff_ref = &(*diff_ref)->next; > + }
Hi Johan. I know this is work in progress but that whole "if" block of code is a good example of where a comment above the block would be *really* useful to tell the reader what it's for. :-) (Also, all that dereferencing is unnecessary in that particular block, but I looked at the rest of the function and saw that that style is used in two other blocks where it is not unnecessary. Personally I would find the following style more readable throughout the function: { svn_diff_t *new_diff = apr_palloc(pool, sizeof(*new_diff)); new_diff->type = ... ... *diff_ref = new_diff; diff_ref = &new_diff->next; } ) > while (1) > { > if (original_start < lcs->position[0]->offset > ]]] > > Naively, I wanted to do the same for diff3 (along with some other > tricks, to make the algorithm aware of the prefix_lines), but it > doesn't work. See the patch in attachment. > > I mean: it works partly: diff-diff3-test.exe passes with the attached > patch, but merge_reintegrate_tests.py (1, 2, 10, 13 and 14) and > merge_tests.py (61) don't. I haven't studied the test failures in > detail (I probably should, but I currently can't wrap my head around > those tests). > > I'm now pondering another approach, to pre-pend an "lcs" piece to the > lcs linked list, and let the rest of the algorithm do its work as > normal. Inspiration for this came when I was reading the code of > diff3.c#svn_diff__resolve_conflict, where a similar trick is used > around line 121: > [[[ > /* If there were matching symbols at the start of > * both sequences, record that fact. > */ > if (common_length > 0) > { > lcs = apr_palloc(subpool, sizeof(*lcs)); > lcs->next = NULL; > lcs->position[0] = start_position[0]; > lcs->position[1] = start_position[1]; > lcs->length = common_length; > > lcs_ref = &lcs->next; > } > ]]] > > Maybe I can even integrate this logic into lcs.c#svn_diff__lcs, where > I'm already passing the prefix_lines as a new parameter. If that would > work out, I could probably remove the special common-diff-injecting > code in diff.c#svn_diff__diff. > > Thoughts? Just from reading what you say here, that last solution sounds much better. - Julian > It will take me some time/experimentation to work out the details, but > I think that should work ... > > Cheers,