On Tue, 24 Apr 2018 16:23:28 -0700
Stefan Beller <[email protected]> wrote:
> >> s/missmatch/mismatch/
> >> Also, what is this used for?
> >
> > The mismatch should be used for (thanks for catching!)
> > checking if the remainder of a line is the same, although a boolean
> > may be not the correct choice. We know that the two strings
> > passed into compute_ws_delta come from a complete white space
> > agnostic comparison, so consider:
> >
> > + SP SP more TAB more
> > + SP SP text TAB text
> >
> > - SP more TAB more
> > - SP text TAB text
> >
> > which would mark this as a moved block. This is the feature
> > working as intended, but what about
> >
> > + SP SP more TAB more
> > + SP SP text TAB text
> >
> > - SP more SP more
> > - SP text TAB text
> >
> > Note how the length of the strings is the same, hence the current
> > code of
> >
> > compute_ws_delta(...) {
> > int d = longer->len - shorter->len;
> > out->string = xmemdupz(longer->line, d);
> > }
> >
> > would work fine and not notice the "change in the remainder".
> > That ought to be caught by the mismatch variable, that
> > is assigned, but not used.
> >
> > The compare_ws_delta(a, b) needs to be extended to
> >
> > !a->mismatch && !b->mismatch && existing_condition
> >
> > Ideally the example from above would be different depending
> > on whether the other white space flags are given or not.
Thanks - this gives me food for thought.
I'm starting to think that it is impossible to avoid creating our own
string comparison function that:
- seeks to the first non-whitespace character in both strings
- checks that both strings, from that first non-whitespace characters,
are equal for some definition of equal (whether through strcmp or
xdiff_compare_lines)
- walks backwards from that first non-whitespace characters to look for
the first non-matching whitespace character between the 2 strings
The existing diff whitespace modes (to be passed to xdiff_compare_lines)
do not give the exact result we want. For example, if
XDF_IGNORE_WHITESPACE is used (as is in this patch), lines like "a b"
and "ab " would match even though they shouldn't.
This comparison function can be used both in moved_entry_cmp() and when
constructing the ws_delta (in which case it should be made to output
whatever information is needed as out parameters or similar).
> >>> + if (diffopt->color_moved_ws_handling &
> >>> COLOR_MOVED_DELTA_WHITESPACES)
> >>> + /*
> >>> + * As there is not specific white space config given,
> >>> + * we'd need to check for a new block, so ignore all
> >>> + * white space. The setup of the white space
> >>> + * configuration for the next block is done else where
> >>> + */
> >>> + flags |= XDF_IGNORE_WHITESPACE;
> >>> +
> >>> return !xdiff_compare_lines(a->es->line, a->es->len,
> >>> b->es->line, b->es->len,
> >>> flags);
We can't add XDF_IGNORE_WHITESPACE here - as far as I can tell, this
means that more lines will be treated as moved than the user wants (if
the user did not set --color-moved-ignore-all-space).
> >> It seems like pmb and wsd are parallel arrays - could each wsd be
> >> embedded into the corresponding entry of pmb instead?
> >
> > I'll explore that. It sounds like a good idea for code hygiene.
> > Although if you do not intend to use this feature, then keeping it separate
> > would allow for a smaller footprint in memory.
If you're worried about memory, wsd can be embedded as a pointer.
> > The command is missing a --color-moved, as the
> > --color-moved-whitespace-settings
> > do not imply --color-moved, yet(?)
Maybe one should imply the other (or at least a warning).