On Thu, Oct 19, 2017 at 1:31 PM, Jeff King <[email protected]> wrote:
> For computing moved lines, we feed the characters of each
> line into a hash. When we've been asked to ignore
> whitespace, then we pick each character using next_byte(),
> which returns -1 on end-of-string, which it determines using
> the start/end pointers we feed it.
>
> However our check of its return value treats "0" the same as
> "-1", meaning we'd quit if the string has an embedded NUL.
I agree. The code looks correct.
> This is unlikely to ever come up in practice since our line
> boundaries generally come from calling strlen() in the first
> place.
get_string_hash is called from
prepare_entry, which in turn is called from
add_lines_to_move_detection or mark_color_as_moved
diff_flush_patch_all_file_pairs
that constructs the arguments in
diff_flush_patch
run_diff
run_diff_cmd
builtin_diff (part "/* Crazy xdl interfaces.. */")
xdi_diff_outf( fn_out_consume as arg!)
xdi_diff
xdl_diff
xdl_call_hunk_func
-> fn_out_consume(cb, line, len)
xdl_call_hunk_func however uses pointer arithmetic instead
of strlen. So I think this sentence is not a good idea to put in
the commit message.
It may not occur in practice, due to binary files detection using
NUL as a signal, but conceptually our move-colored(!) diffs
should be compatible with NULs with this patch now.
> But it was a bit surprising to me as a reader of the
> next_byte() code. And it's possible that we may one day feed
> this function with more exotic input, which otherwise works
> with arbitrary ptr/len pairs.
Good point.
>
> Signed-off-by: Jeff King <[email protected]>
> ---
> I noticed that we make an extra copy of each line here, just to feed it
> to memihash! I guess "-w" is not a critical-performance code path, but
> this could be fixed if we could do memhash() incrementally (e.g., by
> putting the FNV state into a struct and letting callers "add" to it
> incrementally). Maybe an interesting #leftoverbits, though I'd want to
> see timing tests that show it's worth doing.
>
I agree.
Thanks,
Stefan