On Thu, Nov 22, 2018 at 11:16:38AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> > + rm -rf .git &&
> > + git init &&
> > + echo "* diff=bin" > .gitattributes &&
> > + printf "a\0b" >data.bin &&
> > + git add data.bin &&
> > + git commit -m "message" &&
> > + git -c diff.bin.textconv=cat log -G a >actual &&
> > + git log >expected &&
> > + test_cmp actual expected
> > +'
> > +
> > test_done
>
> This patch seems like the wrong direction to me. In particular the
> assertion that "the concept of differences only makes sense for text
> files". That's just not true. This patch breaks this:
But "-G" is defined as "look for differences whose patch text contains
added/removed lines that match <regex>". We don't have patch text here,
let alone added/removed lines.
For binary files, "-Sfoo" is better defined. I think we _could_ define
"search for <pattern> in the added/removed bytes of a binary file". But
I don't think that's what the current code does (it really does a line
diff on a binary file, which is likely to put tons of unchanged crap
into the "added and removed" lines, because the line divisions aren't
meaningful in the first place).
> (
> rm -rf /tmp/g-test &&
> git init /tmp/g-test &&
> cd /tmp/g-test &&
> for i in {1..10}; do
> echo "Always matching thensome 5" >file &&
> printf "a thensome %d binary \0" $i >>file &&
> git add file &&
> git commit -m"Bump $i"
> done &&
> git log -Gthensome.*5
> )
>
> Right now this will emit 3/10 patches, and the right ones! I.e. "Bump
> [156]". The 1st one because it introduces the "Always matching thensome
> 5". Then 5/6 because the add/remove the string "a thensome 5 binary",
> respectively. Which matches /thensome.*5/.
Right, this will sometimes do the right thing. But it will also often do
the wrong thing. It's also very expensive (we specifically avoid feeding
large binary files to xdiff, but I think "-G" will happily do so -- I
didn't double check, though).
-Peff