tbo...@web.de writes:

> From: Junio C Hamano <gits...@pobox.com>
>
> git diff --quiet may take a short-cut to see if a file is changed
> in the working tree:
> Whenever the file size differs from what is recorded in the index,
> the file is assumed to be changed and git diff --quiet returns
> exit with code 1
>
> This shortcut must be suppressed whenever the line endings are converted
> or a filter is in use.
> The attributes say "* text=auto" and a file has
> "Hello\nWorld\n" in the index with a length of 12.
> The file in the working tree has "Hello\r\nWorld\r\n" with a length of 14.
> (Or even "Hello\r\nWorld\n").
> In this case "git add" will not do any changes to the index, and
> "git diff -quiet" should exit 0.

The thing I find the most disturbing is that at this point in the
flow, p->one->size and p->two->size are supposed to be the sizes of
the blob object, not the contents of the file on the working tree.
IOW, p->two->size being 14 in the above example sounds like pointing
at a different bug, if it is 14.  

The early return in diff_populate_filespec(), where it does

        s->size = xsize_t(st.st_size);
        ...
        if (size_only)
                return 0;

way before it runs convert_to_git(), may be the real culprit.

I am wondering if the real fix would be to do this, instead of the
two extra would_convert_to_git() call there in the patch you sent.
The result seems to still pass the new test in your patch.

Thanks for helping.

 diff.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 8c78fce49d..dc51dceb44 100644
--- a/diff.c
+++ b/diff.c
@@ -2792,8 +2792,25 @@ int diff_populate_filespec(struct diff_filespec *s, 
unsigned int flags)
                        s->should_free = 1;
                        return 0;
                }
-               if (size_only)
+
+               /*
+                * Even if the caller would be happy with getting
+                * only the size, we cannot return early at this
+                * point if the path requires us to run the content
+                * conversion.
+                */
+               if (!would_convert_to_git(s->path) && size_only)
                        return 0;
+
+               /*
+                * Note: this check uses xsize_t(st.st_size) that may
+                * not be the true size of the blob after it goes
+                * through convert_to_git().  This may not strictly be
+                * correct, but the whole point of big_file_threashold
+                * and is_binary check is that we want to avoid
+                * opening the file and inspecting the contents, so
+                * this is probably fine.
+                */
                if ((flags & CHECK_BINARY) &&
                    s->size > big_file_threshold && s->is_binary == -1) {
                        s->is_binary = 1;

Reply via email to