Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-22 Thread Junio C Hamano
Duy Nguyen  writes:

>>> If cached is false and ce_ita() is true and either CE_VALID or
>>> CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1.
>>> But I think we should grep_file() instead, at least for CE_VALID.
>>
>> Yes, that is the breakage I noticed in the patch under discussion
>> and that I wanted to fix in the "I wonder if a better change would
>> be..." version.
>
> Heh.. I did guess that. Since neither solution is complete, I'm in
> favor of Charles's and assume that i-t-a forces to ignore CE_SKIP and
> CE_SKIP_WORKTREE. I could wait for people to come back complaining,
> then we know there are real users in very obscure cases and will fix
> it then.

I said something that can be misunderstood.  I meant "I wonder if ..."
version is correct.  Charles's has the bugs you mentioned and I
wanted to fix them by sending the "I wonder if..." version out.

But you seem to have misread my statement as "A bug is in my version
and I want to fix that bug in my version".  That is not what I
meant.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-22 Thread Duy Nguyen
On Wed, Jun 22, 2016 at 8:00 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>>> So I wonder if a better change would be more like
>>>
>>> for (...) {
>>> if (!S_ISREG(ce->ce_mode))
>>> continue; /* not a regular file */
>>> if (!ce_path_match(ce, pathspec, NULL)
>>> continue; /* uninteresting */
>>> +   if (cached && ce_intent_to_add(ce))
>>> +   continue; /* path not yet in the index */
>>>
>>> if (cached || ...)
>>> UNCHANGED FROM THE ORIGINAL
>>>
>>> perhaps?
>>
>> I did wonder a bit about these cases. But, can i-t-a really be
>> combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is
>> automatically set and should not cover i-t-a entries imo (I didn't
>> check the implementation). CE_VALID is about real entries, yes you
>> could do "git update-index --assume-unchanged " but it does
>> not feel right to me.
>
> Yeah but we know people are stupid^W^Wdo unexpected things ;-)
>
>> If cached is false and ce_ita() is true and either CE_VALID or
>> CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1.
>> But I think we should grep_file() instead, at least for CE_VALID.
>
> Yes, that is the breakage I noticed in the patch under discussion
> and that I wanted to fix in the "I wonder if a better change would
> be..." version.

Heh.. I did guess that. Since neither solution is complete, I'm in
favor of Charles's and assume that i-t-a forces to ignore CE_SKIP and
CE_SKIP_WORKTREE. I could wait for people to come back complaining,
then we know there are real users in very obscure cases and will fix
it then.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-22 Thread Junio C Hamano
Duy Nguyen  writes:

>> So I wonder if a better change would be more like
>>
>> for (...) {
>> if (!S_ISREG(ce->ce_mode))
>> continue; /* not a regular file */
>> if (!ce_path_match(ce, pathspec, NULL)
>> continue; /* uninteresting */
>> +   if (cached && ce_intent_to_add(ce))
>> +   continue; /* path not yet in the index */
>>
>> if (cached || ...)
>> UNCHANGED FROM THE ORIGINAL
>>
>> perhaps?
>
> I did wonder a bit about these cases. But, can i-t-a really be
> combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is
> automatically set and should not cover i-t-a entries imo (I didn't
> check the implementation). CE_VALID is about real entries, yes you
> could do "git update-index --assume-unchanged " but it does
> not feel right to me.

Yeah but we know people are stupid^W^Wdo unexpected things ;-)

> If cached is false and ce_ita() is true and either CE_VALID or
> CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1.
> But I think we should grep_file() instead, at least for CE_VALID.

Yes, that is the breakage I noticed in the patch under discussion
and that I wanted to fix in the "I wonder if a better change would
be..." version.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-22 Thread Duy Nguyen
On Wed, Jun 22, 2016 at 12:49 AM, Junio C Hamano  wrote:
>> @@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
>> pathspec *pathspec, int
>>* cache version instead
>>*/
>>   if (cached || (ce->ce_flags & CE_VALID) || 
>> ce_skip_worktree(ce)) {
>> - if (ce_stage(ce))
>> + if (ce_stage(ce) || ce_intent_to_add(ce))
>>   continue;
>>   hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name);
>>   }
>
> OK, so this function handles searching in either the index or the
> working tree.
>
> The first hunk used to unconditionally discard paths marked as
> i-t-a, even when we are looking at the working tree, which is
> clearly useless, and we stop rejecting i-t-a paths too early, which
> is good.
>
> The second hunk is for "grep --cached" but also covers two other
> cases.  What are these?
>
> CE_VALID is used by "Assume unchanged".  Because the user promised
> that s/he will take responsibility of keeping the working tree
> contents in sync with what is in the index by not modifying it, even
> when we are not doing "grep --cached", we pick up the contents from
> the index and look for the string in there, instead of going to the
> working tree.  In other words, even though at the mechanical level
> we are looking into the index, logically we are searching in the
> working tree.  Is it sensible to skip i-t-a entries in that case?
>
> I think the same discussion would apply to CE_SKIP_WORKTREE (see
> "Skip-worktree bit" in Documentation/git-update-index.txt).
>
> So I wonder if a better change would be more like
>
> for (...) {
> if (!S_ISREG(ce->ce_mode))
> continue; /* not a regular file */
> if (!ce_path_match(ce, pathspec, NULL)
> continue; /* uninteresting */
> +   if (cached && ce_intent_to_add(ce))
> +   continue; /* path not yet in the index */
>
> if (cached || ...)
> UNCHANGED FROM THE ORIGINAL
>
> perhaps?

I did wonder a bit about these cases. But, can i-t-a really be
combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is
automatically set and should not cover i-t-a entries imo (I didn't
check the implementation). CE_VALID is about real entries, yes you
could do "git update-index --assume-unchanged " but it does
not feel right to me.

If cached is false and ce_ita() is true and either CE_VALID or
CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1.
But I think we should grep_file() instead, at least for CE_VALID.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-22 Thread Duy Nguyen
On Wed, Jun 22, 2016 at 3:13 AM, Eric Sunshine  wrote:
> On Tue, Jun 21, 2016 at 5:14 PM, Charles Bailey  wrote:
>> From: Charles Bailey 
>>
>> This reverts commit 4d552005323034c1d6311796ac1074e9a4b4b57e and adds an
>> alternative fix to maintain the -L --cached behavior.
>
> It is common to provide some context along with the (shortened) commit
> ID. For instance:
>
> This reverts 4d55200 (grep: make it clear i-t-a entries are
> ignored, 2015-12-27) and adds ...

And that could be produced with some git alias like

git config alias.one 'show -s --date=short --pretty='format:%h (%s - %ad)'

No point in manually copy/pasting the context.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-21 Thread Eric Sunshine
On Tue, Jun 21, 2016 at 5:14 PM, Charles Bailey  wrote:
> From: Charles Bailey 
>
> This reverts commit 4d552005323034c1d6311796ac1074e9a4b4b57e and adds an
> alternative fix to maintain the -L --cached behavior.

It is common to provide some context along with the (shortened) commit
ID. For instance:

This reverts 4d55200 (grep: make it clear i-t-a entries are
ignored, 2015-12-27) and adds ...

> 4d5520053 caused 'git grep' to no longer find matches in new files in
> the working tree where the corresponding index entry had the "intent to
> add" bit set, despite the fact that these files are tracked.
> [...]
> Helped-by: Nguyễn Thái Ngọc Duy 
> Signed-off-by: Charles Bailey 
> ---
>
> Is "Helped-by" an appropriate attribution in this case?

Very much so.

> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> @@ -1364,4 +1364,42 @@ test_expect_success 'grep --color -e A --and -e B -p 
> with context' '
> +test_expect_success 'grep can find things only in the work tree' '
> +   touch work-tree-only &&

Avoid 'touch' if the timestamp of the file has no significance. Use '>' instead:

>work-tree-only &&

> +   git add work-tree-only &&
> +   echo "find in work tree" >work-tree-only &&
> +   git grep --quiet "find in work tree" &&
> +   test_must_fail git grep --quiet --cached "find in work tree" &&
> +   test_must_fail git grep --quiet "find in work tree" HEAD &&
> +   git rm -f work-tree-only

If any statement before this cleanup code fails, then the cleanup will
never take place (due to the &&-chain). To ensure cleanup regardless
of test outcome, instead use test_when_finished() at the beginning of
the test:

test_when_finished "git rm -f work-tree-only" &&

Same applies to other added tests.

> +'
> +
> +test_expect_success 'grep can find things only in the work tree (i-t-a)' '
> +   echo "intend to add this" >intend-to-add &&
> +   git add -N intend-to-add &&
> +   git grep --quiet "intend to add this" &&
> +   test_must_fail git grep --quiet --cached "intend to add this" &&
> +   test_must_fail git grep --quiet "intend to add this" HEAD &&
> +   git rm -f intend-to-add
> +'
> +
> +test_expect_success 'grep can find things only in the index' '
> +   echo "only in the index" >cache-this &&
> +   git add cache-this &&
> +   rm cache-this &&
> +   test_must_fail git grep --quiet "only in the index" &&
> +   git grep --quiet --cached "only in the index" &&
> +   test_must_fail git grep --quiet "only in the index" HEAD &&
> +   git rm --cached cache-this
> +'
> +
> +test_expect_success 'grep does not report i-t-a with -L --cached' '
> +   echo "intend to add this" >intend-to-add &&
> +   git add -N intend-to-add &&
> +   git ls-files | grep -v "^intend-to-add\$" >expected &&
> +   git grep -L --cached "nonexistent_string" >actual &&
> +   test_cmp expected actual &&
> +   git rm -f intend-to-add
> +'
> +
>  test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files

2016-06-21 Thread Junio C Hamano
Charles Bailey  writes:

> Is "Helped-by" an appropriate attribution in this case?

Sure.

> diff --git a/builtin/grep.c b/builtin/grep.c
> index 462e607..ae73831 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -386,7 +386,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
> pathspec *pathspec, int
>  
>   for (nr = 0; nr < active_nr; nr++) {
>   const struct cache_entry *ce = active_cache[nr];
> - if (!S_ISREG(ce->ce_mode) || ce_intent_to_add(ce))
> + if (!S_ISREG(ce->ce_mode))
>   continue;
>   if (!ce_path_match(ce, pathspec, NULL))
>   continue;
> @@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
> pathspec *pathspec, int
>* cache version instead
>*/
>   if (cached || (ce->ce_flags & CE_VALID) || 
> ce_skip_worktree(ce)) {
> - if (ce_stage(ce))
> + if (ce_stage(ce) || ce_intent_to_add(ce))
>   continue;
>   hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name);
>   }

OK, so this function handles searching in either the index or the
working tree.

The first hunk used to unconditionally discard paths marked as
i-t-a, even when we are looking at the working tree, which is
clearly useless, and we stop rejecting i-t-a paths too early, which
is good.

The second hunk is for "grep --cached" but also covers two other
cases.  What are these?

CE_VALID is used by "Assume unchanged".  Because the user promised
that s/he will take responsibility of keeping the working tree
contents in sync with what is in the index by not modifying it, even
when we are not doing "grep --cached", we pick up the contents from
the index and look for the string in there, instead of going to the
working tree.  In other words, even though at the mechanical level
we are looking into the index, logically we are searching in the
working tree.  Is it sensible to skip i-t-a entries in that case?

I think the same discussion would apply to CE_SKIP_WORKTREE (see
"Skip-worktree bit" in Documentation/git-update-index.txt).

So I wonder if a better change would be more like

for (...) {
if (!S_ISREG(ce->ce_mode))
continue; /* not a regular file */
if (!ce_path_match(ce, pathspec, NULL)
continue; /* uninteresting */
+   if (cached && ce_intent_to_add(ce))
+   continue; /* path not yet in the index */

if (cached || ...)
UNCHANGED FROM THE ORIGINAL

perhaps?

I actually think (ce->ce_flags & CE_VALID) case should go to working
tree for performance, but that is a separate topic.

> +test_expect_success 'grep can find things only in the work tree' '
> + touch work-tree-only &&

Please do not use "touch" if the reason to use the command is *not*
to muck with the timestamp, e.g. to create an empty file.

> + git add work-tree-only &&
> + echo "find in work tree" >work-tree-only &&
> + git grep --quiet "find in work tree" &&
> + test_must_fail git grep --quiet --cached "find in work tree" &&
> + test_must_fail git grep --quiet "find in work tree" HEAD &&
> + git rm -f work-tree-only
> +'
> +
> +test_expect_success 'grep can find things only in the work tree (i-t-a)' '
> + echo "intend to add this" >intend-to-add &&
> + git add -N intend-to-add &&
> + git grep --quiet "intend to add this" &&
> + test_must_fail git grep --quiet --cached "intend to add this" &&
> + test_must_fail git grep --quiet "intend to add this" HEAD &&
> + git rm -f intend-to-add
> +'
> +
> +test_expect_success 'grep can find things only in the index' '
> + echo "only in the index" >cache-this &&
> + git add cache-this &&
> + rm cache-this &&
> + test_must_fail git grep --quiet "only in the index" &&
> + git grep --quiet --cached "only in the index" &&
> + test_must_fail git grep --quiet "only in the index" HEAD &&
> + git rm --cached cache-this
> +'
> +
> +test_expect_success 'grep does not report i-t-a with -L --cached' '
> + echo "intend to add this" >intend-to-add &&
> + git add -N intend-to-add &&
> + git ls-files | grep -v "^intend-to-add\$" >expected &&
> + git grep -L --cached "nonexistent_string" >actual &&
> + test_cmp expected actual &&
> + git rm -f intend-to-add
> +'
> +
>  test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html