Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
Duy Nguyenwrites: >>> If cached is false and ce_ita() is true and either CE_VALID or >>> CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1. >>> But I think we should grep_file() instead, at least for CE_VALID. >> >> Yes, that is the breakage I noticed in the patch under discussion >> and that I wanted to fix in the "I wonder if a better change would >> be..." version. > > Heh.. I did guess that. Since neither solution is complete, I'm in > favor of Charles's and assume that i-t-a forces to ignore CE_SKIP and > CE_SKIP_WORKTREE. I could wait for people to come back complaining, > then we know there are real users in very obscure cases and will fix > it then. I said something that can be misunderstood. I meant "I wonder if ..." version is correct. Charles's has the bugs you mentioned and I wanted to fix them by sending the "I wonder if..." version out. But you seem to have misread my statement as "A bug is in my version and I want to fix that bug in my version". That is not what I meant. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
On Wed, Jun 22, 2016 at 8:00 PM, Junio C Hamanowrote: > Duy Nguyen writes: > >>> So I wonder if a better change would be more like >>> >>> for (...) { >>> if (!S_ISREG(ce->ce_mode)) >>> continue; /* not a regular file */ >>> if (!ce_path_match(ce, pathspec, NULL) >>> continue; /* uninteresting */ >>> + if (cached && ce_intent_to_add(ce)) >>> + continue; /* path not yet in the index */ >>> >>> if (cached || ...) >>> UNCHANGED FROM THE ORIGINAL >>> >>> perhaps? >> >> I did wonder a bit about these cases. But, can i-t-a really be >> combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is >> automatically set and should not cover i-t-a entries imo (I didn't >> check the implementation). CE_VALID is about real entries, yes you >> could do "git update-index --assume-unchanged " but it does >> not feel right to me. > > Yeah but we know people are stupid^W^Wdo unexpected things ;-) > >> If cached is false and ce_ita() is true and either CE_VALID or >> CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1. >> But I think we should grep_file() instead, at least for CE_VALID. > > Yes, that is the breakage I noticed in the patch under discussion > and that I wanted to fix in the "I wonder if a better change would > be..." version. Heh.. I did guess that. Since neither solution is complete, I'm in favor of Charles's and assume that i-t-a forces to ignore CE_SKIP and CE_SKIP_WORKTREE. I could wait for people to come back complaining, then we know there are real users in very obscure cases and will fix it then. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
Duy Nguyenwrites: >> So I wonder if a better change would be more like >> >> for (...) { >> if (!S_ISREG(ce->ce_mode)) >> continue; /* not a regular file */ >> if (!ce_path_match(ce, pathspec, NULL) >> continue; /* uninteresting */ >> + if (cached && ce_intent_to_add(ce)) >> + continue; /* path not yet in the index */ >> >> if (cached || ...) >> UNCHANGED FROM THE ORIGINAL >> >> perhaps? > > I did wonder a bit about these cases. But, can i-t-a really be > combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is > automatically set and should not cover i-t-a entries imo (I didn't > check the implementation). CE_VALID is about real entries, yes you > could do "git update-index --assume-unchanged " but it does > not feel right to me. Yeah but we know people are stupid^W^Wdo unexpected things ;-) > If cached is false and ce_ita() is true and either CE_VALID or > CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1. > But I think we should grep_file() instead, at least for CE_VALID. Yes, that is the breakage I noticed in the patch under discussion and that I wanted to fix in the "I wonder if a better change would be..." version. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
On Wed, Jun 22, 2016 at 12:49 AM, Junio C Hamanowrote: >> @@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct >> pathspec *pathspec, int >>* cache version instead >>*/ >> if (cached || (ce->ce_flags & CE_VALID) || >> ce_skip_worktree(ce)) { >> - if (ce_stage(ce)) >> + if (ce_stage(ce) || ce_intent_to_add(ce)) >> continue; >> hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name); >> } > > OK, so this function handles searching in either the index or the > working tree. > > The first hunk used to unconditionally discard paths marked as > i-t-a, even when we are looking at the working tree, which is > clearly useless, and we stop rejecting i-t-a paths too early, which > is good. > > The second hunk is for "grep --cached" but also covers two other > cases. What are these? > > CE_VALID is used by "Assume unchanged". Because the user promised > that s/he will take responsibility of keeping the working tree > contents in sync with what is in the index by not modifying it, even > when we are not doing "grep --cached", we pick up the contents from > the index and look for the string in there, instead of going to the > working tree. In other words, even though at the mechanical level > we are looking into the index, logically we are searching in the > working tree. Is it sensible to skip i-t-a entries in that case? > > I think the same discussion would apply to CE_SKIP_WORKTREE (see > "Skip-worktree bit" in Documentation/git-update-index.txt). > > So I wonder if a better change would be more like > > for (...) { > if (!S_ISREG(ce->ce_mode)) > continue; /* not a regular file */ > if (!ce_path_match(ce, pathspec, NULL) > continue; /* uninteresting */ > + if (cached && ce_intent_to_add(ce)) > + continue; /* path not yet in the index */ > > if (cached || ...) > UNCHANGED FROM THE ORIGINAL > > perhaps? I did wonder a bit about these cases. But, can i-t-a really be combined with CE_VALID or CE_SKIP_WORKTREE? CE_SKIP_... is automatically set and should not cover i-t-a entries imo (I didn't check the implementation). CE_VALID is about real entries, yes you could do "git update-index --assume-unchanged " but it does not feel right to me. If cached is false and ce_ita() is true and either CE_VALID or CE_SKIP_WORKTREE is set, we would continue to grep an _empty_ SHA-1. But I think we should grep_file() instead, at least for CE_VALID. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
On Wed, Jun 22, 2016 at 3:13 AM, Eric Sunshinewrote: > On Tue, Jun 21, 2016 at 5:14 PM, Charles Bailey wrote: >> From: Charles Bailey >> >> This reverts commit 4d552005323034c1d6311796ac1074e9a4b4b57e and adds an >> alternative fix to maintain the -L --cached behavior. > > It is common to provide some context along with the (shortened) commit > ID. For instance: > > This reverts 4d55200 (grep: make it clear i-t-a entries are > ignored, 2015-12-27) and adds ... And that could be produced with some git alias like git config alias.one 'show -s --date=short --pretty='format:%h (%s - %ad)' No point in manually copy/pasting the context. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
On Tue, Jun 21, 2016 at 5:14 PM, Charles Baileywrote: > From: Charles Bailey > > This reverts commit 4d552005323034c1d6311796ac1074e9a4b4b57e and adds an > alternative fix to maintain the -L --cached behavior. It is common to provide some context along with the (shortened) commit ID. For instance: This reverts 4d55200 (grep: make it clear i-t-a entries are ignored, 2015-12-27) and adds ... > 4d5520053 caused 'git grep' to no longer find matches in new files in > the working tree where the corresponding index entry had the "intent to > add" bit set, despite the fact that these files are tracked. > [...] > Helped-by: Nguyễn Thái Ngọc Duy > Signed-off-by: Charles Bailey > --- > > Is "Helped-by" an appropriate attribution in this case? Very much so. > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh > @@ -1364,4 +1364,42 @@ test_expect_success 'grep --color -e A --and -e B -p > with context' ' > +test_expect_success 'grep can find things only in the work tree' ' > + touch work-tree-only && Avoid 'touch' if the timestamp of the file has no significance. Use '>' instead: >work-tree-only && > + git add work-tree-only && > + echo "find in work tree" >work-tree-only && > + git grep --quiet "find in work tree" && > + test_must_fail git grep --quiet --cached "find in work tree" && > + test_must_fail git grep --quiet "find in work tree" HEAD && > + git rm -f work-tree-only If any statement before this cleanup code fails, then the cleanup will never take place (due to the &&-chain). To ensure cleanup regardless of test outcome, instead use test_when_finished() at the beginning of the test: test_when_finished "git rm -f work-tree-only" && Same applies to other added tests. > +' > + > +test_expect_success 'grep can find things only in the work tree (i-t-a)' ' > + echo "intend to add this" >intend-to-add && > + git add -N intend-to-add && > + git grep --quiet "intend to add this" && > + test_must_fail git grep --quiet --cached "intend to add this" && > + test_must_fail git grep --quiet "intend to add this" HEAD && > + git rm -f intend-to-add > +' > + > +test_expect_success 'grep can find things only in the index' ' > + echo "only in the index" >cache-this && > + git add cache-this && > + rm cache-this && > + test_must_fail git grep --quiet "only in the index" && > + git grep --quiet --cached "only in the index" && > + test_must_fail git grep --quiet "only in the index" HEAD && > + git rm --cached cache-this > +' > + > +test_expect_success 'grep does not report i-t-a with -L --cached' ' > + echo "intend to add this" >intend-to-add && > + git add -N intend-to-add && > + git ls-files | grep -v "^intend-to-add\$" >expected && > + git grep -L --cached "nonexistent_string" >actual && > + test_cmp expected actual && > + git rm -f intend-to-add > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] grep: fix grepping for "intent to add" files
Charles Baileywrites: > Is "Helped-by" an appropriate attribution in this case? Sure. > diff --git a/builtin/grep.c b/builtin/grep.c > index 462e607..ae73831 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -386,7 +386,7 @@ static int grep_cache(struct grep_opt *opt, const struct > pathspec *pathspec, int > > for (nr = 0; nr < active_nr; nr++) { > const struct cache_entry *ce = active_cache[nr]; > - if (!S_ISREG(ce->ce_mode) || ce_intent_to_add(ce)) > + if (!S_ISREG(ce->ce_mode)) > continue; > if (!ce_path_match(ce, pathspec, NULL)) > continue; > @@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct > pathspec *pathspec, int >* cache version instead >*/ > if (cached || (ce->ce_flags & CE_VALID) || > ce_skip_worktree(ce)) { > - if (ce_stage(ce)) > + if (ce_stage(ce) || ce_intent_to_add(ce)) > continue; > hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name); > } OK, so this function handles searching in either the index or the working tree. The first hunk used to unconditionally discard paths marked as i-t-a, even when we are looking at the working tree, which is clearly useless, and we stop rejecting i-t-a paths too early, which is good. The second hunk is for "grep --cached" but also covers two other cases. What are these? CE_VALID is used by "Assume unchanged". Because the user promised that s/he will take responsibility of keeping the working tree contents in sync with what is in the index by not modifying it, even when we are not doing "grep --cached", we pick up the contents from the index and look for the string in there, instead of going to the working tree. In other words, even though at the mechanical level we are looking into the index, logically we are searching in the working tree. Is it sensible to skip i-t-a entries in that case? I think the same discussion would apply to CE_SKIP_WORKTREE (see "Skip-worktree bit" in Documentation/git-update-index.txt). So I wonder if a better change would be more like for (...) { if (!S_ISREG(ce->ce_mode)) continue; /* not a regular file */ if (!ce_path_match(ce, pathspec, NULL) continue; /* uninteresting */ + if (cached && ce_intent_to_add(ce)) + continue; /* path not yet in the index */ if (cached || ...) UNCHANGED FROM THE ORIGINAL perhaps? I actually think (ce->ce_flags & CE_VALID) case should go to working tree for performance, but that is a separate topic. > +test_expect_success 'grep can find things only in the work tree' ' > + touch work-tree-only && Please do not use "touch" if the reason to use the command is *not* to muck with the timestamp, e.g. to create an empty file. > + git add work-tree-only && > + echo "find in work tree" >work-tree-only && > + git grep --quiet "find in work tree" && > + test_must_fail git grep --quiet --cached "find in work tree" && > + test_must_fail git grep --quiet "find in work tree" HEAD && > + git rm -f work-tree-only > +' > + > +test_expect_success 'grep can find things only in the work tree (i-t-a)' ' > + echo "intend to add this" >intend-to-add && > + git add -N intend-to-add && > + git grep --quiet "intend to add this" && > + test_must_fail git grep --quiet --cached "intend to add this" && > + test_must_fail git grep --quiet "intend to add this" HEAD && > + git rm -f intend-to-add > +' > + > +test_expect_success 'grep can find things only in the index' ' > + echo "only in the index" >cache-this && > + git add cache-this && > + rm cache-this && > + test_must_fail git grep --quiet "only in the index" && > + git grep --quiet --cached "only in the index" && > + test_must_fail git grep --quiet "only in the index" HEAD && > + git rm --cached cache-this > +' > + > +test_expect_success 'grep does not report i-t-a with -L --cached' ' > + echo "intend to add this" >intend-to-add && > + git add -N intend-to-add && > + git ls-files | grep -v "^intend-to-add\$" >expected && > + git grep -L --cached "nonexistent_string" >actual && > + test_cmp expected actual && > + git rm -f intend-to-add > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html