On Mon, Jul 10, 2017 at 2:48 PM, Durham Goode <dur...@fb.com> wrote:
>
>
> On 7/10/17 1:04 PM, Martin von Zweigbergk wrote:
>>
>> On Mon, Jul 10, 2017 at 11:58 AM, Durham Goode <dur...@fb.com> wrote:
>>>
>>>
>>>
>>> On 7/10/17 11:55 AM, Martin von Zweigbergk wrote:
>>>>
>>>>
>>>> On Mon, Jul 10, 2017 at 11:45 AM, Durham Goode <dur...@fb.com> wrote:
>>>>>
>>>>>
>>>>> On 7/10/17 10:01 AM, Martin von Zweigbergk wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> (For Durham)
>>>>>>
>>>>>> On Sat, Jul 8, 2017 at 4:29 PM, Gregory Szorc
>>>>>> <gregory.sz...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> # HG changeset patch
>>>>>>> # User Gregory Szorc <gregory.sz...@gmail.com>
>>>>>>> # Date 1499555309 25200
>>>>>>> #      Sat Jul 08 16:08:29 2017 -0700
>>>>>>> # Node ID 94f98bc84936defadb959e31012555dba170d8cd
>>>>>>> # Parent  a2867557f9c2314aeea19a946dfb8e167def4fb8
>>>>>>> dirstate: integrate sparse matcher with _ignore (API)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Why does sparse do it this way instead of intersecting the sparse
>>>>>> matcher with the user's matcher?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I'm not sure I understand the question.  What is the "user's matcher"
>>>>> here?
>>>>> The ignore matcher?
>>>>
>>>>
>>>>
>>>> I mean the matcher the user may have provided on the command line (or
>>>> match.always() by default), as in "hg status dir/" (where the matcher
>>>> would be "relpath:dir").
>>>>
>>>>>
>>>>> This code produces a matcher that returns true for any file that should
>>>>> be
>>>>> ignored.  Since both hgignore files and sparse-ignored files should be
>>>>> ignored, I'm not sure how that could be expressed with intersection of
>>>>> those
>>>>> two matchers?
>>>>
>>>>
>>>>
>>>> For narrowhg, we did it the other way around: filtering in instead of
>>>> filtering out. So if the narrowspec (like sparse config, IIUC) says to
>>>> include foo/ and bar/ and the user provides 'glob:*c', we'd intersect
>>>> that and list *.c files in those two directories (recursively).
>>>
>>>
>>>
>>> I'd have to look at the code to be specific, but I think the dirstate
>>> ignore
>>> logic covers more cases than the user provided matcher logic. I'd be
>>> surprised if all commands that hit dirstate.ignore also happened to take
>>> patterns at the command level.
>>
>>
>> If they don't, then the sparse matcher can be passed as is.
>>
>>>  It just seemed cleaner to have a unified
>>> matcher for ignored files at the repo level.  The user specific matcher
>>> can
>>> always be added on top of it later for commands that take patterns.
>>
>>
>> For narrow, we have to apply the matcher when walking the manifest
>> too. The user can pass a matcher to e.g. "hg status -c ." or "hg files
>> -r ." and in those cases we need to intersect the narrow matcher with
>> the user-supplied one. It seemed more natural to do the same for
>> dirstate walks.
>>
>> It also seems simpler to control which directories are visited if
>> using a positive matcher than a negative one. For example, let's say
>> the narrow matcher is path:dir/. The narrowhg code will then restrict
>> the walk to visit only the root directory, dir/, and subdirectories of
>> dir/ (both for manifest walks and dirstate walks). I think we can
>> simply make negatematcher's visitdir return False iff the
>> narrow/sparse matcher returns 'all', so it's probably easy to get it
>> to work. It still seems more natural to me to match what should be
>> included.
>>
>
> I don't have a strong opinion either way. When I made sparse, it was
> specific to the working copy, so it mapped to the ignore matcher very
> tightly. If that needs to change, that's fine.

I just tried this. I think the result is cleaner, but there's a
functional change that perhaps suggests the reason you modified the
ignores instead of modifying the walk: untracked files outside the
sparse config are reported as ignored with your version and are not
reported at all with my version. That's an interesting aspect that I
had not thought about.

I had previously considered adding an option to status for narrowhg to
make it also list files outside the narrow config. One option would be
to not include them by default and list their status as something
special (maybe O for outside?) since neither untracked nor ignored is
really accurate. They may (kind of) be tracked, just not in the
working copy, and with narrowhg with treemanifests we can't even tell
if they're in the manifest or not.

So, would you be okay with changing the behavior to not report these
files as ignored by default and instead adding an option to status
(maybe --no-sparse)?

For narrow, even committed revisions will be filtered by the
narrowspec by default. One can imagine passing --no-sparse (or
--no-narrow?) for committed revisions as well. We could then
potentially report things as outside the narrowspec and added,
removed, modified, or clean, so there would be those three statuses
with an extra bit saying that it's also outside the narrowspec (since
they're all letters, we could use lowercase 'a', 'r', 'm', 'c', but
I'm not sure that's a good idea). We can and should the discuss the
specifics later.

(When running hg on top of our overlay file system at Google, we
definitely do not want to walk the entire working copy (because it has
millions of files outside the sparse/narrow config). But that's not a
normal hg setup, so it would be our problem to set up an override for
that if we decided to report these files as ignored in hg core.)

>
> I just want to avoid duplicating repetitive matcher logic amongst individual
> commands.
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to