Re: [PATCH v4 09/10] config: add core.untrackedCache

2016-01-04 Thread Junio C Hamano
Christian Couder  writes:

> What scenario do you have in mind where people would have to do things
> differently?

They eventually will see a system in which that they do not have do
anything after flipping the configuration, yet will still see stale
"you must run 'git status'" on their websearches and SO questions,
which would be a cost for them to remember that they no longer have
to do the extra 'git status'.

>> Itt sounds like somewhat a short-sighted mindset to design the
>> system, and I was hoping that by now you would have become better
>> than that.
>>
>> The real question is what are the problems in implementing this in
>> the way Duy suggested in the previous discussion.  The answer may
>> fall into somewhere between "that approach does not work in such and
>> such cases, so this is the best I could come up with" and "I know
>> that approach is far superiour, but I have invested too much in this
>> inferiour approach and refuse to rework it further to make it better."
>
> My question is why should I invest time thinking about and testing
> another approach when the current approach seems simpler, less bug
> prone, faster and without any downside?

As to the downside, I think Duy's "at the time we read the index" is
the least problematic route from the maintainability point of view.
What are you doing to help people who make changes in the future to
the system to make "git add $dir" or "git clean $dir" aware of the
untracked cache not to forget doing the "oh, the config says we want
to add the untracked cache if missing, so we do it here" in their
new codepath?  Whey you say "This is only about status", you are
essentially saying "It's outside the scope of my job, I was hired to
improve the usability of 'git status' with untracked cache and I do
not care about the longer term overall health of the system".

Now, you said something about "simpler", "less bug prone" and
"faster" (I doubt you can make such statements that involve
comparison without investing time thinking about other approaches,
though, but that is a separate topic).  That would mean that you see
"complexity", "error prone-ness", and "slowness" in the way Duy
suggested---that is exactly the question I asked you in the message
you are responding to.  What are the problems in implementing this
in the way Duy suggested?  What kind of "complexity" do you see?
Which part is more "error prone"?  Why does it have to be "slower"?

>> Using of not using untracked-cache is (supposed to be) purely
>> performance and not correctness thing, and I do not think the users
>> and the scripts do not deserve to see a failure from "update-index
>> --untracked-cache" only because there is a stray core.untrackedCache
>> set to 'false' somewhere.
>
> This "stray core.untrackedCache" could not have been put there by
> users of previous git versions because it has no meaning before this
> patch series. So I don't understand why you call it "a stray
> core.untrackedCache".
>
> It is no more "stray" to me than the call to "update-index --untracked-cache".
>
> If it has been put in /etc/git.config by an admin and if the user
> thinks he knows better, the user can still change the config locally
> to override the system one.

You are assuming that everybody constantly looks at /etc/git.config
to make sure evil admins won't do things that affect their
repositories and use of Git in potentially negative way.  I doubt
anybody does.

By the way, I understand that this "stray one affects without user
being aware of it" is the primary the reason why Ævar wants this
'configuration automatically adds untracked cache even to the
existing index' feature---everybody will get it without even be
aware of the change their admins make.  Which may be a good thing
for those who use the configuration variable.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 09/10] config: add core.untrackedCache

2015-12-31 Thread Christian Couder
On Thu, Dec 31, 2015 at 12:23 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> On Tue, Dec 29, 2015 at 11:35 PM, Junio C Hamano  wrote:
>> ...
>>> While the above is not wrong per-se, from the point of those who
>>> looked for these options (that is, those who wanted to do a one-shot
>>> enabling or disabling of the feature, perhaps to try it out to see
>>> how well it helps on their system), I think the explanation of the
>>> interaction between the option and the config is backwards.  For
>>> their purpose, setting it to `true` or `false` will be hinderance,
>>> because the options are made no-op by such a setting.  IOW, the
>>> advertisement "once you decided that you want to use the feature,
>>> configuration is a better place to set it" does not belong here.
>>>
>>> If I were writing this documentation, I'd probably rephrase the
>>> second paragraph like so:
>>>
>>> These options take effect only when the
>>> `core.untrackedCache` configuration variable (see
>>> linkgit:git-config[1]) is set to `keep` (or it is left
>>> unset).  When the configuration variable is set to `true`,
>>> the untracked cache feature is always enabled (and when it
>>> is set to `false`, it is always disabled), making these
>>> options no-op.
>>>
>>> perhaps.
>>
>> Yeah, but "no-op" is not technically true, as if you just set the
>> config variable to true for example and then use "--untracked-cache"
>> then the cache will be added immediately.
>
> The "update-index --[no-]untracked-cache" command is made to no
> longer follow the user's immediate desire (i.e. enable or disable)
> expressed by the invocation of the command.  That is what I meant by
> 'no-op'.
>
>> ... for example "git update-index --untracked-cache" will die() if
>> the config variable is set to false.
>
> As I think it is a BUG to make these die(), it is an irrelevant
> objection ;-).

Ok I think I will just use what you suggest except the "no-op" part:

  These options take effect only when the
  `core.untrackedCache` configuration variable (see
  linkgit:git-config[1]) is set to `keep` (or it is left
  unset).  When the configuration variable is set to `true`,
  the untracked cache feature is always enabled (and when it
  is set to `false`, it is always disabled).


>>> I somehow thought, per the previous discussion with Duy, that the
>>> check and addition of an empty one (if missing in the index and
>>> configuration is set to `true`) or removal of an existing one (if
>>> configuration is set to `false`) were to be done when the index is
>>> read by read_index_from(), though.  If you have to say "After
>>> setting the configuration, you must run this command", that does not
>>> sound like a huge improvement over "If you want to enable this
>>> feature, you must run this command".
>>
>> The untracked-cache feature, as I tried to explain in an email in the
>> previous discussion, has an effect only on git status when it has to
>> show untracked files. Otherwise the untracked-cache is simply not
>> used. It might be a goal to use it more often, but it's not my patch
>> series' goal.
>
> In the future we may extend the system to allow "git add somedir/"
> to take advantage of the untracked cache (i.e. we may already know
> what untracked paths there are without letting readdir(3) to scan it
> in somedir/), for example.  Is it reasonable to force users to say
> "git status", in such a future?

If a new feature takes advantage of the untracked cache, it is
reasonable that this feature enables or disables it if should be
enabled or disabled. Maybe we can just say in the doc something like:

When the `core.untrackedCache` configuration variable is changed, the
untracked cache is added to or removed from the index the next time
a command that can use the untracked cache is run; while when
`--[no-|force-]untracked-cache` are used, the untracked cache is
immediately added to or removed from the index. Currently only
"git status" can use the untracked cache.

> And more importantly, when that
> future comes, is it reasonable to force users to re-learn Git to do
> something else to do things differently at that point?

I don't think people will have to relearn anything or do things
differently especially if things are explained like above.

If they want to use the untracked cache by setting a config option,
and then want to enable it right away, they can still use git status
for that, so things will continue to work.

What scenario do you have in mind where people would have to do things
differently?

> Itt sounds like somewhat a short-sighted mindset to design the
> system, and I was hoping that by now you would have become better
> than that.
>
> The real question is what are the problems in implementing this in
> the way Duy suggested in the previous discussion.  The answer may
> fall into somewhere between "that approach does not work in such and
> su

Re: [PATCH v4 09/10] config: add core.untrackedCache

2015-12-30 Thread Junio C Hamano
Christian Couder  writes:

> On Tue, Dec 29, 2015 at 11:35 PM, Junio C Hamano  wrote:
> ...
>> While the above is not wrong per-se, from the point of those who
>> looked for these options (that is, those who wanted to do a one-shot
>> enabling or disabling of the feature, perhaps to try it out to see
>> how well it helps on their system), I think the explanation of the
>> interaction between the option and the config is backwards.  For
>> their purpose, setting it to `true` or `false` will be hinderance,
>> because the options are made no-op by such a setting.  IOW, the
>> advertisement "once you decided that you want to use the feature,
>> configuration is a better place to set it" does not belong here.
>>
>> If I were writing this documentation, I'd probably rephrase the
>> second paragraph like so:
>>
>> These options take effect only when the
>> `core.untrackedCache` configuration variable (see
>> linkgit:git-config[1]) is set to `keep` (or it is left
>> unset).  When the configuration variable is set to `true`,
>> the untracked cache feature is always enabled (and when it
>> is set to `false`, it is always disabled), making these
>> options no-op.
>>
>> perhaps.
>
> Yeah, but "no-op" is not technically true, as if you just set the
> config variable to true for example and then use "--untracked-cache"
> then the cache will be added immediately.

The "update-index --[no-]untracked-cache" command is made to no
longer follow the user's immediate desire (i.e. enable or disable)
expressed by the invocation of the command.  That is what I meant by
'no-op'.

> ... for example "git update-index --untracked-cache" will die() if
> the config variable is set to false.

As I think it is a BUG to make these die(), it is an irrelevant
objection ;-).

>> I somehow thought, per the previous discussion with Duy, that the
>> check and addition of an empty one (if missing in the index and
>> configuration is set to `true`) or removal of an existing one (if
>> configuration is set to `false`) were to be done when the index is
>> read by read_index_from(), though.  If you have to say "After
>> setting the configuration, you must run this command", that does not
>> sound like a huge improvement over "If you want to enable this
>> feature, you must run this command".
>
> The untracked-cache feature, as I tried to explain in an email in the
> previous discussion, has an effect only on git status when it has to
> show untracked files. Otherwise the untracked-cache is simply not
> used. It might be a goal to use it more often, but it's not my patch
> series' goal.

In the future we may extend the system to allow "git add somedir/"
to take advantage of the untracked cache (i.e. we may already know
what untracked paths there are without letting readdir(3) to scan it
in somedir/), for example.  Is it reasonable to force users to say
"git status", in such a future?  And more importantly, when that
future comes, is it reasonable to force users to re-learn Git to do
something else to do things differently at that point?

Itt sounds like somewhat a short-sighted mindset to design the
system, and I was hoping that by now you would have become better
than that.

The real question is what are the problems in implementing this in
the way Duy suggested in the previous discussion.  The answer may
fall into somewhere between "that approach does not work in such and
such cases, so this is the best I could come up with" and "I know
that approach is far superiour, but I have invested too much in this
inferiour approach and refuse to rework it further to make it better."

I unfortunately am not getting the sense that I know where your
answer would fall into that spectrum from your response.

>> Exiting with 0 (= success) after issuing a warning() might be more
>> appropriate.
>
> Scripts users might just not look at the warnings and expect things to
> work if the exit code is 0.

That is exactly why I said it is inappropriate to error out.

Using of not using untracked-cache is (supposed to be) purely
performance and not correctness thing, and I do not think the users
and the scripts do not deserve to see a failure from "update-index
--untracked-cache" only because there is a stray core.untrackedCache
set to 'false' somewhere.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 09/10] config: add core.untrackedCache

2015-12-30 Thread Christian Couder
On Tue, Dec 29, 2015 at 11:35 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> +core.untrackedCache::
>> + Determines if untracked cache will be automatically enabled or
>> + disabled. It can be `keep`, `true` or `false`. Before setting
>> + it to `true`, you should check that mtime is working properly
>> + on your system.
>> + See linkgit:git-update-index[1]. `keep` by default.
>> +
>
> Before "Before setting it to `true`", the reader needs to be told
> what would happen when this is set to each of these three values (as
> well as what happens when this is not set at all).

Ok, then I think I will write something like:

core.untrackedCache::
 Determines what to do about the untracked cache feature of the
index. It will
 be kept, if this variable is unset or set to `keep`. It will
automatically be added
 if set to `true`. And it will automatically be removed, if set to
`false`. Before
 setting it to `true`, you should check that mtime is working
properly on your
 system.
 See linkgit:git-update-index[1]. `keep` by default.

>> diff --git a/Documentation/git-update-index.txt 
>> b/Documentation/git-update-index.txt
>> index a0afe17..813f6cc 100644
>> --- a/Documentation/git-update-index.txt
>> +++ b/Documentation/git-update-index.txt
>> @@ -174,27 +174,31 @@ may not support it yet.
>>
>>  --untracked-cache::
>>  --no-untracked-cache::
>> - Enable or disable untracked cache extension. This could speed
>> - up for commands that involve determining untracked files such
>> - as `git status`. The underlying operating system and file
>> - system must change `st_mtime` field of a directory if files
>> - are added or deleted in that directory.
>> + Enable or disable untracked cache extension. Please use
>> + `--test-untracked-cache` before enabling it.
>
> "extension" is a term that is fairly close to the machinery.  In
> other parts of the documentation (like the added text below in this
> patch), it is called "untracked cache FEATURE", which probably is a
> better word to use here as well.

Ok, I will use "feature".

>> +These options cannot override the `core.untrackedCache` configuration
>> +variable when it is set to `true` or `false` (see
>> +linkgit:git-config[1]). They can override it only when it is unset or
>> +set to `keep`. If you want untracked cache to persist, it is better
>> +anyway to just set this configuration variable to `true` or `false`
>> +directly.
>
> While the above is not wrong per-se, from the point of those who
> looked for these options (that is, those who wanted to do a one-shot
> enabling or disabling of the feature, perhaps to try it out to see
> how well it helps on their system), I think the explanation of the
> interaction between the option and the config is backwards.  For
> their purpose, setting it to `true` or `false` will be hinderance,
> because the options are made no-op by such a setting.  IOW, the
> advertisement "once you decided that you want to use the feature,
> configuration is a better place to set it" does not belong here.
>
> If I were writing this documentation, I'd probably rephrase the
> second paragraph like so:
>
> These options take effect only when the
> `core.untrackedCache` configuration variable (see
> linkgit:git-config[1]) is set to `keep` (or it is left
> unset).  When the configuration variable is set to `true`,
> the untracked cache feature is always enabled (and when it
> is set to `false`, it is always disabled), making these
> options no-op.
>
> perhaps.

Yeah, but "no-op" is not technically true, as if you just set the
config variable to true for example and then use "--untracked-cache"
then the cache will be added immediately. Also it does not explain
that for example "git update-index --untracked-cache" will die() if
the config variable is set to false.

>> @@ -385,6 +389,34 @@ Although this bit looks similar to assume-unchanged 
>> bit, its goal is
>>  different from assume-unchanged bit's. Skip-worktree also takes
>>  precedence over assume-unchanged bit when both are set.
>>
>> +Untracked cache
>> +---
>> +
>> +This cache could speed up commands that involve determining untracked
>> +...
>> +It is recommended to use the `core.untrackedCache` configuration
>> +variable (see linkgit:git-config[1]) to enable or disable this feature
>> +instead of using the `--[no-|force-]untracked-cache`, as it is more
>> +persistant and visible with a configuration variable.
>
> s/persistant/persistent/; but more importantly, I do not think
> persistence has much to do with the choice between the option and
> configuration.  Once it is set with `--untracked-cache`, it should
> persist until you explicitly use `--no-untracked-cache` to disable
> it, no?  Otherwise you have a bug that needs to be fixed.

Yeah, it should persist except if you clone and copy the config file.
I agree that it is not clear what "per

Re: [PATCH v4 09/10] config: add core.untrackedCache

2015-12-29 Thread Junio C Hamano
Christian Couder  writes:

> +core.untrackedCache::
> + Determines if untracked cache will be automatically enabled or
> + disabled. It can be `keep`, `true` or `false`. Before setting
> + it to `true`, you should check that mtime is working properly
> + on your system.
> + See linkgit:git-update-index[1]. `keep` by default.
> +

Before "Before setting it to `true`", the reader needs to be told
what would happen when this is set to each of these three values (as
well as what happens when this is not set at all).

> diff --git a/Documentation/git-update-index.txt 
> b/Documentation/git-update-index.txt
> index a0afe17..813f6cc 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -174,27 +174,31 @@ may not support it yet.
>  
>  --untracked-cache::
>  --no-untracked-cache::
> - Enable or disable untracked cache extension. This could speed
> - up for commands that involve determining untracked files such
> - as `git status`. The underlying operating system and file
> - system must change `st_mtime` field of a directory if files
> - are added or deleted in that directory.
> + Enable or disable untracked cache extension. Please use
> + `--test-untracked-cache` before enabling it.

"extension" is a term that is fairly close to the machinery.  In
other parts of the documentation (like the added text below in this
patch), it is called "untracked cache FEATURE", which probably is a
better word to use here as well.

> ++
> +These options cannot override the `core.untrackedCache` configuration
> +variable when it is set to `true` or `false` (see
> +linkgit:git-config[1]). They can override it only when it is unset or
> +set to `keep`. If you want untracked cache to persist, it is better
> +anyway to just set this configuration variable to `true` or `false`
> +directly.

While the above is not wrong per-se, from the point of those who
looked for these options (that is, those who wanted to do a one-shot
enabling or disabling of the feature, perhaps to try it out to see
how well it helps on their system), I think the explanation of the
interaction between the option and the config is backwards.  For
their purpose, setting it to `true` or `false` will be hinderance,
because the options are made no-op by such a setting.  IOW, the
advertisement "once you decided that you want to use the feature,
configuration is a better place to set it" does not belong here.

If I were writing this documentation, I'd probably rephrase the
second paragraph like so:

These options take effect only when the
`core.untrackedCache` configuration variable (see
linkgit:git-config[1]) is set to `keep` (or it is left
unset).  When the configuration variable is set to `true`,
the untracked cache feature is always enabled (and when it
is set to `false`, it is always disabled), making these
options no-op.

perhaps.

> @@ -385,6 +389,34 @@ Although this bit looks similar to assume-unchanged bit, 
> its goal is
>  different from assume-unchanged bit's. Skip-worktree also takes
>  precedence over assume-unchanged bit when both are set.
>  
> +Untracked cache
> +---
> +
> +This cache could speed up commands that involve determining untracked
> +...
> +It is recommended to use the `core.untrackedCache` configuration
> +variable (see linkgit:git-config[1]) to enable or disable this feature
> +instead of using the `--[no-|force-]untracked-cache`, as it is more
> +persistant and visible with a configuration variable.

s/persistant/persistent/; but more importantly, I do not think
persistence has much to do with the choice between the option and
configuration.  Once it is set with `--untracked-cache`, it should
persist until you explicitly use `--no-untracked-cache` to disable
it, no?  Otherwise you have a bug that needs to be fixed.

The reason you'd want to use a configuration is because the effect
of using the `--untracked-cache` option is per repository (rather,
per index-file).

If you want to enable (or disable) this feature, it is easier to
use the `core.untrackedCache` configuration variable than using
the `--untracked-cache` option to `git update-index` in each
repository, especially if you want to do so across all
repositories you use, because you can set the configuration
variable to `true` (or `false`) in your `$HOME/.gitconfig` just
once and have it affect all repositories you touch.

or something, perhaps.

> +When the `core.untrackedCache` configuration variable is changed, the
> +untracked cache is added to or removed from the index the next time
> +"git status" is run; while when `--[no-|force-]untracked-cache` are
> +used, the untracked cache is immediately added to or removed from the
> +index.

Is it only "git status", or anything that writes/updates the index
file?  The above makes it sound as if you are saying "If you change
the variable, you must run `

[PATCH v4 09/10] config: add core.untrackedCache

2015-12-28 Thread Christian Couder
When we know that mtime is fully supported by the environment, we
might want the untracked cache to be always used by default without
any mtime test or kernel version check being performed.

Also when we know that mtime is not supported by the environment,
for example because the repo is shared over a network file system,
then we might want 'git update-index --untracked-cache' to fail
immediately instead of performing tests (because it might work on
some systems using the repo over the network file system but not
others).

The normal way to achieve the above in Git is to use a config
variable. That's why this patch introduces "core.untrackedCache".

This variable is a tristate, `keep`, `false` and `true`, which
default to `keep`.

When "git status" is run, it now adds or removes the untracked
cache in the index to respect the value of this variable. So it
does nothing if the value is `keep` or if the variable is unset;
it adds the untracked cache if the value is `true`; and it
removes the cache if the value is `false`.

`git update-index --[no-|force-]untracked-cache` still adds or
removes the untracked cache from the index, but this now fails
if it goes against the value of core.untrackedCache.

Also `--untracked-cache` used to check that the underlying
operating system and file system change `st_mtime` field of a
directory if files are added or deleted in that directory. But
those tests take a long time and there is now
`--test-untracked-cache` to perform them.

That's why, to be more consistent with other git commands, this
patch prevents `--untracked-cache` to perform tests, so that
after this patch there is no difference any more between
`--untracked-cache` and `--force-untracked-cache`.

This last change is backward incompatible and should be
mentioned in the release notes.

Signed-off-by: Christian Couder 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   |  7 
 Documentation/git-update-index.txt | 65 ++
 builtin/update-index.c | 35 --
 cache.h|  1 +
 config.c   | 11 ++
 contrib/completion/git-completion.bash |  1 +
 environment.c  |  1 +
 t/t7063-status-untracked-cache.sh  |  4 +--
 wt-status.c| 13 +++
 9 files changed, 109 insertions(+), 29 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f617886..d892127 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -308,6 +308,13 @@ core.trustctime::
crawlers and some backup systems).
See linkgit:git-update-index[1]. True by default.
 
+core.untrackedCache::
+   Determines if untracked cache will be automatically enabled or
+   disabled. It can be `keep`, `true` or `false`. Before setting
+   it to `true`, you should check that mtime is working properly
+   on your system.
+   See linkgit:git-update-index[1]. `keep` by default.
+
 core.checkStat::
Determines which stat fields to match between the index
and work tree. The user can set this to 'default' or
diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index a0afe17..813f6cc 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -174,27 +174,31 @@ may not support it yet.
 
 --untracked-cache::
 --no-untracked-cache::
-   Enable or disable untracked cache extension. This could speed
-   up for commands that involve determining untracked files such
-   as `git status`. The underlying operating system and file
-   system must change `st_mtime` field of a directory if files
-   are added or deleted in that directory.
+   Enable or disable untracked cache extension. Please use
+   `--test-untracked-cache` before enabling it.
++
+These options cannot override the `core.untrackedCache` configuration
+variable when it is set to `true` or `false` (see
+linkgit:git-config[1]). They can override it only when it is unset or
+set to `keep`. If you want untracked cache to persist, it is better
+anyway to just set this configuration variable to `true` or `false`
+directly.
 
 --test-untracked-cache::
Only perform tests on the working directory to make sure
untracked cache can be used. You have to manually enable
-   untracked cache using `--force-untracked-cache` (or
-   `--untracked-cache` but this will run the tests again)
-   afterwards if you really want to use it. If a test fails
-   the exit code is 1 and a message explains what is not
-   working as needed, otherwise the exit code is 0 and OK is
-   printed.
+   untracked cache using `--untracked-cache` or
+   `--force-untracked-cache` or the `core.untrackedCache`
+   configuration variable afterwards if you really want to use
+   it. If a test fails the exit code is 1 and a me