Re: [PATCH] http-backend: allow empty CONTENT_LENGTH

2018-09-08 Thread Max Kirillov
On Fri, Sep 07, 2018 at 02:49:23AM -0700, Junio C Hamano wrote:
> In any case, hopefully we can fix this before the final, as this is
> a regression introduced during this cycle?

I think I am going to stop at the v4. Unless there are some
corrections requested.


[PATCH v4] http-backend: allow empty CONTENT_LENGTH

2018-09-08 Thread Max Kirillov
Before 817f7dc223, CONTENT_LENGTH variable was never considered,
http-backend was just reading request body from standard input until EOF
when it, or a command started by it, needed it.

Then it was discovered that some HTTP do not close standard input, instead
expecting CGI scripts to obey CONTENT_LENGTH. In 817f7dc223, behavior
was changed to consider the CONTENT_LENGTH variable when it is set. Case
of unset CONTENT_LENGTH was kept to mean "read until EOF" which is not
compliant to the RFC3875 (which treats it as empty body), but
practically is used when client uses chunked encoding to submit big
request.

Case of empty CONTENT_LENGTH has slept through this conditions.
Apparently, it is used for GET requests, and RFC3875 does specify that
it also means empty body. Current implementation, however, fails to
parse it and aborts the request.

Fix the case of empty CONTENT_LENGTH to be treated as zero-length body
is expected, as specified by RFC3875. It does not actually matter what
does it mean because body is never read anyway, it just should not cause
parse error. Add a test for the case.

Reported-By: Jelmer Vernooij 
Authored-by: Jeff King 
Signed-off-by: Max Kirillov 
---
The fix suggested by Jeff. I supposed there should be "signed-off"
The tests pass as well
 http-backend.c | 24 ++--
 t/t5562-http-backend-content-length.sh | 11 +++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index e88d29f62b..949821b46f 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -353,8 +353,28 @@ static ssize_t get_content_length(void)
ssize_t val = -1;
const char *str = getenv("CONTENT_LENGTH");
 
-   if (str && !git_parse_ssize_t(str, ))
-   die("failed to parse CONTENT_LENGTH: %s", str);
+   if (!str) {
+   /*
+* RFC3875 says this must mean "no body", but in practice we
+* receive chunked encodings with no CONTENT_LENGTH. Tell the
+* caller to read until EOF.
+*/
+   val = -1;
+   } else if (!*str) {
+   /*
+* An empty length should be treated as "no body" according to
+* RFC3875, and this seems to hold in practice.
+*/
+   val = 0;
+   } else {
+   /*
+* We have a non-empty CONTENT_LENGTH; trust what's in it as 
long
+* as it can be parsed.
+*/
+   if (!git_parse_ssize_t(str, ))
+   die("failed to parse CONTENT_LENGTH: '%s'", str);
+   }
+
return val;
 }
 
diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index 057dcb85d6..b28c3c4765 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -152,4 +152,15 @@ test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
grep "fatal:.*CONTENT_LENGTH" err
 '
 
+test_expect_success 'empty CONTENT_LENGTH' '
+   env \
+   QUERY_STRING="/repo.git/info/refs?service=git-receive-pack" \
+   PATH_TRANSLATED="$PWD"/.git/info/refs \
+   GIT_HTTP_EXPORT_ALL=TRUE \
+   REQUEST_METHOD=GET \
+   CONTENT_LENGTH="" \
+   git http-backend act.out 2>act.err &&
+   verify_http_result "200 OK"
+'
+
 test_done
-- 
2.17.0.1185.g782057d875



Re: git silently ignores include directive with single quotes

2018-09-08 Thread Paul Smith
On Sat, 2018-09-08 at 13:13 -0700, Stas Bekman wrote:
> I remind that the original problem came from a simple command:
> 
>  git config --local include.path '../.gitconfig'
> 
> Which on linux removed the quotes and all was fine, and on windows
> the same command kept the quotes and the user was tearing his hair
> out trying to understand why the custom config was ignored.

I'm quite sure that the user was not using the Git Bash shell when they
entered this command, but instead using command.com or powershell or
some variant of that.

If you use Git Bash as your shell then quotes will be handled like any
POSIX shell and the above will do what (we all) expect.

If you use command.com and powershell and use single quotes, then the
single quotes will be put into the config file as you observed, because
those shells don't deal with single quotes.

You could use double-quotes, which ARE handled by command.com and
powershell; in that case they would be stripped out and would not
appear in the config.


If we were designing from scratch maybe using something like GNU make's
"include" vs "sinclude" (silent include--this is another name for the
already mentioned "-include") would work; maybe "path" and "spath" or
something.  But to make it work right you really want the default
behavior to be "warn if the file is not found" and have the special
behavior be "quiet if the file is not found" otherwise it doesn't
really help beginners to avoid errors.  And that's a backward-
compatibility problem.  To my mind, adding extra "check this" options
isn't very useful either: these kinds of warnings need to be on by
default to be effective.  The beginners, who need them, aren't going to
remember to add extra options to enable more checking.

What I personally think would be more useful would be some sort of
"verbose parsing" option to git config, that would parse the
configuration just as a normal Git command would and show diagnostic
output as the entire config is parsed: for each action line the config
file name and line number, and the operation performed (and any message
about it) would be printed.  This could be useful in a variety of
situations, for instance to discover conflicts between local, global,
and system configuration, easily see where settings are coming from,
etc.

And as part of this output, when an include file was not present or we
didn't have permissions or whatever, an appropriate error message would
be generated.


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 07:51 PM, Paul Smith wrote:
[...]
> What I personally think would be more useful would be some sort of
> "verbose parsing" option to git config, that would parse the
> configuration just as a normal Git command would and show diagnostic
> output as the entire config is parsed: for each action line the config
> file name and line number, and the operation performed (and any message
> about it) would be printed.  This could be useful in a variety of
> situations, for instance to discover conflicts between local, global,
> and system configuration, easily see where settings are coming from,
> etc.
> 
> And as part of this output, when an include file was not present or we
> didn't have permissions or whatever, an appropriate error message would
> be generated.

I was thinking along the same lines, Paul - i.e. no need to change
anything in the config syntax, but to provide better diagnostics.

I quote below what I suggested in an earlier email, but I like Paul's
idea even better as it'd be useful to many situations and not just the
one that started this thread.

> 1) I suggest this is done via:
>
>   git config --list --show-origin
>
> where the new addition would be to also show configuration parts that
> are not active and indicating why it is so.
>
> So for example currently I get on a valid configuration setup and having
> git/../.gitconfig in place the following output:
>
> [...]
> file:/home/stas/.gitconfig  mergetool.prompt=false
> [...]
> file:.git/configinclude.path=../.gitconfig
> [...]
> file:.git/../.gitconfig
> filter.fastai-nbstripout-code.clean=tools/fastai-nbstripout
> [...]
>
> Now, if include.path=../.gitconfig is there and file:.git/../.gitconfig
> is not found, it will indicate that in some way that stands out for the
> user. Perhaps:
>
> [...]
> file:/home/stas/.gitconfig  mergetool.prompt=false
> [...]
> file:.git/configinclude.path=../.gitconfig
> [...]
> file:.git/../.gitconfig FILE NOT FOUND! Ignored configuration
> [...]
>
> So that would allow things to work as before, but now we have a way to
> debug user-side configuration. And of course hoping that the docs would
> indicate that method for debugging configuration problems.
>
> I hope this is a reasonable suggestion that doesn't require any
> modification on the users' part who rely on this silent ignoring
> "feature", yet lending to a configuration debug feature.



-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 11:10:44PM +0100, Ramsay Jones wrote:

> > Probably:
> > 
> >   [include]
> >   warnOnMissing = false
> >   path = ...
> 
> I was going to suggest, inspired by Makefile syntax, that
> [-include] would not complain if the file was missing ...
> except, of course, it's too late for that! ;-)
> 
> I suppose [+include] could complain if the file is missing
> instead, ... dunno.

I think that's syntactically invalid. At any rate, there are clearly
three options for setting a bit:

  1. In the section header (+include, or Ævar's includeIf suggestion).

  2. In another key (which looks pretty clean, but does introduce
 ordering constraints).

  3. In the key name (maybePath or similar).

I don't have a huge preference between them.

-Peff


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 03:49:01PM -0700, Stas Bekman wrote:

> On 2018-09-08 02:22 PM, Jeff King wrote:
> [...]
> >> The original problem cropped up due to using:
> >>
> >>  git config --local include.path '../.gitconfig'
> >>
> >> which on linux stripped the single quotes, but on some windows git bash
> >> emulation it kept them.
> > 
> > That sounds like a bug in git bash, if it is not treating single quotes
> > in the usual shell way. But I'd also expect such a bug to cause loads of
> > problems in all of the shell scripts. Are you sure it wasn't cmd.exe or
> > some other interpreter?
> 
> I don't know, Jeff. I think the user said it was first anaconda shell.
> And then the user tried gitforwindows with same results. I don't know
> MSwindows at all.
> 
> But it doesn't matter at the end of the day, since we can't cover all
> possible unix shell emulations out there. What matters is that there is
> a way to flag the misconfiguration, either by default, or through a
> special check - some ideas I suggested in my previous email, but surely
> you have a much better insight of how to deal with that.

Yes, I agree this is pretty orthogonal to the config discussion. I was
just wondering if there was a separate bug to look into. I'm willing to
shrug and say "it was probably user error" until we see more definite
details. Thanks.

-Peff


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Jeff King
On Sun, Sep 09, 2018 at 12:32:57AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > You could even do:
> >
> >   [include]
> >   warnOnMissing = false
> >   path = one
> >   warnOnMissing = true
> >   path = two
> >
> > to treat two includes differently (though I'm not sure why you would
> > want to).
> 
> I think this is introducing a brand new caveat into our *.ini syntax,
> i.e. that we're sensitive to the order in which we're parsing
> *different* keys.
> 
> I.e. we already had the concept that some keys override existing sets
> (e.g. user.name), but not that a x.y=foo controls the behavior of a
> subsequent a.b=bar, or the other way around.

This already exists. For example:

  echo '[foo]bar = inc' >config.inc
  echo '[foo]bar = main' >config.main
  echo '[include]path = config.inc' >>config.main
  git config -f config.main --includes foo.bar

Arriving at that answer requires expanding the include's contents at
the exact same spot in the file.

As far as I know this is the only case, though, so include.path really
is special. And this would be expanding that specialness to other things
in include.*. But I'm not sure that is really that big a deal. That
warnOnMissing isn't meant to be just a normal key. It's an
order-sensitive directive for further parsing, just like include.path
is. Either the parser understands includes (and has to handle these) or
it doesn't.

So I'm not worried about any burden on the parsing side, but...

> This also makes programmatic (via "git config") editing of the config
> hard, we'd need to introduce something like:
> 
> git config -f ~/.gitconfig a.b bar
> git config -f ~/.gitconfig --before a.b x.y foo
> 
> To set a.b=bar before x.y=foo, or --after or whatever.

Yes, I don't think "git config include.warnOnMissing true" would be very
useful, because it would generally be added after any includes you have,
and therefore not affect them.

I think this is generally an issue with include.path, too. My assumption
has been that anybody at the level of using includes is probably going
to be hand-editing anyway.

But if include.* is special on the parsing side, I don't know that it is
that bad to make it special on the writing side, too. I.e., to recognize
"git config include.warnOnMissing true" and always add it at the head of
any existing include block.

It certainly _feels_ hacky, but I think it would behave sensibly and
predictably. And it would just work, as opposed to requiring something
like "--before", which would be quite a subtle gotcha for somebody to
forget to use.

> Yeah, we could do that, and it wouldn't break the model described above,
> We can make that work, but this would be nasty. E.g. are we going to
> treat EACCES and ENOENT the same way in this construct?

I don't have a strong opinion (after all, I already wrote the behavior I
thought was reasonable long ago ;) ). So I think it would be up to
somebody to propose. We do already report and die on EACCES (and
basically any other error except ENOENT). So if we did treat them both
as a warning, that would be a weakening for EACCES.

-Peff


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 02:22 PM, Jeff King wrote:
[...]
>> The original problem cropped up due to using:
>>
>>  git config --local include.path '../.gitconfig'
>>
>> which on linux stripped the single quotes, but on some windows git bash
>> emulation it kept them.
> 
> That sounds like a bug in git bash, if it is not treating single quotes
> in the usual shell way. But I'd also expect such a bug to cause loads of
> problems in all of the shell scripts. Are you sure it wasn't cmd.exe or
> some other interpreter?

I don't know, Jeff. I think the user said it was first anaconda shell.
And then the user tried gitforwindows with same results. I don't know
MSwindows at all.

But it doesn't matter at the end of the day, since we can't cover all
possible unix shell emulations out there. What matters is that there is
a way to flag the misconfiguration, either by default, or through a
special check - some ideas I suggested in my previous email, but surely
you have a much better insight of how to deal with that.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 08 2018, Jeff King wrote:

> On Sat, Sep 08, 2018 at 09:54:14PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> The reason missing includes are ignored is that the way this is expected
>> to be used is e.g.:
>>
>> [include]
>> path ~/.gitconfig.work
>>
>> Where .gitconfig.work is some configuration you're going to drop into
>> place on your $dayjob servers, but not on your personal machine, even
>> though you sync the same ~/.gitconfig everywhere.
>>
>> A lot of people who use includes rely on this, but I see from this
>> thread this should be better documented.
>
> Right, this was an intentional choice at the time the feature was added,
> to support this kind of feature. I'd note also that it mirrors other
> misspelled keys. E.g.:
>
>   [include]
>   psth = whatever
>
> will also not generate an error. This is also intentional, for two
> reasons:
>
>   1. Git's config format has always been designed to carry extra keys
>  used by third-party scripts and porcelain. So we don't actually
>  know the complete set of valid keys. (Though you could make an
>  argument that git-core could stake out include.* as its own).
>
>   2. It makes using multiple git versions easier in some ways (though
>  also harder in others). A config key that isn't known to the
>  current version will be quietly ignored.
>
> Of course those things mean that true spelling mistakes are harder to
> catch as such, because Git doesn't know that's what they are. And here
> I'm talking config _keys_, not values. So I'm just explaining the
> philosophical thinking that led to the "missing file is a silent noop".
> It doesn't _have_ to behave the same.
>
> That said, it _does_ behave the same and people are likely depending on
> it at this point. So if we introduce a warning, for example, there needs
> to be some way to suppress it.
>
> Probably:
>
>   [include]
>   warnOnMissing = false
>   path = ...
>
> would be enough (with the default being "true").
>
> You could even do:
>
>   [include]
>   warnOnMissing = false
>   path = one
>   warnOnMissing = true
>   path = two
>
> to treat two includes differently (though I'm not sure why you would
> want to).

I think this is introducing a brand new caveat into our *.ini syntax,
i.e. that we're sensitive to the order in which we're parsing
*different* keys.

I.e. we already had the concept that some keys override existing sets
(e.g. user.name), but not that a x.y=foo controls the behavior of a
subsequent a.b=bar, or the other way around.

This also makes programmatic (via "git config") editing of the config
hard, we'd need to introduce something like:

git config -f ~/.gitconfig a.b bar
git config -f ~/.gitconfig --before a.b x.y foo

To set a.b=bar before x.y=foo, or --after or whatever.

>> If we were to make nonexisting files an error, we'd need something like
>> an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
>> conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
>> include", 2017-03-01). I.e.:
>>
>> [includeIfcond "test -e ~/.gitconfig.work"]
>> path = ~/.gitconfig.work
>>
>> Or something like that, this is getting increasingly harder to shove
>> into the *.ini config syntax.
>
> I think it would be simpler to just introduce a new key that's a variant
> of "path". Like:
>
>   [include]
>   maybePath = ~/.gitconfig.work
>
> Though if it really is just a warning, the "warnOnMissing" above would
> make that unnecessary (and it also scales better if we have to end up
> adding more behavior tweaks in the future).

Yeah, we could do that, and it wouldn't break the model described above,
We can make that work, but this would be nasty. E.g. are we going to
treat EACCES and ENOENT the same way in this construct?


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ramsay Jones



On 08/09/18 22:14, Jeff King wrote:
> On Sat, Sep 08, 2018 at 09:54:14PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
>> The reason missing includes are ignored is that the way this is expected
>> to be used is e.g.:
>>
>> [include]
>> path ~/.gitconfig.work
>>
>> Where .gitconfig.work is some configuration you're going to drop into
>> place on your $dayjob servers, but not on your personal machine, even
>> though you sync the same ~/.gitconfig everywhere.
>>
>> A lot of people who use includes rely on this, but I see from this
>> thread this should be better documented.
> 
> Right, this was an intentional choice at the time the feature was added,
> to support this kind of feature. I'd note also that it mirrors other
> misspelled keys. E.g.:
> 
>   [include]
>   psth = whatever
> 
[snip]
> That said, it _does_ behave the same and people are likely depending on
> it at this point. So if we introduce a warning, for example, there needs
> to be some way to suppress it.
> 
> Probably:
> 
>   [include]
>   warnOnMissing = false
>   path = ...

I was going to suggest, inspired by Makefile syntax, that
[-include] would not complain if the file was missing ...
except, of course, it's too late for that! ;-)

I suppose [+include] could complain if the file is missing
instead, ... dunno.

ATB,
Ramsay Jones



Re: git silently ignores include directive with single quotes

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 11:58:47AM -0700, Stas Bekman wrote:

> This doesn’t:
> 
> [include]
> path = '../.gitconfig'

So I think it's been covered elsewhere that single quotes aren't a thing
in git's config format. I will say that this was actually a minor
surprise to me, after a decade of working with the format. ;)

I don't know if it's worth changing now or not It would be
backwards-incompatible, but I wonder if we could do it in a sane way.
E.g., with a rule like:

  - if the first non-whitespace character of the value is a
single-quote, assume the value is quoted and apply normal shell
rules (i.e., no backslash escapes until the ending single-quote)

  - otherwise, single-quotes are not special at all

That would allow things like:

  [diff "foo"]
  textconv = some_shell_hackery 'with quotes' | foo

to continue working, but make:

  [some]
  path = 'this has "double quotes" in it!'

do what the user probably intended. It would be a regression for anybody
who literally has a value that starts with a single-quote, but that
seems like it would be pretty rare. Or I dunno, maybe people do it on
Windows to try to protect path-names that get interpreted by the shell.

> The original problem cropped up due to using:
> 
>  git config --local include.path '../.gitconfig'
> 
> which on linux stripped the single quotes, but on some windows git bash
> emulation it kept them.

That sounds like a bug in git bash, if it is not treating single quotes
in the usual shell way. But I'd also expect such a bug to cause loads of
problems in all of the shell scripts. Are you sure it wasn't cmd.exe or
some other interpreter?

-Peff


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 09:54:14PM +0200, Ævar Arnfjörð Bjarmason wrote:

> The reason missing includes are ignored is that the way this is expected
> to be used is e.g.:
> 
> [include]
> path ~/.gitconfig.work
> 
> Where .gitconfig.work is some configuration you're going to drop into
> place on your $dayjob servers, but not on your personal machine, even
> though you sync the same ~/.gitconfig everywhere.
> 
> A lot of people who use includes rely on this, but I see from this
> thread this should be better documented.

Right, this was an intentional choice at the time the feature was added,
to support this kind of feature. I'd note also that it mirrors other
misspelled keys. E.g.:

  [include]
  psth = whatever

will also not generate an error. This is also intentional, for two
reasons:

  1. Git's config format has always been designed to carry extra keys
 used by third-party scripts and porcelain. So we don't actually
 know the complete set of valid keys. (Though you could make an
 argument that git-core could stake out include.* as its own).

  2. It makes using multiple git versions easier in some ways (though
 also harder in others). A config key that isn't known to the
 current version will be quietly ignored.

Of course those things mean that true spelling mistakes are harder to
catch as such, because Git doesn't know that's what they are. And here
I'm talking config _keys_, not values. So I'm just explaining the
philosophical thinking that led to the "missing file is a silent noop".
It doesn't _have_ to behave the same.

That said, it _does_ behave the same and people are likely depending on
it at this point. So if we introduce a warning, for example, there needs
to be some way to suppress it.

Probably:

  [include]
  warnOnMissing = false
  path = ...

would be enough (with the default being "true").

You could even do:

  [include]
  warnOnMissing = false
  path = one
  warnOnMissing = true
  path = two

to treat two includes differently (though I'm not sure why you would
want to).

> If we were to make nonexisting files an error, we'd need something like
> an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
> conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
> include", 2017-03-01). I.e.:
> 
> [includeIfcond "test -e ~/.gitconfig.work"]
> path = ~/.gitconfig.work
> 
> Or something like that, this is getting increasingly harder to shove
> into the *.ini config syntax.

I think it would be simpler to just introduce a new key that's a variant
of "path". Like:

  [include]
  maybePath = ~/.gitconfig.work

Though if it really is just a warning, the "warnOnMissing" above would
make that unnecessary (and it also scales better if we have to end up
adding more behavior tweaks in the future).

-Peff


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 01:28 PM, Ævar Arnfjörð Bjarmason wrote:
[...]
> Yeah, some version of this is sensible. There's at least a doc patch in
> here somewhere, if not some "warn if missing" mode.
> 
> So don't take any of this as minimizing that aspect of your bug report.
> 
> *But*
> 
> There's just no way that "git" the tool can somehow in a sane way rescue
> you from knowing the quoting rules of the shell on your system, which
> differ wildly between the likes of Windows and Linux.

I understand. All your explanations are perfectly reasonable, Ævar.
Thank you.

Yet, there needs to be some way for a user to know that git ignored
something if their configuration doesn't work as expected.

1) I suggest this is done via:

  git config --list --show-origin

where the new addition would be to also show configuration parts that
are not active and indicating why it is so.

So for example currently I get on a valid configuration setup and having
git/../.gitconfig in place the following output:

[...]
file:/home/stas/.gitconfig  mergetool.prompt=false
[...]
file:.git/configinclude.path=../.gitconfig
[...]
file:.git/../.gitconfig
filter.fastai-nbstripout-code.clean=tools/fastai-nbstripout
[...]

Now, if include.path=../.gitconfig is there and file:.git/../.gitconfig
is not found, it will indicate that in some way that stands out for the
user. Perhaps:

[...]
file:/home/stas/.gitconfig  mergetool.prompt=false
[...]
file:.git/configinclude.path=../.gitconfig
[...]
file:.git/../.gitconfig FILE NOT FOUND! Ignored configuration
[...]

So that would allow things to work as before, but now we have a way to
debug user-side configuration. And of course hoping that the docs would
indicate that method for debugging configuration problems.

I hope this is a reasonable suggestion that doesn't require any
modification on the users' part who rely on this silent ignoring
"feature", yet lending to a configuration debug feature.

2) And a secondary suggestion I mentioned earlier is to also have a flag
for git config to validate the path as it is being configured:

 git config --local include.path '../.gitconfig' --validate-path

so that on shells that deal with quoting differently, than what git
expects, this git command will fail saying:

error: can't find file:.git/'../.gitconfig'

or at the very least give a warning if we don't want it be fatal. Though
I see no problem with it being fatal if a user uses a special flag.

I made this second suggestion since it will help users to detect the
problem early on. Before they need to search for another debug solution
such as the first one suggested in this email.

3) Finally, it'd be useful to have GIT_TRACE=1 state that so and so
include path wasn't found and was ignored during various 'git whatever'
commands.

I am open to any or all of these solutions, or alternative suggestions
of course.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 08 2018, Stas Bekman wrote:

> On 2018-09-08 01:02 PM, Ævar Arnfjörð Bjarmason wrote:
>
>> Aside from other issues here, in the "wold of unix" (not that we only
>> use the git config syntax on those sort of systems) you can't assume
>> that just because some quoting construct works in the shell, that it
>> works the same way in some random config format. If you look in your
>> /etc/ you'll find plenty of config formats where you can't use single,
>> double and no quotes interchangeably, so I don't see what hte confusion
>> is with that particular aspect of this.
>>
>> Although as I mentioned in <87bm97rcih@evledraar.gmail.com> the fact
>> that we ignore missing includes definitely needs to be documented, but
>> that our quoting constructs in our config format behave like they do in
>> POSIX shells I see as a non-issue.
>
> I agree that I should make no such assumptions. Thank you. But it is a
> cross-platform problem. I remind that the original problem came from a
> simple command:
>
>  git config --local include.path '../.gitconfig'
>
> Which on linux removed the quotes and all was fine, and on windows the
> same command kept the quotes and the user was tearing his hair out
> trying to understand why the custom config was ignored.
>
> So you can say, don't use the quotes in first place. But what if you have:
>
>  git config --local include.path 'somewhere on the system/.gitconfig'
>
> you have to use single or double quotes inside the shell to keep it as a
> single argument, yet on some windows set ups it'll result in git
> ignoring this configuration directive, as the quotes will end up in git
> config file.
>
> I'd say at the very least 'git config' could have an option
> --verify-path or something similar and for it to validate that the path
> is there exactly as it adds it to .git/config at the time of running
> this command to help the user debug the situation. Of course this won't
> help if .git/config is modified manually. But it's a step towards
> supporting users.
>
> I hope this clarifies the situation.

Yeah, some version of this is sensible. There's at least a doc patch in
here somewhere, if not some "warn if missing" mode.

So don't take any of this as minimizing that aspect of your bug report.

*But*

There's just no way that "git" the tool can somehow in a sane way rescue
you from knowing the quoting rules of the shell on your system, which
differ wildly between the likes of Windows and Linux.

We guarantee that if you pass us the string "foo" it'll work the same
(for the purposes of config syntax, and most other things on all
systems).

We can't guarantee that just because on one system/shell "foo" means the
same as 'foo' when you type it into the terminal, but others it doesn't
that we'll treat it the same way, it's ultimately up to you to know the
quoting rules of your system shell.

On linux/bash I can also do "git config foo.bar <(some-command)", and
there's some systems where that'll be passed in as though (on
linux/bash) we'd passed in:

git config foo.bar "<(some-command)"

What are we supposed to do with that? In particular in the case where
"foo.bar" is supposed to point to a valid filename, and
"<(some-command)" *is* a valid filename?


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 08 2018, Stas Bekman wrote:

> On 2018-09-08 12:54 PM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Sat, Sep 08 2018, Martin Ågren wrote:
>>
>>> Hi Stas
>>>
>>> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
 [include]
 path = '../.gitconfig'

 Notice the single quotes around the filename. When this is the case git
 silently (!) ignores the custom configuration, which is clearly a bug.
>>>
>>> Thanks for reporting and describing out your expectations and what you
>>> observed.
>>>
>>> Actually, there is a test explicitly testing that 'missing include files
>>> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
>>> (config: add include directive, 2012-02-06).
>>>
 The original problem cropped up due to using:

  git config --local include.path '../.gitconfig'

 which on linux stripped the single quotes, but on some windows git bash
 emulation it kept them.
>>>
>>> Huh, I wouldn't have expected them to be kept. You learn something
>>> new every day...
>>>
 What am I suggesting is that git:

 (1) should complain if it encounters an invalid configuration and not
 silently ignore it. It took quite some effort and time to figure the
 culprit.
>>>
>>> Sounds reasonable to me, but I might be missing something. I'm cc-ing
>>> the original author. Maybe he can recall why he made sure it silently
>>> ignores missing files.
>>>
 (2) probably allow the quoted location of the file, but it's much less
 important, as it's easy to rectify once git gives user #1
>>>
>>> I don't think this will work. Allowing quoting for just this one item,
>>> or for all? Any and all quoting or just at the first and last character?
>>> What about those config items where quotes might legitimately occur,
>>> i.e., we'd need some escaping? Actually, something like '.gitconfig'
>>> *with* *those* *quotes* is a valid filename on my machine.
>>
>> The reason missing includes are ignored is that the way this is expected
>> to be used is e.g.:
>>
>> [include]
>> path ~/.gitconfig.work
>>
>> Where .gitconfig.work is some configuration you're going to drop into
>> place on your $dayjob servers, but not on your personal machine, even
>> though you sync the same ~/.gitconfig everywhere.
>
> Thank you for clarifying why this is done silently, Ævar. It makes sense
> then.
>
>> If we were to make nonexisting files an error, we'd need something like
>> an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
>> conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
>> include", 2017-03-01). I.e.:
>>
>> [includeIfcond "test -e ~/.gitconfig.work"]
>> path = ~/.gitconfig.work
>>
>> Or something like that, this is getting increasingly harder to shove
>> into the *.ini config syntax.
>
> This suggestion won't solve the real problem. The real problem is that
> git can't find '.gitconfig' even though it's there, due to single quotes
> around the filepath. So the suggested check will still ignore the
> configuration even if it's there.

...because that's not how the *.ini syntax works. That means to look up
a file called '.gitconfig', as opposed to .gitconfig, ie. one that
actually starts with a single quote. On POSIX systems filenames can
include all bytes except \0, so we need some way to include those.

I've just created a 'foo' file (i.e. one that has a 5-chararcer name,
including single quotes), and including it via git's config works, as
opposed to the filename foo (i.e. the three-character version).

I can see how this is confusing, but we can't have some way to have this
"ignore missing" feature and warn about stuff like 'foo' v.s. "foo"
v.s. foo without carrying some list of quoting constructs deemed to be
confusing, and forbidding includes from files that look like that.

> This also leads me to think what if the include path has spaces in it?
>
> path = ~/somewhere on my system/.gitconfig.work
>
> most people would assume quotes are needed around the filepath.


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 01:02 PM, Ævar Arnfjörð Bjarmason wrote:

> Aside from other issues here, in the "wold of unix" (not that we only
> use the git config syntax on those sort of systems) you can't assume
> that just because some quoting construct works in the shell, that it
> works the same way in some random config format. If you look in your
> /etc/ you'll find plenty of config formats where you can't use single,
> double and no quotes interchangeably, so I don't see what hte confusion
> is with that particular aspect of this.
> 
> Although as I mentioned in <87bm97rcih@evledraar.gmail.com> the fact
> that we ignore missing includes definitely needs to be documented, but
> that our quoting constructs in our config format behave like they do in
> POSIX shells I see as a non-issue.

I agree that I should make no such assumptions. Thank you. But it is a
cross-platform problem. I remind that the original problem came from a
simple command:

 git config --local include.path '../.gitconfig'

Which on linux removed the quotes and all was fine, and on windows the
same command kept the quotes and the user was tearing his hair out
trying to understand why the custom config was ignored.

So you can say, don't use the quotes in first place. But what if you have:

 git config --local include.path 'somewhere on the system/.gitconfig'

you have to use single or double quotes inside the shell to keep it as a
single argument, yet on some windows set ups it'll result in git
ignoring this configuration directive, as the quotes will end up in git
config file.

I'd say at the very least 'git config' could have an option
--verify-path or something similar and for it to validate that the path
is there exactly as it adds it to .git/config at the time of running
this command to help the user debug the situation. Of course this won't
help if .git/config is modified manually. But it's a step towards
supporting users.

I hope this clarifies the situation.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:54 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sat, Sep 08 2018, Martin Ågren wrote:
> 
>> Hi Stas
>>
>> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>>> [include]
>>> path = '../.gitconfig'
>>>
>>> Notice the single quotes around the filename. When this is the case git
>>> silently (!) ignores the custom configuration, which is clearly a bug.
>>
>> Thanks for reporting and describing out your expectations and what you
>> observed.
>>
>> Actually, there is a test explicitly testing that 'missing include files
>> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
>> (config: add include directive, 2012-02-06).
>>
>>> The original problem cropped up due to using:
>>>
>>>  git config --local include.path '../.gitconfig'
>>>
>>> which on linux stripped the single quotes, but on some windows git bash
>>> emulation it kept them.
>>
>> Huh, I wouldn't have expected them to be kept. You learn something
>> new every day...
>>
>>> What am I suggesting is that git:
>>>
>>> (1) should complain if it encounters an invalid configuration and not
>>> silently ignore it. It took quite some effort and time to figure the
>>> culprit.
>>
>> Sounds reasonable to me, but I might be missing something. I'm cc-ing
>> the original author. Maybe he can recall why he made sure it silently
>> ignores missing files.
>>
>>> (2) probably allow the quoted location of the file, but it's much less
>>> important, as it's easy to rectify once git gives user #1
>>
>> I don't think this will work. Allowing quoting for just this one item,
>> or for all? Any and all quoting or just at the first and last character?
>> What about those config items where quotes might legitimately occur,
>> i.e., we'd need some escaping? Actually, something like '.gitconfig'
>> *with* *those* *quotes* is a valid filename on my machine.
> 
> The reason missing includes are ignored is that the way this is expected
> to be used is e.g.:
> 
> [include]
> path ~/.gitconfig.work
> 
> Where .gitconfig.work is some configuration you're going to drop into
> place on your $dayjob servers, but not on your personal machine, even
> though you sync the same ~/.gitconfig everywhere.

Thank you for clarifying why this is done silently, Ævar. It makes sense
then.

> If we were to make nonexisting files an error, we'd need something like
> an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
> conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
> include", 2017-03-01). I.e.:
> 
> [includeIfcond "test -e ~/.gitconfig.work"]
> path = ~/.gitconfig.work
> 
> Or something like that, this is getting increasingly harder to shove
> into the *.ini config syntax.

This suggestion won't solve the real problem. The real problem is that
git can't find '.gitconfig' even though it's there, due to single quotes
around the filepath. So the suggested check will still ignore the
configuration even if it's there.

This also leads me to think what if the include path has spaces in it?

path = ~/somewhere on my system/.gitconfig.work

most people would assume quotes are needed around the filepath.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 08 2018, Stas Bekman wrote:

> On 2018-09-08 12:30 PM, Martin Ågren wrote:
>> Hi Stas
>>
>> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>>> [include]
>>> path = '../.gitconfig'
>
>> Actually, there is a test explicitly testing that 'missing include files
>> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
>> (config: add include directive, 2012-02-06).
>
> And also to stress out, that the file is not missing.  At least in the
> world of unix, in particular its many shells, - command line arguments
> "xyz", 'xyz', xyz are often deemed to be the same if there are no spaces
> in the word. So that's why it took us a lot of trial and error to even
> consider the quotes in '../.gitconfig' as a problem. While git deems it
> different, to me:
>
> path = '../.gitconfig'
> path = "../.gitconfig"
> path = ../.gitconfig
>
> appear to be the "same". So git needs to have a way to say otherwise.
>
> I realize I am going back to the issue of quoting here, after suggesting
> to ignore it. So to clarify I'm bringing it up only in the context of
> wanting git to tell the user what it wants, and not necessarily asking
> to support all the possible ways one could quote a filepath.

Aside from other issues here, in the "wold of unix" (not that we only
use the git config syntax on those sort of systems) you can't assume
that just because some quoting construct works in the shell, that it
works the same way in some random config format. If you look in your
/etc/ you'll find plenty of config formats where you can't use single,
double and no quotes interchangeably, so I don't see what hte confusion
is with that particular aspect of this.

Although as I mentioned in <87bm97rcih@evledraar.gmail.com> the fact
that we ignore missing includes definitely needs to be documented, but
that our quoting constructs in our config format behave like they do in
POSIX shells I see as a non-issue.


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 08 2018, Martin Ågren wrote:

> Hi Stas
>
> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>> [include]
>> path = '../.gitconfig'
>>
>> Notice the single quotes around the filename. When this is the case git
>> silently (!) ignores the custom configuration, which is clearly a bug.
>
> Thanks for reporting and describing out your expectations and what you
> observed.
>
> Actually, there is a test explicitly testing that 'missing include files
> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
> (config: add include directive, 2012-02-06).
>
>> The original problem cropped up due to using:
>>
>>  git config --local include.path '../.gitconfig'
>>
>> which on linux stripped the single quotes, but on some windows git bash
>> emulation it kept them.
>
> Huh, I wouldn't have expected them to be kept. You learn something
> new every day...
>
>> What am I suggesting is that git:
>>
>> (1) should complain if it encounters an invalid configuration and not
>> silently ignore it. It took quite some effort and time to figure the
>> culprit.
>
> Sounds reasonable to me, but I might be missing something. I'm cc-ing
> the original author. Maybe he can recall why he made sure it silently
> ignores missing files.
>
>> (2) probably allow the quoted location of the file, but it's much less
>> important, as it's easy to rectify once git gives user #1
>
> I don't think this will work. Allowing quoting for just this one item,
> or for all? Any and all quoting or just at the first and last character?
> What about those config items where quotes might legitimately occur,
> i.e., we'd need some escaping? Actually, something like '.gitconfig'
> *with* *those* *quotes* is a valid filename on my machine.

The reason missing includes are ignored is that the way this is expected
to be used is e.g.:

[include]
path ~/.gitconfig.work

Where .gitconfig.work is some configuration you're going to drop into
place on your $dayjob servers, but not on your personal machine, even
though you sync the same ~/.gitconfig everywhere.

A lot of people who use includes rely on this, but I see from this
thread this should be better documented.

If we were to make nonexisting files an error, we'd need something like
an extension of the includeIf syntax added in 3efd0bedc6 ("config: add
conditional include", 2017-03-01) 3efd0bedc6 ("config: add conditional
include", 2017-03-01). I.e.:

[includeIfcond "test -e ~/.gitconfig.work"]
path = ~/.gitconfig.work

Or something like that, this is getting increasingly harder to shove
into the *.ini config syntax.


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:30 PM, Martin Ågren wrote:
> Hi Stas
> 
> On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
>> [include]
>> path = '../.gitconfig'

> Actually, there is a test explicitly testing that 'missing include files
> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
> (config: add include directive, 2012-02-06).

And also to stress out, that the file is not missing.  At least in the
world of unix, in particular its many shells, - command line arguments
"xyz", 'xyz', xyz are often deemed to be the same if there are no spaces
in the word. So that's why it took us a lot of trial and error to even
consider the quotes in '../.gitconfig' as a problem. While git deems it
different, to me:

path = '../.gitconfig'
path = "../.gitconfig"
path = ../.gitconfig

appear to be the "same". So git needs to have a way to say otherwise.

I realize I am going back to the issue of quoting here, after suggesting
to ignore it. So to clarify I'm bringing it up only in the context of
wanting git to tell the user what it wants, and not necessarily asking
to support all the possible ways one could quote a filepath.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
On 2018-09-08 12:30 PM, Martin Ågren wrote:

> Actually, there is a test explicitly testing that 'missing include files
> are ignored'. I couldn't find a motivation for this in 9b25a0b52e
> (config: add include directive, 2012-02-06).

Thank you for the follow up, Martin. And discovering that it is by design.

I suppose this could have been done to optimize run-time performance.
But there must be a way for a user to validate their custom
configuration. So perhaps there should be a specific directive to do so?
One could argue that:

  git config --list --show-origin

does exactly that. Except it should probably also indicate that some
configuration file or parts of were ignored - and clearly indicate the
exact nature of the problem. In which case it'd be sufficient.

>> (2) probably allow the quoted location of the file, but it's much less
>> important, as it's easy to rectify once git gives user #1
> 
> I don't think this will work. Allowing quoting for just this one item,
> or for all? Any and all quoting or just at the first and last character?
> What about those config items where quotes might legitimately occur,
> i.e., we'd need some escaping? Actually, something like '.gitconfig'
> *with* *those* *quotes* is a valid filename on my machine.

Let's ignore this sub-issue for now. If we can get git to report when
something is mis-configured, this issue can then be easily resolved.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: git silently ignores include directive with single quotes

2018-09-08 Thread Martin Ågren
Hi Stas

On Sat, 8 Sep 2018 at 21:00, Stas Bekman  wrote:
> [include]
> path = '../.gitconfig'
>
> Notice the single quotes around the filename. When this is the case git
> silently (!) ignores the custom configuration, which is clearly a bug.

Thanks for reporting and describing out your expectations and what you
observed.

Actually, there is a test explicitly testing that 'missing include files
are ignored'. I couldn't find a motivation for this in 9b25a0b52e
(config: add include directive, 2012-02-06).

> The original problem cropped up due to using:
>
>  git config --local include.path '../.gitconfig'
>
> which on linux stripped the single quotes, but on some windows git bash
> emulation it kept them.

Huh, I wouldn't have expected them to be kept. You learn something
new every day...

> What am I suggesting is that git:
>
> (1) should complain if it encounters an invalid configuration and not
> silently ignore it. It took quite some effort and time to figure the
> culprit.

Sounds reasonable to me, but I might be missing something. I'm cc-ing
the original author. Maybe he can recall why he made sure it silently
ignores missing files.

> (2) probably allow the quoted location of the file, but it's much less
> important, as it's easy to rectify once git gives user #1

I don't think this will work. Allowing quoting for just this one item,
or for all? Any and all quoting or just at the first and last character?
What about those config items where quotes might legitimately occur,
i.e., we'd need some escaping? Actually, something like '.gitconfig'
*with* *those* *quotes* is a valid filename on my machine.

Thank you for reporting.

Martin


git silently ignores include directive with single quotes

2018-09-08 Thread Stas Bekman
Hi,

One of the windows users discovered this bug, and I was able to
reproduce it on linux.

We are using a custom content filter configuration REPO/.gitconfig which
needs to be enabled inside REPO/.git/config:

This works:

[include]
path = ../.gitconfig

This doesn’t:

[include]
path = '../.gitconfig'

Notice the single quotes around the filename. When this is the case git
silently (!) ignores the custom configuration, which is clearly a bug.

I found the easiest to debug this is by using:

git config --list --show-origin

In the former case it shows the custom config, in the latter it does not.

Yet, git gives no indication of any errors, not even with GIT_TRACE and
other debug vars.

The original problem cropped up due to using:

 git config --local include.path '../.gitconfig'

which on linux stripped the single quotes, but on some windows git bash
emulation it kept them.

What am I suggesting is that git:

(1) should complain if it encounters an invalid configuration and not
silently ignore it. It took quite some effort and time to figure the
culprit.

(2) probably allow the quoted location of the file, but it's much less
important, as it's easy to rectify once git gives user #1

I don't have the details about the windows user setup, but I was able to
reproduce this bug with git version 2.17.1 on linux.

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: [PATCH] Revert "Merge branch 'sb/submodule-core-worktree'" (was Re: Old submodules broken in 2.19rc1 and 2.19rc2)

2018-09-08 Thread Johannes Sixt
Am 08.09.2018 um 04:04 schrieb Junio C Hamano:
> Jonathan Nieder  writes:
> 
>> It is late in the release cycle, so revert the whole 3-patch series.
>> We can try again later for 2.20.
>>
>> Reported-by: Allan Sandfeld Jensen 
>> Helped-by: Stefan Beller 
>> Signed-off-by: Jonathan Nieder 
>> ---
>> Stefan Beller wrote:
>>> Jonathan Nieder wrote:
>>
 I think we
 should revert e98317508c0 in "master" (for 2.19) and keep making use
 of that 'second try' in "next" (for 2.20).
>>>
>>> Actually I'd rather revert the whole topic leading up to
>>> 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18)
>>> as the last patch in there doesn't work well without e98317508c0 IIRC.
>>>
>>> And having only the first patch would bring an inconsistent state as
>>> then different commands behave differently w.r.t. setting core.worktree.
>>
>> Like this (generated using "git revert -m1)?
> 
> OK.  Thanks for taking care of it.

Please don't forget to remove the corresponding release notes entry.

diff --git a/Documentation/RelNotes/2.19.0.txt 
b/Documentation/RelNotes/2.19.0.txt
index bcbfbc2041..834454ffb9 100644
--- a/Documentation/RelNotes/2.19.0.txt
+++ b/Documentation/RelNotes/2.19.0.txt
@@ -296,12 +296,6 @@ Fixes since v2.18
to the submodule was changed in the range of commits in the
superproject, sometimes showing "(null)".  This has been corrected.
 
- * "git submodule" did not correctly adjust core.worktree setting that
-   indicates whether/where a submodule repository has its associated
-   working tree across various state transitions, which has been
-   corrected.
-   (merge 984cd77ddb sb/submodule-core-worktree later to maint).
-
  * Bugfix for "rebase -i" corner case regression.
(merge a9279c6785 pw/rebase-i-keep-reword-after-conflict later to maint).
 


Re: [PATCH 2/2] mingw: fix mingw_open_append to work with named pipes

2018-09-08 Thread Johannes Sixt

Am 08.09.2018 um 11:26 schrieb Johannes Sixt:

Am 07.09.2018 um 20:19 schrieb Jeff Hostetler via GitGitGadget:

diff --git a/compat/mingw.c b/compat/mingw.c
index 858ca14a57..ef03bbe5d2 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -355,7 +355,7 @@ static int mingw_open_append(wchar_t const 
*wfilename, int oflags, ...)

   * FILE_SHARE_WRITE is required to permit child processes
   * to append to the file.
   */
-    handle = CreateFileW(wfilename, FILE_APPEND_DATA,
+    handle = CreateFileW(wfilename, FILE_WRITE_DATA | FILE_APPEND_DATA,
  FILE_SHARE_WRITE | FILE_SHARE_READ,
  NULL, create, FILE_ATTRIBUTE_NORMAL, NULL);
  if (handle == INVALID_HANDLE_VALUE)



I did not go with this version because the documentation 
https://docs.microsoft.com/en-us/windows/desktop/fileio/file-access-rights-constants 
says:


FILE_APPEND_DATA: For a file object, the right to append data to the 
file. (For local files, write operations will not overwrite existing 
data if this flag is specified without FILE_WRITE_DATA.) [...]


which could be interpreted as: Only if FILE_WRITE_DATA is not set, we 
have the guarantee that existing data in local files is not overwritten, 
i.e., new data is appended atomically.


Is this interpretation too narrow and we do get atomicity even when 
FILE_WRITE_DATA is set?


Here is are some comments on stackoverflow which let me think that 
FILE_APPEND_DATA with FILE_WRITE_DATA is no longer atomic:


https://stackoverflow.com/questions/20093571/difference-between-file-write-data-and-file-append-data#comment29995346_20108249

-- Hannes


Re: gitweb: HTML output is not always encoded in UTF-8 when using --fastcgi.

2018-09-08 Thread Julien Moutinho
Le sam. 08 sept. 2018 à 19:15:32 +0200, Julien Moutinho a écrit :
> As a quick workaround this hotpatch can even be put in $GITWEB_CONFIG
> by removing the `local` before `*FCGI::Stream::PRINT`.

Turns out to require more care than that,
due to $per_request_config reloading $GITWEB_CONFIG at each request,
overwriting FCGI::Stream::PRINT multiple times, messing the encoding.
This seems to work(around):
| if ($first_request) {
| my $enc = Encode::find_encoding('UTF-8');
| my $org = \::Stream::PRINT;
| no warnings 'redefine';
| *FCGI::Stream::PRINT = sub {
| my @OUTPUT = @_;
| for (my $i = 1; $i < @_; $i++) {
| $OUTPUT[$i] = $enc->encode($_[$i], 
Encode::FB_CROAK|Encode::LEAVE_SRC);
| }
| @_ = @OUTPUT;
| goto $org;
| };
| }


gitweb: HTML output is not always encoded in UTF-8 when using --fastcgi.

2018-09-08 Thread Julien Moutinho
Description
===
An old (2011) description of the problem is here:
https://stackoverflow.com/questions/7285215/nginx-fastcgi-utf-8-encoding-output-iso-8859-1-instead-of-utf8#answer-18149487

Basically, gitweb's HTML output is not always encoded in UTF-8
when using --fastcgi.


Reproduction

gitweb v2.18.0
perl   v5.28.0

| echo Système >test.git/description

According to the 2011 problem report,
the problem only appears when using gitweb.cgi --fastcgi
not when gitweb.cgi is spawned by fcgiwrap.

And apparently, the text must not contain one character
which cannot be correctly converted to ISO-8859-1,
or an UTF-8 encoding is done (not sure by what);
which made this bug harder to spot.


Explanation
===
According to Christian Hansen (chansen), the problem is that:
> FCGI streams are implemented using the older stream API,
> TIEHANDLE. Applying PerlIO layers using binmode() has no effect.
https://stackoverflow.com/questions/5005104/how-to-force-fastcgi-to-encode-form-data-as-utf-8-as-cgi-pm-has-option#answer-7097698
https://perldoc.perl.org/perltie.html

Indeed:
> FCGI.pm isn't Unicode aware,
> only characters within the range 0x00-0xFF are supported.
https://metacpan.org/pod/FCGI#LIMITATIONS

But, as stated in gitweb's to_utf8():
> gitweb writes out in utf-8
> thanks to "binmode STDOUT, ':utf8'" at beginning"


Correction
==
Christian Hansen suggested that:
"The proper solution would be to encode your data before outputting it,
but if thats not an option I can offer this hotpatch:"

| my $enc = Encode::find_encoding('UTF-8');
| my $org = \::Stream::PRINT;
| no warnings 'redefine';
| local *FCGI::Stream::PRINT = sub {
| my @OUTPUT = @_;
| for (my $i = 1; $i < @_; $i++) {
| $OUTPUT[$i] = $enc->encode($_[$i], 
Encode::FB_CROAK|Encode::LEAVE_SRC);
| }
| @_ = @OUTPUT;
| goto $org;
| };

As a quick workaround this hotpatch can even be put in $GITWEB_CONFIG
by removing the `local` before `*FCGI::Stream::PRINT`.


Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-08 Thread Martin Ågren
On Sat, 8 Sep 2018 at 16:04, Ben Peart  wrote:
> On 9/8/2018 2:29 AM, Martin Ågren wrote:
> > Maybe it all works out, e.g., so that when someone (brian) merges a
> > NewHash and runs the testsuite, this will fail consistently and in a
> > safe way. But I wonder if it would be too hard to avoid the hardcoded 24
> > already now.
>
> I can certainly change this to be:
>
> #define EOIE_SIZE (4 + GIT_SHA1_RAWSZ)
>
> which should (hopefully) make it easier to find this hard coded hash
> length in the sea of hard coded "20" and "160" (bits) littered through
> the codebase.

Yeah, that seems more grep-friendly.

Martin


Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases

2018-09-08 Thread brian m. carlson
On Fri, Sep 07, 2018 at 12:23:53AM -0700, Jonathan Nieder wrote:
> Ævar Arnfjörð Bjarmason wrote:
> > If this turns out to be a common use-case perhaps the easiest way to
> > support that would be to make the hashmap (optionally?) ordered, as Ruby
> > 1.9 did with their hash implementation:
> > https://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/
> 
> That's about recording the order of insertion.  I'm talking about
> something much simpler: sorting an array (as preparation for emitting
> it) and binary searching to find an entry in that array.

If you want both a collection that is always sorted and has efficient
lookup for an arbitrary entry, then a B-tree is probably a better
choice.  That's the primitive that Rust provides for that situation.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH 5/5] split-index: smudge and add racily clean cache entries to split index

2018-09-08 Thread Duy Nguyen
On Thu, Sep 6, 2018 at 4:48 AM SZEDER Gábor  wrote:
>
> Ever since the split index feature was introduced [1], refreshing a
> split index is prone to a variant of the classic racy git problem.
>
> Consider the following sequence of commands updating the split index
> when the shared index contains a racily clean cache entry, i.e. an
> entry whose cached stat data matches with the corresponding file in
> the worktree and the cached mtime matches that of the index:
>
>   echo "cached content" >file
>   git update-index --split-index --add file
>   echo "dirty worktree" >file# size stays the same!
>   # ... wait ...
>   git update-index --add other-file
>
> Normally, when a non-split index is updated, then do_write_index()
> (the function responsible for writing all kinds of indexes, "regular",
> split, and shared) recognizes racily clean cache entries, and writes
> them with smudged stat data, i.e. with file size set to 0.  When
> subsequent git commands read the index, they will notice that the
> smudged stat data doesn't match with the file in the worktree, and
> then go on to check the file's content.
>
> In the above example, however, in the second 'git update-index'
> prepare_to_write_split_index() gathers all cache entries that should
> be written to the new split index.  Alas, this function never looks
> out for racily clean cache entries, and since the file's stat data in
> the worktree hasn't changed since the shared index was written, it
> won't be replaced in the new split index.  Consequently,
> do_write_index() doesn't even get this racily clean cache entry, and
> can't smudge its stat data.  Subsequent git commands will then see
> that the index has more recent mtime than the file and that the (not
> smudged) cached stat data still matches with the file in the worktree,
> and, ultimately, will erroneously consider the file clean.
>
> Modify prepare_to_write_split_index() to recognize racily clean cache
> entries, and mark them to be added to the split index.  This way
> do_write_index() will get these racily clean cache entries as well,
> and will then write them with smudged stat data to the new split
> index.

Ack. I was aware of the first half of of the racy solution but did not
pay attention to this smudging business.

I wonder if untracked cache is also racy like this. It also only has
half the racy solution because I only knew that much. I'll check this
later.

> Note that after this change if the index is split when it contains a
> racily clean cache entry, then a smudged cache entry will be written
> both to the new shared and to the new split indexes.  This doesn't
> affect regular git commands: as far as they are concerned this is just
> an entry in the split index replacing an outdated entry in the shared
> index.  It did affect a few tests in 't1700-split-index.sh', though,
> because they actually check which entries are stored in the split
> index; the previous patch made the necessary adjustments.  And racily
> clean cache entries and index splitting are rare enough to not worry
> about the resulting duplicated smudged cache entries, and the
> additional complexity required to prevent them is not worth it.

Yes. If we have to make updates (racy or not) we have to make updates,
and the version in the shared index becomes obsolete by design.
-- 
Duy


Re: [RFC PATCH v4 2/3] Show the call history when an alias is looping

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 03:34:34PM +0200, Duy Nguyen wrote:

> On Sat, Sep 8, 2018 at 12:44 AM Tim Schumacher  wrote:
> >
> > Just printing the command that the user entered is not particularly
> > helpful when trying to find the alias that causes the loop.
> >
> > Print the history of substituted commands to help the user find the
> > offending alias. Mark the entrypoint of the loop with "<==" and the
> > last command (which looped back to the entrypoint) with "==>".
> 
> An even simpler way to give this information is simply suggest the
> user tries again with GIT_TRACE=1. All alias expansion is shown there
> and we teach the user about GIT_TRACE. But your approach is probably
> more user friendly.

Good point. I'm OK with the amount of code here for the nicer message
(but would be happy either way).

If we were going to track cross-process loops like Ævar suggested, I
think I'd rather go with a simple counter and just ask the user to run
with GIT_TRACE when it exceeds some maximum sanity value. For two
reasons:

  1. Passing a counter through the environment is way simpler than
 an arbitrarily-sized list.

  2. When you get into multiple processes, there's potentially more
 going on than just Git commands. You might have a git command which
 runs a hook which runs a third party script which runs a git
 command, which runs a hook, and so on. That full dump is going to
 be more useful.

-Peff


[PATCH v2] config.mak.dev: add -Wformat-security

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 03:38:19PM +0200, Duy Nguyen wrote:

> On Fri, Sep 7, 2018 at 8:21 PM Jeff King  wrote:
> >
> > We currently build cleanly with -Wformat-security, and it's
> > a good idea to make sure we continue to do so (since calls
> > that trigger the warning may be security vulnerabilities).
> 
> Nice. I had this flag in my config.mak too before switching to
> DEVELOPER=1. Didn't realize I lost the flag until now.

I thought I used to use it, too, but I realized recently that I never
did (I experimented with format-nonliteral a long time ago but gave up).

AFAIK we've never actually had a case that this triggered, because it's
so narrow (as opposed to format-nonliteral).

> > diff --git a/config.mak.dev b/config.mak.dev
> > index 9a998149d9..f832752454 100644
> > --- a/config.mak.dev
> > +++ b/config.mak.dev
> > @@ -14,6 +14,7 @@ CFLAGS += -Wpointer-arith
> >  CFLAGS += -Wstrict-prototypes
> >  CFLAGS += -Wunused
> >  CFLAGS += -Wvla
> > +CFLAGS += -Wformat-security
> 
> Maybe keep it sorted

Heh, I didn't even notice the original was sorted. Here it is in the
right spot:

-- >8 --
Subject: config.mak.dev: add -Wformat-security

We currently build cleanly with -Wformat-security, and it's
a good idea to make sure we continue to do so (since calls
that trigger the warning may be security vulnerabilities).

Note that we cannot use the stronger -Wformat-nonliteral, as
there are case where we are clever with passing around
pointers to string literals. E.g., bisect_rev_setup() takes
bad_format and good_format parameters. These ultimately come
from literals, but they still trigger the warning.

Some of these might be fixable (e.g., by passing flags from
which we locally select a format), and might even be worth
fixing (not because of security, but just because it's an
easy mistake to pass the wrong format). But there are other
cases which are likely quite hard to fix (we actually
generate formats in a local buffer in some cases). So let's
punt on that for now and start with -Wformat-security, which
is supposed to catch the most important cases.

Signed-off-by: Jeff King 
---
 config.mak.dev | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.dev b/config.mak.dev
index 9a998149d9..92d268137f 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -7,6 +7,7 @@ CFLAGS += -pedantic
 CFLAGS += -DUSE_PARENS_AROUND_GETTEXT_N=0
 endif
 CFLAGS += -Wdeclaration-after-statement
+CFLAGS += -Wformat-security
 CFLAGS += -Wno-format-zero-length
 CFLAGS += -Wold-style-definition
 CFLAGS += -Woverflow
-- 
2.19.0.rc2.594.gc4806cd463



Re: [PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing

2018-09-08 Thread Jeff King
On Fri, Sep 07, 2018 at 11:28:41PM -0400, Todd Zullinger wrote:

> > So this fix looks fine. It might be worth a comment above the creation
> > of expect_cookies.txt to mention it must be in sorted order (of course
> > anybody modifying it would see a test failure).
> 
> I thought about running the expect_cookies.txt file through
> sort as well.  That would ensure that both files were using
> the same sorting.  Whether that's needed on any platform
> now, I don't know.  Maybe that would be a useful way to
> protect against future edits to the expect_cookies.txt file
> catching the editor?

Yes, I think sorting the expect file would work fine. I'm OK with that,
or just leaving a comment. The comment has the bonus that it does not
cost an extra process at runtime. I'd probably use a sort if we expected
the list to be long and complicated, since it makes life easier on a
future developer. But since there are only 2 lines, I don't think it's a
big deal either way (or even just leaving it as-is without a comment is
probably OK).

> I thought there might be a test function to sort the output,
> but I was (incorrectly) thinking of check_access_log() which
> Gábor added in e8b3b2e275 ("t/lib-httpd: avoid occasional
> failures when checking access.log", 2018-07-12).
> 
> Perhaps it would be useful to have a test_cmp_sorted() to do
> the simple dance of sorting the actual & expected.  I
> haven't looked through the tests to see how often such a
> function might be useful.

I suspect it would occasionally be useful, but I don't recall it coming
up all that often.

-Peff


Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never

2018-09-08 Thread Jeff King
On Sat, Sep 01, 2018 at 08:34:41PM -0400, Noam Postavsky wrote:

> On 6 August 2018 at 17:26, Jeff King  wrote:
> 
> > I suspect it still has a bug, which is that it is handling this
> > first-parent-goes-left case, but probably gets the straight-parent case
> > wrong. But at least in this form, I think it is obvious to see where
> > that bug is (the "three" in the comment is not accurate in that latter
> > case, and it should be two).
> 
> Yes, thanks, it makes a lot more sense this way. I believe the
> attached handles both parent types correctly.

Great (and sorry for the delayed response).

Let me see if I understand it...

> diff --git a/graph.c b/graph.c
> index e1f6d3bdd..478c86dfb 100644
> --- a/graph.c
> +++ b/graph.c
> @@ -848,21 +848,45 @@ static int graph_draw_octopus_merge(struct git_graph 
> *graph,
>   struct strbuf *sb)
>  {
>   /*
> -  * Here dashless_commits represents the number of parents
> -  * which don't need to have dashes (because their edges fit
> -  * neatly under the commit).
> +  * Here dashless_commits represents the number of parents which don't
> +  * need to have dashes (because their edges fit neatly under the
> +  * commit).  And dashful_commits are the remaining ones.
>*/
>   const int dashless_commits = 2;

Why is dashless commits always going to be 2? I thought at first that it
was representing the merge commit itself, plus the line of development
to the left. But the latter could be arbitrarily sized.

But that's not what it is, because we handle that by using
graph->commit_index in the iteration below. So I get that the merge
commit itself does not need a dash. What's the other one?

> + int dashful_commits = graph->num_parents - dashless_commits;

OK, this makes sense.

> + /*
> +  * Usually, each parent gets its own column, like this:
> +  *
> +  * | *-.
> +  * | |\ \
> +  * | | | *
> +  *
> +  * Sometimes the first parent goes into an existing column, like this:
> +  *
> +  * | *-.
> +  * | |\ \
> +  * |/ / /
> +  *
> +  */
> + int parent_in_existing_cols = graph->num_parents -
> + (graph->num_new_columns - graph->num_columns);

Ah, OK, this is the magic part: we compare num_new_columns versus
num_columns to see which case we have. Makes sense. And this comment is
very welcome to explain it visually.

> + /*
> +  * Draw the dashes.  We start in the column following the
> +  * dashless_commits, but subtract out the parent which goes to an
> +  * existing column: we've already counted that column in commit_index.
> +  */
> + int first_col = graph->commit_index + dashless_commits
> + - parent_in_existing_cols;
> + int i;

OK, so we start at the commit_index, which makes sense. We skip past the
dashless commits (which includes the merge itself, plus the other
mystery one). And then we go back by the parents in existing columns,
which I think is either 1 or 2.

And I think that may be the root of my confusion. The other "dashless"
commit is the parent we've already printed before hitting this function,
the left-hand line that goes all the way down below where we print the
other parents.

So I think this is doing the right thing. I'm not sure if there's a
better way to explain "dashless" or not.

> + for (i = 0; i < dashful_commits; i++) {
> + strbuf_write_column(sb, >new_columns[i+first_col], '-');
> + strbuf_write_column(sb, >new_columns[i+first_col],
> + i == dashful_commits-1 ? '.' : '-');
>   }

And this loop is nice and simple now. Good.

So I think this patch looks right. This is all sufficiently complex that
we probably want to add something to the test suite. For reference,
here's how I hacked up your original script to put more commits on the
left:

--- test-multiway-merge.sh.orig 2018-09-08 12:04:23.007468601 -0400
+++ test-multiway-merge.sh  2018-09-08 12:11:02.267750789 -0400
@@ -25,6 +25,11 @@
 
 
 "$GIT" init
+for base in 1 2 3 4; do
+echo base-$base >foo
+git add foo
+git commit -m base-$base
+done
 echo 0 > foo
 "$GIT" add foo
 "$GIT" commit -m 0
@@ -47,4 +52,10 @@
 "$GIT" checkout m
 "$GIT" merge -m "merge a b $*" a b "$@"
 
+sleep 1
+for i in 1 2 3 4; do
+git checkout -b $i-prime master~$i
+git commit --allow-empty -m side-$i
+done
+
 valgrind "$GIT" log --oneline --graph --all

Then running it as "test-multiway-merge d e f" gives a nice wide graph
that should show any off-by-one mistakes. We should be able to do away
with the "sleep" in a real test if we make use of test_commit(), which
advances GIT_COMMITTER_DATE by one second for each commit.

Do you feel comfortable trying to add something to the test suite for
this?

-Peff


Re: Git in Outreachy Dec-Mar?

2018-09-08 Thread Jeff King
On Sat, Sep 08, 2018 at 10:57:46AM +0200, Christian Couder wrote:

> On Thu, Sep 6, 2018 at 9:31 PM, Jeff King  wrote:
> > On Thu, Sep 06, 2018 at 11:51:49AM +0200, Christian Couder wrote:
> >
> >> Yeah, I think the https://git.github.io/Outreachy-17/ is not actually 
> >> necessary.
> >
> > I think it still may be helpful for explaining in further detail things
> > like #leftoverbits (though I see you put some of that in your project
> > description).
> 
> You mean in https://git.github.io/Outreachy-17/ or somewhere else?
> 
> It is already described in https://git.github.io/SoC-2018-Microprojects/.

Yeah, I meant it may still be useful to have an Outreachy page for our
community explaining community-specific procedures. I agree it's mostly
redundant with what's on the GSoC page, but it might be easier on
applicants to have a page tailored directly towards Outreachy. But I
haven't gone over the material as recently as you, so I'd leave that
decision to you.

-Peff


Re: [PATCH v3 3/4] read-cache: load cache extensions on a worker thread

2018-09-08 Thread Ben Peart




On 9/7/2018 5:10 PM, Junio C Hamano wrote:

Ben Peart  writes:


+struct load_index_extensions
+{
+#ifndef NO_PTHREADS
+   pthread_t pthread;
+#endif
+   struct index_state *istate;
+   void *mmap;
+   size_t mmap_size;
+   unsigned long src_offset;


If the file format only allows uint32_t on any platform, perhaps
this is better specified as uint32_t?  Or if this is offset into
a mmap'ed region of memory, size_t may be more appropriate.

Same comment applies to "extension_offset" we see below (which in
turn means the returned type of read_eoie_extension() function may
want to match).


+ };


Space before '}'??


+
+static void *load_index_extensions(void *_data)
+{
+   struct load_index_extensions *p = _data;


Perhaps we are being superstitious, but I think our code try to
avoid leading underscore when able, i.e.

load_index_extensions(void *data_)
{
struct load_index_extensions *p = data;


That's what I get for copying code from elsewhere in the source. :-)

static void *preload_thread(void *_data)
{
int nr;
struct thread_data *p = _data;

since there isn't any need for the underscore at all, I'll just make it:

static void *load_index_extensions(void *data)
{
struct load_index_extensions *p = data;




+   unsigned long src_offset = p->src_offset;
+
+   while (src_offset <= p->mmap_size - the_hash_algo->rawsz - 8) {
+   /* After an array of active_nr index entries,
+* there can be arbitrary number of extended
+* sections, each of which is prefixed with
+* extension name (4-byte) and section length
+* in 4-byte network byte order.
+*/
+   uint32_t extsize;
+   memcpy(, (char *)p->mmap + src_offset + 4, 4);
+   extsize = ntohl(extsize);


The same "ntohl(), not get_be32()?" question as the one for the
previous step applies here, too.  I think the answer is "the
original was written that way" and that is acceptable, but once this
series lands, we may want to review the whole file and see if it is
worth making them consistent with a separate clean-up patch.



Makes sense, I'll add a cleanup patch to fix the inconsistency and have 
them use get_be32().



I think mmap() and munmap() are the only places that wants p->mmap
and mmap parameters passed around in various callchains to be of
type "void *"---I wonder if it is simpler to use "const char *"
throughout and only cast it to "void *" when necessary (I suspect
that there is nowhere we need to cast to or from "void *" explicitly
if we did so---assignment and argument passing would give us an
appropriate cast for free)?


Sure, I'll add minimizing the casting to the clean up patch.




+   if (read_index_extension(p->istate,
+   (const char *)p->mmap + src_offset,
+   (char *)p->mmap + src_offset + 8,
+   extsize) < 0) {
+   munmap(p->mmap, p->mmap_size);
+   die("index file corrupt");
+   }
+   ...
@@ -1907,6 +1951,11 @@ ...
...
+   p.mmap = mmap;
+   p.mmap_size = mmap_size;
+
+#ifndef NO_PTHREADS
+   nr_threads = git_config_get_index_threads();
+   if (!nr_threads)
+   nr_threads = online_cpus();
+
+   if (nr_threads >= 2) {
+   extension_offset = read_eoie_extension(mmap, mmap_size);
+   if (extension_offset) {
+   /* create a thread to load the index extensions */
+   p.src_offset = extension_offset;
+   if (pthread_create(, NULL, load_index_extensions, 
))
+   die(_("unable to create 
load_index_extensions_thread"));
+   }
+   }
+#endif


Makes sense.



Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-08 Thread Ben Peart




On 9/8/2018 2:29 AM, Martin Ågren wrote:

On Fri, 7 Sep 2018 at 22:24, Ben Peart  wrote:

Ben Peart  writes:



- 160-bit SHA-1 over the extension types and their sizes (but not
their contents).  E.g. if we have "TREE" extension that is N-bytes
long, "REUC" extension that is M-bytes long, followed by "EOIE",
then the hash would be:



The purpose of the SHA isn't to detect disk corruption (we already have
a SHA for the entire index that can serve that purpose) but to help
ensure that this was actually a valid EOIE extension and not a lucky
random set of bytes. [...]



+#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */



+the_hash_algo->init_fn();
+while (src_offset < mmap_size - the_hash_algo->rawsz - 
EOIE_SIZE_WITH_HEADER) {

[...]

+the_hash_algo->final_fn(hash, );
+if (hashcmp(hash, (unsigned char *)index))
+return 0;
+
+/* Validate that the extension offsets returned us back to the eoie 
extension. */
+if (src_offset != mmap_size - the_hash_algo->rawsz - EOIE_SIZE_WITH_HEADER)
+return 0;


Besides the issue you and Junio discussed with "should we document this
as being SHA-1 or NewHash" (or "the hash algo"), it seems to me that
this implementation is living somewhere between using SHA-1 and "the
hash algo". The hashing uses `the_hash_algo`, but the hash size is
hardcoded at 20 bytes.

Maybe it all works out, e.g., so that when someone (brian) merges a
NewHash and runs the testsuite, this will fail consistently and in a
safe way. But I wonder if it would be too hard to avoid the hardcoded 24
already now.

Martin



I can certainly change this to be:

#define EOIE_SIZE (4 + GIT_SHA1_RAWSZ)

which should (hopefully) make it easier to find this hard coded hash 
length in the sea of hard coded "20" and "160" (bits) littered through 
the codebase.


Re: [PATCH 1/5] t1700-split-index: drop unnecessary 'grep'

2018-09-08 Thread Duy Nguyen
On Thu, Sep 6, 2018 at 4:48 AM SZEDER Gábor  wrote:
>
> The test 'disable split index' in 't1700-split-index.sh' runs the
> following pipeline:
>
>   cmd | grep  | sed s///
>
> Drop that 'grep' from the pipeline, and let 'sed' take over its
> duties.

Years of using sed and I never realized -n could simplify this
pattern. Thank you.
-- 
Duy


Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-08 Thread Duy Nguyen
On Sat, Sep 8, 2018 at 3:09 PM Duy Nguyen  wrote:
>
> On Sat, Sep 8, 2018 at 11:28 AM Hultqvist  wrote:
> >
> > The bash commands are using a git and bash bundle that was installed
> > in parallel with gitextensions(a gui for git)
> >
> > G:\Min enhet> set GIT_TRACE_SETUP=1
> > G:\Min enhet> git st
> > 10:40:28.881927 trace.c:318 setup: git_dir:
> > C:/Users/hultqvist/Drive.git
> > 10:40:28.881927 trace.c:319 setup: git_common_dir:
> > C:/Users/hultqvist/Drive.git
> > 10:40:28.881927 trace.c:320 setup: worktree: G:/Min enhet
> > 10:40:28.881927 trace.c:321 setup: cwd: G:/Min enhet
> > 10:40:28.881927 trace.c:322 setup: prefix: (null)
> > 10:40:28.882930 chdir-notify.c:67   setup: chdir from 'G:/Min
> > enhet' to 'G:/Min enhet'
>
> Unfortunately this looks good. Whenever those files 'index',
> 'config'... are created, they should always be created in
> C:\Users\hultqvist\Drive.git, not G:\Min enhet, including their
> temporary versions. I don't know if there are any more changes on the
> windows fork that could affect this though, I only checked git.git.

BTW do you notice these files showing up after any particular command
or they're always there after cloning? Perhaps some command got the
".git" directory discovery wrong and assumed $GIT_DIR=$GIT_WORK_TREE.
We have a much bigger problem then.
-- 
Duy


Re: [PATCH] config.mak.dev: add -Wformat-security

2018-09-08 Thread Duy Nguyen
On Fri, Sep 7, 2018 at 8:21 PM Jeff King  wrote:
>
> We currently build cleanly with -Wformat-security, and it's
> a good idea to make sure we continue to do so (since calls
> that trigger the warning may be security vulnerabilities).

Nice. I had this flag in my config.mak too before switching to
DEVELOPER=1. Didn't realize I lost the flag until now.

> Signed-off-by: Jeff King 
> ---
>  config.mak.dev | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/config.mak.dev b/config.mak.dev
> index 9a998149d9..f832752454 100644
> --- a/config.mak.dev
> +++ b/config.mak.dev
> @@ -14,6 +14,7 @@ CFLAGS += -Wpointer-arith
>  CFLAGS += -Wstrict-prototypes
>  CFLAGS += -Wunused
>  CFLAGS += -Wvla
> +CFLAGS += -Wformat-security

Maybe keep it sorted
-- 
Duy


Re: [RFC PATCH v4 2/3] Show the call history when an alias is looping

2018-09-08 Thread Duy Nguyen
On Sat, Sep 8, 2018 at 12:44 AM Tim Schumacher  wrote:
>
> Just printing the command that the user entered is not particularly
> helpful when trying to find the alias that causes the loop.
>
> Print the history of substituted commands to help the user find the
> offending alias. Mark the entrypoint of the loop with "<==" and the
> last command (which looped back to the entrypoint) with "==>".

An even simpler way to give this information is simply suggest the
user tries again with GIT_TRACE=1. All alias expansion is shown there
and we teach the user about GIT_TRACE. But your approach is probably
more user friendly.
-- 
Duy


Re: [RFC PATCH v4 1/3] Add support for nested aliases

2018-09-08 Thread Duy Nguyen
On Sat, Sep 8, 2018 at 12:44 AM Tim Schumacher  wrote:
> +   /*
> +* It could be an alias -- this works around the insanity
>  * of overriding "git log" with "git show" by having
>  * alias.log = show
>  */

I think this comment block is about the next two lines you just
deleted. So delete it to instead of fixing style.

> -   if (done_alias)
> -   break;
> if (!handle_alias(argcp, argv))
> break;
> done_alias = 1;
> }
-- 
Duy


Re: [PATCH v3 0/4] read-cache: speed up index load through parallelization

2018-09-08 Thread Duy Nguyen
On Fri, Sep 7, 2018 at 7:21 PM Junio C Hamano  wrote:
>
> Ben Peart  writes:
>
> > On further investigation with the previous patch, I noticed that my test
> > repos didn't contain the cache tree extension in their index. After doing a
> > commit to ensure they existed, I realized that in some instances, the time
> > to load the cache tree exceeded the time to load all the cache entries in
> > parallel.  Because the thread to read the cache tree was started last (due
> > to having to parse through all the cache entries first) we weren't always
> > getting optimal performance.
> >
> > To better optimize for this case, I decided to write the EOIE extension
> > as suggested by Junio [1] in response to my earlier multithreading patch
> > series [2].  This enables me to spin up the thread to load the extensions
> > earlier as it no longer has to parse through all the cache entries first.
>
> Hmph. I kinda liked the simplicity of the previous one, but if we
> need to start reading the extension sections sooner by eliminating
> the overhead to scan the cache entries, perhaps we should bite the
> bullet now.

My view is slightly different. If we have to optimize might as well
try to squeeze the best out of it. Simplicity is already out of the
window at this point (but maintainability remains).

> > The big changes in this iteration are:
> >
> > - add the EOIE extension
> > - update the index extension worker thread to start first
>
> I guess I'd need to see the actual patch to find this out, but once
> we rely on a new extension, then we could omit scanning the main
> index even to partition the work among workers (i.e. like the topic
> long ago, you can have list of pointers into the main index to help
> partitioning, plus reset the prefix compression used in v4).  I
> think you didn't get that far in this round, which is good.  If the
> gain with EOIE alone (and starting the worker for the extension
> section early) is large enough without such a pre-computed work
> partition table, the simplicity of this round may give us a good
> stopping point.

I suspect the reduced gain in 1M files case compared to 100k files in
4/4 is because of scanning the index to split work to worker threads.
Since the index is now larger, the scanning takes more time before we
can start worker threads and we gain less from parallelization. I have
not experimented to see if this is true or there is something else.

> > This patch conflicts with Duy's patch to remove the double memory copy and
> > pass in the previous ce instead.  The two will need to be merged/reconciled
> > once they settle down a bit.
>
> Thanks.  I have a feeling that 67922abb ("read-cache.c: optimize
> reading index format v4", 2018-09-02) is already 'next'-worthy
> and ready to be built on, but I'd prefer to hear from Duy to double
> check.

Yes I think it's good. I ran the entire test suite with v4 just to
double check (and thinking of testing v4 version in travis too).
-- 
Duy


Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-08 Thread Duy Nguyen
On Sat, Sep 8, 2018 at 11:28 AM Hultqvist  wrote:
>
> The bash commands are using a git and bash bundle that was installed
> in parallel with gitextensions(a gui for git)
>
> G:\Min enhet> set GIT_TRACE_SETUP=1
> G:\Min enhet> git st
> 10:40:28.881927 trace.c:318 setup: git_dir:
> C:/Users/hultqvist/Drive.git
> 10:40:28.881927 trace.c:319 setup: git_common_dir:
> C:/Users/hultqvist/Drive.git
> 10:40:28.881927 trace.c:320 setup: worktree: G:/Min enhet
> 10:40:28.881927 trace.c:321 setup: cwd: G:/Min enhet
> 10:40:28.881927 trace.c:322 setup: prefix: (null)
> 10:40:28.882930 chdir-notify.c:67   setup: chdir from 'G:/Min
> enhet' to 'G:/Min enhet'

Unfortunately this looks good. Whenever those files 'index',
'config'... are created, they should always be created in
C:\Users\hultqvist\Drive.git, not G:\Min enhet, including their
temporary versions. I don't know if there are any more changes on the
windows fork that could affect this though, I only checked git.git.
-- 
Duy


Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-08 Thread Hultqvist
The bash commands are using a git and bash bundle that was installed
in parallel with gitextensions(a gui for git)

G:\Min enhet> set GIT_TRACE_SETUP=1
G:\Min enhet> git st
10:40:28.881927 trace.c:318 setup: git_dir:
C:/Users/hultqvist/Drive.git
10:40:28.881927 trace.c:319 setup: git_common_dir:
C:/Users/hultqvist/Drive.git
10:40:28.881927 trace.c:320 setup: worktree: G:/Min enhet
10:40:28.881927 trace.c:321 setup: cwd: G:/Min enhet
10:40:28.881927 trace.c:322 setup: prefix: (null)
10:40:28.882930 chdir-notify.c:67   setup: chdir from 'G:/Min
enhet' to 'G:/Min enhet'
On branch master
Your branch is up to date with 'nas/master'.

nothing to commit, working tree clean

$ cat .git
gitdir: C:\Users\hultqvist\Drive.git

$cat C:/Users/hultqvist/Drive.git/config
[core]
repositoryformatversion = 0
filemode = false
bare = false
logallrefupdates = true
ignorecase = true
autocrlf = false
...

G:\Min enhet> git version
$ git version
git version 2.18.0.windows.1




Den lör 8 sep. 2018 kl 08:08 skrev Duy Nguyen :
>
> On Fri, Sep 7, 2018 at 6:48 PM Junio C Hamano  wrote:
> >
> > Hultqvist  writes:
> >
> > > Considering that the gitdir could be located on a different drive than
> > > the workdir wouldn't it make more sense to create the temporary files
> > > in a subdirectory inside the gitdir rather tan in the workdir?
> >
> > I do not think we intend to create temporary files, whose final
> > destination is somewhere under $GIT_DIR/, in any working tree;
> > rather, I think we try to create them inside $GIT_DIR (or possibly
> > if the destination is a file in a subdirectory of $GIT_DIR, then in
> > the same subdirectory).  What you are seeing definitely smells like
> > a bug in the worktree code, perhaps getting confused by the fact
> > that the full path to these places look "unusual" by starting with a
> > single alphabet followed by a colon (IOW, this may manifest only in
> > Windows port).
>
> I agree. Auditing the setup code did not reveal anything though. Our
> code should recognize these unusual Windows paths as absolute and
> while I spotted an incorrect use of '/' (instead of is_dir_sep) it
> does not explain the problem here.
>
> Hultqvist, if you set environment variable GIT_TRACE_SETUP to 1 and
> run "git status" in G:\Test1, what does it say?
> --
> Duy


Re: [PATCH 2/2] mingw: fix mingw_open_append to work with named pipes

2018-09-08 Thread Johannes Sixt

Am 07.09.2018 um 20:19 schrieb Jeff Hostetler via GitGitGadget:

From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
  compat/mingw.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 858ca14a57..ef03bbe5d2 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -355,7 +355,7 @@ static int mingw_open_append(wchar_t const *wfilename, int 
oflags, ...)
 * FILE_SHARE_WRITE is required to permit child processes
 * to append to the file.
 */
-   handle = CreateFileW(wfilename, FILE_APPEND_DATA,
+   handle = CreateFileW(wfilename, FILE_WRITE_DATA | FILE_APPEND_DATA,
FILE_SHARE_WRITE | FILE_SHARE_READ,
NULL, create, FILE_ATTRIBUTE_NORMAL, NULL);
if (handle == INVALID_HANDLE_VALUE)



I did not go with this version because the documentation 
https://docs.microsoft.com/en-us/windows/desktop/fileio/file-access-rights-constants 
says:


FILE_APPEND_DATA: For a file object, the right to append data to the 
file. (For local files, write operations will not overwrite existing 
data if this flag is specified without FILE_WRITE_DATA.) [...]


which could be interpreted as: Only if FILE_WRITE_DATA is not set, we 
have the guarantee that existing data in local files is not overwritten, 
i.e., new data is appended atomically.


Is this interpretation too narrow and we do get atomicity even when 
FILE_WRITE_DATA is set?


-- Hannes


Re: Git in Outreachy Dec-Mar?

2018-09-08 Thread Christian Couder
On Thu, Sep 6, 2018 at 9:34 PM, Jeff King  wrote:
>
> By the way, I've got funding from GitHub lined up, so we are good on
> that front.

Great, thanks!


Re: Git in Outreachy Dec-Mar?

2018-09-08 Thread Christian Couder
On Thu, Sep 6, 2018 at 9:31 PM, Jeff King  wrote:
> On Thu, Sep 06, 2018 at 11:51:49AM +0200, Christian Couder wrote:
>
>> Yeah, I think the https://git.github.io/Outreachy-17/ is not actually 
>> necessary.
>
> I think it still may be helpful for explaining in further detail things
> like #leftoverbits (though I see you put some of that in your project
> description).

You mean in https://git.github.io/Outreachy-17/ or somewhere else?

It is already described in https://git.github.io/SoC-2018-Microprojects/.

>> I did that for the "Improve `git bisect`" project. As the
>> "coordinator", you will need to approve that project.
>
> Thanks. I approved it, though a few of the descriptions are a little
> funny. For instance, the text says "we use an issue tracker", which then
> links to public-inbox. I assume this is because you filled in a field
> for "issue tracker" and then the system generated the text.

Yeah, it was generated from fields that I filled in.

> I don't know if there's a way go into more detail there.

I don't think so, though we could perhaps improve our web pages.


Re: [PATCH v2 10/11] builtin rebase: only store fully-qualified refs in `options.head_name`

2018-09-08 Thread SZEDER Gábor
On Tue, Sep 04, 2018 at 02:27:20PM -0700, Pratik Karki via GitGitGadget wrote:
> From: Pratik Karki 
> 
> When running a rebase on a detached HEAD, we currently store the string
> "detached HEAD" in options.head_name. That is a faithful translation of
> the shell script version, and we still kind of need it for the purposes of
> the scripted backends.
> 
> It is poor style for C, though, where we would really only want a valid,
> fully-qualified ref name as value, and NULL for detached HEADs, using
> "detached HEAD" for display only. Make it so.
> 
> Signed-off-by: Pratik Karki 
> Signed-off-by: Johannes Schindelin 
> ---
>  builtin/rebase.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index d45f8f9008..afc75fe731 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c

> @@ -575,7 +579,8 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>   branch_name = options.head_name;
>  
>   } else {
> - options.head_name = xstrdup("detached HEAD");
> + free(options.head_name);
> + options.head_name = NULL;

Please use FREE_AND_NULL(options.head_name) here.

>   branch_name = "HEAD";
>   }
>   if (get_oid("HEAD", _head))
> -- 
> gitgitgadget
> 


Re: [PATCH 0/4] un-breaking pack-objects with bitmaps

2018-09-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 01 2018, Jeff King wrote:

> On Fri, Aug 31, 2018 at 06:55:58PM -0400, Jeff King wrote:
>
>> On Fri, Aug 31, 2018 at 05:23:17PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> > On Tue, Aug 21 2018, Jeff King wrote:
>> >
>> > > +int bitmap_has_sha1_in_uninteresting(struct bitmap_index *bitmap_git,
>> > > + const unsigned char *sha1)
>> > > +{
>> > > +int pos;
>> > > +
>> > > +if (!bitmap_git)
>> > > +return 0; /* no bitmap loaded */
>> > > +if (!bitmap_git->result)
>> > > +BUG("failed to perform bitmap walk before querying");
>> >
>> > Some part of what calls this completely breaks pushing from the "next"
>> > branch when you have local bitmaps (we *really* should have some tests
>> > for this...).
>>
>> Yikes, thanks for reporting. I agree we need better tests here.
>
> OK, here is the fix. Since the problem is in 'next', this is done as a
> patch on top of jk/pack-delta-reuse-with-bitmap. But since we're set to
> rewind 'next' post-release anyway, we could squash it directly into
> 30cdc33fba from the original series. That would help later bisections
> from running into it, which may be worth it as it's a pretty severe
> breakage.  Or maybe not:

Junio: Just a reminder that next is still broken with this, and I see
e.g. the Debian "experimental" has the bug but not the fix at this
point.

I'm just reverting jk/pack-delta-reuse-with-bitmap out of next when
building my own package of git, but I think this really should be fixed
in that branch, either by merging the fix down or reverting the original
series out of next, I think just merging the fix down makes sense, but
have no strong opinion on it.


Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-08 Thread Martin Ågren
On Fri, 7 Sep 2018 at 22:24, Ben Peart  wrote:
> > Ben Peart  writes:

> >> - 160-bit SHA-1 over the extension types and their sizes (but not
> >> their contents).  E.g. if we have "TREE" extension that is N-bytes
> >> long, "REUC" extension that is M-bytes long, followed by "EOIE",
> >> then the hash would be:

> The purpose of the SHA isn't to detect disk corruption (we already have
> a SHA for the entire index that can serve that purpose) but to help
> ensure that this was actually a valid EOIE extension and not a lucky
> random set of bytes. [...]

> >> +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */

> >> +the_hash_algo->init_fn();
> >> +while (src_offset < mmap_size - the_hash_algo->rawsz - 
> >> EOIE_SIZE_WITH_HEADER) {
[...]
> >> +the_hash_algo->final_fn(hash, );
> >> +if (hashcmp(hash, (unsigned char *)index))
> >> +return 0;
> >> +
> >> +/* Validate that the extension offsets returned us back to the eoie 
> >> extension. */
> >> +if (src_offset != mmap_size - the_hash_algo->rawsz - 
> >> EOIE_SIZE_WITH_HEADER)
> >> +return 0;

Besides the issue you and Junio discussed with "should we document this
as being SHA-1 or NewHash" (or "the hash algo"), it seems to me that
this implementation is living somewhere between using SHA-1 and "the
hash algo". The hashing uses `the_hash_algo`, but the hash size is
hardcoded at 20 bytes.

Maybe it all works out, e.g., so that when someone (brian) merges a
NewHash and runs the testsuite, this will fail consistently and in a
safe way. But I wonder if it would be too hard to avoid the hardcoded 24
already now.

Martin


Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-08 Thread Duy Nguyen
On Fri, Sep 7, 2018 at 6:48 PM Junio C Hamano  wrote:
>
> Hultqvist  writes:
>
> > Considering that the gitdir could be located on a different drive than
> > the workdir wouldn't it make more sense to create the temporary files
> > in a subdirectory inside the gitdir rather tan in the workdir?
>
> I do not think we intend to create temporary files, whose final
> destination is somewhere under $GIT_DIR/, in any working tree;
> rather, I think we try to create them inside $GIT_DIR (or possibly
> if the destination is a file in a subdirectory of $GIT_DIR, then in
> the same subdirectory).  What you are seeing definitely smells like
> a bug in the worktree code, perhaps getting confused by the fact
> that the full path to these places look "unusual" by starting with a
> single alphabet followed by a colon (IOW, this may manifest only in
> Windows port).

I agree. Auditing the setup code did not reveal anything though. Our
code should recognize these unusual Windows paths as absolute and
while I spotted an incorrect use of '/' (instead of is_dir_sep) it
does not explain the problem here.

Hultqvist, if you set environment variable GIT_TRACE_SETUP to 1 and
run "git status" in G:\Test1, what does it say?
-- 
Duy