Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-04-08 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Yes, it is a workaround.  Making shell faster on windows would of
>> course be one possible solution to make t/t*.sh scripts go faster
>> ;-)  Or update parts of t/t*.sh so that the equivalent test coverage
>> can be kept while running making them go faster on Windows.
>
> What makes you think that I did not try my hardest for around 812 hours in
> total so far to make the shell faster?

Nowhere in these four lines I ever said that I think you did not
work hard to solve the performance issues you have.



Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-04-03 Thread Johannes Schindelin
Hi Duy,

On Tue, 3 Apr 2018, Duy Nguyen wrote:

> On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelin
>  wrote:
> > It is very frustrating to spend that much time with only little gains
> > here and there (and BusyBox-w32 is simply not robust enough yet, apart
> > from also not showing a significant improvement in performance).
> 
> You still use busybox-w32?

Yes.

> It's amazing that people still use it after the linux subsystem comes.

I use WSL myself. But you need to realize that WSL is only available on
Windows 10 (many users still use Windows 7), and it is a little tricky to
get to work in Docker containers, I heard, so I did not even try.

Also, many Windows users are unfamiliar with Linux, and forcing them to
learn and install a Linux distribution on their machine when all they want
is to use Git is a bit... much.

> busybox has a lot of commands built in (i.e. no new processes) and
> unless rmyorston did something more, the "fork" in ash shell should be
> as cheap as it could be: it simply serializes data and sends to the new
> process.

Yes, I had the pleasure of reading that code.

It might surprise you, but I had to come up with quite a bit of patches to
make the test suite pass. And it does not really pass, as I randomly get
hangs...

> If performance does not improve, I guess the process creation cost
> dominates. There's not much we could do except moving away from the
> zillion processes test framework: either something C-based or another
> scripting language (ok I don't want to bring this up again)

There is no need to guess. I now have .pdb files, and once I have a good
example of a shell script construct that is particularly slow, and once I
find some time to work on it, I will dig into the bottlenecks.

Ciao,
Dscho


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-04-03 Thread Duy Nguyen
On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelin
 wrote:
> It is very frustrating to spend that much time with only little gains here
> and there (and BusyBox-w32 is simply not robust enough yet, apart from
> also not showing a significant improvement in performance).

You still use busybox-w32? It's amazing that people still use it after
the linux subsystem comes. busybox has a lot of commands built in
(i.e. no new processes) and unless rmyorston did something more, the
"fork" in ash shell should be as cheap as it could be: it simply
serializes data and sends to the new process.

If performance does not improve, I guess the process creation cost
dominates. There's not much we could do except moving away from the
zillion processes test framework: either something C-based or another
scripting language (ok I don't want to bring this up again)
-- 
Duy


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-04-03 Thread Johannes Schindelin
Hi Junio,

On Fri, 30 Mar 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > What would be a *really* good strategy is: "Oh, there is a problem! Let's
> > acknowledge it and try to come up with a solution rather than a
> > work-around".
> >
> > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution.
> 
> Yes, it is a workaround.  Making shell faster on windows would of
> course be one possible solution to make t/t*.sh scripts go faster
> ;-)  Or update parts of t/t*.sh so that the equivalent test coverage
> can be kept while running making them go faster on Windows.

What makes you think that I did not try my hardest for around 812 hours in
total so far to make the shell faster?

Ciao,
Dscho

P.S.: I do not have the actual number of hours I spent on both MSYS2's
runtime and BusyBox and Git to find *some* way to make it faster, as my
time-keeping is organized in a different way that makes it hard to query
the overall number. But I can state with confidence that it is easily in
the 200-300 hour range, if not beyond that.

It is very frustrating to spend that much time with only little gains here
and there (and BusyBox-w32 is simply not robust enough yet, apart from
also not showing a significant improvement in performance). Please do not
make this experience even more frustrating. Thanks.


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Duy Nguyen
On Fri, Mar 30, 2018 at 8:53 PM, Johannes Schindelin
 wrote:
> I think the best course of action would be to incrementally do away with
> the shell scripted test framework, in the way you outlined earlier this
> year. This would *also* buy us a wealth of other benefits, such as better
> control over the parallelization, resource usage, etc.

If you have not noticed, I'm a bit busy with all sorts of stuff and
probably won't continue that work. And since it affects you the most,
you probably have the best motive to tackle it ;-) I don't think
complaining about slow test suite helps. And avoiding adding more
tests because of that definitely does not help.

> It would also finally make it easier to introduce something like "smart
> testing" where code coverage could be computed (this works only for C
> code, of course, not for the many scripted parts of core Git), and a diff
> could be inspected to discover which tests *really* need to be run,
> skipping the tests that would only touch unchanged code.
-- 
Duy


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> What would be a *really* good strategy is: "Oh, there is a problem! Let's
> acknowledge it and try to come up with a solution rather than a
> work-around".
>
> EXPENSIVE_ON_WINDOWS is a symptom. Not a solution.

Yes, it is a workaround.  Making shell faster on windows would of
course be one possible solution to make t/t*.sh scripts go faster
;-)  Or update parts of t/t*.sh so that the equivalent test coverage
can be kept while running making them go faster on Windows.






Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Johannes Schindelin
Hi Duy,

On Fri, 30 Mar 2018, Duy Nguyen wrote:

> On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin
>  wrote:
> >
> > On Thu, 29 Mar 2018, Jeff King wrote:
> >
> >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote:
> >>
> >> > > When calling `git config --unset abc.a` on this file, it leaves this
> >> > > (invalid) config behind:
> >> > >
> >> > > [
> >> > > [xyz]
> >> > > key = value
> >> > >
> >> > > The reason is that we try to search for the beginning of the line (or
> >> > > for the end of the preceding section header on the same line) that
> >> > > defines abc.a, but as an optimization, we subtract 2 from the offset
> >> > > pointing just after the definition before we call
> >> > > find_beginning_of_line(). That function, however, *also* performs that
> >> > > optimization and promptly fails to find the section header correctly.
> >> >
> >> > This commit message would be more convincing if we had it in test form.
> >>
> >> I agree a test might be nice. But I don't find the commit message
> >> unconvincing at all. It explains pretty clearly why the bug occurs, and
> >> you can verify it by looking at find_beginning_of_line.
> >>
> >> > [abc]a
> >> >
> >> > is not written by Git, but would be written from an outside tool or 
> >> > person
> >> > and we barely cope with it?
> >>
> >> Yes, I don't think git would ever write onto the same line. But clearly
> >> we should handle anything that's syntactically valid.
> >
> > I was tempted to add the test case, because it is easy to test it.
> >
> > But I then decided *not* to add it. Why? Testing is a balance between "can
> > do" and "need to do".
> >
> > Can you imagine that I did *not* run the entire test suite before
> > submitting this patch series, because it takes an incredible *90 minutes*
> > to run *on a fast Windows machine*?
> 
> What's wrong with firing up a new worktree, run the test suite there
> and go back to do something else so you won't waste time just waiting
> for test results and submit? Sure there is a mental overhead for
> switching tasks, but at 90 minutes, I think it's worth doing.

Of course it is worth doing. That's why I often test the end result on
Windows (waiting those 90 minutes, but I do not fire up a new worktree, I
use my cloud privilege and let Azure/Visual Studio Team Services do the
work for me, without slowing down my laptop).

What I would love to do, however, would be to test all intermediate
patches, too, as that often shows a problem with my frequent reorderings
via interactive rebases. And 90 minutes times 9 is... 13 hours and 30
minutes. That's a really long time.

I think the best course of action would be to incrementally do away with
the shell scripted test framework, in the way you outlined earlier this
year. This would *also* buy us a wealth of other benefits, such as better
control over the parallelization, resource usage, etc.

It would also finally make it easier to introduce something like "smart
testing" where code coverage could be computed (this works only for C
code, of course, not for the many scripted parts of core Git), and a diff
could be inspected to discover which tests *really* need to be run,
skipping the tests that would only touch unchanged code.

Ciao,
Dscho


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Johannes Schindelin
Hi Junio,

On Fri, 30 Mar 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
> 
> > I think if it's worth fixing it's worth testing for, a future change to
> > the config code could easily introduce a regression for this, and
> > particularly in this type of code obscure edge cases like this can point
> > to bugs elsewhere.
> 
> Yup.  "The port to my favourite platform is too slow, and everybody
> should learn to live with thin test coverage" would not be a good
> strategy in the longer run.

What would be a *really* good strategy is: "Oh, there is a problem! Let's
acknowledge it and try to come up with a solution rather than a
work-around".

EXPENSIVE_ON_WINDOWS is a symptom. Not a solution.

And you are actively hurting my ability to contribute, I hope you are
aware of that.

Ciao,
Dscho

Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Duy Nguyen
On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin
 wrote:
> Hi,
>
> On Thu, 29 Mar 2018, Jeff King wrote:
>
>> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote:
>>
>> > > When calling `git config --unset abc.a` on this file, it leaves this
>> > > (invalid) config behind:
>> > >
>> > > [
>> > > [xyz]
>> > > key = value
>> > >
>> > > The reason is that we try to search for the beginning of the line (or
>> > > for the end of the preceding section header on the same line) that
>> > > defines abc.a, but as an optimization, we subtract 2 from the offset
>> > > pointing just after the definition before we call
>> > > find_beginning_of_line(). That function, however, *also* performs that
>> > > optimization and promptly fails to find the section header correctly.
>> >
>> > This commit message would be more convincing if we had it in test form.
>>
>> I agree a test might be nice. But I don't find the commit message
>> unconvincing at all. It explains pretty clearly why the bug occurs, and
>> you can verify it by looking at find_beginning_of_line.
>>
>> > [abc]a
>> >
>> > is not written by Git, but would be written from an outside tool or person
>> > and we barely cope with it?
>>
>> Yes, I don't think git would ever write onto the same line. But clearly
>> we should handle anything that's syntactically valid.
>
> I was tempted to add the test case, because it is easy to test it.
>
> But I then decided *not* to add it. Why? Testing is a balance between "can
> do" and "need to do".
>
> Can you imagine that I did *not* run the entire test suite before
> submitting this patch series, because it takes an incredible *90 minutes*
> to run *on a fast Windows machine*?

What's wrong with firing up a new worktree, run the test suite there
and go back to do something else so you won't waste time just waiting
for test results and submit? Sure there is a mental overhead for
switching tasks, but at 90 minutes, I think it's worth doing.
-- 
Duy


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> I think if it's worth fixing it's worth testing for, a future change to
> the config code could easily introduce a regression for this, and
> particularly in this type of code obscure edge cases like this can point
> to bugs elsewhere.

Yup.  "The port to my favourite platform is too slow, and everybody
should learn to live with thin test coverage" would not be a good
strategy in the longer run.



Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Ævar Arnfjörð Bjarmason

On Fri, Mar 30 2018, Johannes Schindelin wrote:

> On Thu, 29 Mar 2018, Jeff King wrote:
>
>> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote:
>>
>> > > When calling `git config --unset abc.a` on this file, it leaves this
>> > > (invalid) config behind:
>> > >
>> > > [
>> > > [xyz]
>> > > key = value
>> > >
>> > > The reason is that we try to search for the beginning of the line (or
>> > > for the end of the preceding section header on the same line) that
>> > > defines abc.a, but as an optimization, we subtract 2 from the offset
>> > > pointing just after the definition before we call
>> > > find_beginning_of_line(). That function, however, *also* performs that
>> > > optimization and promptly fails to find the section header correctly.
>> >
>> > This commit message would be more convincing if we had it in test form.
>>
>> I agree a test might be nice. But I don't find the commit message
>> unconvincing at all. It explains pretty clearly why the bug occurs, and
>> you can verify it by looking at find_beginning_of_line.
>>
>> > [abc]a
>> >
>> > is not written by Git, but would be written from an outside tool or person
>> > and we barely cope with it?
>>
>> Yes, I don't think git would ever write onto the same line. But clearly
>> we should handle anything that's syntactically valid.
>
> I was tempted to add the test case, because it is easy to test it.
>
> But I then decided *not* to add it. Why? Testing is a balance between "can
> do" and "need to do".
>
> Can you imagine that I did *not* run the entire test suite before
> submitting this patch series, because it takes an incredible *90 minutes*
> to run *on a fast Windows machine*?

I think if it's worth fixing it's worth testing for, a future change to
the config code could easily introduce a regression for this, and
particularly in this type of code obscure edge cases like this can point
to bugs elsewhere.

We have the EXPENSIVE_ON_WINDOWS prerequisite already in master from an
earlier series of mine, maybe we could use that here, or add some other
prereq like OVERLY_EXHAUSTIVE which by default could depend on
EXPENSIVE_ON_WINDOWS, i.e. we'd have a set of overly pedantic tests that
we skip on Windows by default, as there's no reason to suspect they're
platform-dependent, but we'd like to know if they regress.


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-30 Thread Johannes Schindelin
Hi,

On Thu, 29 Mar 2018, Jeff King wrote:

> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote:
> 
> > > When calling `git config --unset abc.a` on this file, it leaves this
> > > (invalid) config behind:
> > >
> > > [
> > > [xyz]
> > > key = value
> > >
> > > The reason is that we try to search for the beginning of the line (or
> > > for the end of the preceding section header on the same line) that
> > > defines abc.a, but as an optimization, we subtract 2 from the offset
> > > pointing just after the definition before we call
> > > find_beginning_of_line(). That function, however, *also* performs that
> > > optimization and promptly fails to find the section header correctly.
> > 
> > This commit message would be more convincing if we had it in test form.
> 
> I agree a test might be nice. But I don't find the commit message
> unconvincing at all. It explains pretty clearly why the bug occurs, and
> you can verify it by looking at find_beginning_of_line.
> 
> > [abc]a
> > 
> > is not written by Git, but would be written from an outside tool or person
> > and we barely cope with it?
> 
> Yes, I don't think git would ever write onto the same line. But clearly
> we should handle anything that's syntactically valid.

I was tempted to add the test case, because it is easy to test it.

But I then decided *not* to add it. Why? Testing is a balance between "can
do" and "need to do".

Can you imagine that I did *not* run the entire test suite before
submitting this patch series, because it takes an incredible *90 minutes*
to run *on a fast Windows machine*?

Seriously, this is hurting me. I do not complain about this due to some
mental illness forcing me to do it. I complain about this so often
*because it slows me down*, you gentle people. And you don't seem to care,
at least the test suite gets noticably worse by the month. I frankly do
not know what to do about this, as you keep adding and adding and it gets
less and less feasible for me to run the full test suite. I seem to be
totally unable to get through to you with the message that this is a real
problem with a real need to get fixed.

So with this in mind, I do not want to add a test case for a concocted
example that won't affect anybody except users who *want* to trigger this
bug.

I hope you agree,
Dscho

P.S.: Of course I ran the entire test suite. Not on Windows, but in a
Linux VM, because Linux is what Git is fine-tuned for, most obviously so.
An alien digging up ancient Earth history in the far future might be
tempted to assume that Git was developed to develop Linux which was
developed to develop Git, and then ask herself why humans bothered at all.

I actually ran the entire test suite on Linux on every single patch, via
`git rebase -x "make -j15 DEVELOPER=1 test" @{u}`, as I usually do before
submitting a patch series.

And it *did* find an obscure bug in an earlier iteration, where
t5512-ls-remote.sh demonstrated that looking at only one entry at a time
is not enough: `git config --unset-all uploadpack.hiderefs` *also* needs
to remove the now-empty section, because we might end up with the empty
sections in the wrong order, and the order of [transfer] and [uploadpack]
*matters* if the transfer.hiderefs setting is negative and the
uploadpack.hiderefs setting is positive, as is the case in 'overrides work
between mixed transfer/upload-pack hideRefs'. (Side-note: this looks like
a pretty obvious design bug to me, as there is *no tooling* to switch
around the order of these settings. Even worse: if somebody gets
instructions to add those settings, and there is already a [transfer]
section in the config: you're out of luck! You will have to *know* that
the order matters, *and add a second [transfer] section manually*!)


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-29 Thread Jeff King
On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote:

> > When calling `git config --unset abc.a` on this file, it leaves this
> > (invalid) config behind:
> >
> > [
> > [xyz]
> > key = value
> >
> > The reason is that we try to search for the beginning of the line (or
> > for the end of the preceding section header on the same line) that
> > defines abc.a, but as an optimization, we subtract 2 from the offset
> > pointing just after the definition before we call
> > find_beginning_of_line(). That function, however, *also* performs that
> > optimization and promptly fails to find the section header correctly.
> 
> This commit message would be more convincing if we had it in test form.

I agree a test might be nice. But I don't find the commit message
unconvincing at all. It explains pretty clearly why the bug occurs, and
you can verify it by looking at find_beginning_of_line.

> [abc]a
> 
> is not written by Git, but would be written from an outside tool or person
> and we barely cope with it?

Yes, I don't think git would ever write onto the same line. But clearly
we should handle anything that's syntactically valid.

-Peff


Re: [PATCH 1/9] git_config_set: fix off-by-two

2018-03-29 Thread Stefan Beller
On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelin
 wrote:
> Currently, we are slightly overzealous When removing an entry from a
> config file of this form:
>
> [abc]a
> [xyz]
> key = value
>
> When calling `git config --unset abc.a` on this file, it leaves this
> (invalid) config behind:
>
> [
> [xyz]
> key = value
>
> The reason is that we try to search for the beginning of the line (or
> for the end of the preceding section header on the same line) that
> defines abc.a, but as an optimization, we subtract 2 from the offset
> pointing just after the definition before we call
> find_beginning_of_line(). That function, however, *also* performs that
> optimization and promptly fails to find the section header correctly.

This commit message would be more convincing if we had it in test form.

[abc]a

is not written by Git, but would be written from an outside tool or person
and we barely cope with it?

Thanks,
Stefan

>
> Signed-off-by: Johannes Schindelin 
> ---
>  config.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/config.c b/config.c
> index b0c20e6cb8a..5cc049aaef0 100644
> --- a/config.c
> +++ b/config.c
> @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
> } else
> copy_end = find_beginning_of_line(
> contents, contents_sz,
> -   store.offset[i]-2, _line);
> +   store.offset[i], _line);
>
> if (copy_end > 0 && contents[copy_end-1] != '\n')
> new_line = 1;
> --
> 2.16.2.windows.1.26.g2cc3565eb4b
>
>