Re: [PATCH 1/9] git_config_set: fix off-by-two
Johannes Schindelinwrites: >> Yes, it is a workaround. Making shell faster on windows would of >> course be one possible solution to make t/t*.sh scripts go faster >> ;-) Or update parts of t/t*.sh so that the equivalent test coverage >> can be kept while running making them go faster on Windows. > > What makes you think that I did not try my hardest for around 812 hours in > total so far to make the shell faster? Nowhere in these four lines I ever said that I think you did not work hard to solve the performance issues you have.
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Duy, On Tue, 3 Apr 2018, Duy Nguyen wrote: > On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelin >wrote: > > It is very frustrating to spend that much time with only little gains > > here and there (and BusyBox-w32 is simply not robust enough yet, apart > > from also not showing a significant improvement in performance). > > You still use busybox-w32? Yes. > It's amazing that people still use it after the linux subsystem comes. I use WSL myself. But you need to realize that WSL is only available on Windows 10 (many users still use Windows 7), and it is a little tricky to get to work in Docker containers, I heard, so I did not even try. Also, many Windows users are unfamiliar with Linux, and forcing them to learn and install a Linux distribution on their machine when all they want is to use Git is a bit... much. > busybox has a lot of commands built in (i.e. no new processes) and > unless rmyorston did something more, the "fork" in ash shell should be > as cheap as it could be: it simply serializes data and sends to the new > process. Yes, I had the pleasure of reading that code. It might surprise you, but I had to come up with quite a bit of patches to make the test suite pass. And it does not really pass, as I randomly get hangs... > If performance does not improve, I guess the process creation cost > dominates. There's not much we could do except moving away from the > zillion processes test framework: either something C-based or another > scripting language (ok I don't want to bring this up again) There is no need to guess. I now have .pdb files, and once I have a good example of a shell script construct that is particularly slow, and once I find some time to work on it, I will dig into the bottlenecks. Ciao, Dscho
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelinwrote: > It is very frustrating to spend that much time with only little gains here > and there (and BusyBox-w32 is simply not robust enough yet, apart from > also not showing a significant improvement in performance). You still use busybox-w32? It's amazing that people still use it after the linux subsystem comes. busybox has a lot of commands built in (i.e. no new processes) and unless rmyorston did something more, the "fork" in ash shell should be as cheap as it could be: it simply serializes data and sends to the new process. If performance does not improve, I guess the process creation cost dominates. There's not much we could do except moving away from the zillion processes test framework: either something C-based or another scripting language (ok I don't want to bring this up again) -- Duy
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Johannes Schindelinwrites: > > > What would be a *really* good strategy is: "Oh, there is a problem! Let's > > acknowledge it and try to come up with a solution rather than a > > work-around". > > > > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. > > Yes, it is a workaround. Making shell faster on windows would of > course be one possible solution to make t/t*.sh scripts go faster > ;-) Or update parts of t/t*.sh so that the equivalent test coverage > can be kept while running making them go faster on Windows. What makes you think that I did not try my hardest for around 812 hours in total so far to make the shell faster? Ciao, Dscho P.S.: I do not have the actual number of hours I spent on both MSYS2's runtime and BusyBox and Git to find *some* way to make it faster, as my time-keeping is organized in a different way that makes it hard to query the overall number. But I can state with confidence that it is easily in the 200-300 hour range, if not beyond that. It is very frustrating to spend that much time with only little gains here and there (and BusyBox-w32 is simply not robust enough yet, apart from also not showing a significant improvement in performance). Please do not make this experience even more frustrating. Thanks.
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30, 2018 at 8:53 PM, Johannes Schindelinwrote: > I think the best course of action would be to incrementally do away with > the shell scripted test framework, in the way you outlined earlier this > year. This would *also* buy us a wealth of other benefits, such as better > control over the parallelization, resource usage, etc. If you have not noticed, I'm a bit busy with all sorts of stuff and probably won't continue that work. And since it affects you the most, you probably have the best motive to tackle it ;-) I don't think complaining about slow test suite helps. And avoiding adding more tests because of that definitely does not help. > It would also finally make it easier to introduce something like "smart > testing" where code coverage could be computed (this works only for C > code, of course, not for the many scripted parts of core Git), and a diff > could be inspected to discover which tests *really* need to be run, > skipping the tests that would only touch unchanged code. -- Duy
Re: [PATCH 1/9] git_config_set: fix off-by-two
Johannes Schindelinwrites: > What would be a *really* good strategy is: "Oh, there is a problem! Let's > acknowledge it and try to come up with a solution rather than a > work-around". > > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. Yes, it is a workaround. Making shell faster on windows would of course be one possible solution to make t/t*.sh scripts go faster ;-) Or update parts of t/t*.sh so that the equivalent test coverage can be kept while running making them go faster on Windows.
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Duy, On Fri, 30 Mar 2018, Duy Nguyen wrote: > On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin >wrote: > > > > On Thu, 29 Mar 2018, Jeff King wrote: > > > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > >> > >> > > When calling `git config --unset abc.a` on this file, it leaves this > >> > > (invalid) config behind: > >> > > > >> > > [ > >> > > [xyz] > >> > > key = value > >> > > > >> > > The reason is that we try to search for the beginning of the line (or > >> > > for the end of the preceding section header on the same line) that > >> > > defines abc.a, but as an optimization, we subtract 2 from the offset > >> > > pointing just after the definition before we call > >> > > find_beginning_of_line(). That function, however, *also* performs that > >> > > optimization and promptly fails to find the section header correctly. > >> > > >> > This commit message would be more convincing if we had it in test form. > >> > >> I agree a test might be nice. But I don't find the commit message > >> unconvincing at all. It explains pretty clearly why the bug occurs, and > >> you can verify it by looking at find_beginning_of_line. > >> > >> > [abc]a > >> > > >> > is not written by Git, but would be written from an outside tool or > >> > person > >> > and we barely cope with it? > >> > >> Yes, I don't think git would ever write onto the same line. But clearly > >> we should handle anything that's syntactically valid. > > > > I was tempted to add the test case, because it is easy to test it. > > > > But I then decided *not* to add it. Why? Testing is a balance between "can > > do" and "need to do". > > > > Can you imagine that I did *not* run the entire test suite before > > submitting this patch series, because it takes an incredible *90 minutes* > > to run *on a fast Windows machine*? > > What's wrong with firing up a new worktree, run the test suite there > and go back to do something else so you won't waste time just waiting > for test results and submit? Sure there is a mental overhead for > switching tasks, but at 90 minutes, I think it's worth doing. Of course it is worth doing. That's why I often test the end result on Windows (waiting those 90 minutes, but I do not fire up a new worktree, I use my cloud privilege and let Azure/Visual Studio Team Services do the work for me, without slowing down my laptop). What I would love to do, however, would be to test all intermediate patches, too, as that often shows a problem with my frequent reorderings via interactive rebases. And 90 minutes times 9 is... 13 hours and 30 minutes. That's a really long time. I think the best course of action would be to incrementally do away with the shell scripted test framework, in the way you outlined earlier this year. This would *also* buy us a wealth of other benefits, such as better control over the parallelization, resource usage, etc. It would also finally make it easier to introduce something like "smart testing" where code coverage could be computed (this works only for C code, of course, not for the many scripted parts of core Git), and a diff could be inspected to discover which tests *really* need to be run, skipping the tests that would only touch unchanged code. Ciao, Dscho
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmasonwrites: > > > I think if it's worth fixing it's worth testing for, a future change to > > the config code could easily introduce a regression for this, and > > particularly in this type of code obscure edge cases like this can point > > to bugs elsewhere. > > Yup. "The port to my favourite platform is too slow, and everybody > should learn to live with thin test coverage" would not be a good > strategy in the longer run. What would be a *really* good strategy is: "Oh, there is a problem! Let's acknowledge it and try to come up with a solution rather than a work-around". EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. And you are actively hurting my ability to contribute, I hope you are aware of that. Ciao, Dscho
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelinwrote: > Hi, > > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? What's wrong with firing up a new worktree, run the test suite there and go back to do something else so you won't waste time just waiting for test results and submit? Sure there is a mental overhead for switching tasks, but at 90 minutes, I think it's worth doing. -- Duy
Re: [PATCH 1/9] git_config_set: fix off-by-two
Ævar Arnfjörð Bjarmasonwrites: > I think if it's worth fixing it's worth testing for, a future change to > the config code could easily introduce a regression for this, and > particularly in this type of code obscure edge cases like this can point > to bugs elsewhere. Yup. "The port to my favourite platform is too slow, and everybody should learn to live with thin test coverage" would not be a good strategy in the longer run.
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30 2018, Johannes Schindelin wrote: > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? I think if it's worth fixing it's worth testing for, a future change to the config code could easily introduce a regression for this, and particularly in this type of code obscure edge cases like this can point to bugs elsewhere. We have the EXPENSIVE_ON_WINDOWS prerequisite already in master from an earlier series of mine, maybe we could use that here, or add some other prereq like OVERLY_EXHAUSTIVE which by default could depend on EXPENSIVE_ON_WINDOWS, i.e. we'd have a set of overly pedantic tests that we skip on Windows by default, as there's no reason to suspect they're platform-dependent, but we'd like to know if they regress.
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > > > > When calling `git config --unset abc.a` on this file, it leaves this > > > (invalid) config behind: > > > > > > [ > > > [xyz] > > > key = value > > > > > > The reason is that we try to search for the beginning of the line (or > > > for the end of the preceding section header on the same line) that > > > defines abc.a, but as an optimization, we subtract 2 from the offset > > > pointing just after the definition before we call > > > find_beginning_of_line(). That function, however, *also* performs that > > > optimization and promptly fails to find the section header correctly. > > > > This commit message would be more convincing if we had it in test form. > > I agree a test might be nice. But I don't find the commit message > unconvincing at all. It explains pretty clearly why the bug occurs, and > you can verify it by looking at find_beginning_of_line. > > > [abc]a > > > > is not written by Git, but would be written from an outside tool or person > > and we barely cope with it? > > Yes, I don't think git would ever write onto the same line. But clearly > we should handle anything that's syntactically valid. I was tempted to add the test case, because it is easy to test it. But I then decided *not* to add it. Why? Testing is a balance between "can do" and "need to do". Can you imagine that I did *not* run the entire test suite before submitting this patch series, because it takes an incredible *90 minutes* to run *on a fast Windows machine*? Seriously, this is hurting me. I do not complain about this due to some mental illness forcing me to do it. I complain about this so often *because it slows me down*, you gentle people. And you don't seem to care, at least the test suite gets noticably worse by the month. I frankly do not know what to do about this, as you keep adding and adding and it gets less and less feasible for me to run the full test suite. I seem to be totally unable to get through to you with the message that this is a real problem with a real need to get fixed. So with this in mind, I do not want to add a test case for a concocted example that won't affect anybody except users who *want* to trigger this bug. I hope you agree, Dscho P.S.: Of course I ran the entire test suite. Not on Windows, but in a Linux VM, because Linux is what Git is fine-tuned for, most obviously so. An alien digging up ancient Earth history in the far future might be tempted to assume that Git was developed to develop Linux which was developed to develop Git, and then ask herself why humans bothered at all. I actually ran the entire test suite on Linux on every single patch, via `git rebase -x "make -j15 DEVELOPER=1 test" @{u}`, as I usually do before submitting a patch series. And it *did* find an obscure bug in an earlier iteration, where t5512-ls-remote.sh demonstrated that looking at only one entry at a time is not enough: `git config --unset-all uploadpack.hiderefs` *also* needs to remove the now-empty section, because we might end up with the empty sections in the wrong order, and the order of [transfer] and [uploadpack] *matters* if the transfer.hiderefs setting is negative and the uploadpack.hiderefs setting is positive, as is the case in 'overrides work between mixed transfer/upload-pack hideRefs'. (Side-note: this looks like a pretty obvious design bug to me, as there is *no tooling* to switch around the order of these settings. Even worse: if somebody gets instructions to add those settings, and there is already a [transfer] section in the config: you're out of luck! You will have to *know* that the order matters, *and add a second [transfer] section manually*!)
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > > When calling `git config --unset abc.a` on this file, it leaves this > > (invalid) config behind: > > > > [ > > [xyz] > > key = value > > > > The reason is that we try to search for the beginning of the line (or > > for the end of the preceding section header on the same line) that > > defines abc.a, but as an optimization, we subtract 2 from the offset > > pointing just after the definition before we call > > find_beginning_of_line(). That function, however, *also* performs that > > optimization and promptly fails to find the section header correctly. > > This commit message would be more convincing if we had it in test form. I agree a test might be nice. But I don't find the commit message unconvincing at all. It explains pretty clearly why the bug occurs, and you can verify it by looking at find_beginning_of_line. > [abc]a > > is not written by Git, but would be written from an outside tool or person > and we barely cope with it? Yes, I don't think git would ever write onto the same line. But clearly we should handle anything that's syntactically valid. -Peff
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelinwrote: > Currently, we are slightly overzealous When removing an entry from a > config file of this form: > > [abc]a > [xyz] > key = value > > When calling `git config --unset abc.a` on this file, it leaves this > (invalid) config behind: > > [ > [xyz] > key = value > > The reason is that we try to search for the beginning of the line (or > for the end of the preceding section header on the same line) that > defines abc.a, but as an optimization, we subtract 2 from the offset > pointing just after the definition before we call > find_beginning_of_line(). That function, however, *also* performs that > optimization and promptly fails to find the section header correctly. This commit message would be more convincing if we had it in test form. [abc]a is not written by Git, but would be written from an outside tool or person and we barely cope with it? Thanks, Stefan > > Signed-off-by: Johannes Schindelin > --- > config.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/config.c b/config.c > index b0c20e6cb8a..5cc049aaef0 100644 > --- a/config.c > +++ b/config.c > @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char > *config_filename, > } else > copy_end = find_beginning_of_line( > contents, contents_sz, > - store.offset[i]-2, _line); > + store.offset[i], _line); > > if (copy_end > 0 && contents[copy_end-1] != '\n') > new_line = 1; > -- > 2.16.2.windows.1.26.g2cc3565eb4b > >