On Mon, Jun 19, 2017 at 3:32 PM, Ævar Arnfjörð Bjarmason <ava...@gmail.com> wrote: > > On Mon, Jun 19 2017, Stefan Beller jotted: > >>> Now, git-grep could make use of the pluggable error facility added in >>> commit c19a490e37 ("usage: allow pluggable die-recursion checks", >>> 2013-04-16). >> >> I think we should do that instead (though I have not looked at the downsides >> of this), because... > > It makes sense to do that in addition to what I'm doing here (or if the > approach I'm taking doesn't make sense, some other patch to fix the > general issue in the default handler). > > I'm going to try to get around to fixing the grep behavior in a > follow-up patch, this is a fix for the overzealous recursion detection > in the default handler needlessly causing other issues. > >>> >>> So let's just set the recursion limit to a number higher than the >>> number of threads we're ever likely to spawn. Now we won't lose >>> errors, and if we have a recursing die handler we'll still die within >>> microseconds. >> >> how are we handling access to that global variable? >> Do we need to hold a mutex to be correct? or rather hope that >> it works across threads, not counting on it, because each thread >> individually would count up to 1024? > > It's not guarded by a mutex and the ++ here and the reads from it are > both racy. > > However, for its stated purpose that's fine, even if we're racily > incrementing it and losing some updates some will get through, which is > good enough for an infinite recursion detection. We don't really care if > we die at exactly 1 or exactly 1024. > >> I would prefer if we kept the number as low as "at most >> one screen of lines". > > In practice this is the case in git, because the programs that would > encounter this are going to be spawning less than screenfull of threads, > assuming (as is the case) that each thread might print out one error. > > The semantics of that aren't changed with this patch, the difference is > that you're going to get e.g. N repeats of a meaningful error instead of > N repeats of either the meaningful error OR "recursion detected in die > handler", depending on your luck. > > I.e. in current git (after a few runs to get an unlucky one): > > $ git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: recursion detected in die handler > fatal: recursion detected in die handler > fatal: recursion detected in die handler > > Or if you're lucky at least one of these will be the actual error: > > $ git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: recursion detected in die handler > fatal: pcre_exec failed with error code -8 > fatal: recursion detected in die handler > fatal: recursion detected in die handler > fatal: recursion detected in die handler > > But with this change: > > $ ~/g/git/git-grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > fatal: pcre2_jit_match failed with error code -47: match limit exceeded > > (The error message is different because I compiled with PCRE v2 locally, > instead of the system PCRE v1, but that doesn't matter for this example) > > Over 1000 runs thish is how that breaks down on my machine, without this > patch. I've replaced the recursion error with just "R" and the PCRE > error with "P", and shown them in descending order by occurrence, lines > without a "P" only printed out the recursion error: > > $ (for i in {1..1000}; do git grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: > r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort > -nr|head -n 10 > 247 R R > 136 P R R > 122 P R > 112 R R R > 108 R > 59 R P R > 54 R P > 54 P > 31 P R R R > 21 R R P > > There's a long tail I've omitted there of alterations to that. As this > shows in >10% of cases we don't print out any meaningful error at > all. But with this change: > > $ (for i in {1..1000}; do ~/g/git/git-grep -P --threads=10 > '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: > r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort > -nr|head -n 10 > 377 P P > 358 P P P > 192 P > 63 P P P P > 8 P P P P P > 2 P P P P P P > > We will always show a meaningful error, but may of course do so multiple > times, which is a subject for a fix in git-grep in particular, but the > point is again, to fix the general case for the default handler. > > Something something sorry about the long mail didn't have time to write > a shorter one :) >
Actually this convinced me (and it would be lovely to have such observations in the commit message). Thanks, Stefan