Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-09 Thread Ævar Arnfjörð Bjarmason
On Tue, May 9, 2017 at 4:22 PM, demerphq  wrote:
> On 9 May 2017 at 13:12, Ævar Arnfjörð Bjarmason  wrote:
>> On Tue, May 9, 2017 at 2:37 AM, brian m. carlson
>>  wrote:
>>> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> * gitweb is vulnerable to CPU DoS now in its default configuration.
>> It's easy to provide an ERE that ends up slurping up 100% CPU for
>> several seconds on any non-trivial sized repo, do that in parallel &
>> you have a DoS vector.
>
> Does one need an ERE? Can't one do that now to many parts of git just
> with a glob?

in practice I don't think so because:

1) I'm now aware of any place where we expose globbing over the wire.

2) AFAICT for the issue detailed in [1] to trigger you also need a
pathological filename in the repo, e.g. I can't get git-ls-files to go
quadratic on either git.git or linux.git, whereas it's pretty easy to
come up with a really expensive regex since there's more content to
choose from when matching file content than filenames.

1. https://public-inbox.org/git/20170424211249.28553-1-ava...@gmail.com/


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-09 Thread demerphq
On 9 May 2017 at 13:12, Ævar Arnfjörð Bjarmason  wrote:
> On Tue, May 9, 2017 at 2:37 AM, brian m. carlson
>  wrote:
>> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
> * gitweb is vulnerable to CPU DoS now in its default configuration.
> It's easy to provide an ERE that ends up slurping up 100% CPU for
> several seconds on any non-trivial sized repo, do that in parallel &
> you have a DoS vector.

Does one need an ERE? Can't one do that now to many parts of git just
with a glob?

Yves


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-09 Thread Ævar Arnfjörð Bjarmason
On Tue, May 9, 2017 at 12:40 PM, Johannes Schindelin
 wrote:
> Hi,
>
> On Tue, 9 May 2017, brian m. carlson wrote:
>
>> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> > On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
>> >  wrote:
>> > > PCRE and PCRE2 also tend to have a lot of security updates, so I
>> > > would prefer if we didn't import them into the tree.  It is far
>> > > better for users to use their distro's packages for PCRE, as it
>> > > means they get automatic security updates even if they're using an
>> > > old Git.
>> > >
>> > > We shouldn't consider shipping anything with a remotely frequent
>> > > history of security updates in our tree, since people very
>> > > frequently run old or ancient versions of Git.
>> >
>> > I'm aware of its security record[1], but I wonder what threat model
>> > you have in mind here. I'm not aware of any parts of git (except maybe
>> > gitweb?) where we take regexes from untrusted sources.
>> >
>> > I.e. yes there have been DoS's & even some overflow bugs leading code
>> > execution in PCRE, but in the context of powering git-grep & git-log
>> > with PCRE this falls into the "stop hitting yourself" category.
>>
>> Just because you don't drive Git with untrusted regexes doesn't mean
>> other people don't.
>
> Or other applications.
>
>> It's not a good idea to require a stronger security model than we
>> absolutely have to, since people can and will violate it.  Think how
>> devastating Shellshock was even though technically nobody should provide
>> insecure environment variables to the shell.
>>
>> And, yes, gitweb does in fact call git grep. That means that git grep
>> must in fact be secure against untrusted regexes, or you have a remote
>> code execution vulnerability.
>
> And not only grep is affected. Think HEAD^{/}. There are plenty of
> sites where you are allowed to specify revs in a freer form than SHA-1s.

That will still use reg(comp|exec) for the foreseeable future. We have
plenty of manual use of that all over the place:

$ git grep 'reg(comp|exec)\(' *.[ch] builtin/*.[ch]

And the ^{/rx} feature is powered by the one in sha1_name.c

> Having said that, I do like the prospect of a faster git grep.
>
> Hopefully there will be a way to make use of PCRE that can be switched
> off? Like, a compile-time replacement of the regex API backed by PCRE v2
> *iff* PCRE v2 is used for building?

Yup, see my just-sent

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-09 Thread Ævar Arnfjörð Bjarmason
On Tue, May 9, 2017 at 2:37 AM, brian m. carlson
 wrote:
> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
>>  wrote:
>> > PCRE and PCRE2 also tend to have a lot of security updates, so I would
>> > prefer if we didn't import them into the tree.  It is far better for
>> > users to use their distro's packages for PCRE, as it means they get
>> > automatic security updates even if they're using an old Git.
>> >
>> > We shouldn't consider shipping anything with a remotely frequent history
>> > of security updates in our tree, since people very frequently run old or
>> > ancient versions of Git.
>>
>> I'm aware of its security record[1], but I wonder what threat model
>> you have in mind here. I'm not aware of any parts of git (except maybe
>> gitweb?) where we take regexes from untrusted sources.
>>
>> I.e. yes there have been DoS's & even some overflow bugs leading code
>> execution in PCRE, but in the context of powering git-grep & git-log
>> with PCRE this falls into the "stop hitting yourself" category.
>
> Just because you don't drive Git with untrusted regexes doesn't mean
> other people don't.  It's not a good idea to require a stronger security
> model than we absolutely have to, since people can and will violate it.
> Think how devastating Shellshock was even though technically nobody
> should provide insecure environment variables to the shell.

Yes this is definitely something we should keep in mind. I see my
comment could be read as dismissing your concerns, I didn't mean that.
This is definitely something to be concerned about.

I was trying to solicit feedback on whether there were any other parts
of stock git that could take external input as regexes than gitweb,
I'm not aware of any.

I thought that gitweb wouldn't take regexes by default, but I see now
that the 'search' feature is on by default, and that allows you to
grep / pickaxe with regexes. Either -E or -F for grep, or
--pickaxe-regex (which implies -E) for log.

> And, yes, gitweb does in fact call git grep.  That means that git grep
> must in fact be secure against untrusted regexes, or you have a remote
> code execution vulnerability.

Indeed, but it's not as clear-cut as turning on on PCRE making you vulnerable.

* gitweb is vulnerable to CPU DoS now in its default configuration.
It's easy to provide an ERE that ends up slurping up 100% CPU for
several seconds on any non-trivial sized repo, do that in parallel &
you have a DoS vector.

* Obviously something that's 2x as fast or more (which my WIP code is)
makes this better.

* PCRE tends to be worse at pathological patterns, but this is because
it has really large limits by default and will try rather insanely
hard to match your pattern.

-DHEAP_LIMIT=2000 \
-DMATCH_LIMIT=1000 \
-DMATCH_LIMIT_DEPTH=1000 \
-DMAX_NAME_COUNT=1 \
-DMAX_NAME_SIZE=32 \
-DPARENS_NEST_LIMIT=250 \

This can be reduced dynamically via the API (and the patterns can't
change it, except to reduce it). For example on my system 2.11.0
(before any of my changes) on linux.git:

$ time git grep -E -- '-?-?-?-?-?-?-?-?-?-?-?---$' |wc -l
12434
real0m0.444s
$ time git grep -P -- '-?-?-?-?-?-?-?-?-?-?-?---$' | wc -l
12434
real1m20.849s

With my JIT changes:

$ time ~/g/git/git --exec-path=/home/avar/g/git grep -P --
'-?-?-?-?-?-?-?-?-?-?-?---$' | wc -l
12434
real0m5.334s

However for user-supplied patterns we can just turn on really
conservative settings:

$ time ~/g/git/git --exec-path=/home/avar/g/git grep -P --
'(*LIMIT_RECURSION=1000)(*LIMIT_MATCH=1000)-?-?-?-?-?-?-?-?-?-?-?---$'
| wc -l
fatal: pcre2_jit_match failed with error code -47: match limit exceeded
0
real0m0.021s

When I locally compile git with something like this:

-   -DMATCH_LIMIT=1000 \
-   -DMATCH_LIMIT_DEPTH=1000 \
-   -DMATCH_LIMIT_RECURSION=1000 \
-   -DMAX_NAME_COUNT=1 \
+   -DMATCH_LIMIT=1000 \
+   -DMATCH_LIMIT_DEPTH=1000 \
+   -DMATCH_LIMIT_RECURSION=1000 \
+   -DMAX_NAME_COUNT=100 \
-DMAX_NAME_SIZE=32 \
-   -DPARENS_NEST_LIMIT=250 \
+   -DPARENS_NEST_LIMIT=10 \

All tests still pass, and from my own testing I can't find any
non-pathological patterns this would break. So we might actually
consider turning this down for git by default.

* Much of the rest of the patterns PCRE has CVEs for can similarly be
mitigated by simply turning various features off.

> Furthermore, at work we distribute Git with all releases of our product.
> We normally only do non-security updates to the last couple of releases,
> but we provide security updates to all supported 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-09 Thread Johannes Schindelin
Hi,

On Tue, 9 May 2017, brian m. carlson wrote:

> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
> > On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
> >  wrote:
> > > PCRE and PCRE2 also tend to have a lot of security updates, so I
> > > would prefer if we didn't import them into the tree.  It is far
> > > better for users to use their distro's packages for PCRE, as it
> > > means they get automatic security updates even if they're using an
> > > old Git.
> > >
> > > We shouldn't consider shipping anything with a remotely frequent
> > > history of security updates in our tree, since people very
> > > frequently run old or ancient versions of Git.
> > 
> > I'm aware of its security record[1], but I wonder what threat model
> > you have in mind here. I'm not aware of any parts of git (except maybe
> > gitweb?) where we take regexes from untrusted sources.
> > 
> > I.e. yes there have been DoS's & even some overflow bugs leading code
> > execution in PCRE, but in the context of powering git-grep & git-log
> > with PCRE this falls into the "stop hitting yourself" category.
> 
> Just because you don't drive Git with untrusted regexes doesn't mean
> other people don't.

Or other applications.

> It's not a good idea to require a stronger security model than we
> absolutely have to, since people can and will violate it.  Think how
> devastating Shellshock was even though technically nobody should provide
> insecure environment variables to the shell.
> 
> And, yes, gitweb does in fact call git grep. That means that git grep
> must in fact be secure against untrusted regexes, or you have a remote
> code execution vulnerability.

And not only grep is affected. Think HEAD^{/}. There are plenty of
sites where you are allowed to specify revs in a freer form than SHA-1s.

Having said that, I do like the prospect of a faster git grep.

Hopefully there will be a way to make use of PCRE that can be switched
off? Like, a compile-time replacement of the regex API backed by PCRE v2
*iff* PCRE v2 is used for building?

Ciao,
Dscho

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread brian m. carlson
On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
>  wrote:
> > PCRE and PCRE2 also tend to have a lot of security updates, so I would
> > prefer if we didn't import them into the tree.  It is far better for
> > users to use their distro's packages for PCRE, as it means they get
> > automatic security updates even if they're using an old Git.
> >
> > We shouldn't consider shipping anything with a remotely frequent history
> > of security updates in our tree, since people very frequently run old or
> > ancient versions of Git.
> 
> I'm aware of its security record[1], but I wonder what threat model
> you have in mind here. I'm not aware of any parts of git (except maybe
> gitweb?) where we take regexes from untrusted sources.
> 
> I.e. yes there have been DoS's & even some overflow bugs leading code
> execution in PCRE, but in the context of powering git-grep & git-log
> with PCRE this falls into the "stop hitting yourself" category.

Just because you don't drive Git with untrusted regexes doesn't mean
other people don't.  It's not a good idea to require a stronger security
model than we absolutely have to, since people can and will violate it.
Think how devastating Shellshock was even though technically nobody
should provide insecure environment variables to the shell.

And, yes, gitweb does in fact call git grep.  That means that git grep
must in fact be secure against untrusted regexes, or you have a remote
code execution vulnerability.

Furthermore, at work we distribute Git with all releases of our product.
We normally only do non-security updates to the last couple of releases,
but we provide security updates to all supported versions.  I'm not
comfortable shipping the entirety of PCRE or PCRE2 to customers without
providing security updates, so you're going to make my job (and my
coworkers') a lot harder by shipping it.  Please don't.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread Ævar Arnfjörð Bjarmason
On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
 wrote:
> On Mon, May 08, 2017 at 04:10:41PM +0900, Junio C Hamano wrote:
>> Ævar Arnfjörð Bjarmason  writes:
>>
>> > This won't be in my next PCRE submission, but I have a path locally to
>> > simply import PCRE into git.git as compat/pcre2, so it can be compiled
>> > with NO_PCRE=Y similar to how NO_REGEX=Y works.
>> >
>> > This will hopefully address your concerns partially, i.e. when you do
>> > want to try it out it'll be easier.
>>
>> Eek, please don't.
>>
>> Until pcre2 becomes _so_ stable that all reasonable distros give
>> choice to the end-users to install it easily in a packaged form,
>> such a "not a fork, but a copy" will be a moving target that I'd
>> rather not to have in compat/.  IOW, our compat/$pkg should be a
>> last resort to help those on distros that are so hard to convince to
>> carry the version/variant of $pkg we would like to use.

The reason I'm looking into this is because the WIP part of my PCRE
branch has changes which start to use PCRE as a general matching
engine in git, even. I.e.:

* git grep -F will be powered by it rather than kwset (which'll be git rm'd)
* Long standing limitations with \0s in regexes go away.
* grep -G and -E will use PCRE via a WIP POSIX -> PCRE  pattern
translator (https://bugs.exim.org/show_bug.cgi?id=2106)
* Perhaps we can remove compat/regex/ entirely & use PCRE via its
POSIX emulation mode + pattern translator (we use regcomp/regexec a
lot for non-grep/log), I'm not sure yet.

I have messy but working code for this in a WIP branch, here's the
performance improvement against linux.git:

Test   v2.13.0-rc2   HEAD
---
7820.1: basic grep how.to  0.31(1.20+0.46)
0.19(0.33+0.55) -38.7%
7820.2: extended grep how.to   0.31(1.19+0.46)
0.19(0.33+0.55) -38.7%
7820.3: perl grep how.to   0.30(1.12+0.46)
0.19(0.28+0.62) -36.7%
7820.5: basic grep ^how to 0.31(1.24+0.39)
0.19(0.32+0.56) -38.7%
7820.6: extended grep ^how to  0.30(1.18+0.44)
0.19(0.22+0.66) -36.7%
7820.7: perl grep ^how to  0.55(2.68+0.41)
0.19(0.32+0.56) -65.5%
7820.9: basic grep [how] to0.47(2.17+0.44)
0.22(0.39+0.54) -53.2%
7820.10: extended grep [how] to0.47(2.21+0.40)
0.22(0.39+0.55) -53.2%
7820.11: perl grep [how] to0.53(2.64+0.39)
0.21(0.37+0.58) -60.4%
7820.13: basic grep \(e.t[^ ]*\|v.ry\) rare0.63(3.16+0.48)
0.21(0.48+0.53) -66.7%
7820.14: extended grep (e.t[^ ]*|v.ry) rare0.64(3.28+0.38)
0.21(0.45+0.57) -67.2%
7820.15: perl grep (e.t[^ ]*|v.ry) rare1.00(5.77+0.37)
0.21(0.50+0.53) -79.0%
7820.17: basic grep m\(ú\|u\)ult.b\(æ\|y\)te   0.31(1.23+0.51)
0.19(0.30+0.58) -38.7%
7820.18: extended grep m(ú|u)ult.b(æ|y)te  0.32(1.26+0.47)
0.19(0.27+0.61) -40.6%
7820.19: perl grep m(ú|u)ult.b(æ|y)te  0.36(1.61+0.37)
0.19(0.30+0.57) -47.2%
7821.1: fixed grep int 0.52(1.71+0.64)
0.43(1.11+0.72) -17.3%
7821.2: basic grep int 0.54(1.62+0.70)
0.42(1.14+0.62) -22.2%
7821.3: extended grep int  0.53(1.67+0.64)
0.51(1.17+0.62) -3.8%
7821.4: perl grep int  0.53(1.71+0.59)
0.72(1.13+0.63) +35.8%
7821.6: fixed grep -i int  0.58(1.86+0.67)
0.47(1.32+0.62) -19.0%
7821.7: basic grep -i int  0.62(1.94+0.61)
0.57(1.25+0.72) -8.1%
7821.8: extended grep -i int   0.82(1.86+0.68)
0.50(1.41+0.56) -39.0%
7821.9: perl grep -i int   0.70(1.88+0.68)
0.56(1.25+0.70) -20.0%
7821.11: fixed grep æ  0.33(1.30+0.43)
0.19(0.22+0.64) -42.4%
7821.12: basic grep æ  0.33(1.35+0.38)
0.18(0.26+0.59) -45.5%
7821.13: extended grep æ   0.33(1.20+0.52)
0.18(0.32+0.53) -45.5%
7821.14: perl grep æ   0.33(1.31+0.40)
0.18(0.28+0.56) -45.5%
7821.16: fixed grep -i æ   0.25(0.87+0.50)
0.18(0.24+0.60) -28.0%
7821.17: basic grep -i æ   0.26(0.88+0.48)
0.18(0.24+0.60) -30.8%
7821.18: extended grep -i æ0.26(0.92+0.44)
0.18(0.24+0.61) -30.8%
7821.19: perl grep -i æ0.25(0.79+0.45)
0.19(0.32+0.56) -24.0%

In case that comes out misformatted it's also available at
https://github.com/avar/git/commit/ee5b2040e2c697e22a73c7b5f07f1b1e591f07e3

I.e. grepping is sped up by 50% and more in many cases, even for -G
and -E patterns which now get translated internally into PCRE
patterns.

> PCRE and PCRE2 also tend to have a lot of security updates, so I would
> prefer if we didn't import them into the tree.  It is far better for
> users to use their distro's packages for PCRE, as it means they get
> automatic 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread brian m. carlson
On Mon, May 08, 2017 at 04:10:41PM +0900, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason  writes:
> 
> > This won't be in my next PCRE submission, but I have a path locally to
> > simply import PCRE into git.git as compat/pcre2, so it can be compiled
> > with NO_PCRE=Y similar to how NO_REGEX=Y works.
> >
> > This will hopefully address your concerns partially, i.e. when you do
> > want to try it out it'll be easier.
> 
> Eek, please don't.
> 
> Until pcre2 becomes _so_ stable that all reasonable distros give
> choice to the end-users to install it easily in a packaged form,
> such a "not a fork, but a copy" will be a moving target that I'd
> rather not to have in compat/.  IOW, our compat/$pkg should be a
> last resort to help those on distros that are so hard to convince to
> carry the version/variant of $pkg we would like to use.

PCRE and PCRE2 also tend to have a lot of security updates, so I would
prefer if we didn't import them into the tree.  It is far better for
users to use their distro's packages for PCRE, as it means they get
automatic security updates even if they're using an old Git.

We shouldn't consider shipping anything with a remotely frequent history
of security updates in our tree, since people very frequently run old or
ancient versions of Git.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> This won't be in my next PCRE submission, but I have a path locally to
> simply import PCRE into git.git as compat/pcre2, so it can be compiled
> with NO_PCRE=Y similar to how NO_REGEX=Y works.
>
> This will hopefully address your concerns partially, i.e. when you do
> want to try it out it'll be easier.

Eek, please don't.  

Until pcre2 becomes _so_ stable that all reasonable distros give
choice to the end-users to install it easily in a packaged form,
such a "not a fork, but a copy" will be a moving target that I'd
rather not to have in compat/.  IOW, our compat/$pkg should be a
last resort to help those on distros that are so hard to convince to
carry the version/variant of $pkg we would like to use.






Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread Ævar Arnfjörð Bjarmason
On Thu, May 4, 2017 at 1:32 PM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Thu, 4 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> So I think if your criteria for working on integrating v2 is users
>> noticing it elsewhere and asking you for it you'll likely never switch
>> to it.
>
> Speaking for myself, my biggest problem with your patch series is to find
> the time to work on packaging PCRE v2 as an MSYS2 package.

This won't be in my next PCRE submission, but I have a path locally to
simply import PCRE into git.git as compat/pcre2, so it can be compiled
with NO_PCRE=Y similar to how NO_REGEX=Y works.

This will hopefully address your concerns partially, i.e. when you do
want to try it out it'll be easier.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-04 Thread Ævar Arnfjörð Bjarmason
On Thu, May 4, 2017 at 1:32 PM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Thu, 4 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> So I think if your criteria for working on integrating v2 is users
>> noticing it elsewhere and asking you for it you'll likely never switch
>> to it.
>
> Speaking for myself, my biggest problem with your patch series is to find
> the time to work on packaging PCRE v2 as an MSYS2 package.
>
> If you force me to make that time *now*, by forcing the switch to v2 in
> `pu`, you will disimprove my mood. That's all.
>
> If, however, you are gently with people like me, offering this as an
> opt-in, so that I can play with it when I find some time to build PCRE v2
> in MSYS2's context, I will be pretty happy, and of course will ship Git
> for Windows with the faster PCRE support as soon as I am satisfied that it
> works correctly.
>
> That should make you happy, too, as you will get quite a bit of testers
> that way. Gently.

Sure, all makes sense. As noted I'll keep it from being the default for now.

The only reason I even replied & kept this thread going now was
because I noticed "make" not enabling it, REG_STARTEND making \0
searches work in practice (even if not specified by the NetBSD
extension) etc.

All stuff I would surely forget by the time we have the discussion
again down the road, so I wanted to get that E-Mail out now for future
reference.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-04 Thread Johannes Schindelin
Hi Ævar,

On Thu, 4 May 2017, Ævar Arnfjörð Bjarmason wrote:

> So I think if your criteria for working on integrating v2 is users
> noticing it elsewhere and asking you for it you'll likely never switch
> to it.

Speaking for myself, my biggest problem with your patch series is to find
the time to work on packaging PCRE v2 as an MSYS2 package.

If you force me to make that time *now*, by forcing the switch to v2 in
`pu`, you will disimprove my mood. That's all.

If, however, you are gently with people like me, offering this as an
opt-in, so that I can play with it when I find some time to build PCRE v2
in MSYS2's context, I will be pretty happy, and of course will ship Git
for Windows with the faster PCRE support as soon as I am satisfied that it
works correctly.

That should make you happy, too, as you will get quite a bit of testers
that way. Gently.

Ciao,
Dscho

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-04 Thread Ævar Arnfjörð Bjarmason
On Thu, May 4, 2017 at 11:11 AM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Wed, 3 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> [Just replying to you & Duy in the same mail, easier]
>
> It makes it harder on everybody else, though, as two slightly different
> discussion points are conflated now. Also, no single online mail archive
> will be able to render the thread correctly (assuming that you edited in
> the In-Reply-To header to loop back to Duy's mail).
>
>> On Wed, May 3, 2017 at 11:45 AM, Johannes Schindelin
>>  wrote:
>> >
>> > At this point, I feel that someone should recall into our collective
>> > memory what happened when we made a change similar in nature that
>> > broke existing build setups: by requiring REG_STARTEND all of a sudden
>> > ("you can easily flip the NO_REGEX switch", as if everybody should
>> > know about those Makefile flags we have).
>>
>> And as a result grep/log -G got faster by default,
>
> Sure. For those developers where the build was not broken.
>
> Software maintenance is always a trade-off, and with software as popular
> as Git, maintainers bear a special responsibility to *not* break builds
> easily, as it is more likely than not that anybody who wants to build Git
> is *unfamiliar* with the specifics.
>
> That is the main reason why we have a configure, even if we try hard to
> make things work with a straight `make`: people who are happily oblivious
> of our discussions on this here high-volume mailing list will be able to
> build Git without even consulting the documentation, just by doing what
> they would do with any Unix-based software: ./configure && make

And they still will. That sort of invocation will not build with
PCRE[1]. This change to v2 by default only impacts builders who are
already going out of their way to customize their build with
USE_LIBPCRE=YesPlease.

I don't think your comments make sense in that context. We're not even
talking about the advanced user/newbie developer who wants to build
git for the first time, we're talking about the user who's already
involved enough to start scouring the Makefile & tweaking the various
flags there.

Changing the current advice from "if you want PCRE enable this" to "if
you want PCRE enable this, we use v2 then, turn this other thing on if
you want v1" is a *very* small imposition on people already deeply
customizing their build to the extent of turning on the non-default
PCRE in the first place.

Yes that's a trade-off, and I'm not denying that some advanced users
will notice / be annoyed by that change. I just think the end-user
benefit of bringing v2 to the attention of distributors who are doing
such customized builds outweighs that.

>> and more importantly since v2.10.1 which includes your 2f8952250a and
>> made a REG_STARTEND engine a hard requirement nobody using git is
>> mysteriously going to miss grep results because of some stray \0 in the
>> string being matched.
>
> That is a misinterpretation of what the REG_STARTEND flag is supposed to
> do. In *some* implementations, REG_STARTEND allows NULs in the haystack.
> Some other implementations do not allow that. It is ill-defined.

We have a test in t7008-grep-binary.sh that'll fail if REG_STARTEND
doesn't behave as I described. Actually I think I got the history a
bit wrong here, I marked that test as succeeding in commit 7e36de5859
("t/t7008-grep-binary.sh: un-TODO a test that needs REG_STARTEND",
2010-08-17).

So I think since 2010 either distributors without REG_STARTEND have
been enabling NO_REGEX=YesPlease giving users the right behavior with
a \0 in their haystack, or their tests have been consistently failing
for 7 years, hopefully the former is the case.

>> I agree that I should drop the patch to make v2 the default from this
>> series for now. Clearly it's controversial, and can be considered on
>> its own merits once the supporting code is in. I'll do that in the
>> next submission, which I'm planning to send after v2.13.0 comes out.
>
> Good. I am really glad that we agree that the move to v2 should be a
> two-step process, with the uncontroversial "optionally use PCRE v2 for
> speed and robustness" first.
>
> Once enough users like it (and speaking for myself, once I heard from
> enough users how good it is so that I can justify to set aside enough time
> to support PCRE v2 in MSYS2), you will find it much smoother sailing to go
> to phase 2.

The v2 API is not a user-visible change. It'll help performance, and
over the long term subject users to fewer internal PCRE bugs in v1
that'll never get fixed.

That's pretty much a textbook example of the sort of thing users will
never even notice, but which is a good thing to do anyway, users don't
notice most of the incremental performance improvements made to git,
but they matter to them in the aggregate.

So I think if your criteria for working on integrating v2 is users
noticing it elsewhere and asking you for it you'll likely 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-04 Thread Johannes Schindelin
Hi Ævar,

On Wed, 3 May 2017, Ævar Arnfjörð Bjarmason wrote:

> [Just replying to you & Duy in the same mail, easier]

It makes it harder on everybody else, though, as two slightly different
discussion points are conflated now. Also, no single online mail archive
will be able to render the thread correctly (assuming that you edited in
the In-Reply-To header to loop back to Duy's mail).

> On Wed, May 3, 2017 at 11:45 AM, Johannes Schindelin
>  wrote:
> >
> > At this point, I feel that someone should recall into our collective
> > memory what happened when we made a change similar in nature that
> > broke existing build setups: by requiring REG_STARTEND all of a sudden
> > ("you can easily flip the NO_REGEX switch", as if everybody should
> > know about those Makefile flags we have).
> 
> And as a result grep/log -G got faster by default,

Sure. For those developers where the build was not broken.

Software maintenance is always a trade-off, and with software as popular
as Git, maintainers bear a special responsibility to *not* break builds
easily, as it is more likely than not that anybody who wants to build Git
is *unfamiliar* with the specifics.

That is the main reason why we have a configure, even if we try hard to
make things work with a straight `make`: people who are happily oblivious
of our discussions on this here high-volume mailing list will be able to
build Git without even consulting the documentation, just by doing what
they would do with any Unix-based software: ./configure && make

> and more importantly since v2.10.1 which includes your 2f8952250a and
> made a REG_STARTEND engine a hard requirement nobody using git is
> mysteriously going to miss grep results because of some stray \0 in the
> string being matched.

That is a misinterpretation of what the REG_STARTEND flag is supposed to
do. In *some* implementations, REG_STARTEND allows NULs in the haystack.
Some other implementations do not allow that. It is ill-defined.

> I agree that I should drop the patch to make v2 the default from this
> series for now. Clearly it's controversial, and can be considered on
> its own merits once the supporting code is in. I'll do that in the
> next submission, which I'm planning to send after v2.13.0 comes out.

Good. I am really glad that we agree that the move to v2 should be a
two-step process, with the uncontroversial "optionally use PCRE v2 for
speed and robustness" first.

Once enough users like it (and speaking for myself, once I heard from
enough users how good it is so that I can justify to set aside enough time
to support PCRE v2 in MSYS2), you will find it much smoother sailing to go
to phase 2.

Ciao,
Dscho

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-03 Thread Ævar Arnfjörð Bjarmason
[Just replying to you & Duy in the same mail, easier]

On Wed, May 3, 2017 at 11:45 AM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Tue, 2 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> On Tue, May 2, 2017 at 6:05 PM, Johannes Schindelin
>>  wrote:
>>
>> > [... complaint about the abrupt change that makes Git for Windows fail
>> >  to build because pcre2.h is missing ...]
>>
>> I think it's worth it to copy/paste the commit message where I made
>> this change, since we're having this discussion in this thread:
>>
>> Makefile & configure: make PCRE v2 the default PCRE implementation
>>
>> Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the
>> Makefile & configure script, respectively, to mean use PCRE v2, not
>> PCRE v1.
>>
>> The legacy library previously available via those options is still
>> available on request via USE_LIBPCRE1=YesPlease or
>> --with-libpcre1. The existing USE_LIBPCRE2=YesPlease & --with-libpcre2
>> still explicitly ask for v2.
>>
>> The v2 PCRE is stable & end-user compatible, all this change does is
>> change the default. Someone building a new git is likely to also have
>> packaged PCRE v2 sometime in the last 2 years since it was released.
>> If not they can choose to use the legacy v2 library by making the
>> trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2.
>>
>> New releases of PCRE v2 are already faster than PCRE v1, and not only
>> is all significant development is happening on v2, but bugs in
>> reported against v1 have started getting WONTFIX'd asking users to
>> just upgrade to v2.
>>
>> So it makes sense to give our downstream distributors a nudge to
>> switch over to it.
>
> I see where you are coming from.
>
> At this point, I feel that someone should recall into our collective
> memory what happened when we made a change similar in nature that broke
> existing build setups: by requiring REG_STARTEND all of a sudden ("you can
> easily flip the NO_REGEX switch", as if everybody should know about those
> Makefile flags we have).

And as a result grep/log -G got faster by default, and more
importantly since v2.10.1 which includes your 2f8952250a and made a
REG_STARTEND engine a hard requirement nobody using git is
mysteriously going to miss grep results because of some stray \0 in
the string being matched.

I agree that I should drop the patch to make v2 the default from this
series for now. Clearly it's controversial, and can be considered on
its own merits once the supporting code is in. I'll do that in the
next submission, which I'm planning to send after v2.13.0 comes out.

But aside from that, I must say I completely disagree with this line
of argument. I think to the extent that we did anything wrong with
NO_REGEX it's that I should have noticed that we could use
REG_STARTEND after I imported gawk's engine in compat/regex &
submitted a patch like yours back in 2010. Instead we got 6 years of
git releases where

Similarly, if PCRE v2 is less buggy & faster then we're looking at man
days/weeks and possibly months/years saved given how many people use
git v.s. how many are developing it & write build scripts for it.

Leaving user benefits on the table to optimize for reducing the
one-time inconvenience of packagers is the wrong move.

> [...]
On Wed, May 3, 2017 at 1:15 PM, Duy Nguyen  wrote:
> On Wed, May 3, 2017 at 4:45 PM, Johannes Schindelin
>  wrote:
>>> So it makes sense to give our downstream distributors a nudge to
>>> switch over to it.
>
> Some contributor (i.e. me) was not happy with this nudging though. The
> other day I switched to some branch (probably 'pu') and the build
> failed because, guess what, I didn't have pcre2 installed. If I set
> USE_LIBPCRE1 then I lose pcre support when switching to other
> branches. And no, I don't want to install libpcre2, not when I'm
> forced to this way.

Assuming that we'd want to change the default to v2, can you think of
a better way to handle this transition? As pointed out in
 I
can't.

Perhaps the best way would be to migrate this change directly to
master if there's consensus to move to v2 by default, that would avoid
the pain for developers related to switching between concurrent
branches while the change is cooking.

> 188 packages on Gentoo optionally depend on libpcre, 6 packages on
> libpcre2. Chances that a Gentoo user has libpcre2 already are rather
> low.

It sounds like two things are being conflated here, surely "a Gentoo
user", i.e. someone who's not actively contributing to git.git will
just get this via `emerge git` if this change were released, since the
ebuild will depend on libpcre2?

> I'll revisit my installation when the level of libpcre2 support
> grows a bit more than that.

There being few existing reverse 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-03 Thread Duy Nguyen
On Wed, May 3, 2017 at 4:45 PM, Johannes Schindelin
 wrote:
>> So it makes sense to give our downstream distributors a nudge to
>> switch over to it.

Some contributor (i.e. me) was not happy with this nudging though. The
other day I switched to some branch (probably 'pu') and the build
failed because, guess what, I didn't have pcre2 installed. If I set
USE_LIBPCRE1 then I lose pcre support when switching to other
branches. And no, I don't want to install libpcre2, not when I'm
forced to this way.

188 packages on Gentoo optionally depend on libpcre, 6 packages on
libpcre2. Chances that a Gentoo user has libpcre2 already are rather
low. I'll revisit my installation when the level of libpcre2 support
grows a bit more than that. You can nudge distributors directly,
probably more efficient too.

> ...
>
> I hate to be that someone, but it has to be said: this is a disruptive
> change, and it would be a lot better to make it an opt-in at first, and
> when the dust settled about this option and many distributions have opted
> in already because of the benefits and tested this thoroughly in practice,

Agreed.
-- 
Duy


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-03 Thread Johannes Schindelin
Hi Ævar,

On Tue, 2 May 2017, Ævar Arnfjörð Bjarmason wrote:

> On Tue, May 2, 2017 at 6:05 PM, Johannes Schindelin
>  wrote:
>
> > [... complaint about the abrupt change that makes Git for Windows fail
> >  to build because pcre2.h is missing ...]
>
> I think it's worth it to copy/paste the commit message where I made
> this change, since we're having this discussion in this thread:
> 
> Makefile & configure: make PCRE v2 the default PCRE implementation
> 
> Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the
> Makefile & configure script, respectively, to mean use PCRE v2, not
> PCRE v1.
> 
> The legacy library previously available via those options is still
> available on request via USE_LIBPCRE1=YesPlease or
> --with-libpcre1. The existing USE_LIBPCRE2=YesPlease & --with-libpcre2
> still explicitly ask for v2.
> 
> The v2 PCRE is stable & end-user compatible, all this change does is
> change the default. Someone building a new git is likely to also have
> packaged PCRE v2 sometime in the last 2 years since it was released.
> If not they can choose to use the legacy v2 library by making the
> trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2.
> 
> New releases of PCRE v2 are already faster than PCRE v1, and not only
> is all significant development is happening on v2, but bugs in
> reported against v1 have started getting WONTFIX'd asking users to
> just upgrade to v2.
> 
> So it makes sense to give our downstream distributors a nudge to
> switch over to it.

I see where you are coming from.

At this point, I feel that someone should recall into our collective
memory what happened when we made a change similar in nature that broke
existing build setups: by requiring REG_STARTEND all of a sudden ("you can
easily flip the NO_REGEX switch", as if everybody should know about those
Makefile flags we have).

I hate to be that someone, but it has to be said: this is a disruptive
change, and it would be a lot better to make it an opt-in at first, and
when the dust settled about this option and many distributions have opted
in already because of the benefits and tested this thoroughly in practice,
then we can think about making this an opt-out option, potentially putting
some serious thought into trying to make it painless to upgrade for
users casually compiling Git (and not, woohoo, being subscribed to the Git
mailing list, let alone knowing about all those Makefile knobs).

In short: it would be a mistake to force PCRE v2 support on by default.

Ciao,
Dscho

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Ævar Arnfjörð Bjarmason
On Tue, May 2, 2017 at 10:51 PM, Kevin Daudt  wrote:
> On Tue, May 02, 2017 at 08:52:21PM +0200, Ęvar Arnfjörš Bjarmason wrote:
>>
>>  * Due to the bizarro existing semantics of the configure script noted
>> upthread if you have a git build script that does --with-libpcre & you
>> have libpcre1 installed, it'll link to it, but now since
>> --with-libpcre defaults to libpcre2 it'll silently skip linking to it
>> if you don't have it installed.
>>
>
> Case in point: The Archlinux git-git aur package[0] (community maintained,
> latest git version) does run ./configure without --with-libpcre, but
>  requests it from make with USE_LIBPCRE=1.
>
> I noticed when trying git grep -P which then failed.
>
> [0]: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=git-git

Whatever's going on there is unrelated to the issue I'm talking about,
but if that's producing a package where -P doesn't work that looks
like a bug in the Makefile.

The way --with-libpcre works now is that it second guesses you, i.e.
it'll put USE_LIBPCRE=YesPlease into config.mak.autogen (sourced by
the Makefile) only if you supply --with-libpcre *and* it can find a
working libpcre on the system, otherwise it silently ignores you.

Whatever you supply to the configure script it'll be overridden by
Makefile arguments if present, but even if there was another bug where
./configure arguments took precedence over Makefile arguments I don't
see how it would apply here, in this case nothing libpcre related will
be written to config.mak.autogen.

So I must be missing something. I don't see how that package build
doesn't either fail at compile time because it doesn't have pcre, or
work, in which case the -P option will work.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Kevin Daudt
On Tue, May 02, 2017 at 08:52:21PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
>  * Due to the bizarro existing semantics of the configure script noted
> upthread if you have a git build script that does --with-libpcre & you
> have libpcre1 installed, it'll link to it, but now since
> --with-libpcre defaults to libpcre2 it'll silently skip linking to it
> if you don't have it installed.
> 

Case in point: The Archlinux git-git aur package[0] (community maintained,
latest git version) does run ./configure without --with-libpcre, but
 requests it from make with USE_LIBPCRE=1.

I noticed when trying git grep -P which then failed.

[0]: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=git-git


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Ævar Arnfjörð Bjarmason
On Tue, May 2, 2017 at 6:05 PM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Tue, 2 May 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> On Tue, May 2, 2017 at 2:09 PM, Johannes Schindelin
>>  wrote:
>> >
>> > On Sun, 30 Apr 2017, Junio C Hamano wrote:
>> >
>> >> * ab/grep-pcre-v2 (2017-04-25) 20 commits
>> >>  - SQUASH???
>> >>  - Makefile & configure: make PCRE v2 the default PCRE implementation
>> >>  - grep: remove support for concurrent use of both PCRE v1 & v2
>> >>  - grep: add support for PCRE v2
>> >>  - grep: add support for the PCRE v1 JIT API
>> >>  - perf: add a performance comparison test of grep -E and -P
>> >>  - grep: change the internal PCRE code & header names to be PCRE1
>> >>  - grep: change the internal PCRE macro names to be PCRE1
>> >>  - test-lib: rename the LIBPCRE prerequisite to PCRE
>> >>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
>> >>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
>> >>  - log: add -P as a synonym for --perl-regexp
>> >>  - log: add exhaustive tests for pattern style options & config
>> >>  - grep: add a test for backreferences in PCRE patterns
>> >>  - Makefile & configure: reword outdated comment about PCRE
>> >>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
>> >>  - grep: remove redundant regflags assignment under PCRE
>> >>  - grep: submodule-related case statements should die if new fields are 
>> >> added
>> >>  - grep: add tests for grep pattern types being passed to submodules
>> >>  - grep: amend submodule recursion test in preparation for rx engine 
>> >> testing
>> >>
>> >>  PCRE2, which has an API different from and incompatible with PCRE,
>> >>  can now be chosen to support "grep -P -e ''" and friends.
>> >
>> > FWIW for quite a couple of recent builds, `pu` fails on Windows with a
>> > variation of this error:
>> >
>> > CC blob.o
>> > In file included from revision.h:5:0,
>> >  from bisect.c:4:
>> > grep.h:16:19: fatal error: pcre2.h: No such file or directory
>> >  #include 
>> >^
>> > compilation terminated.
>> >
>> > Maybe this can be fixed before hitting `next`?
>>
>> This will be due to a combination of the build machine not having pcre
>> v2 (but having v1) & my "Makefile & configure: make PCRE v2 the
>> default PCRE implementation" patch, which makes v2 the default for
>> USE_LIBPCRE=YesPlease.
>>
>> Is it easy to install v2 on these build machines? Alternatively that
>> patch could be ejected out of pu, or you could USE_LIBPCRE1=YesPlease
>> to use v1, but as explained in that commit I think it makes sense to
>> make v2 the default.
>
> Let me put it this way: Installing PCRE v1 in MSYS2 is as easy as
>
> pacman -Sy mingw-w64-x86_64-pcre
>
> To install PCRE v2, you would have to copy-edit
> https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD,
> make sure that it builds by running it through
>
> makepkg-mingw -s
>
> possibly initializing a Git repository in the
> mingw-w64-pcre/src/${_realname}-${pkgver}/ directory, patching the source
> until it builds, committing the changes, adding them as patch files to the
> same directory as the new PKGBUILD, adjusting the PKGBUILD file to list
> the patch files with their checksums and to add the commands to apply
> them.
>
> Then (and this is a one time cost, fortunately) initializing two packages
> on BinTray (which we use to serve the Pacman packages of Git for Windows),
> then build and upload the packages.
>
> In short, PCRE v2 would be slightly (ahem, ahem) more involved than PCRE
> v1.
>
> I cannot imagine that MSYS2 is the only environment with that issue.

I think it's worth it to copy/paste the commit message where I made
this change, since we're having this discussion in this thread:

Makefile & configure: make PCRE v2 the default PCRE implementation

Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the
Makefile & configure script, respectively, to mean use PCRE v2, not
PCRE v1.

The legacy library previously available via those options is still
available on request via USE_LIBPCRE1=YesPlease or
--with-libpcre1. The existing USE_LIBPCRE2=YesPlease & --with-libpcre2
still explicitly ask for v2.

The v2 PCRE is stable & end-user compatible, all this change does is
change the default. Someone building a new git is likely to also have
packaged PCRE v2 sometime in the last 2 years since it was released.
If not they can choose to use the legacy v2 library by making the
trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2.

New releases of PCRE v2 are already faster than PCRE v1, and not only
is all significant development is happening on v2, but bugs in
reported against v1 have started getting WONTFIX'd asking users to
just upgrade to v2.

So it makes 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Ævar Arnfjörð Bjarmason
On Tue, May 2, 2017 at 7:43 PM, Brandon Williams  wrote:
> On 05/02, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, May 2, 2017 at 2:09 PM, Johannes Schindelin
>>  wrote:
>> > Hi Ævar,
>> >
>> > On Sun, 30 Apr 2017, Junio C Hamano wrote:
>> >
>> >> * ab/grep-pcre-v2 (2017-04-25) 20 commits
>> >>  - SQUASH???
>> >>  - Makefile & configure: make PCRE v2 the default PCRE implementation
>> >>  - grep: remove support for concurrent use of both PCRE v1 & v2
>> >>  - grep: add support for PCRE v2
>> >>  - grep: add support for the PCRE v1 JIT API
>> >>  - perf: add a performance comparison test of grep -E and -P
>> >>  - grep: change the internal PCRE code & header names to be PCRE1
>> >>  - grep: change the internal PCRE macro names to be PCRE1
>> >>  - test-lib: rename the LIBPCRE prerequisite to PCRE
>> >>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
>> >>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
>> >>  - log: add -P as a synonym for --perl-regexp
>> >>  - log: add exhaustive tests for pattern style options & config
>> >>  - grep: add a test for backreferences in PCRE patterns
>> >>  - Makefile & configure: reword outdated comment about PCRE
>> >>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
>> >>  - grep: remove redundant regflags assignment under PCRE
>> >>  - grep: submodule-related case statements should die if new fields are 
>> >> added
>> >>  - grep: add tests for grep pattern types being passed to submodules
>> >>  - grep: amend submodule recursion test in preparation for rx engine 
>> >> testing
>> >>
>> >>  PCRE2, which has an API different from and incompatible with PCRE,
>> >>  can now be chosen to support "grep -P -e ''" and friends.
>> >
>> > FWIW for quite a couple of recent builds, `pu` fails on Windows with a
>> > variation of this error:
>> >
>> > CC blob.o
>> > In file included from revision.h:5:0,
>> >  from bisect.c:4:
>> > grep.h:16:19: fatal error: pcre2.h: No such file or directory
>> >  #include 
>> >^
>> > compilation terminated.
>> >
>> > Maybe this can be fixed before hitting `next`?
>>
>> This will be due to a combination of the build machine not having pcre
>> v2 (but having v1) & my "Makefile & configure: make PCRE v2 the
>> default PCRE implementation" patch, which makes v2 the default for
>> USE_LIBPCRE=YesPlease.
>>
>> Is it easy to install v2 on these build machines? Alternatively that
>> patch could be ejected out of pu, or you could USE_LIBPCRE1=YesPlease
>> to use v1, but as explained in that commit I think it makes sense to
>> make v2 the default.
>
> Shouldn't the Makefile check for the existence of PCREv2 before making
> it the default?

We can't check for the existence of anything in the Makefile, that's
what the configure script is for.

The configure script does the "right" thing with pcre v2.

The "right" thing being the bizarro pre-existing behavior of "oh you
asked for PCRE? Well let me check if it's on the system, if it's not
I'll just silently undo what you asked for". Which is crazy, but was
there before my patches, so I didn't change it.

I.e. instead of the more sane behavior of opting the user into an
optional library if it's found on the system, and producing an error
if you supply --with-that-library but don't have thatlibrary.so.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Brandon Williams
On 05/02, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 2, 2017 at 2:09 PM, Johannes Schindelin
>  wrote:
> > Hi Ævar,
> >
> > On Sun, 30 Apr 2017, Junio C Hamano wrote:
> >
> >> * ab/grep-pcre-v2 (2017-04-25) 20 commits
> >>  - SQUASH???
> >>  - Makefile & configure: make PCRE v2 the default PCRE implementation
> >>  - grep: remove support for concurrent use of both PCRE v1 & v2
> >>  - grep: add support for PCRE v2
> >>  - grep: add support for the PCRE v1 JIT API
> >>  - perf: add a performance comparison test of grep -E and -P
> >>  - grep: change the internal PCRE code & header names to be PCRE1
> >>  - grep: change the internal PCRE macro names to be PCRE1
> >>  - test-lib: rename the LIBPCRE prerequisite to PCRE
> >>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
> >>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
> >>  - log: add -P as a synonym for --perl-regexp
> >>  - log: add exhaustive tests for pattern style options & config
> >>  - grep: add a test for backreferences in PCRE patterns
> >>  - Makefile & configure: reword outdated comment about PCRE
> >>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
> >>  - grep: remove redundant regflags assignment under PCRE
> >>  - grep: submodule-related case statements should die if new fields are 
> >> added
> >>  - grep: add tests for grep pattern types being passed to submodules
> >>  - grep: amend submodule recursion test in preparation for rx engine 
> >> testing
> >>
> >>  PCRE2, which has an API different from and incompatible with PCRE,
> >>  can now be chosen to support "grep -P -e ''" and friends.
> >
> > FWIW for quite a couple of recent builds, `pu` fails on Windows with a
> > variation of this error:
> >
> > CC blob.o
> > In file included from revision.h:5:0,
> >  from bisect.c:4:
> > grep.h:16:19: fatal error: pcre2.h: No such file or directory
> >  #include 
> >^
> > compilation terminated.
> >
> > Maybe this can be fixed before hitting `next`?
> 
> This will be due to a combination of the build machine not having pcre
> v2 (but having v1) & my "Makefile & configure: make PCRE v2 the
> default PCRE implementation" patch, which makes v2 the default for
> USE_LIBPCRE=YesPlease.
> 
> Is it easy to install v2 on these build machines? Alternatively that
> patch could be ejected out of pu, or you could USE_LIBPCRE1=YesPlease
> to use v1, but as explained in that commit I think it makes sense to
> make v2 the default.

Shouldn't the Makefile check for the existence of PCREv2 before making
it the default?

-- 
Brandon Williams


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Johannes Schindelin
Hi Ævar,

On Tue, 2 May 2017, Ævar Arnfjörð Bjarmason wrote:

> On Tue, May 2, 2017 at 2:09 PM, Johannes Schindelin
>  wrote:
> >
> > On Sun, 30 Apr 2017, Junio C Hamano wrote:
> >
> >> * ab/grep-pcre-v2 (2017-04-25) 20 commits
> >>  - SQUASH???
> >>  - Makefile & configure: make PCRE v2 the default PCRE implementation
> >>  - grep: remove support for concurrent use of both PCRE v1 & v2
> >>  - grep: add support for PCRE v2
> >>  - grep: add support for the PCRE v1 JIT API
> >>  - perf: add a performance comparison test of grep -E and -P
> >>  - grep: change the internal PCRE code & header names to be PCRE1
> >>  - grep: change the internal PCRE macro names to be PCRE1
> >>  - test-lib: rename the LIBPCRE prerequisite to PCRE
> >>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
> >>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
> >>  - log: add -P as a synonym for --perl-regexp
> >>  - log: add exhaustive tests for pattern style options & config
> >>  - grep: add a test for backreferences in PCRE patterns
> >>  - Makefile & configure: reword outdated comment about PCRE
> >>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
> >>  - grep: remove redundant regflags assignment under PCRE
> >>  - grep: submodule-related case statements should die if new fields are 
> >> added
> >>  - grep: add tests for grep pattern types being passed to submodules
> >>  - grep: amend submodule recursion test in preparation for rx engine 
> >> testing
> >>
> >>  PCRE2, which has an API different from and incompatible with PCRE,
> >>  can now be chosen to support "grep -P -e ''" and friends.
> >
> > FWIW for quite a couple of recent builds, `pu` fails on Windows with a
> > variation of this error:
> >
> > CC blob.o
> > In file included from revision.h:5:0,
> >  from bisect.c:4:
> > grep.h:16:19: fatal error: pcre2.h: No such file or directory
> >  #include 
> >^
> > compilation terminated.
> >
> > Maybe this can be fixed before hitting `next`?
> 
> This will be due to a combination of the build machine not having pcre
> v2 (but having v1) & my "Makefile & configure: make PCRE v2 the
> default PCRE implementation" patch, which makes v2 the default for
> USE_LIBPCRE=YesPlease.
> 
> Is it easy to install v2 on these build machines? Alternatively that
> patch could be ejected out of pu, or you could USE_LIBPCRE1=YesPlease
> to use v1, but as explained in that commit I think it makes sense to
> make v2 the default.

Let me put it this way: Installing PCRE v1 in MSYS2 is as easy as

pacman -Sy mingw-w64-x86_64-pcre

To install PCRE v2, you would have to copy-edit
https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD,
make sure that it builds by running it through

makepkg-mingw -s

possibly initializing a Git repository in the
mingw-w64-pcre/src/${_realname}-${pkgver}/ directory, patching the source
until it builds, committing the changes, adding them as patch files to the
same directory as the new PKGBUILD, adjusting the PKGBUILD file to list
the patch files with their checksums and to add the commands to apply
them.

Then (and this is a one time cost, fortunately) initializing two packages
on BinTray (which we use to serve the Pacman packages of Git for Windows),
then build and upload the packages.

In short, PCRE v2 would be slightly (ahem, ahem) more involved than PCRE
v1.

I cannot imagine that MSYS2 is the only environment with that issue.

Ciao,
Dscho

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Ævar Arnfjörð Bjarmason
On Tue, May 2, 2017 at 2:09 PM, Johannes Schindelin
 wrote:
> Hi Ævar,
>
> On Sun, 30 Apr 2017, Junio C Hamano wrote:
>
>> * ab/grep-pcre-v2 (2017-04-25) 20 commits
>>  - SQUASH???
>>  - Makefile & configure: make PCRE v2 the default PCRE implementation
>>  - grep: remove support for concurrent use of both PCRE v1 & v2
>>  - grep: add support for PCRE v2
>>  - grep: add support for the PCRE v1 JIT API
>>  - perf: add a performance comparison test of grep -E and -P
>>  - grep: change the internal PCRE code & header names to be PCRE1
>>  - grep: change the internal PCRE macro names to be PCRE1
>>  - test-lib: rename the LIBPCRE prerequisite to PCRE
>>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
>>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
>>  - log: add -P as a synonym for --perl-regexp
>>  - log: add exhaustive tests for pattern style options & config
>>  - grep: add a test for backreferences in PCRE patterns
>>  - Makefile & configure: reword outdated comment about PCRE
>>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
>>  - grep: remove redundant regflags assignment under PCRE
>>  - grep: submodule-related case statements should die if new fields are added
>>  - grep: add tests for grep pattern types being passed to submodules
>>  - grep: amend submodule recursion test in preparation for rx engine testing
>>
>>  PCRE2, which has an API different from and incompatible with PCRE,
>>  can now be chosen to support "grep -P -e ''" and friends.
>
> FWIW for quite a couple of recent builds, `pu` fails on Windows with a
> variation of this error:
>
> CC blob.o
> In file included from revision.h:5:0,
>  from bisect.c:4:
> grep.h:16:19: fatal error: pcre2.h: No such file or directory
>  #include 
>^
> compilation terminated.
>
> Maybe this can be fixed before hitting `next`?

This will be due to a combination of the build machine not having pcre
v2 (but having v1) & my "Makefile & configure: make PCRE v2 the
default PCRE implementation" patch, which makes v2 the default for
USE_LIBPCRE=YesPlease.

Is it easy to install v2 on these build machines? Alternatively that
patch could be ejected out of pu, or you could USE_LIBPCRE1=YesPlease
to use v1, but as explained in that commit I think it makes sense to
make v2 the default.


PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-02 Thread Johannes Schindelin
Hi Ævar,

On Sun, 30 Apr 2017, Junio C Hamano wrote:

> * ab/grep-pcre-v2 (2017-04-25) 20 commits
>  - SQUASH???
>  - Makefile & configure: make PCRE v2 the default PCRE implementation
>  - grep: remove support for concurrent use of both PCRE v1 & v2
>  - grep: add support for PCRE v2
>  - grep: add support for the PCRE v1 JIT API
>  - perf: add a performance comparison test of grep -E and -P
>  - grep: change the internal PCRE code & header names to be PCRE1
>  - grep: change the internal PCRE macro names to be PCRE1
>  - test-lib: rename the LIBPCRE prerequisite to PCRE
>  - grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
>  - grep & rev-list doc: stop promising libpcre for --perl-regexp
>  - log: add -P as a synonym for --perl-regexp
>  - log: add exhaustive tests for pattern style options & config
>  - grep: add a test for backreferences in PCRE patterns
>  - Makefile & configure: reword outdated comment about PCRE
>  - grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
>  - grep: remove redundant regflags assignment under PCRE
>  - grep: submodule-related case statements should die if new fields are added
>  - grep: add tests for grep pattern types being passed to submodules
>  - grep: amend submodule recursion test in preparation for rx engine testing
> 
>  PCRE2, which has an API different from and incompatible with PCRE,
>  can now be chosen to support "grep -P -e ''" and friends.

FWIW for quite a couple of recent builds, `pu` fails on Windows with a
variation of this error:

CC blob.o
In file included from revision.h:5:0,
 from bisect.c:4:
grep.h:16:19: fatal error: pcre2.h: No such file or directory
 #include 
   ^
compilation terminated.

Maybe this can be fixed before hitting `next`?

Ciao,
Dscho