Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Fri, Jul 31, 2020 at 8:39 AM Li-Wen Hsu wrote: > > On Fri, Jul 31, 2020 at 9:50 AM Kyle Evans wrote: > > > > On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans wrote: > > > > > > On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu wrote: > > > > > > > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > > > > > > > > > Author: kevans > > > > > Date: Wed Jul 29 23:21:56 2020 > > > > > New Revision: 363679 > > > > > URL: https://svnweb.freebsd.org/changeset/base/363679 > > > > > > > > > > Log: > > > > > regex(3): Interpret many escaped ordinary characters as EESCAPE > > > > > > > > > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar > > > > > allows for > > > > > any character to be escaped, but "ORD_CHAR preceded by an unescaped > > > > >character [gives undefined results]". > > > > > > > > > > Historically, we've interpreted an escaped ordinary character as the > > > > > ordinary character itself. This becomes problematic when some > > > > > extensions > > > > > give special meanings to an otherwise ordinary character > > > > > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > > > > > interpretations of the same sequence. > > > > > > > > > > To make this easier to deal with and given that the standard calls > > > > > this > > > > > undefined, we should throw an error (EESCAPE) if we run into this > > > > > scenario > > > > > to ease transition into a state where some escaped ordinaries are > > > > > blessed > > > > > with a special meaning -- it will either error out or have extended > > > > > behavior, rather than have two entirely different versions of > > > > > undefined > > > > > behavior that leave the consumer of regex(3) guessing as to what > > > > > behavior > > > > > will be used or leaving them with false impressions. > > > > > > > > > > This change bumps the symbol version of regcomp to FBSD_1.6 and > > > > > provides the > > > > > old escape semantics for legacy applications, just in case one has > > > > > an older > > > > > application that would immediately turn into a pumpkin because of an > > > > > extraneous escape that's embedded or otherwise critical to its > > > > > operation. > > > > > > > > > > This is the final piece needed before enhancing libregex with GNU > > > > > extensions > > > > > and flipping the switch on bsdgrep. > > > > > > > > > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > > > > > > > > > PR: 229925 (exp-run, courtesy of antoine) > > > > > Differential Revision:https://reviews.freebsd.org/D10510 > > > > > > > > > > Modified: > > > > > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > > > > > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > > > > > head/lib/libc/regex/Symbol.map > > > > > head/lib/libc/regex/regcomp.c > > > > > > > > I think there are 3 test cases need to be modified after this change: > > > > > > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ > > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ > > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ > > > > > > > > > > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock. > > > > > > Testing my libregex GNU extensions revealed that I'm really not ready > > > to commit that just yet. We have two options here for googletest: > > > > > > 1. Disable it and create a PR to be fixed when my changes are done, > > > hopefully by the end of the week, or > > > 2. Fix the expressions in > > > contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX > > > compliant and upstream that. > > > > > > #2 is generally a replacement of \w -> [[:alnum:]] and \W -> > > > [^[:alnum:]] and maybe \s -> [[:space:]]. > > > > > > > Sorry, to be more precise: disable it meaning expect failure of that > > specific test or something similar. > > I think there's no need to let a known issue generate lots of failure > reports for more than 24 hours, I suggest let's go with 1) first. For > 2), It's also good that both libregex and googletest can aware the > difference between POSIX and GNU extensions, but I am not sure how > upstream thinks about this. Still worth trying, though. > Sure- if you have time and no one objects, please proceed with #1 (no time at the moment myself) and I'll get it fixed this weekend, even if I have to hold back implementation of some of the GNU extensions to nab the few googletest's tests care about. Thanks, Kyle Evans ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Fri, Jul 31, 2020 at 9:50 AM Kyle Evans wrote: > > On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans wrote: > > > > On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu wrote: > > > > > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > > > > > > > Author: kevans > > > > Date: Wed Jul 29 23:21:56 2020 > > > > New Revision: 363679 > > > > URL: https://svnweb.freebsd.org/changeset/base/363679 > > > > > > > > Log: > > > > regex(3): Interpret many escaped ordinary characters as EESCAPE > > > > > > > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows > > > > for > > > > any character to be escaped, but "ORD_CHAR preceded by an unescaped > > > >character [gives undefined results]". > > > > > > > > Historically, we've interpreted an escaped ordinary character as the > > > > ordinary character itself. This becomes problematic when some > > > > extensions > > > > give special meanings to an otherwise ordinary character > > > > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > > > > interpretations of the same sequence. > > > > > > > > To make this easier to deal with and given that the standard calls > > > > this > > > > undefined, we should throw an error (EESCAPE) if we run into this > > > > scenario > > > > to ease transition into a state where some escaped ordinaries are > > > > blessed > > > > with a special meaning -- it will either error out or have extended > > > > behavior, rather than have two entirely different versions of > > > > undefined > > > > behavior that leave the consumer of regex(3) guessing as to what > > > > behavior > > > > will be used or leaving them with false impressions. > > > > > > > > This change bumps the symbol version of regcomp to FBSD_1.6 and > > > > provides the > > > > old escape semantics for legacy applications, just in case one has an > > > > older > > > > application that would immediately turn into a pumpkin because of an > > > > extraneous escape that's embedded or otherwise critical to its > > > > operation. > > > > > > > > This is the final piece needed before enhancing libregex with GNU > > > > extensions > > > > and flipping the switch on bsdgrep. > > > > > > > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > > > > > > > PR: 229925 (exp-run, courtesy of antoine) > > > > Differential Revision:https://reviews.freebsd.org/D10510 > > > > > > > > Modified: > > > > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > > > > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > > > > head/lib/libc/regex/Symbol.map > > > > head/lib/libc/regex/regcomp.c > > > > > > I think there are 3 test cases need to be modified after this change: > > > > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ > > > > > > > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock. > > > > Testing my libregex GNU extensions revealed that I'm really not ready > > to commit that just yet. We have two options here for googletest: > > > > 1. Disable it and create a PR to be fixed when my changes are done, > > hopefully by the end of the week, or > > 2. Fix the expressions in > > contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX > > compliant and upstream that. > > > > #2 is generally a replacement of \w -> [[:alnum:]] and \W -> > > [^[:alnum:]] and maybe \s -> [[:space:]]. > > > > Sorry, to be more precise: disable it meaning expect failure of that > specific test or something similar. I think there's no need to let a known issue generate lots of failure reports for more than 24 hours, I suggest let's go with 1) first. For 2), It's also good that both libregex and googletest can aware the difference between POSIX and GNU extensions, but I am not sure how upstream thinks about this. Still worth trying, though. Best, Li-Wen ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans wrote: > > On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu wrote: > > > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > > > > > Author: kevans > > > Date: Wed Jul 29 23:21:56 2020 > > > New Revision: 363679 > > > URL: https://svnweb.freebsd.org/changeset/base/363679 > > > > > > Log: > > > regex(3): Interpret many escaped ordinary characters as EESCAPE > > > > > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows > > > for > > > any character to be escaped, but "ORD_CHAR preceded by an unescaped > > >character [gives undefined results]". > > > > > > Historically, we've interpreted an escaped ordinary character as the > > > ordinary character itself. This becomes problematic when some extensions > > > give special meanings to an otherwise ordinary character > > > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > > > interpretations of the same sequence. > > > > > > To make this easier to deal with and given that the standard calls this > > > undefined, we should throw an error (EESCAPE) if we run into this > > > scenario > > > to ease transition into a state where some escaped ordinaries are > > > blessed > > > with a special meaning -- it will either error out or have extended > > > behavior, rather than have two entirely different versions of undefined > > > behavior that leave the consumer of regex(3) guessing as to what > > > behavior > > > will be used or leaving them with false impressions. > > > > > > This change bumps the symbol version of regcomp to FBSD_1.6 and > > > provides the > > > old escape semantics for legacy applications, just in case one has an > > > older > > > application that would immediately turn into a pumpkin because of an > > > extraneous escape that's > > > embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded > > > or otherwise critical to its operation. > > > > > > This is the final piece needed before enhancing libregex with GNU > > > extensions > > > and flipping the switch on bsdgrep. > > > > > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > > > > > PR: 229925 (exp-run, courtesy of antoine) > > > Differential Revision:https://reviews.freebsd.org/D10510 > > > > > > Modified: > > > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > > > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > > > head/lib/libc/regex/Symbol.map > > > head/lib/libc/regex/regcomp.c > > > > I think there are 3 test cases need to be modified after this change: > > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ > > > > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock. > > Testing my libregex GNU extensions revealed that I'm really not ready > to commit that just yet. We have two options here for googletest: > > 1. Disable it and create a PR to be fixed when my changes are done, > hopefully by the end of the week, or > 2. Fix the expressions in > contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX > compliant and upstream that. > > #2 is generally a replacement of \w -> [[:alnum:]] and \W -> > [^[:alnum:]] and maybe \s -> [[:space:]]. > Sorry, to be more precise: disable it meaning expect failure of that specific test or something similar. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu wrote: > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > > > Author: kevans > > Date: Wed Jul 29 23:21:56 2020 > > New Revision: 363679 > > URL: https://svnweb.freebsd.org/changeset/base/363679 > > > > Log: > > regex(3): Interpret many escaped ordinary characters as EESCAPE > > > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for > > any character to be escaped, but "ORD_CHAR preceded by an unescaped > >character [gives undefined results]". > > > > Historically, we've interpreted an escaped ordinary character as the > > ordinary character itself. This becomes problematic when some extensions > > give special meanings to an otherwise ordinary character > > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > > interpretations of the same sequence. > > > > To make this easier to deal with and given that the standard calls this > > undefined, we should throw an error (EESCAPE) if we run into this scenario > > to ease transition into a state where some escaped ordinaries are blessed > > with a special meaning -- it will either error out or have extended > > behavior, rather than have two entirely different versions of undefined > > behavior that leave the consumer of regex(3) guessing as to what behavior > > will be used or leaving them with false impressions. > > > > This change bumps the symbol version of regcomp to FBSD_1.6 and provides > > the > > old escape semantics for legacy applications, just in case one has an > > older > > application that would immediately turn into a pumpkin because of an > > extraneous escape that's > > embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded > > or otherwise critical to its operation. > > > > This is the final piece needed before enhancing libregex with GNU > > extensions > > and flipping the switch on bsdgrep. > > > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > > > PR: 229925 (exp-run, courtesy of antoine) > > Differential Revision:https://reviews.freebsd.org/D10510 > > > > Modified: > > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > > head/lib/libc/regex/Symbol.map > > head/lib/libc/regex/regcomp.c > > I think there are 3 test cases need to be modified after this change: > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock. Testing my libregex GNU extensions revealed that I'm really not ready to commit that just yet. We have two options here for googletest: 1. Disable it and create a PR to be fixed when my changes are done, hopefully by the end of the week, or 2. Fix the expressions in contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX compliant and upstream that. #2 is generally a replacement of \w -> [[:alnum:]] and \W -> [^[:alnum:]] and maybe \s -> [[:space:]]. Thoughts? Thanks, Kyle Evans ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Am 30.07.20 um 13:54 schrieb Kyle Evans: > On Thu, Jul 30, 2020 at 6:48 AM Gordon Bergling wrote: >> I got the same error this morning and was able to solve it by doing a full >> buildworld without NO_CLEAN=yes. >> >> You may want to try this in case you are using NO_CLEAN=yes. >> > > This is interesting; there shouldn't be any NO_CLEAN implications with > this change. There were no dependency changes, libc should definitely > get rebuilt because regcomp.c changed and thus, the libc in your > objdir should have the symbol. The binary referenced above is one that > we symlink into OBJDIR from the host system. > > I think it's also likely your problem was just fixed by the second > installworld. The first one will manage to get libc installed, but not > before you get errors from all the other stuff. This appears to be true: after once completing installworld with WITHOUT_TESTS=yes the build and installation does also succeed for subsequent runs with WITH_TESTS=yes. My guess is that "make install" in tests tries to link against the base system version of the library and the freshly built one with the correct symbol version has not been installed, yet. Regards, STefan ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Am 30.07.20 um 13:48 schrieb Gordon Bergling: > On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote: >> Am 30.07.20 um 01:21 schrieb Kyle Evans: >> [...] >>> This change bumps the symbol version of regcomp to FBSD_1.6 and provides >>> the >>> old escape semantics for legacy applications, just in case one has an >>> older >>> application that would immediately turn into a pumpkin because of an >>> extraneous escape that's embedded or otherwise critical to its operation. >> >> I get an error during make buildworld with option WITH_TESTS=yes: >> >> ===> usr.bin/bmake/tests (install) >> ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined >> symbol "regcomp@FBSD_1.6" >> >> Regards, STefan > > I got the same error this morning and was able to solve it by doing a full > buildworld without NO_CLEAN=yes. > > You may want to try this in case you are using NO_CLEAN=yes. Too late ... but thanks for the hint ... I have restarted make buildworld installworld on an unmodified source tree from when the error occurred and it just finished, without error this time. Maybe that it will work with WITH_TESTS too, now - I'll start another build/install cycle now and will report back. If it does not work, I'll try without NO_CLEAN (I'm building with META_MODE enabled, normally). Regards, STefan ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Thu, Jul 30, 2020 at 6:48 AM Gordon Bergling wrote: > > On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote: > > Am 30.07.20 um 01:21 schrieb Kyle Evans: > > [...] > > > This change bumps the symbol version of regcomp to FBSD_1.6 and > > > provides the > > > old escape semantics for legacy applications, just in case one has an > > > older > > > application that would immediately turn into a pumpkin because of an > > > extraneous escape that's embedded or otherwise critical to its > > > operation. > > > > I get an error during make buildworld with option WITH_TESTS=yes: > > > > ===> usr.bin/bmake/tests (install) > > ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined > > symbol "regcomp@FBSD_1.6" > > > > Regards, STefan > > I got the same error this morning and was able to solve it by doing a full > buildworld without NO_CLEAN=yes. > > You may want to try this in case you are using NO_CLEAN=yes. > This is interesting; there shouldn't be any NO_CLEAN implications with this change. There were no dependency changes, libc should definitely get rebuilt because regcomp.c changed and thus, the libc in your objdir should have the symbol. The binary referenced above is one that we symlink into OBJDIR from the host system. I think it's also likely your problem was just fixed by the second installworld. The first one will manage to get libc installed, but not before you get errors from all the other stuff. Thanks, Kyle Evans ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote: > Am 30.07.20 um 01:21 schrieb Kyle Evans: > [...] > > This change bumps the symbol version of regcomp to FBSD_1.6 and provides > > the > > old escape semantics for legacy applications, just in case one has an > > older > > application that would immediately turn into a pumpkin because of an > > extraneous escape that's embedded or otherwise critical to its operation. > > I get an error during make buildworld with option WITH_TESTS=yes: > > ===> usr.bin/bmake/tests (install) > ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined > symbol "regcomp@FBSD_1.6" > > Regards, STefan I got the same error this morning and was able to solve it by doing a full buildworld without NO_CLEAN=yes. You may want to try this in case you are using NO_CLEAN=yes. --Gordon ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Thu, Jul 30, 2020 at 6:26 AM Stefan Eßer wrote: > > Am 30.07.20 um 01:21 schrieb Kyle Evans: > [...] > > This change bumps the symbol version of regcomp to FBSD_1.6 and provides > > the > > old escape semantics for legacy applications, just in case one has an > > older > > application that would immediately turn into a pumpkin because of an > > extraneous escape that's embedded or otherwise critical to its operation. > > I get an error during make buildworld with option WITH_TESTS=yes: > > ===> usr.bin/bmake/tests (install) > ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined > symbol "regcomp@FBSD_1.6" > > Regards, STefan Hi, Can you describe the environment in which you're running installworld, please? i.e. is it just a raw installworld directly in your shell, or something more complicated? I observed this in testing an exceptional scenario; running installworld in a buildenv. installworld injects .WAIT between lib and libexec + other subdirs, which is supposed to prevent stuff like this (new binary got installed linked against new libc before new libc). Running in a buildenv set SYSROOT and stripped out the .WAITs, leaving me with an annoyance where I had to installworld twice. Thanks, Kyle Evans ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Am 30.07.20 um 01:21 schrieb Kyle Evans: [...] > This change bumps the symbol version of regcomp to FBSD_1.6 and provides the > old escape semantics for legacy applications, just in case one has an older > application that would immediately turn into a pumpkin because of an > extraneous escape that's embedded or otherwise critical to its operation. I get an error during make buildworld with option WITH_TESTS=yes: ===> usr.bin/bmake/tests (install) ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined symbol "regcomp@FBSD_1.6" Regards, STefan ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Sorry, on mobile, so doubling down on bad formatting by top-posting... The sed/diff tests are easy to fix, will do those in about 8/9 hours. The Google test failure is interesting- this expression has clearly been wrong and getting the wrong results, so we've caught a legitimate issue here. I think the best path forward for that one is to commit my libregex extensions and link that baby up so that \w works. Thanks, Kyle Evans On Wed, Jul 29, 2020, 22:53 Li-Wen Hsu wrote: > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > > > Author: kevans > > Date: Wed Jul 29 23:21:56 2020 > > New Revision: 363679 > > URL: https://svnweb.freebsd.org/changeset/base/363679 > > > > Log: > > regex(3): Interpret many escaped ordinary characters as EESCAPE > > > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows > for > > any character to be escaped, but "ORD_CHAR preceded by an unescaped > >character [gives undefined results]". > > > > Historically, we've interpreted an escaped ordinary character as the > > ordinary character itself. This becomes problematic when some > extensions > > give special meanings to an otherwise ordinary character > > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > > interpretations of the same sequence. > > > > To make this easier to deal with and given that the standard calls this > > undefined, we should throw an error (EESCAPE) if we run into this > scenario > > to ease transition into a state where some escaped ordinaries are > blessed > > with a special meaning -- it will either error out or have extended > > behavior, rather than have two entirely different versions of undefined > > behavior that leave the consumer of regex(3) guessing as to what > behavior > > will be used or leaving them with false impressions. > > > > This change bumps the symbol version of regcomp to FBSD_1.6 and > provides the > > old escape semantics for legacy applications, just in case one has an > older > > application that would immediately turn into a pumpkin because of an > > extraneous escape that's embehttps:// > ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded > or otherwise critical to its operation. > > > > This is the final piece needed before enhancing libregex with GNU > extensions > > and flipping the switch on bsdgrep. > > > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > > > PR: 229925 (exp-run, courtesy of antoine) > > Differential Revision:https://reviews.freebsd.org/D10510 > > > > Modified: > > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > > head/lib/libc/regex/Symbol.map > > head/lib/libc/regex/regcomp.c > > I think there are 3 test cases need to be modified after this change: > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ > > Please help to check them, thanks! > > Li-Wen > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans wrote: > > Author: kevans > Date: Wed Jul 29 23:21:56 2020 > New Revision: 363679 > URL: https://svnweb.freebsd.org/changeset/base/363679 > > Log: > regex(3): Interpret many escaped ordinary characters as EESCAPE > > In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for > any character to be escaped, but "ORD_CHAR preceded by an unescaped >character [gives undefined results]". > > Historically, we've interpreted an escaped ordinary character as the > ordinary character itself. This becomes problematic when some extensions > give special meanings to an otherwise ordinary character > (e.g. GNU's \b, \s, \w), meaning we may have two different valid > interpretations of the same sequence. > > To make this easier to deal with and given that the standard calls this > undefined, we should throw an error (EESCAPE) if we run into this scenario > to ease transition into a state where some escaped ordinaries are blessed > with a special meaning -- it will either error out or have extended > behavior, rather than have two entirely different versions of undefined > behavior that leave the consumer of regex(3) guessing as to what behavior > will be used or leaving them with false impressions. > > This change bumps the symbol version of regcomp to FBSD_1.6 and provides the > old escape semantics for legacy applications, just in case one has an older > application that would immediately turn into a pumpkin because of an > extraneous escape that's > embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded > or otherwise critical to its operation. > > This is the final piece needed before enhancing libregex with GNU extensions > and flipping the switch on bsdgrep. > > [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ > > PR: 229925 (exp-run, courtesy of antoine) > Differential Revision:https://reviews.freebsd.org/D10510 > > Modified: > head/contrib/netbsd-tests/lib/libc/regex/data/meta.in > head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in > head/lib/libc/regex/Symbol.map > head/lib/libc/regex/regcomp.c I think there are 3 test cases need to be modified after this change: https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/ Please help to check them, thanks! Li-Wen ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex
Author: kevans Date: Wed Jul 29 23:21:56 2020 New Revision: 363679 URL: https://svnweb.freebsd.org/changeset/base/363679 Log: regex(3): Interpret many escaped ordinary characters as EESCAPE In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for any character to be escaped, but "ORD_CHAR preceded by an unescaped character [gives undefined results]". Historically, we've interpreted an escaped ordinary character as the ordinary character itself. This becomes problematic when some extensions give special meanings to an otherwise ordinary character (e.g. GNU's \b, \s, \w), meaning we may have two different valid interpretations of the same sequence. To make this easier to deal with and given that the standard calls this undefined, we should throw an error (EESCAPE) if we run into this scenario to ease transition into a state where some escaped ordinaries are blessed with a special meaning -- it will either error out or have extended behavior, rather than have two entirely different versions of undefined behavior that leave the consumer of regex(3) guessing as to what behavior will be used or leaving them with false impressions. This change bumps the symbol version of regcomp to FBSD_1.6 and provides the old escape semantics for legacy applications, just in case one has an older application that would immediately turn into a pumpkin because of an extraneous escape that's embedded or otherwise critical to its operation. This is the final piece needed before enhancing libregex with GNU extensions and flipping the switch on bsdgrep. [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/ PR: 229925 (exp-run, courtesy of antoine) Differential Revision:https://reviews.freebsd.org/D10510 Modified: head/contrib/netbsd-tests/lib/libc/regex/data/meta.in head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in head/lib/libc/regex/Symbol.map head/lib/libc/regex/regcomp.c Modified: head/contrib/netbsd-tests/lib/libc/regex/data/meta.in == --- head/contrib/netbsd-tests/lib/libc/regex/data/meta.in Wed Jul 29 23:17:16 2020(r363678) +++ head/contrib/netbsd-tests/lib/libc/regex/data/meta.in Wed Jul 29 23:21:56 2020(r363679) @@ -4,7 +4,9 @@ a[bc]d & abd abd a\*c & a*c a*c a\\b & a\b a\b a\\\*b & a\*ba\*b -a\bc & abc abc +# Begin FreeBSD +a\bc &C EESCAPE +# End FreeBSD a\ &C EESCAPE a\\bc & a\bca\bc \{ bC BADRPT Modified: head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in == --- head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in Wed Jul 29 23:17:16 2020(r363678) +++ head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in Wed Jul 29 23:21:56 2020(r363679) @@ -12,7 +12,7 @@ a(b+)c- abbbc abbbc bbb a(b*)c - ac ac @c (a|ab)(bc([de]+)f|cde) - abcdef abcdef a,bcdef,de # Begin FreeBSD -a\(b\|c\)d b ab|cd ab|cd b|c +a\(b|c\)d b ab|cd ab|cd b|c # End FreeBSD # the regression tester only asks for 9 subexpressions a(b)(c)(d)(e)(f)(g)(h)(i)(j)k - abcdefghijk abcdefghijk b,c,d,e,f,g,h,i,j Modified: head/lib/libc/regex/Symbol.map == --- head/lib/libc/regex/Symbol.map Wed Jul 29 23:17:16 2020 (r363678) +++ head/lib/libc/regex/Symbol.map Wed Jul 29 23:21:56 2020 (r363679) @@ -3,8 +3,11 @@ */ FBSD_1.0 { - regcomp; regerror; regexec; regfree; +}; + +FBSD_1.6 { + regcomp; }; Modified: head/lib/libc/regex/regcomp.c == --- head/lib/libc/regex/regcomp.c Wed Jul 29 23:17:16 2020 (r363678) +++ head/lib/libc/regex/regcomp.c Wed Jul 29 23:21:56 2020 (r363679) @@ -102,11 +102,14 @@ struct parse { sopno pend[NPAREN]; /* -> ) ([0] unused) */ bool allowbranch; /* can this expression branch? */ bool bre; /* convenience; is this a BRE? */ + int pflags; /* other parsing flags -- legacy escapes? */ bool (*parse_expr)(struct parse *, struct branchc *); void (*pre_parse)(struct parse *, struct branchc *); void (*post_parse)(struct parse *, struct branchc *); }; +#define PFLAG_LEGACY_ESC 0x0001 + /* = begin header generated by ./mkh = */ #ifdef __cplusplus extern "C" { @@ -132,6 +135,7 @@ static void p_b_cclass(struct parse *p, cset *cs); static void p_b_eclass(struct parse *p, cset *cs)