Bug#355442: mawk: missing Posix ERE curly braces
On Mon, Aug 23, 2021 at 01:20:27PM -0400, Greg Wooledge wrote: > The version of mawk shipped with bullseye still has this bug (or what > upstream calls a known limitation). yes. I spent some time last year (July-October) working on mawk, and have most of this feature implemented. However, I won't make a new version available until I finish that (since it would only lead to bug reports by end-users). Before that, I intend fixing breakage here: https://github.com/macports/macports-ports/blob/master/editors/vile/Portfile ... and at the moment am working on xterm. > I've also verified that the bug occurs with upstream mawk, version > 1.3.4-20200120, both with and without the --without-builtin-regex > compile time option. > > unicorn:~/tmp/mawk-1.3.4-20200120$ ./configure --without-builtin-regex > [ output omitted ] > unicorn:~/tmp/mawk-1.3.4-20200120$ make > [ output omitted ] > unicorn:~/tmp/mawk-1.3.4-20200120$ echo 99 | ./mawk '/^9{1}/' > unicorn:~/tmp/mawk-1.3.4-20200120$ > > (Output should have been 99.) > > Since it's failing even with --without-builtin-regex I figure this might > be something upstream should hear about, so I'm Cc-ing Mr. Dickey. I get the mail without a cc. -- Thomas E. Dickey https://invisible-island.net ftp://ftp.invisible-island.net signature.asc Description: PGP signature
Bug#355442: mawk: missing Posix ERE curly braces
The version of mawk shipped with bullseye still has this bug (or what upstream calls a known limitation). I've also verified that the bug occurs with upstream mawk, version 1.3.4-20200120, both with and without the --without-builtin-regex compile time option. unicorn:~/tmp/mawk-1.3.4-20200120$ ./configure --without-builtin-regex [ output omitted ] unicorn:~/tmp/mawk-1.3.4-20200120$ make [ output omitted ] unicorn:~/tmp/mawk-1.3.4-20200120$ echo 99 | ./mawk '/^9{1}/' unicorn:~/tmp/mawk-1.3.4-20200120$ (Output should have been 99.) Since it's failing even with --without-builtin-regex I figure this might be something upstream should hear about, so I'm Cc-ing Mr. Dickey.
Bug#355442: mawk: missing Posix ERE curly braces
Hi, Fabien COUTANT wrote: > I am talking about {m}/{m,}/{m,n} at the place where you use ?, * > or +. > > The sample program: > > BEGIN { > s="abacab" > r1="(a[bc]){2,3}" > r2="(a[bc])(a[bc])+" > print s~r1, s~r2 > } > > should print "1 1" but it doesn't. With Aleksey Cheusov’s patch, this can be fixed by using libc regexp, but ideally I would like mawk’s internal regexp to handle this, too. I have a skeleton of an implementation mocked up, but I stopped when the relationship between REcompile and RE_lex started getting ugly and I ran out of time. How should the pair of interval endpoints be passed from RE_lex to REcompile? And how should they be stored on the op_stack? Ideas welcome. Jonathan -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#355442: mawk: missing Posix ERE curly braces
FC> On Monday, 06 March 2006, you (Aleksey Cheusov) wrote: >> Look at this patch >> http://www.mova.org/~cheusov/pub/mawk_external_regexp.patch >> >> It allows to link mawk with external regexp library. >> >> 0 ~>mawk '/^a{3,5}$/' >> aa >> aaa >> aaa >> >> >> a >> a >> aa >> 0 ~> FC> Thanks, it looks interesting, I will try it. Is there any chance FC> it is integrated into mainstream, and become widespread ? (which FC> was, indirectly, the purpose of my bug report) Do not forget to autoconf mawk sources using autoconf 2.13, later versions didn't work for me. It looks like mawk is almost dead. AFAIK nobody maintains it. P.S. Here is my bugreport for mawk http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=314323 It looks very similar to your BR. -- Best regards, Aleksey Cheusov. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#355442: mawk: missing Posix ERE curly braces
On Monday, 06 March 2006, you (Aleksey Cheusov) wrote: > Look at this patch > http://www.mova.org/~cheusov/pub/mawk_external_regexp.patch > > It allows to link mawk with external regexp library. > > 0 ~>mawk '/^a{3,5}$/' > aa > aaa > aaa > > > a > a > aa > 0 ~> > > -- > Best regards, Aleksey Cheusov. Thanks, it looks interesting, I will try it. Is there any chance it is integrated into mainstream, and become widespread ? (which was, indirectly, the purpose of my bug report) -- Hope this helps, Fabien. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#355442: mawk: missing Posix ERE curly braces
FC> Package: mawk FC> Version: 1.3.3-11 FC> Severity: normal FC> mawk claims to comply with Posix 1003.2. I can't check it directly, but FC> checking instead SUSv2 (which I think equals Posix concerning Awk), it FC> mandates that regular expressions support the interval repetition count FC> feature. I am talking about {m}/{m,}/{m,n} FC> at the place where you use ?, * FC> or +. Look at this patch http://www.mova.org/~cheusov/pub/mawk_external_regexp.patch It allows to link mawk with external regexp library. 0 ~>mawk '/^a{3,5}$/' aa aaa aaa a a aa 0 ~> -- Best regards, Aleksey Cheusov. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#355442: mawk: missing Posix ERE curly braces
Package: mawk Version: 1.3.3-11 Severity: normal mawk claims to comply with Posix 1003.2. I can't check it directly, but checking instead SUSv2 (which I think equals Posix concerning Awk), it mandates that regular expressions support the interval repetition count feature. I am talking about {m}/{m,}/{m,n} at the place where you use ?, * or +. The sample program: BEGIN { s="abacab" r1="(a[bc]){2,3}" r2="(a[bc])(a[bc])+" print s~r1, s~r2 } should print "1 1" but it doesn't. This is both a program bug (incomplete regexp implementation) and a documentation bug (not 100% Posix compatible, should be added to the BUGS section unless corrected). -- System Information: Debian Release: 3.1 Architecture: i386 (i686) Kernel: Linux 2.6.15-1-k7-smp Locale: [EMAIL PROTECTED], [EMAIL PROTECTED] (charmap=ISO-8859-15) Versions of packages mawk depends on: ii libc6 2.3.2.ds1-22 GNU C Library: Shared libraries an -- no debconf information -- Hope this helps, Fabien. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]