Re: [ast-users] {x, y} in ere should match x times but no more than y times but does not work in ast-ksh.20120612

Roland Mainz Mon, 18 Jun 2012 12:20:12 -0700

On Mon, Jun 18, 2012 at 9:14 PM, Glenn Fowler <[email protected]> wrote:
> On Mon, 18 Jun 2012 20:50:43 +0200 Roland Mainz wrote:
>> On Mon, Jun 18, 2012 at 6:07 PM, Glenn Fowler <[email protected]> wrote:
>> > On Mon, 18 Jun 2012 12:01:25 -0400 Glenn Fowler wrote:
>> >> On Mon, 18 Jun 2012 17:28:43 +0200 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= 
>> >> wrote:
>> >> > On Mon, Jun 18, 2012 at 5:21 PM, Glenn Fowler <[email protected]> 
>> >> > wrote:
>> >> > >
>> >> > > On Mon, 18 Jun 2012 17:04:44 +0200 
>> >> > > =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= wrote:
>> >> > >> from what I understand a {x,y} in extended regular expressions should
>> >> > >> match x times but no more than y times. But ksh (ast-ksh.20120612)
>> >> > >> returns no matches at all:
>> >> > >> ksh -c 's="abbbc" ; d="${s/~(E)b{2,4}/dummy}" ; print -v .sh.match'
>> >> > >
>> >> > >> Is this a bug?
>> >> > >
>> >> > > first run with -x to checjk the parse
>> >> > >
>> >> > >        ksh -cx 's="abbbc" ; d="${s/~(E)b{2,4}/dummy}" ; print -v 
>> >> > > .sh.match'
>> >> > >
>> >> > > and it does show a problem
>> >> > > --
>> >> > > +t+ s=bbb
>> >> > > +t+ d='bbb/dummy}' <======
>> >> > > +t+ print -v .sh.match
>> >> > > --
>> >> > >
>> >> > > we can double verify that the regex is ok by using the regex test 
>> >> > > harness
>> >> > > --
>> >> > > bin/package use
>> >> > > cd re
>> >> > > print $'K\t~(E)b{2,4}\tabbbc\t(1,4)' > t.dat
>> >> > > ./testregex t.dat
>> >> > > --
>> >> > >
>> >> > > so it looks like a battle between the 2 '}' in the ${...} expansion
>> >> >
>> >> > So what should I do? Escape the } and {?
>> >
>> >> aha
>> >> not sure
>> >> it looks like it involves the ksh lexer/parser and how it handles the
>> >> tokenization implications of ~(...) mid-stream
>> >> dgk and I will talk about it this afternoon
>> >
>> > until the lex/parse is resolved you can put the pattern in a separate var
>> >
>> > ksh -cx 's="abbbc" ; p="~(E)b{2,4}" ; d="${s/$p/dummy}" ; print -v 
>> > .sh.match ; print -v $d'
>
>> Mhhh... what about _always_ requiring to have "such" [1] special
>> characters quoted ? AFAIK we had this issue two or three times before
>> this one and maybe we should try a
>> catch-them-all-by-force-to-quote-them-all solution (this would/should
>> safeguard against future pattern system enhancements, too) ... :-)
>
>> [1]=The list shoul be defined once (by crawling over the whole ASCII
>> range and define what has to be quoted) and listed in the ksh93(1)
>> manual page.
>
> ksh already handles ~(E) changing the lexer on the fly
> hopefully this is just another place that is easy to fix
>
> otherwise we risk descending into a nroff-like \ hell
> because regex and the RE families it supports has its own
> often family-specific quoting rules


Mhhh... my concern was that if we later add a new pattern system with
"yet unknown" syntax... how do we handle that (e.g. Olga
(un-)fortunately found the "agrep" (approximate grep, see
http://laurikari.net/tre/about/) thing in grep.c... noone knows where
this will lead or which syntax it will bring with it... ;-) ) ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [email protected]
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)

_______________________________________________
ast-users mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-users

Re: [ast-users] {x, y} in ere should match x times but no more than y times but does not work in ast-ksh.20120612

Reply via email to