On Sat, 23 Jun 2012 22:38:36 +0200 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= wrote: > I do not see a mistake in the regular expression itself. Either it's a > hard to spot quoting issue or a full bug in ksh93 or libast regex.
> Glenn, what do you think? to rule out any possible ksh quoting conflict express the pattern and subject string as a testregex test that will make it easy to rule in/out regex itself > Olga > On Sat, Jun 23, 2012 at 9:08 AM, Lionel Cons > <[email protected]> wrote: > > On 23 June 2012 06:40, Roland Mainz <[email protected]> wrote: > >> On Sat, Jun 23, 2012 at 6:22 AM, Roland Mainz <[email protected]> > >> wrote: > >>> On Sat, Jun 23, 2012 at 5:55 AM, Glenn Fowler <[email protected]> > >>> wrote: > >>>> On Sat, 23 Jun 2012 03:40:15 +0200 Roland Mainz wrote: > >>>>> On Sat, Jun 23, 2012 at 2:34 AM, Roland Mainz > >>>>> <[email protected]> wrote: > >>>>> > Here's another issue with regex. We wrote the following script to > >>>>> > parse XML fragments: > >>>>> > -- snip -- > >>>>> > typeset -r xmltext='<h1 ><div> a text </div>More [TEXT].<!-- a comment > >>>>> > (<disabled>) --></h1>' > >>>>> > > >>>>> > # > >>>>> > # parse the XML data > >>>>> > # > >>>>> > typeset dummy > >>>>> > > >>>>> > dummy="${xmltext//~(Ex)(?: > >>>>> > (<!--.+-->)+?| # xml comments > >>>>> > (<[[:alnum:]_-:]+ > >>>>> > (?: # attributes > >>>>> > [:space:]+ > >>>> > >>>>> Grumpf... this should be [[:space:]]+ ... > >>>>> > (?:[[:alnum:]_-:]+=[^[:space:]\"\']+)| > >>>>> > #x='foo=bar huz=123' > >>>>> > (?:[[:alnum:]_-:]+=\"[^\"]*\")| > >>>>> > #x='foo="ba=r o" huz=123' > >>>>> > (?:[[:alnum:]_-:]+=\'[^\"]*\')| > >>>>> > #x="foo='ba=r o' huz=123" > >>>>> > (?:[[:alnum:]_-:]+) > >>>>> > #x="foox huz=123" > >>>>> > )* > >>>>> > [:space:]* > >>>> > >>>>> Grumpf... this should be [[:space:]]* ... > >>>> > >>>>> Erm... David/Glenn... can we get a ~(<modifer>) flag which enables... > >>>>> 1. ... strict pattern interpretation > >>>>> 2. ... forces (controlled by ~(<modifer>) ... unless there's something > >>>>> else which already enabled that elsewhere (like a global shell > >>>>> option)) ksh93 to print runtime error messages if a pattern fails to > >>>>> compile > >>>>> ... please ? > >>>> > >>>> well again I think this is regex giving users enough rope to do whatever > >> > >> BTW: Can you look at the script in > >> http://opensolaris.pastebin.ca/2164009 and tell me why it works when > >> the variable "working" is set to "true" and fails when the variable is > >> set to "false" ? The difference is the style of XML attribute value > >> quoting used in the embedded test string, e.g. ... > >> -- snip -- > >> if ${working} ; then > >> typeset -r xmltext=$'<h1 style=\'foo\' h=\'bar\'><div> a text > >> </div>More [TEXT].<!-- a comment (<disabled>) --></h1>' > >> else > >> typeset -r xmltext=$'<h1 style=\'foo\' h="bar"><div> a text > >> </div>More [TEXT].<!-- a comment (<disabled>) --></h1>' > >> fi > >> -- snip -- > >> > >> I don't see why name='value' should be matched differently than > >> name="value" by this pattern in the script: > >> -- snip -- > >> dummy="${xmltext//~(Ex)(?: > >> (<!--.+-->)+?| # xml comments > >> (<[[:alnum:]_-:]+ > >> (?: # attributes > >> [[:space:]]+ > >> (?:[[:alnum:]_-:]+=[^[:space:]\"]+?)| #x='foo=bar > >> huz=123' > >> (?:[[:alnum:]_-:]+=\"[^\"]*?\")| > >> #x='foo="ba=r o" huz=123' > >> (?:[[:alnum:]_-:]+=\'[^\']*?\')| > >> #x="foo='ba=r o' huz=123" > >> (?:[[:alnum:]_-:]+) #x="foox > >> huz=123" > >> )* > >> [[:space:]]* > >> \/? # start tags which are end tags, too (like <foo\/>) > >> >)+?| # xml start tags > >> (<\/[[:alnum:]_-:]+>)+?| # xml end tags > >> ([^><]+) # xml text > >> )/D}" > >> -- snip -- > >> > >> Do you see anything suspicious ? > > > > The suspicious part is that your ${s//../..} <expression> is wrapped > > in double-quotes and your matching fails when it tries to match > > double-quotes. I suspect there's something which causes the shell to > > misparse the regex near the double-quotes. > > > > Lionel > > > > _______________________________________________ > > ast-developers mailing list > > [email protected] > > https://mailman.research.att.com/mailman/listinfo/ast-developers > -- > , _ _ , > { \/`o;====- Olga Kryzhanovska -====;o`\/ } > .----'-/`-/ [email protected] \-`\-'----. > `'-..-| / http://twitter.com/fleyta \ |-..-'` > /\/\ Solaris/BSD//C/C++ programmer /\/\ > `--` `--` _______________________________________________ ast-developers mailing list [email protected] https://mailman.research.att.com/mailman/listinfo/ast-developers
