I do not see a mistake in the regular expression itself. Either it's a
hard to spot quoting issue or a full bug in ksh93 or libast regex.

Glenn, what do you think?

Olga

On Sat, Jun 23, 2012 at 9:08 AM, Lionel Cons
<[email protected]> wrote:
> On 23 June 2012 06:40, Roland Mainz <[email protected]> wrote:
>> On Sat, Jun 23, 2012 at 6:22 AM, Roland Mainz <[email protected]> 
>> wrote:
>>> On Sat, Jun 23, 2012 at 5:55 AM, Glenn Fowler <[email protected]> wrote:
>>>> On Sat, 23 Jun 2012 03:40:15 +0200 Roland Mainz wrote:
>>>>> On Sat, Jun 23, 2012 at 2:34 AM, Roland Mainz <[email protected]> 
>>>>> wrote:
>>>>> > Here's another issue with regex. We wrote the following script to
>>>>> > parse XML fragments:
>>>>> > -- snip --
>>>>> > typeset -r xmltext='<h1 ><div> a text </div>More [TEXT].<!-- a comment
>>>>> > (<disabled>) --></h1>'
>>>>> >
>>>>> > #
>>>>> > # parse the XML data
>>>>> > #
>>>>> > typeset dummy
>>>>> >
>>>>> > dummy="${xmltext//~(Ex)(?:
>>>>> >        (<!--.+-->)+?|  # xml comments
>>>>> >        (<[[:alnum:]_-:]+
>>>>> >                (?: # attributes
>>>>> >                        [:space:]+
>>>>
>>>>> Grumpf... this should be [[:space:]]+ ...
>>>>> >                        (?:[[:alnum:]_-:]+=[^[:space:]\"\']+)|  
>>>>> > #x='foo=bar huz=123'
>>>>> >                        (?:[[:alnum:]_-:]+=\"[^\"]*\")|         
>>>>> > #x='foo="ba=r o" huz=123'
>>>>> >                        (?:[[:alnum:]_-:]+=\'[^\"]*\')|         
>>>>> > #x="foo='ba=r o' huz=123"
>>>>> >                        (?:[[:alnum:]_-:]+)                     #x="foox 
>>>>> > huz=123"
>>>>> >                )*
>>>>> >                [:space:]*
>>>>
>>>>> Grumpf... this should be [[:space:]]* ...
>>>>
>>>>> Erm... David/Glenn... can we get a ~(<modifer>) flag which enables...
>>>>> 1. ... strict pattern interpretation
>>>>> 2. ... forces (controlled by ~(<modifer>) ... unless there's something
>>>>> else which already enabled that elsewhere (like a global shell
>>>>> option)) ksh93 to print runtime error messages if a pattern fails to
>>>>> compile
>>>>> ... please ?
>>>>
>>>> well again I think this is regex giving users enough rope to do whatever
>>
>> BTW: Can you look at the script in
>> http://opensolaris.pastebin.ca/2164009 and tell me why it works when
>> the variable "working" is set to "true" and fails when the variable is
>> set to "false" ? The difference is the style of XML attribute value
>> quoting used in the embedded test string, e.g. ...
>> -- snip --
>> if ${working} ; then
>>        typeset -r xmltext=$'<h1 style=\'foo\' h=\'bar\'><div> a text
>> </div>More [TEXT].<!-- a comment (<disabled>) --></h1>'
>> else
>>        typeset -r xmltext=$'<h1 style=\'foo\' h="bar"><div> a text
>> </div>More [TEXT].<!-- a comment (<disabled>) --></h1>'
>> fi
>> -- snip --
>>
>> I don't see why name='value' should be matched differently than
>> name="value" by this pattern in the script:
>> -- snip --
>> dummy="${xmltext//~(Ex)(?:
>>        (<!--.+-->)+?|  # xml comments
>>        (<[[:alnum:]_-:]+
>>                (?: # attributes
>>                        [[:space:]]+
>>                        (?:[[:alnum:]_-:]+=[^[:space:]\"]+?)|   #x='foo=bar 
>> huz=123'
>>                        (?:[[:alnum:]_-:]+=\"[^\"]*?\")|        #x='foo="ba=r 
>> o" huz=123'
>>                        (?:[[:alnum:]_-:]+=\'[^\']*?\')|        #x="foo='ba=r 
>> o' huz=123"
>>                        (?:[[:alnum:]_-:]+)                     #x="foox 
>> huz=123"
>>                )*
>>                [[:space:]]*
>>                \/?     # start tags which are end tags, too (like <foo\/>)
>>        >)+?|                           # xml start tags
>>        (<\/[[:alnum:]_-:]+>)+?|        # xml end tags
>>        ([^><]+)                        # xml text
>>        )/D}"
>> -- snip --
>>
>> Do you see anything suspicious ?
>
> The suspicious part is that your ${s//../..} <expression> is wrapped
> in double-quotes and your matching fails when it tries to match
> double-quotes. I suspect there's something which causes the shell to
> misparse the regex near the double-quotes.
>
> Lionel
>
> _______________________________________________
> ast-developers mailing list
> [email protected]
> https://mailman.research.att.com/mailman/listinfo/ast-developers



-- 
      ,   _                                    _   ,
     { \/`o;====-    Olga Kryzhanovska   -====;o`\/ }
.----'-/`-/     [email protected]   \-`\-'----.
 `'-..-| /       http://twitter.com/fleyta     \ |-..-'`
      /\/\     Solaris/BSD//C/C++ programmer   /\/\
      `--`                                      `--`

_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to