On Wed 06 Sep, David Corbin wrote:
> Nathan Wiger wrote:
> > 
> > > It would be useful (and increasingly more common) to be able to match
> > > qr|<\s*(\w+)([^>]*)>| to qr|<\s*/\1\s*>|, and handle the case where
> > > those can nest as well.  Something like
> > >
> > > <list>    match this with
> > >    <list>
> > >    </list>   not this but
> > > </list>   this.
> > 
> > I suspect this is going to need a ?[ and ?] of its own. I've been
> > thinking about this since your email on the subject yesterday, and I
> > don't see how either RFC 145 or this alternative method could support
> > it, since there are two tags - > and </ - which are paired
> > asymmetrically, and neither approach gives any credence to what's
> > contained inside the tag. So <tag> would be matched itself as "< matches
> > >".
> 
> Actually, in one of my responses I did outline a syntax which would handle
> this with reasonably ease, I think.  If the contents of (?[) is considered
> a pattern, then you can define a matching pattern.

I think it should be a list of patterns rather than a single pattern.

Each pattern in the list is attempted left to right until one matches.  I now
dont think it should be a hash as it needs to be ordered.  But using the =>
as the l/r separateor does  make it clear.

> 
> m:(?['<\w+>' => '</\1>').*(?]):
> 
> 
> I'll grant you it's not the simplest syntax, but it's a lot simpler than
> using the 5.6 method... :)

Actually that simple case is handled as m:<(\w+)>.*</\1>: but I 
think this is getting somewhere.  This is a rich syntax that has lots of
potential uses, not just for html.

> > 
> > What if we added special XML/HTML-parsing ?< and ?> operators?
> > Unfortunately, as Richard notes, ?> is already taken, but I will use it
> > for the examples to make things symmetrical.
> > 
> >    ?<  =  opening tag (with name specified)
> >    ?>  =  closing tag (matches based on nesting)

We are running out of (? syntax, we might want to find some other construct
before long.  But anyway, XML/HTML is important, but I am not convinced
that what is being covered here really helps.  I am working on an RFC
to allow boolean logic ( && and || and !) to apply a number of patterns to
the same substring to allow easier mining of information out of such
constructs. 

Richard

-- 

[EMAIL PROTECTED]

Reply via email to