Aidan Lister wrote:
> Hello list,
>
> I'm pretty terrible with regular expressions, I was wondering if
> someone would be able to help me with this
> http://paste.phpfi.com/31964
>
> The problem is detailed in the above link. Basically I need to match
> the contents of any HTML tag, except a link. I'm pretty sure a
> lookbehind set is needed in the center (%s) bit.
>
> Any suggestions would be appreciated, but it's not quite as simple as
> it sounds - if possible please make sure you run the above script and
> see if it "PASSED".

So basically, you want to put a link around "foo", only if it doesn't
already have one, right?

The problem with look-behind assertions is that they have to be fixed-width.
If you're certain of what kind of data you're going to be dealing with then
this may be sufficient.  For example, I came up with a regex that will PASS
your script but I doubt seriously that it'll be very useful to you as it
would be easy to break it by coming up with various test cases.  For your
single test case, however, this works:

/(?<!<a href="foo">)(?<!<a href=")(foo)/

The problem is that HTML tags can be split across lines...they have have any
variable amount of whitespace within the tag...they can have other
attributes (class, id, onClick), etc.  Since look behind assertions have to
be fixed width it'd be impossible (IMHO) to come up with a single regex that
would match all cases, unless the input data was uniform.  For example,
stuff like

<a   href = "foo" ID="id1" class="redlink"
onClick="javascript:someFunction();">foo</a>

and its infinite variants could not be trapped for with a single regex since
you cannot have an infinite number of fixed width look-behind assertions.
If quantifying modifiers such as '*', '+', and '?' were allowed in
look-behind assertions it would be possible, but they aren't (see "man
perlre").

If your data is coming from unknown sources you'll probably have to use a
full fledged HTML parser to pull out text that isn't already part of an <a>
tag.  I know there are several of these available for perl and I'm sure
there are for PHP too but I'm unaware of them.

Sorry if this isn't terribly helpful.  Maybe I'm overlooking something and
someone else will point out a simple way to accomplish what you're trying to
do...

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to