On Tue, Nov 4, 2008 at 10:31 AM, Pete <[EMAIL PROTECTED]> wrote:
> In message <[EMAIL PROTECTED]>
> , Phill Sparks <[EMAIL PROTECTED]> writes
>
>>On Tue, Nov 4, 2008 at 9:29 AM, Pete <[EMAIL PROTECTED]> wrote:
>>> I need it to match with "the first > after you have found <div"
>
>>Hi Pete,
>>
>>I tend to ask reg for any character that is not a >, like this...
>>
>>$rep = "/<div[^>]*>/";
>>
>>Phill
>
> Thanks for that, but I was looking for a more generic answer. "the
> first... (anything) after you have found the start of the phrase". I
> suspected that it was ?, which makes reg "ungreedy", stops it as soon as
> it can. But I couldn't get the syntax correct, so I assumed that I was
> wrong.
>
> I eventually found this:
> "/<div(.*?)>/";
>
> The ? is the magic extra touch that was missing. It means "everything
> from "<div" to the first ">". So I can now use those two as variables
> in a class that will remove whatever I specify.
>

It actually means "find me the smallest possible match between '<div'
and '>', which roughly translates to what you're asking for.  There's
a modifier that can be used which makes all matches ungreedy which
you'd use like this...

"/<div(.*)>/U"; // See
http://php.net/manual/en/reference.pcre.pattern.modifiers.php for more
modifiers

As a side point, if you're not interested in what (.*?) actually
matches you might consider using (?:.*?) or even just .*?  the (?: )
is used to match something that you don't need to use later, it's
slightly more efficient :-)

"/<div.*>/U";  // Much more readable!

Phill

Reply via email to