On Tue, Nov 4, 2008 at 10:31 AM, Pete <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]> > , Phill Sparks <[EMAIL PROTECTED]> writes > >>On Tue, Nov 4, 2008 at 9:29 AM, Pete <[EMAIL PROTECTED]> wrote: >>> I need it to match with "the first > after you have found <div" > >>Hi Pete, >> >>I tend to ask reg for any character that is not a >, like this... >> >>$rep = "/<div[^>]*>/"; >> >>Phill > > Thanks for that, but I was looking for a more generic answer. "the > first... (anything) after you have found the start of the phrase". I > suspected that it was ?, which makes reg "ungreedy", stops it as soon as > it can. But I couldn't get the syntax correct, so I assumed that I was > wrong. > > I eventually found this: > "/<div(.*?)>/"; > > The ? is the magic extra touch that was missing. It means "everything > from "<div" to the first ">". So I can now use those two as variables > in a class that will remove whatever I specify. >
It actually means "find me the smallest possible match between '<div' and '>', which roughly translates to what you're asking for. There's a modifier that can be used which makes all matches ungreedy which you'd use like this... "/<div(.*)>/U"; // See http://php.net/manual/en/reference.pcre.pattern.modifiers.php for more modifiers As a side point, if you're not interested in what (.*?) actually matches you might consider using (?:.*?) or even just .*? the (?: ) is used to match something that you don't need to use later, it's slightly more efficient :-) "/<div.*>/U"; // Much more readable! Phill