> -----Original Message-----
> From: Jennifer Swofford [mailto:[EMAIL PROTECTED]]
> Sent: 01 October 2002 21:21
> To: [EMAIL PROTECTED]
> Subject: [PHP] eregi_replace / preg_match_all
> 
> 
> Why does this work:
> 
> $contents =
> eregi_replace("(\")(.(/))*[A-Z0-9_/-]+(.gif|.jpg)",
> "\"blah.gif", $contents);
> 
> But this does not:
> 
> preg_match_all("(\")(.(/))*[A-Z0-9_/-]+(.gif|.jpg)",
> $contents, $matches);
> 
> for ($i=0; $i< count($matches[0]); $i++) {
> echo "matched: ".$matches[0][$i]."\n"
> }
> 
> I get this error:
> 
> Warning: Unknown modifier '(' in
> /home/littleduck/www/www/newcontrol/temp/fread.php on
> line 20

Well, first of all you've switched from a POSIX-extended regular expression
function, eregi_replace(), to a Perl-compatible regular expression (PCRE)
function, preg_match_all() -- all the POSIX-extended function names begin
with an e, and all the PCRE ones with a p.  The rules are different for the
two sets, and one of the most important ones is that the PCRE functions
require the pattern to be enclosed within delimiters, and the POSIX-extended
ones don't (this is because you can add optional _modifiers_ after the
ending delimiter for PCRE matching).  The syntax of the regular expressions
is also slightly different, and a pattern that works reliably for PCRE won't
necessarily for POSIX-extended, and vice versa -- especially as PCRE is far
more versatile and feature-rich.

So, your basic problem in the preg_match_all() is that you haven't *added*
delimiters -- the requirement is that you must use the same character both
before and after your pattern, or a matched pair from one of the sets (),
{}, [], and <>. So, the pattern parser is seeing the initial "(" in your
expression, treating it as the opening delimiter, and taking your basic
pattern to be the text up to the matching ")" -- that is, just (\").  It
then looks to see if the next character is a valid _modifier_ (remember
them?), finds "(" which isn't one, and complains -- and bingo!, there's your
"Unknown modifier '('" message.

So, this, for example, would work:

  preg_match_all("{(\")(.(/))*[A-Z0-9_/-]+(.gif|.jpg)}", $contents,
$matches);

although I'd point out that "." is a pattern-matching element that means
"any character", so you might prefer to escape it -- unfortunately, both
regular expressions and PHP's double-quoted strings use \ as their escape
character, so to get a single backslash into the expression to escape the
".", you've got to use two \\s (and to match an actual single backslash, you
have to write no less than *four* backslashes!).  So this would give:

  preg_match_all("{(\")(\\.(/))*[A-Z0-9_/-]+(\\.gif|\\.jpg)}', $contents,
$matches);

Because of the problems with "backslash breeding" that this can cause, I'd
always recommend using single-quoted strings for your patterns -- especially
as, in this case, you have a double-quote in the pattern!  Thus, my final
effort would be:

  preg_match_all('{(")(\.(/))*[A-Z0-9_/-]+(\.gif|\.jpg)}', $contents,
$matches);

Lastly, you may care to search the archives of this list for postings about
the relative efficiency of the two types of expressions -- I know one set is
frequently recommended as more efficient than the other, but can never
remember which!

Hope this is helpful,

Cheers!

Mike

---------------------------------------------------------------------
Mike Ford,  Electronic Information Services Adviser,
Learning Support Services, Learning & Information Services,
JG125, James Graham Building, Leeds Metropolitan University,
Beckett Park, LEEDS,  LS6 3QS,  United Kingdom
Email: [EMAIL PROTECTED]
Tel: +44 113 283 2600 extn 4730      Fax:  +44 113 283 3211 



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to