Dave M G wrote:
> Jochem,
>
> Thank you for responding.
>
>>
>> does this one work?:
>> preg_replace('#^<\!DOCTYPE(.*)<ul[^>]*>#is', '', $htmlPage);
>
> Yes, that works. I don't think I would have every figured that out on my
> own - it's certainly much more complicated than the ereg equivalent.
1. the '^' at the start of the regexp states 'must match start of the string
(or line in multiline mode)'
2. the 'i' after the the closing regexp delimiter states 'match
case-insensitively'
3. the 's' after the the closing regexp delimiter states 'the dot also matches
newlines'
4. the '<u[^>]*>' matches a UL tag with any number of attributes ... the
'[^>]*' matches a number
of characters that are not a '>' character - the square brackets denote a
character class (in
this cass with just one character in it) and the '^' at the start of the
character class
definition negates the class (i.e. turns the character class definition to mean
every character
*not* defined in the class)
PCRE is alot more powerful [imho], the downside it it has more modifiers
and syntax to control the meaning of the patterns...
read and become familiar with these 2 pages:
http://php.net/manual/en/reference.pcre.pattern.modifiers.php
http://php.net/manual/en/reference.pcre.pattern.syntax.php
and remember that writing patterns is often quite a complex - when you build one
just take i one 'assertion' at a time, ie. build the pattern up step by step...
if you give it a good go and get stuck, then there is always the list.
>
> If I may push for just one more example of how to properly use regular
> expressions with preg:
>
> It occurs to me that I've been assuming that with regular expressions I
> could only remove or change specified text.
essentially regexps are pattern syntax for asserting where something matches
a pattern (or not) - there are various functions that allow you to act upon the
results of the pattern matching depending on your needs (see below)
>
> What if I wanted to get rid of everything *other* than the specified text?
>
> Can I form an expression that would take $htmlPage and delete everything
> *except* text that is between a <li> tag and a <br> tag?
yes but you wouldn't use preg_replace() but rather preg_match() or
preg_match_all()
which gives you back an array (via 3rd/4th[?] reference argument) which contains
the texts that matched (and therefore want to keep).
>
> Or is that something that requires much more than a single use of
> preg_replace?
>
> --
> Dave M G
>
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php