>I want to extract from a large number of html files everything between
>the following specified comments, including the comments themselves:
>
><!--Begin CMS Content-->...<!-- End CMS Content-->
<snip>
>And the regular expression I've got is
>
>'/[<!--Begin CMS Content\-\->].+[<!-- End CMS Content\-\->]/s'
>
>I expected that when I ran this using preg_match_all I would get two
>matches
Those brackets mean "match one any of the characteres found within", so it
will match '<', or '!', or '-', or 'B', or...
You want this:
'/<!--Begin CMS Content-->(.+)<!-- End CMS Content-->/Uis'
...which gets you this (I added the parentheses in the middle so you could
also get the stuff inside the CMS content delimiters):
Array
(
[0] => Array
(
[0] => <!--Begin CMS Content-->
<span class="headline">Breadth Requirement</span>
<hr class="under" />
<!-- End CMS Content-->
[1] => <!--Begin CMS Content-->
<strong>More Matched Content!</strong>
<!-- End CMS Content-->
)
[1] => Array
(
[0] =>
<span class="headline">Breadth Requirement</span>
<hr class="under" />
[1] =>
<strong>More Matched Content!</strong>
)
)
---------------------------------------------------------------------
michal migurski- contact info and pgp key:
sf/ca http://mike.teczno.com/contact.html
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php