Hi folks. I'm having a little trouble with a regular expression and I'm
hoping someone can point out what I'm doing wrong.
I want to extract from a large number of html files everything between
the following specified comments, including the comments themselves:
<!--Begin CMS Content-->...<!-- End CMS Content-->
The string I'm testing the expression against is:
'Some code that will be ignored.
<!--Begin CMS Content-->
<span class="headline">Breadth Requirement</span>
<hr class="under" />
<!-- End CMS Content-->
This is some more content that will not be matched.
<!--Begin CMS Content-->
<strong>More Matched Content!</strong>
<!-- End CMS Content-->
Some more ignored code.'
And the regular expression I've got is
'/[<!--Begin CMS Content\-\->].+[<!-- End CMS Content\-\->]/s'
I expected that when I ran this using preg_match_all I would get two
matches, the comments and the content between them, but instead I get
the following:
Array
(
[0] => Array
(
[0] => Some code that will be ignored.
<!--Begin CMS Content-->
<span class="headline">Breadth Requirement</span>
<hr class="under" />
<!-- End CMS Content-->
This is some more content that will not be matched.
<!--Begin CMS Content-->
<strong>More Matched Content!</strong>
<!-- End CMS Content-->
Some more ignored code
)
)
which is just a match of the whole string minus the period at the very
end which is not matched.
Can anybody point out where I'm going wrong here?
Cheers and TIA,
Pablo.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php