>I want to extract from a large number of html files everything between
>the following specified comments, including the comments themselves:
>
><!--Begin CMS Content-->...<!-- End CMS Content-->
<snip>
>And the regular expression I've got is
>
>'/[<!--Begin CMS Content\-\->].+[<!-- End CMS Content\-\->]/s'
>
>I expected that when I ran this using preg_match_all I would get two
>matches

Those brackets mean "match one any of the characteres found within", so it
will match '<', or '!', or '-', or 'B', or...

You want this:
        '/<!--Begin CMS Content-->(.+)<!-- End CMS Content-->/Uis'

...which gets you this (I added the parentheses in the middle so you could
also get the stuff inside the CMS content delimiters):

        Array
        (
            [0] => Array
                (
                    [0] => <!--Begin CMS Content-->
        <span class="headline">Breadth Requirement</span>
        <hr class="under" />
        <!-- End CMS Content-->
                    [1] => <!--Begin CMS Content-->
        <strong>More Matched Content!</strong>
        <!-- End CMS Content-->
                )

            [1] => Array
                (
                    [0] =>
        <span class="headline">Breadth Requirement</span>
        <hr class="under" />

                    [1] =>
        <strong>More Matched Content!</strong>

                )

        )


---------------------------------------------------------------------
michal migurski- contact info and pgp key:
sf/ca            http://mike.teczno.com/contact.html

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to