Hello all
I have been scratching my head for the last two days about this regular
expression problem. I would be really VERY happy if someone could help me!
I have the following text in the file 'text.htm', for example:
--
<BLOCKQUOTE><P>
Cow, Cow, Cow, Cow, Cow
Cow, Cow, Cow, Cow, Cow
Cow, Cow, Cow, Cow, Cow
a lot of lines
</P></BLOCKQUOTE>
<p>boring stuff - we are not interested in this....</p>
<BLOCKQUOTE><P>
Chicken, Chicken, Chicken
Chicken, Chicken, Chicken
Chicken, Chicken, Chicken
more lines
</P></BLOCKQUOTE>
<p>more boring stuff - we are not interested in this....</p>
<BLOCKQUOTE><P>
Rabbit, Rabbit, Rabbit, Rabbit
</P></BLOCKQUOTE>
<p>even more boring stuff - we are not interested in this....</p>
<BLOCKQUOTE><P>
Pig, Pig, Pig, Pig, Pig
</P></BLOCKQUOTE>
--
I want to return all the stuff between <BLOCKQUOTE><P> ... </P></BLOCKQUOTE>
in an array. One element per match. For example, for the above text, I would
like to get back an array back like this:
array(
"Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow a
lot of lines",
"Chicken, Chicken, Chicken Chicken, Chicken, Chicken Chicken, Chicken,
Chicken more lines",
"Rabbit, Rabbit, Rabbit, Rabbit",
"Pig, Pig, Pig, Pig, Pig"
)
I have been trying to do this with (many variations of) the following code:
--
<?PHP
// open file
$fd = fopen ("./text.htm", "r");
// load contents into a variable
while (!feof ($fd))
{
$content .= fgets($fd, 4096);
}
// close file
fclose ($fd);
// remove char returns and co.
$content = preg_replace("/(\r\n)|(\n\r)|(\n|\r)/", " ",$content);
// match agains regex -- this does not work correctly....
if
(preg_match("/<BLOCKQUOTE><P>(.*)<\/P><\/BLOCKQUOTE>/i",$content,$matches))
{
echo "<pre>";
var_dump($matches);
echo "</pre>";
}
?>
--
For the above, var_dump() returns this:
--
array(2) {
[0]=>
string(556) "<BLOCKQUOTE><P> Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow,
Cow Cow, Cow, Cow, Cow, Cow a lot of lines </P></BLOCKQUOTE> <p>boring
stuff - we are not interested in this....</p> <BLOCKQUOTE><P> Chicken,
Chicken, Chicken Chicken, Chicken, Chicken Chicken, Chicken, Chicken more
lines </P></BLOCKQUOTE> <p>more boring stuff - we are not interested in
this....</p> <BLOCKQUOTE><P> Rabbit, Rabbit, Rabbit, Rabbit
</P></BLOCKQUOTE> <p>even more boring stuff - we are not interested in
this....</p> <BLOCKQUOTE><P> Pig, Pig, Pig, Pig, Pig </P></BLOCKQUOTE>"
[1]=>
string(524) " Cow, Cow, Cow, Cow, Cow Cow, Cow, Cow, Cow, Cow Cow, Cow,
Cow, Cow, Cow a lot of lines </P></BLOCKQUOTE> <p>boring stuff - we are not
interested in this....</p> <BLOCKQUOTE><P> Chicken, Chicken, Chicken
Chicken, Chicken, Chicken Chicken, Chicken, Chicken more lines
</P></BLOCKQUOTE> <p>more boring stuff - we are not interested in
this....</p> <BLOCKQUOTE><P> Rabbit, Rabbit, Rabbit, Rabbit
</P></BLOCKQUOTE> <p>even more boring stuff - we are not interested in
this....</p> <BLOCKQUOTE><P> Pig, Pig, Pig, Pig, Pig "
}
--
Clearly not what I want.
Is my approach here incorrect? Or is it indeed possible to construct a regex
to do what I want (with just one pass of the text)?
Thank you in advance.
:-))
S.
_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php