Andy - do yourself a favor and don't try it manually. This problem has been 
solved by a number of publicly available modules already. Take a look at 
HTML::Parser (or HTML::PullParser) for example, which offers extraction of 
content between opening and closing tags.

Hope this helps
  Tobias


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy Postulka
Sent: Tuesday, October 21, 2008 10:47 AM
To: perl-win32-users@listserv.ActiveState.com
Subject: RegExp matching over multiple lines

Hi to everyone,

First off I hope I'm posting this to the right Mailing list.
If not, please forgive me in advance. I just started using ActivePerl

I'm having difficulty using RegExp to match/extract across several lines in an 
HTML file.

I want to match and extract everything between a pair of HTML tags.
This Match/Extraction can occur over multiple lines in an HTML file.

I'm using the following RegExp to test the HTML file for a Match/Extraction

                /<div >(.*?)<\/div>/s


Here is my Test HTML file.

Case #1
<div > Some text  Number 1
</div>

Case #2
<div > Some text  Number 2 </div>

Case #3
<div >
  Some text  Number 3
</div>

This only match that occurs is "Case #2" when the Tag pair occurs on the same 
line.
So that  tells me the RegExp works for a single line.

When I split the Tag Pair over more that one line (Case # 1 & Case #3) it fails 
to match.
Since the "." does not match the "new line" characters I've used the "s" 
modifier, but that does not  seem to work

I've tried using different combinations of the "s" & "m" modifiers which didn't 
seem to help any
I've also tried to "chomp()" each line to strip the new line characters from 
each line before applying the RegExp and use the different combination of the 
"sm" modifiers. That didn't work either.

I can't figure out what I'm doing wrong.
I'm using Active Perl 5.10.0 on a Windows XP computer.

Any help to solve this problem will be greatly appreciated.

Andy

_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to