I am trying to extract a table (<table class="xxxx"><tr><td>...... until
</table>) and its content from an HTML file.

With the file I have something like this

<div id="product" class="product">
<table border="0" cellspacing="0" cellpadding="0" class="prodc"
title="Product ">
.
.
.
</table>
</div>

There could be more that one table in the file.however I am only interested
in the table within <div id="product" class="product"> </div>.

/^.*<div id="product" class="product">.+?(<table
border="0".+?\s+<\/table>)\s*<\/div>.*$/ims

The above and various variations I tried do not much.

I am able to easily match this using sed, however I need to try using perl.

This sed work just fine:

sed -n '/<div id="product" class="product">/,/<\/table>/p' thelo826.html
|sed -n '/<table border.*/,/<\/table>/p'| sed -e 's/class=".*"//g'

Thanks

Mimi

Reply via email to