Paul Kraus wrote: > Someone want to show me how this module can help parse out html? > > I want to grap text between <td>text</td> being able to apple regexp to > get what I want. > > The problem is my text is among 10,000 td tags. With the only difference > being what the above <th> tag has in it. > > So if th tag = then store text between <td> into an array. > > Paul
Hi Paul, Sorry that earlier response was so dumb. I didn't connect the content with the subject line, I'm afraid. That may be because it included neither data illustration nor anything you had tried on your own. Anyway, I hope this makes up for my negligence a bit. I'm not sure that HTML::TokeParser::Simple adds anything to the functionality of HTML::TokeParser for your purposes [at least what you have described here]. The Simple part mostly has to do with making the tag types and attributes more transparent. I didn't see much in the docs about the data itself. Neither module seems all that user friendly, but I got something along that line working. With a simple table using headers: table_test.html: <html> <head> <title> HTML::TokeParser Test </title> </head> <body> <table rows=4 cols=3> <tr> <th> Key </th> <th> name </th> <th> Address </th> </tr> <tr> <td> 1 </td> <td> George </td> <td> farewell </td> </tr> <tr> <td> 2 </td> <td> Abe </td> <td> Gettysburg </td> </tr> <tr> <td> 3 </td> <td> Joseph </td> <td> E-Mail </td> </tr> </table> This [after many hours of near-misses], seemed to work:Greetings! E:\d_drive\perlStuff>perl -w -MHTML::TokeParser Greetings! E:\d_drive\perlStuff>perl -w -MHTML::TokeParser my $tp = HTML::TokeParser->new('table_test.html'); my @fields; my @test; my $open_tag; $open_tag = $tp->get_tag('th'); while ($open_tag and $open_tag->[0] ne '/tr') { if (my $test = $tp->get_text('/th')) { push @fields, $test; $open_tag = $tp->get_tag('th', '/tr') } } my @data; my $data_start = $tp->get_tag('tr'); while ($data_start) { my $values = {}; foreach (@fields) { $tp->get_tag('td'); $values->{$_} = $tp->get_text('/td'); } push @data, $values; $data_start = $tp->get_tag('tr'); } foreach $row (@data) { print "$_: $row->{$_}; " foreach keys %$row; print "\n"; } ^Z Address : farewell ; name : George ; Key : 1 ; Address : Gettysburg ; name : Abe ; Key : 2 ; Address : E-Mail ; name : Joseph ; Key : 3 ; It simple would not come together until I dealt with holiday cooking and celebrations, though. The main problem I was having was because I had been trying to do too much in the control blocks of the while loops. These "shortcuts" kept creating situations where the loop would pass beyond the desired data and consume the whole file. Doing a priming round, and then doing a spare test of value in the loop condition helped a lot. Of course, you still have to have a way to pick the particular row that you want, a complication that you didn't mention. Joseph -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]