--envbJBWh7q8WU6mo Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable
On Wed, Jun 11, 2003 at 04:00:06PM -0500, Jay Strauss wrote: > Found HTML::TableContentParser which does some of the heavy lifting for = me > playing with it now > > http://quote.cboe.com/QuoteTable.asp?TICKER=3Dqqq&ALL=3D2 > > > > It seems like there would be a cpan thing to read in a string (html), t= hen > > would let me navigate. That is, give me the third table, give me the > > first row, give me the first table data save the html into a file with wget, then feed that as an argument to the perl below... if you want the calls and puts broken into separate arrays or into hashes it should be easy from here. I would do a cleaner example (like pulling the page with LWP, and=20 storing the data into a hash) if I thought I'd get paid for it. ;) TTFN, Mike ps: if you want to see what each step is doing to the data, put a=20 "print $_;" line and pipe the output into less, so you can see the null characters clearly. This is a very simple table, I normally need to use \00, \01, \02, etc... to mark different chunks of data, so that after the html is gone I can identify what was what.. =3D=3D=3D=3D #! /usr/bin/perl -w $_ =3D join '', <>; # suck in the html s#^.*<!--Start Options Table-->##s; # strip before interest s#<!--End Options Table-->.*##s; # strip after interest s#^.*(<table)#$1#is; # fine tune strip before s#<td[^>]*?>##g; # nuke table data starts s#</td[^>]*?>#\00#g; # mark table data stops s#[\r\n]##g; # nuke return and newline s#\s+# #g; # nuke multiple spaces s#<tr[^>]*?>##g; # nuke table record starts s#</tr[^>]*?>#\n#g; # mark table record stops s#</?[^>]*?>##g; # nuke all remaining html s#^ ##mg; # nuke leading spaces foreach $line (split '\n', $_) { # work on each table record @ray =3D split "\0", $line; # split based on data marks next if (@ray !=3D 14); # ignore incomplete rows printf "%-23s %-9s %-5s %-5s %-5s %-4s %-8s " . "%-24s %-9s %-5s %-5s %-5s %-5s %-8s\n", @ray; # print the data nicely. } =3D=3D=3D=3D --=20 GPG key: http://simons-clan.com/~msimons/gpg/msimons.asc Fingerprint: 524D A726 77CB 62C9 4D56 8109 E10C 249F B7FA ACBE --envbJBWh7q8WU6mo Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE+56UX4Qwkn7f6rL4RArXSAJ94281aWTFAdWw5+dS0ZqFLmdBlgACcCcGi 7YuroM6cOE8L46gTmFVKM6Y= =Xp9y -----END PGP SIGNATURE----- --envbJBWh7q8WU6mo-- _______________________________________________ vox-tech mailing list [EMAIL PROTECTED] http://lists.lugod.org/mailman/listinfo/vox-tech