I am an intermediate perl user. I taught myself Perl by reading "Learning Perl," with some online tutorials and I have some other reference texts. I can generally do what I need to with with Perl, but my code is far from elegant. I understand the very basics of object-oriented programming in Perl, but I generally need sample code to get started with modules from cpan. I am a professor at Rice University and have found Perl to be invaluable for extracting data for my research, especially the regular expression capabilities of Perl. I have been unable to attend any of the monthly meetings, but hope to in the future.
For my current project, I am trying to extract historical financial statement data from www.marketwatch.com. The url is http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0. I use WWW::Mechanize to download the webpage and then I use HTML::TableExtract to extract the text that I need. I want to transpose the table at depth=1, count=1 after extracting it so that each year is a row and each variable is a column. I have not been able to find any documentation on how to extract a column from a table using HTML::TableExtract. The following simple program downloads the data using WWW::Mechanize and extracts the table with HTML::TableExtract and prints the output of each row. #!/usr/bin/perl use HTML::TableExtract; use WWW::Mechanize; use strict; my $marketwatch = WWW::Mechanize->new( autocheck => 1 ); $marketwatch->get(" http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0 "); chomp(my $html = $marketwatch->content); my $table = HTML::TableExtract->new(keep_html=>0, depth => 1, count => 1, br_translate => 0 ); $table->parse($html); foreach my $row ($table->rows) { print join("\t", @$row), "\n"; } I am not able to figure out how to use the columns method. My intuition makes me think it should be something like the following (but my intuition is wrong): foreach my $column ($table->columns) { print join("\t", @$column), "\n"; } The error message I get says: Can't locate object method "columns" via package "HTML::TableExtract". The documentation doesn't shed much light (for me anyway). I can see in the code of the module that the columns method belongs to HTML::TableExtract::Table, but I can't figure out how to use it. I appreciate any help. For an experienced programmer, I am sure this is trivial, but I am the closest thing to a programmer in my department, and I don't really have anyone around me that I can get help from.
_______________________________________________ Houston mailing list Houston@pm.org http://mail.pm.org/mailman/listinfo/houston Website: http://houston.pm.org/