Hi Greg, This is one of the many things that I don’t know much about. My first thought was WWW::Mechanize might be the way to go, but I found this in the WWW::Mechanize documentation:
[quote] Please note that Mech does NOT support JavaScript, you need additional software for that. Please check "JavaScript" in WWW::Mechanize::FAQ for more. [end quote] (“Mech” is just an abbreviation for “WWW::Mechanize”.) Probably a good idea for you to peruse that FAQ: http://search.cpan.org/~jesse/WWW-Mechanize-1.72/lib/WWW/Mechanize/FAQ.pod In there, I found losts of stuff including: [quote] Which modules work like Mechanize and have JavaScript support? In no particular order: Gtk2::WebKit::Mechanize, Win32::IE::Mechanize, WWW::Mechanize::Firefox, WWW::Scripter, WWW::Selenium [end quote] For 32-bit perls, I provide ppm packages for Gtk2::WebKit (and dependencies). Gtk2::WebKit::Mechanize is pure perl, and would (presumably) be installable with: cpan –i Gtk2::WebKIt::Mechanize once Gtk2::WebKit and dependencies have been installed. However, for 64-bit perls I don’t yet have ppm packages for Gtk2::WebKit. I think it’s some problem I had in obtaining a 64-bit webkit library. For 64-bit 5.16.x perl, ActiveState provide ppm packages for both WWW::Scripter and WWW::Selenium – so that might be your best bet for x64 perl-5.16.0. On 64-bit Strawberry ‘cpan -i Win32::IE::Mechanize’ hanged during ‘dmake test’, but it might be serviceable if you download the source, unpack it, then install manually with ‘perl Makefile.PL’, ‘dmake test’, ‘dmake install’. Hope there’s something there that helps. You might also try perlmonks, where you’ll possibly encounter more people who have faced (and, hopefully, solved) the same problem on Windows. They’re a bit finicky over there, however, and don’t respond all that well to requests for help if no code attempts are provided. Cheers, Rob From: Greg Aiken Sent: Friday, February 15, 2013 11:36 AM To: Perl-Win32-Users@listserv.ActiveState.com Subject: perl to web-scrape an html page that requires 'Javascript' to fullyrender ive used the sample/simple lwpget.pl sample script (thats found in numerous place on the internet, which uses LWP package). if the url requested is 'static content', the page is received in its full/complete content. if however, the url requested is dynamic content (that normally requires Javascript to be running in the 'client-browser-environment' - to be fully rendered) - then obviously, in this case - such a page does not retrieve with all parts being present. is there any perl package that adds Javascript functionality to the basic lwpget.pl - so that when the perl-script-acting-as-a-web-client requests the page, its actually parsed, and javascript functions are performed - to allow the perl requested page to be the full equivalent as if the page were rendered in a fully Javascript aware web browser? any help here would be appreciated, as well as a sample script. sincerely, greg -------------------------------------------------------------------------------- _______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
_______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs