Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file
> "TS" == Tolkin, Steve <[EMAIL PROTECTED]> writes: TS> Checking if your kit is complete... TS> Looks good TS> Warning: prerequisite LWP::UserAgent 2.024 not found. We have 1.004. TS> Warning: prerequisite Test::LongString 0 not found. TS> Warning: prerequisite URI 1.25 not found. We have 1.19. TS> Writing Makefile for WWW::Mechanize use CPAN.pm which can install prerequisites for you. no need for a bundle. TS> if ( !$skiplive ) { TS> require IO::Socket; TS> my $s = IO::Socket::INET->new( TS> PeerAddr => "www.google.com:80", TS> Timeout => 10, TS> ); TS> I think my proxy is set up correctly. TS> C:\perl_install\WWW-Mechanize-1.14>env | grep -i proxy TS> FTP_PROXY=http://proxbos1.fmr.com:8000 TS> HTTP_PROXY=http://proxbos1.fmr.com:8000 the common proxy you refer to is if you use an HTTP client such as LWP or WWW::Mechanize or a browser. IO::Socket knows not from any protocols or proxies as it is a direct connection to a socket (any socket). there is no standard socket proxy protocol (there are programs which will do socket redirection). the problem is that you can't redirect on the fly and sockets don't have any way of conveying out of band data (yes there is socket OOB but it is rarely used). so how would the socket proxy know that google.com:80 means to first connect to your proxy and then redirect to that host/port address? web proxies are connect to from a client which then sends a web request (inband and in the proxy format) which then get connected to the real url. TS> How can I learn more about why IO::Socket::INET->new failed? it failed because fidelity doesn't allow direct access to the outside world on port 80 (or any other port). all web access is via proxies. i wouldn't worry about passing the tests as mechanize is very stable in general. just install the dependencies you want (some may be optional) and install mechanize. mechanize will obey proxies since it subclasses LWP which does the web work. uri -- Uri Guttman -- [EMAIL PROTECTED] http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs http://jobs.perl.org ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file
I didn't notice or use a bundle. CPANPLUS will handle dependencies they say, but I just kept grabbing modules until it shut up. # 1. What might cause IO::Socket::INET->new to fail? # I *am* directly connected to the Internet, so the first warning is probably caused by a proxy problem. No, you are not directly connected. ( Try a traceroute to find out just how indirect you are ;-) Assuming this is on the FMN internal network, you are on a private NAT'd subnet whose firewall/gateways require proxy connection for port80 traffic. So yes, you do need to set the proxies : FTP_PROXY=... HTTP_PROXY=... and no, those will not make the Makefile think you're directly connected. Those are used by LWP::, for HTTP: and FTP: schemes, but not by raw IO::Socket::INET, which tested if you were DIRECTLY connected. The LPW Proxy scheme probably uses IO::Socket::INET to connect to the proxy. If you have those proxies set, you can probably let the Makefile.PL create those tests anyway. -- Bill ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file
Summary: 1. What might cause IO::Socket::INET->new to fail? 2. Is there a bundle for WWW-Mechanize? Details: I went to http://search.cpan.org/dist/WWW-Mechanize/ and read the doc and it looks promising. I downloaded the tar.gz file, extracted all its files, and started the usual install process. Unfortunately I hit a variety of problems. Here is the output: C:\perl_install\WWW-Mechanize-1.14>perl makefile.pl It seems that you are not directly connected to the Internet. Some of the WWW::Mechanize tests interact with websites such as Google, in addition to its own internal tests. Do you want to skip these tests? [y] y Do you want to install the mech-dump utility? [y] y It looks like you don't have SSL capability (like IO::Socket::SSL) installed. You will not be able to process https:// URLs correctly. WWW::Mechanize likes to have a lot of test modules for some of its tests. The following are modules that would be nice to have, but not required. Test::Pod Test::Memory::Cycle Test::Warn Checking if your kit is complete... Looks good Warning: prerequisite LWP::UserAgent 2.024 not found. We have 1.004. Warning: prerequisite Test::LongString 0 not found. Warning: prerequisite URI 1.25 not found. We have 1.19. Writing Makefile for WWW::Mechanize // I *am* directly connected to the Internet, so the first warning is probably caused by a proxy problem. Looking inside the Makefile.PL I think the specific test that failed is: if ( !$skiplive ) { require IO::Socket; my $s = IO::Socket::INET->new( PeerAddr => "www.google.com:80", Timeout => 10, ); I think my proxy is set up correctly. C:\perl_install\WWW-Mechanize-1.14>env | grep -i proxy FTP_PROXY=http://proxbos1.fmr.com:8000 HTTP_PROXY=http://proxbos1.fmr.com:8000 How can I learn more about why IO::Socket::INET->new failed? The others errors are dependencies on other modules, or newer versions of modules. Is there a "bundle" for WWW-Mechanize? Thanks, Steve -Original Message- From: Ricker, William Sent: Wednesday, September 14, 2005 5:05 PM To: Tolkin, Steve; L-boston-pm Subject: RE: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file Is this to implement the missing PRINTABLE PAGE button for just yourself or as part of the website? This sounds a lot like one of the examples in MDJ's new Higher Order Perl book. Outside of HOP, WWW::Mechanize is the new wrapper around LWP::Simple for this sort of thing. Makes my old LWP-wielding cache-and-smash implementation look lumpy ... Bill ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm
[Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file
This seems like a problem that would be easily solved with a small perl script. Many web pages have a large list of links. I would like to follow all the links, to some small depth (typically just 1) and put their output into one file, in some format suitable for printing. I am flexible about the order of the links, and the details of the format, etc. This has probably been written already. Having it in perl would let me modify it, which might be useful. (If there is a reliable freeware or shareware program, I would also be interested in that.) Thanks, Steve P.S. perl -v says: This is perl, v5.8.0 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2002, Larry Wall Binary build 805 provided by ActiveState Corp. http://www.ActiveState.com Built 18:08:02 Feb 4 2003 ___ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm