Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file

2005-09-15 Thread Uri Guttman
> "TS" == Tolkin, Steve <[EMAIL PROTECTED]> writes:

  TS> Checking if your kit is complete...
  TS> Looks good
  TS> Warning: prerequisite LWP::UserAgent 2.024 not found. We have 1.004.
  TS> Warning: prerequisite Test::LongString 0 not found.
  TS> Warning: prerequisite URI 1.25 not found. We have 1.19.
  TS> Writing Makefile for WWW::Mechanize
 
use CPAN.pm which can install prerequisites for you. no need for a bundle.

  TS> if ( !$skiplive ) {
  TS> require IO::Socket;
  TS> my $s = IO::Socket::INET->new(
  TS> PeerAddr => "www.google.com:80",
  TS> Timeout  => 10,
  TS> );

  TS> I think my proxy is set up correctly.
  TS> C:\perl_install\WWW-Mechanize-1.14>env | grep -i proxy
  TS> FTP_PROXY=http://proxbos1.fmr.com:8000
  TS> HTTP_PROXY=http://proxbos1.fmr.com:8000

the common proxy you refer to is if you use an HTTP client such as LWP
or WWW::Mechanize or a browser. IO::Socket knows not from any protocols
or proxies as it is a direct connection to a socket (any socket). there
is no standard socket proxy protocol (there are programs which will do
socket redirection). the problem is that you can't redirect on the fly
and sockets don't have any way of conveying out of band data (yes there
is socket OOB but it is rarely used). so how would the socket proxy know
that google.com:80 means to first connect to your proxy and then
redirect to that host/port address? web proxies are connect to from a
client which then sends a web request (inband and in the proxy format)
which then get connected to the real url.

  TS> How can I learn more about why IO::Socket::INET->new failed?

it failed because fidelity doesn't allow direct access to the outside
world on port 80 (or any other port). all web access is via proxies.

i wouldn't worry about passing the tests as mechanize is very stable in
general. just install the dependencies you want (some may be optional)
and install mechanize. mechanize will obey proxies since it subclasses
LWP which does the web work.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org
 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file

2005-09-15 Thread Ricker, William
I didn't notice or use a bundle.  CPANPLUS will handle dependencies they
say, but I just kept grabbing modules until it shut up.

# 1. What might cause IO::Socket::INET->new to fail?
# I *am* directly connected to the Internet, so the first warning is
probably caused by a proxy problem. 

No, you are not directly connected. ( Try a traceroute to find out just
how indirect you are ;-)
 
Assuming this is on the FMN internal network, you are on a private NAT'd
subnet whose firewall/gateways require proxy connection for port80
traffic.  So yes, you do need to set the proxies :

 FTP_PROXY=...
 HTTP_PROXY=...

and no, those will not make the Makefile think you're directly
connected.  Those are used by LWP::, for HTTP: and FTP: schemes, but not
by raw IO::Socket::INET, which tested if you were DIRECTLY connected.
The LPW Proxy scheme probably uses IO::Socket::INET to connect to the
proxy.

If you have those proxies set, you can probably let the Makefile.PL
create those tests anyway.


-- Bill


 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file

2005-09-15 Thread Tolkin, Steve
Summary:
1. What might cause IO::Socket::INET->new to fail?
2. Is there a bundle for WWW-Mechanize?  

Details:
I went to http://search.cpan.org/dist/WWW-Mechanize/ and read the doc
and it looks promising.  
I downloaded the tar.gz file, extracted all its files, and started the
usual install process.
Unfortunately I hit a variety of problems.  Here is the output:

C:\perl_install\WWW-Mechanize-1.14>perl makefile.pl

It seems that you are not directly connected to the Internet.  Some
of the WWW::Mechanize tests interact with websites such as Google,
in addition to its own internal tests.

Do you want to skip these tests? [y] y
Do you want to install the mech-dump utility? [y] y

It looks like you don't have SSL capability (like IO::Socket::SSL)
installed.
You will not be able to process https:// URLs correctly.


WWW::Mechanize likes to have a lot of test modules for some of its
tests.
The following are modules that would be nice to have, but not required.

Test::Pod
Test::Memory::Cycle
Test::Warn


Checking if your kit is complete...
Looks good
Warning: prerequisite LWP::UserAgent 2.024 not found. We have 1.004.
Warning: prerequisite Test::LongString 0 not found.
Warning: prerequisite URI 1.25 not found. We have 1.19.
Writing Makefile for WWW::Mechanize
 
//

I *am* directly connected to the Internet, so the first warning is
probably caused by a proxy problem. 
Looking inside the Makefile.PL I think the specific test that failed is:

if ( !$skiplive ) {
require IO::Socket;
my $s = IO::Socket::INET->new(
PeerAddr => "www.google.com:80",
Timeout  => 10,
);

I think my proxy is set up correctly.
C:\perl_install\WWW-Mechanize-1.14>env | grep -i proxy
FTP_PROXY=http://proxbos1.fmr.com:8000
HTTP_PROXY=http://proxbos1.fmr.com:8000

How can I learn more about why IO::Socket::INET->new failed?

The others errors are dependencies on other modules, or newer versions
of modules.
Is there a "bundle" for WWW-Mechanize?


Thanks,
Steve

-Original Message-
From: Ricker, William 
Sent: Wednesday, September 14, 2005 5:05 PM
To: Tolkin, Steve; L-boston-pm
Subject: RE: [Boston.pm] Combining the nodes reachable in n steps from a
web page into one printable file


Is this to implement the missing PRINTABLE PAGE button for just yourself
or as part of the website?

This sounds a lot like one of the examples in MDJ's new Higher Order
Perl book.

Outside of HOP, WWW::Mechanize is the new wrapper around LWP::Simple for
this sort of thing.

Makes my old LWP-wielding cache-and-smash implementation look lumpy ...

Bill


 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm


[Boston.pm] Combining the nodes reachable in n steps from a web page into one printable file

2005-09-14 Thread Tolkin, Steve
This seems like a problem that would be easily solved with a small perl
script.

Many web pages have a large list of links.  
I would like to follow all the links, to some small depth (typically
just 1) and put their output into one file, in some format suitable for
printing.  
I am flexible about the order of the links, and the details of the
format, etc.
This has probably been written already. 
Having it in perl would let me modify it, which might be useful.
(If there is a reliable freeware or shareware program, I would also be
interested in that.)


Thanks,
Steve

P.S.  perl -v says:
This is perl, v5.8.0 built for MSWin32-x86-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2002, Larry Wall

Binary build 805 provided by ActiveState Corp.
http://www.ActiveState.com
Built 18:08:02 Feb  4 2003

 
___
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm