RE: RobotRules fails on user-agents with spaces

2005-10-14 Thread Matthew.van.Eerde
Gisle Aas wrote: > <[EMAIL PROTECTED]> writes: > >> The problem... if I include a space in my robot's user agent, it >> will fail to recognize robots.txt records targeted to my robot. > > You are not allowed to have space in the user agent name. See section > "3.8 Product Tokens" of RFC 2616 [1]

Re: RobotRules fails on user-agents with spaces

2005-10-14 Thread Jonathan Kamens
Nigel Horne writes: > Perhaps it would help if WWW::RobotRules were to warn/die when setting > an agent with a space in? An excellent message would be "RFC2616 forbids > spaces in an agent's names". Impossible, because the whole user agent string *is* allowed to have spaces in it. It's just th

Re: RobotRules fails on user-agents with spaces

2005-10-14 Thread Nigel Horne
On Fri, 2005-10-14 at 10:37, Gisle Aas wrote: > <[EMAIL PROTECTED]> writes: > > > The problem... if I include a space in my robot's user agent, it > > will fail to recognize robots.txt records targeted to my robot. > > You are not allowed to have space in the user agent name. See section > "3.8

Re: RobotRules fails on user-agents with spaces

2005-10-14 Thread Gisle Aas
<[EMAIL PROTECTED]> writes: > The problem... if I include a space in my robot's user agent, it > will fail to recognize robots.txt records targeted to my robot. You are not allowed to have space in the user agent name. See section "3.8 Product Tokens" of RFC 2616 [1]. Isn't it an option to just

RobotRules fails on user-agents with spaces

2005-10-13 Thread Matthew.van.Eerde
The problem... if I include a space in my robot's user agent, it will fail to recognize robots.txt records targeted to my robot. My robot's user agent: Hispanic Business Inc. Spider/1.0 Robots.txt file: User-agent: Hispanic Business Inc. Spider Disallow: User-agent: * Disallow: / My robot will

Re: user agents

2004-12-04 Thread Mattias Holmlund
Gisle Aas wrote: Mattias Holmlund <[EMAIL PROTECTED]> writes: Gisle Aas wrote: It is documented (barely) that the module export the variable '$ua'. A side effect of importing this variable is that this forces the full LWP::UserAgent implementation to be used, otherwise settings on the $ua o

Re: user agents

2004-12-03 Thread Gisle Aas
Mattias Holmlund <[EMAIL PROTECTED]> writes: > Gisle Aas wrote: > > >It is documented (barely) that the module export the variable '$ua'. > >A side effect of importing this variable is that this forces the full > >LWP::UserAgent implementation to be used, otherwise settings on the > >$ua object w

Re: user agents

2004-12-02 Thread Mattias Holmlund
Gisle Aas wrote: It is documented (barely) that the module export the variable '$ua'. A side effect of importing this variable is that this forces the full LWP::UserAgent implementation to be used, otherwise settings on the $ua object would have no effect. I want to declare this as the official in

Re: user agents

2004-12-02 Thread Gisle Aas
Zed Lopez <[EMAIL PROTECTED]> writes: > On 01 Dec 2004 01:35:13 -0800, Gisle Aas <[EMAIL PROTECTED]> wrote: > > Zed Lopez <[EMAIL PROTECTED]> writes: > > > I'd like to suggest these differences be documented. > > > > I agree this is wrong. Do you want to suggest a doc patch? > > I'm working on

Re: user agents

2004-12-02 Thread Zed Lopez
On 01 Dec 2004 01:35:13 -0800, Gisle Aas <[EMAIL PROTECTED]> wrote: > Zed Lopez <[EMAIL PROTECTED]> writes: > > I'd like to suggest these differences be documented. > > I agree this is wrong. Do you want to suggest a doc patch? I'm working on the doc patch... would it be considered desirable to

Re: user agents

2004-12-01 Thread Gisle Aas
Zed Lopez <[EMAIL PROTECTED]> writes: > I'd like to suggest these differences be documented. I agree this is wrong. Do you want to suggest a doc patch? > Does anyone know why _trivial_http_get uses its own user agent and > HTTP version? Because it is a totally different client implementation w

user agents

2004-12-01 Thread Zed Lopez
I just found some behavior that surprised me. LWP::Simple's perldoc says: The user agent created by this module will identify itself as "LWP::Simple/#.##" (where "#.##" is the libwww-perl version number) and will initialize its proxy defaults from the environment (by calling $ua->env_proxy).