hi Susam,
very sorry for the mistake in the 1st code. I had put <default/> but omitted
that line when i sent it across to u :(.

for our intranet sites we do not require a proxy. hence i have now removed
the proxy and ensured its default auth and did a crawl. have attached the
log, still getting the same 401 :(

would you like me to send the nutch-site and default xmls?

thanks,
Rochelle


On Fri, May 15, 2009 at 4:54 PM, Susam Pal <[email protected]> wrote:

> On Fri, May 15, 2009 at 2:43 PM, Rochelle D'souza
> <[email protected]> wrote:
> > hi Susam,
> >
> > Many thanks for your reply.
> >
> >
> >
> > As requested I have given only default authentication. Below is the
> > httpclient-auth.xml
> >
> > <?xml version="1.0"?>
> >
> > <auth-configuration>
> >
> >       <credentials username="devadmin" password="password">
> >
> >       </credentials>
> >
> > </auth-configuration>
>
> This is not the correct way to configure default credentials. You need
> to put a <default/> tag within the <credentials> tag.
>
> Please read the section, 'Crawling an Intranet with Default
> Authentication Scope' in
> http://wiki.apache.org/nutch/HttpAuthenticationSchemes to see an
> example.
>
> >
> > The logs for the same are
> >
> > I have only masked the agent and proxy host name since I am sharing the
> log
> > file.
> >
> >
> >
> > Then I changed the httpclient-auth.xml to the below code
> >
> > <?xml version="1.0"?>
> >
> > <auth-configuration>
> >
> >       <credentials username="devadmin" password="password">
> >
> >             <authscope host="googly" port="80" realm="xyz"/>
> >
> >       </credentials>
> >
> > </auth-configuration>
>
> From the logs obtained with this configuration, I see:
>
> 2009-05-15 14:32:40,971 INFO  auth.AuthChallengeProcessor - ntlm
> authentication scheme selected
> 2009-05-15 14:32:41,002 INFO  httpclient.HttpMethodDirector - Failure
> authenticating with NTLM <any realm>@googly:80
> 2009-05-15 14:32:41,002 DEBUG httpclient.Http - url: http://googly/;
> status code: 401; bytes received: 1539; Content-Length: 1539
> 2009-05-15 14:32:41,205 DEBUG httpclient.Http - 401 Authentication Required
>
> These lines tell that authentication was tried but the authentication
> failed.
>
> In the logs I also see these lines:
>
> 2009-05-15 14:32:35,661 INFO  httpclient.Http - http.proxy.host =
> proxy.companyname.com
> 2009-05-15 14:32:35,739 INFO  httpclient.Http - http.proxy.port = 6050
>
> So, it seems you have configured a proxy server. Have you configured
> http.proxy.username and http.proxy.password too? If yes, this may be
> the cause of the problem. Authentication for proxy server as well as
> web server is not supported at the moment. For more on this please go
> through the 'NTLM' section of this article:
> http://hc.apache.org/httpclient-3.x/authentication.html
>
> Regards,
> Susam Pal
>

Reply via email to