Is HTTP Basic authentication working at all? I've been working with v0.9 for two days now, and I have yet to get this working.
I have one test directory with an .htaccess file requiring a username:password just for the fetcher. I can access this directory with a browser using that username:password. In nutch-site.xml I have replaced 'protocol-http' with 'protocol-httpclient' in the 'plugin.includes' property. and the following... <property> <name>http.auth.basic.IT.user</name> <value>spider</value> <description>HTTP Basic Authentication</description> </property> <property> <name>http.auth.basic.IT.pass</name> <value>pissword</value> <description>HTTP Basic Authentication</description> </property> 'IT' is the realm (AuthName "IT"). I've tried defining these properties as 'http.auth.basic.IT.user', 'http.auth.basic..user', and 'http.auth.basic.user'. as I've discovered in several others' examples in the Nutch Wiki. I see this in hadoop.log... 2007-08-06 16:12:45,856 INFO httpclient.HttpMethodDirector - No credentials available for BASIC 'IT'@spock.abaqus.com:80 I see the fetcher hitting the server, but it never tries the 'spider' user to authenticate... 172.17.25.27 - - [06/Aug/2007:16:12:45 -0400] "GET /development HTTP/1.0" 401 1287 "-" "ABAQUS/Nutch-0.9 (moin; http://spock; [EMAIL PROTECTED])" Please tell me whether I should expect the basic authentication mechanism to work at all. I've already spent so much time trying to figure this out. Regards, Clarence Donath Spelling is a lossed art. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
