Thanks for responding! I've hit it again with TRACE logging... here's the results of that:
2019-04-25 08:53:10,261 INFO parse.ParserChecker - fetching: http://url.com/crawltest.html 2019-04-25 08:53:10,268 INFO plugin.PluginRepository - Plugins: looking in: C:\nutch\apache-nutch-1.5.1\plugins 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true] 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Registered Plugins: 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Http / Https Protocol Plug-in (protocol-httpclient) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Tika Parser Plug-in (parse-tika) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - URL Validator (urlfilter-validator) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Registered Extension-Points: 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter) 2019-04-25 08:53:10,350 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) 2019-04-25 08:53:10,377 INFO httpclient.Http - http.proxy.host = null 2019-04-25 08:53:10,377 INFO httpclient.Http - http.proxy.port = 8080 2019-04-25 08:53:10,378 INFO httpclient.Http - http.timeout = 10000 2019-04-25 08:53:10,379 INFO httpclient.Http - http.content.limit = -1 2019-04-25 08:53:10,379 INFO httpclient.Http - http.agent = Spider/Nutch-1.5.1 2019-04-25 08:53:10,379 INFO httpclient.Http - http.accept.language = en-us,en-gb,en;q=0.7,*;q=0.3 2019-04-25 08:53:10,380 INFO httpclient.Http - http.accept = text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 2019-04-25 08:53:10,385 TRACE httpclient.Http - Credentials - username: user; set as default for realm: ntdomain; scheme: 2019-04-25 08:53:10,392 TRACE httpclient.Http - Pre-configured credentials with scope - host: url.com; port: 80; not found for url: http://url.com/crawltest.html 2019-04-25 08:53:10,449 DEBUG auth.AuthChallengeProcessor - Supported authentication schemes in the order of preference: [ntlm, digest, basic] 2019-04-25 08:53:10,449 INFO auth.AuthChallengeProcessor - ntlm authentication scheme selected 2019-04-25 08:53:10,450 DEBUG auth.AuthChallengeProcessor - Using authentication scheme: ntlm 2019-04-25 08:53:10,450 DEBUG auth.AuthChallengeProcessor - Authorization challenge processed 2019-04-25 08:53:10,452 TRACE auth.NTLMScheme - enter NTLMScheme.authenticate(Credentials, HttpMethod) 2019-04-25 08:53:10,460 DEBUG auth.AuthChallengeProcessor - Using authentication scheme: ntlm 2019-04-25 08:53:10,460 DEBUG auth.AuthChallengeProcessor - Authorization challenge processed 2019-04-25 08:53:10,461 TRACE auth.NTLMScheme - enter NTLMScheme.authenticate(Credentials, HttpMethod) 2019-04-25 08:53:10,952 DEBUG auth.AuthChallengeProcessor - Using authentication scheme: ntlm 2019-04-25 08:53:10,953 DEBUG auth.AuthChallengeProcessor - Authorization challenge processed 2019-04-25 08:53:10,955 INFO httpclient.HttpMethodDirector - Failure authenticating with NTLM <any realm>@url.com:80 2019-04-25 08:53:10,959 TRACE httpclient.Http - url: http://url.com/crawltest.html; status code: 401; bytes received: 6322; Content-Length: 6322 2019-04-25 08:53:11,033 TRACE httpclient.Http - 401 Authentication Required 2019-04-25 08:53:11,133 INFO crawl.SignatureFactory - Using Signature impl: org.apache.nutch.crawl.MD5Signature 2019-04-25 08:53:11,135 INFO parse.ParserChecker - parsing: http://urlcom/crawltest.html 2019-04-25 08:53:11,135 INFO parse.ParserChecker - contentType: application/xhtml+xml 2019-04-25 08:53:11,138 INFO parse.ParserChecker - signature: 495abb7f991fb4dd6a056f748908a2d9 Regarding whats on the server security events - a couple interesting things: 1. It sees it, but the failure reason is "Unknown user name or bad password". The user and password being sent from httpclient-auth.xml is the exact same as what i'm sending in from the curl command 2. Unlike the Curl command, the Account Name being sent over is all upper case! I have this suspicion that this has something to do with it. Again, though, the username in httpclient-auth.xml is NOT all in upper case. -- Sent from: http://lucene.472066.n3.nabble.com/Nutch-User-f603147.html