Yes, I will check that.     I cranked up the logging and ran again, to see if 
you might spot something odd. 

2016-11-02 14:23:01,652 INFO  parse.ParserChecker - fetching: 
http://iis75.intranet.org
2016-11-02 14:23:01,684 INFO  plugin.PluginRepository - Plugins: looking in: 
/opt/nutch/plugins
2016-11-02 14:23:01,684 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/creativecommons/plugin.xml
2016-11-02 14:23:01,693 DEBUG plugin.PluginRepository - plugin: 
id=creativecommons name=Creative Commons Plugins version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:01,705 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.creativecommons.nutch.CCParseFilter
2016-11-02 14:23:01,705 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.creativecommons.nutch.CCIndexingFilter
2016-11-02 14:23:01,706 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/feed/plugin.xml
2016-11-02 14:23:01,708 DEBUG plugin.PluginRepository - plugin: id=feed 
name=Feed Parse/Index/Query Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,708 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.feed.FeedParser
2016-11-02 14:23:01,709 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.feed.FeedIndexingFilter
2016-11-02 14:23:01,709 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/headings/plugin.xml
2016-11-02 14:23:01,711 DEBUG plugin.PluginRepository - plugin: id=headings 
name=Headings Parse Filter version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,711 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.parse.headings.HeadingsParseFilter
2016-11-02 14:23:01,712 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-anchor/plugin.xml
2016-11-02 14:23:01,714 DEBUG plugin.PluginRepository - plugin: id=index-anchor 
name=Anchor Indexing Filter version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,714 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2016-11-02 14:23:01,715 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-basic/plugin.xml
2016-11-02 14:23:01,732 DEBUG plugin.PluginRepository - plugin: id=index-basic 
name=Basic Indexing Filter version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,732 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.basic.BasicIndexingFilter
2016-11-02 14:23:01,733 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-geoip/plugin.xml
2016-11-02 14:23:01,735 DEBUG plugin.PluginRepository - plugin: id=index-geoip 
name=GeoIP2 Indexing Filter version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,735 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.geoip.GeoIPIndexingFilter
2016-11-02 14:23:01,736 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-links/plugin.xml
2016-11-02 14:23:01,738 DEBUG plugin.PluginRepository - plugin: id=index-links 
name=Index inlinks and outlinks version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,738 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.links.LinksIndexingFilter
2016-11-02 14:23:01,738 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-metadata/plugin.xml
2016-11-02 14:23:01,754 DEBUG plugin.PluginRepository - plugin: 
id=index-metadata name=Index Metadata version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,754 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.metadata.MetadataIndexer
2016-11-02 14:23:01,754 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-more/plugin.xml
2016-11-02 14:23:01,756 DEBUG plugin.PluginRepository - plugin: id=index-more 
name=More Indexing Filter version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,757 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.more.MoreIndexingFilter
2016-11-02 14:23:01,757 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-replace/plugin.xml
2016-11-02 14:23:01,764 DEBUG plugin.PluginRepository - plugin: 
id=index-replace name=Replace Indexer version=1.0 
provider=PeterCiuffetticlass=null
2016-11-02 14:23:01,764 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.replace.ReplaceIndexer
2016-11-02 14:23:01,764 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/index-static/plugin.xml
2016-11-02 14:23:01,779 DEBUG plugin.PluginRepository - plugin: id=index-static 
name=Index Static version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,779 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.staticfield.StaticFieldIndexer
2016-11-02 14:23:01,779 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/indexer-cloudsearch/plugin.xml
2016-11-02 14:23:01,782 DEBUG plugin.PluginRepository - plugin: 
id=indexer-cloudsearch name=CloudSearchIndexWriter version=1.0.0 
provider=nutch.apache.orgclass=null
2016-11-02 14:23:01,783 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexWriter 
class=org.apache.nutch.indexwriter.cloudsearch.CloudSearchIndexWriter
2016-11-02 14:23:01,784 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/indexer-dummy/plugin.xml
2016-11-02 14:23:01,785 DEBUG plugin.PluginRepository - plugin: 
id=indexer-dummy name=DummyIndexWriter version=1.0.0 
provider=nutch.apache.orgclass=null
2016-11-02 14:23:01,786 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexWriter 
class=org.apache.nutch.indexwriter.dummy.DummyIndexWriter
2016-11-02 14:23:01,786 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/indexer-elastic/plugin.xml
2016-11-02 14:23:01,804 DEBUG plugin.PluginRepository - plugin: 
id=indexer-elastic name=ElasticIndexWriter version=1.0.0 
provider=nutch.apache.orgclass=null
2016-11-02 14:23:01,804 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexWriter 
class=org.apache.nutch.indexwriter.elastic.ElasticIndexWriter
2016-11-02 14:23:01,805 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/indexer-solr/plugin.xml
2016-11-02 14:23:01,807 DEBUG plugin.PluginRepository - plugin: id=indexer-solr 
name=SolrIndexWriter version=1.0.0 provider=nutch.apache.orgclass=null
2016-11-02 14:23:01,808 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexWriter 
class=org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-11-02 14:23:01,809 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/language-identifier/plugin.xml
2016-11-02 14:23:01,811 DEBUG plugin.PluginRepository - plugin: 
id=language-identifier name=Language Identification Parser/Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:01,811 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.analysis.lang.HTMLLanguageParser
2016-11-02 14:23:01,811 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.analysis.lang.LanguageIndexingFilter
2016-11-02 14:23:01,811 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-htmlunit/plugin.xml
2016-11-02 14:23:01,833 DEBUG plugin.PluginRepository - plugin: id=lib-htmlunit 
name=HTTP Framework version=1.0 provider=org.apache.nutchclass=null
2016-11-02 14:23:01,838 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-http/plugin.xml
2016-11-02 14:23:01,839 DEBUG plugin.PluginRepository - plugin: id=lib-http 
name=HTTP Framework version=1.0 provider=org.apache.nutchclass=null
2016-11-02 14:23:01,840 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-nekohtml/plugin.xml
2016-11-02 14:23:01,841 DEBUG plugin.PluginRepository - plugin: id=lib-nekohtml 
name=CyberNeko HTML Parser version=1.9.19 
provider=net.sourceforge.nekohtmlclass=null
2016-11-02 14:23:01,842 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-regex-filter/plugin.xml
2016-11-02 14:23:01,872 DEBUG plugin.PluginRepository - plugin: 
id=lib-regex-filter name=Regex URL Filter Framework version=1.0 
provider=org.apache.nutchclass=null
2016-11-02 14:23:01,872 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-selenium/plugin.xml
2016-11-02 14:23:01,874 DEBUG plugin.PluginRepository - plugin: id=lib-selenium 
name=HTTP Framework version=1.0 provider=org.apache.nutchclass=null
2016-11-02 14:23:01,878 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/lib-xml/plugin.xml
2016-11-02 14:23:01,880 DEBUG plugin.PluginRepository - plugin: id=lib-xml 
name=XML Libraries version=1.0 provider=org.apache.nutch.xmlclass=null
2016-11-02 14:23:01,880 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/microformats-reltag/plugin.xml
2016-11-02 14:23:01,902 DEBUG plugin.PluginRepository - plugin: 
id=microformats-reltag name=Rel-Tag microformat Parser/Indexer/Querier 
version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,902 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.microformats.reltag.RelTagParser
2016-11-02 14:23:01,902 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.microformats.reltag.RelTagIndexingFilter
2016-11-02 14:23:01,902 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/mimetype-filter/plugin.xml
2016-11-02 14:23:01,904 DEBUG plugin.PluginRepository - plugin: 
id=mimetype-filter name=Filter indexed documents by the detected MIME 
version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,904 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.filter.MimeTypeIndexingFilter
2016-11-02 14:23:01,905 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/nutch-extensionpoints/plugin.xml
2016-11-02 14:23:01,906 DEBUG plugin.PluginRepository - plugin: 
id=nutch-extensionpoints name=the nutch core extension points version=2.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:01,907 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-ext/plugin.xml
2016-11-02 14:23:01,909 DEBUG plugin.PluginRepository - plugin: id=parse-ext 
name=External Parser Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,909 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.ext.ExtParser
2016-11-02 14:23:01,909 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.ext.ExtParser
2016-11-02 14:23:01,910 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-html/plugin.xml
2016-11-02 14:23:01,935 DEBUG plugin.PluginRepository - plugin: id=parse-html 
name=Html Parse Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,935 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.html.HtmlParser
2016-11-02 14:23:01,935 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-js/plugin.xml
2016-11-02 14:23:01,937 DEBUG plugin.PluginRepository - plugin: id=parse-js 
name=JavaScript Parser version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,937 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser 
class=org.apache.nutch.parse.js.JSParseFilter
2016-11-02 14:23:01,937 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.parse.js.JSParseFilter
2016-11-02 14:23:01,938 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-metatags/plugin.xml
2016-11-02 14:23:01,939 DEBUG plugin.PluginRepository - plugin: 
id=parse-metatags name=MetaTags version=1.0 provider=digitalpebble.comclass=null
2016-11-02 14:23:01,939 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.parse.metatags.MetaTagsParser
2016-11-02 14:23:01,939 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-swf/plugin.xml
2016-11-02 14:23:01,941 DEBUG plugin.PluginRepository - plugin: id=parse-swf 
name=SWF Parse Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,941 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.swf.SWFParser
2016-11-02 14:23:01,942 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-tika/plugin.xml
2016-11-02 14:23:01,944 DEBUG plugin.PluginRepository - plugin: id=parse-tika 
name=Tika Parser Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,944 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.tika.TikaParser
2016-11-02 14:23:01,966 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parse-zip/plugin.xml
2016-11-02 14:23:01,968 DEBUG plugin.PluginRepository - plugin: id=parse-zip 
name=Zip Parse Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:01,968 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.zip.ZipParser
2016-11-02 14:23:01,968 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parsefilter-naivebayes/plugin.xml
2016-11-02 14:23:01,970 DEBUG plugin.PluginRepository - plugin: 
id=parsefilter-naivebayes name=Naive Bayes Parse Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:01,970 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.parsefilter.naivebayes.NaiveBayesParseFilter
2016-11-02 14:23:01,971 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/parsefilter-regex/plugin.xml
2016-11-02 14:23:01,973 DEBUG plugin.PluginRepository - plugin: 
id=parsefilter-regex name=Regex Parse Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:01,999 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.parse.HtmlParseFilter 
class=org.apache.nutch.parsefilter.regex.RegexParseFilter
2016-11-02 14:23:02,000 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-file/plugin.xml
2016-11-02 14:23:02,001 DEBUG plugin.PluginRepository - plugin: 
id=protocol-file name=File Protocol Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,002 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.file.File
2016-11-02 14:23:02,002 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-ftp/plugin.xml
2016-11-02 14:23:02,003 DEBUG plugin.PluginRepository - plugin: id=protocol-ftp 
name=Ftp Protocol Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:02,003 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol class=org.apache.nutch.protocol.ftp.Ftp
2016-11-02 14:23:02,004 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-htmlunit/plugin.xml
2016-11-02 14:23:02,005 DEBUG plugin.PluginRepository - plugin: 
id=protocol-htmlunit name=HtmlUnit Protocol Plug-in version=1.0.0 
provider=nutch.apache.orgclass=null
2016-11-02 14:23:02,005 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.htmlunit.Http
2016-11-02 14:23:02,005 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.htmlunit.Http
2016-11-02 14:23:02,006 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-http/plugin.xml
2016-11-02 14:23:02,007 DEBUG plugin.PluginRepository - plugin: 
id=protocol-http name=Http Protocol Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,007 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.http.Http
2016-11-02 14:23:02,007 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.http.Http
2016-11-02 14:23:02,008 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-httpclient/plugin.xml
2016-11-02 14:23:02,009 DEBUG plugin.PluginRepository - plugin: 
id=protocol-httpclient name=Http / Https Protocol Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,009 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.httpclient.Http
2016-11-02 14:23:02,009 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.httpclient.Http
2016-11-02 14:23:02,010 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-interactiveselenium/plugin.xml
2016-11-02 14:23:02,031 DEBUG plugin.PluginRepository - plugin: 
id=protocol-interactiveselenium name=Http Protocol Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,031 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.interactiveselenium.Http
2016-11-02 14:23:02,031 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/protocol-selenium/plugin.xml
2016-11-02 14:23:02,040 DEBUG plugin.PluginRepository - plugin: 
id=protocol-selenium name=Http Protocol Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,041 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.protocol.Protocol 
class=org.apache.nutch.protocol.selenium.Http
2016-11-02 14:23:02,041 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/scoring-depth/plugin.xml
2016-11-02 14:23:02,042 DEBUG plugin.PluginRepository - plugin: 
id=scoring-depth name=Scoring plugin for depth-limited crawling. version=1.0.0 
provider=ant.comclass=null
2016-11-02 14:23:02,042 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.depth.DepthScoringFilter
2016-11-02 14:23:02,043 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/scoring-link/plugin.xml
2016-11-02 14:23:02,044 DEBUG plugin.PluginRepository - plugin: id=scoring-link 
name=Link Analysis Scoring Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:02,044 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.link.LinkAnalysisScoringFilter
2016-11-02 14:23:02,044 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/scoring-opic/plugin.xml
2016-11-02 14:23:02,046 DEBUG plugin.PluginRepository - plugin: id=scoring-opic 
name=OPIC Scoring Plug-in version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:02,046 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.opic.OPICScoringFilter
2016-11-02 14:23:02,046 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/scoring-similarity/plugin.xml
2016-11-02 14:23:02,047 DEBUG plugin.PluginRepository - plugin: 
id=scoring-similarity name=Similarity based Scoring Plug-in version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,048 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.similarity.SimilarityScoringFilter
2016-11-02 14:23:02,048 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/subcollection/plugin.xml
2016-11-02 14:23:02,062 DEBUG plugin.PluginRepository - plugin: 
id=subcollection name=Subcollection indexing and query filter version=1.0.0 
provider=apache.orgclass=null
2016-11-02 14:23:02,062 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.subcollection.SubcollectionIndexingFilter
2016-11-02 14:23:02,062 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/tld/plugin.xml
2016-11-02 14:23:02,064 DEBUG plugin.PluginRepository - plugin: id=tld name=Top 
Level Domain Plugin version=1.0.0 provider=nutch.orgclass=null
2016-11-02 14:23:02,064 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.tld.TLDIndexingFilter
2016-11-02 14:23:02,064 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.tld.TLDScoringFilter
2016-11-02 14:23:02,064 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-automaton/plugin.xml
2016-11-02 14:23:02,066 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-automaton name=Automaton URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,066 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.automaton.AutomatonURLFilter
2016-11-02 14:23:02,066 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-domain/plugin.xml
2016-11-02 14:23:02,067 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-domain name=Domain URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,068 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.domain.DomainURLFilter
2016-11-02 14:23:02,068 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-domainblacklist/plugin.xml
2016-11-02 14:23:02,069 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-domainblacklist name=Domain Blacklist URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,069 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.domainblacklist.DomainBlacklistURLFilter
2016-11-02 14:23:02,069 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-ignoreexempt/plugin.xml
2016-11-02 14:23:02,071 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-ignoreexempt name=External Domain Ignore Exemption version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,071 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLExemptionFilter 
class=org.apache.nutch.urlfilter.ignoreexempt.ExemptionUrlFilter
2016-11-02 14:23:02,071 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-prefix/plugin.xml
2016-11-02 14:23:02,086 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-prefix name=Prefix URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,087 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.prefix.PrefixURLFilter
2016-11-02 14:23:02,087 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-regex/plugin.xml
2016-11-02 14:23:02,088 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-regex name=Regex URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,088 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.regex.RegexURLFilter
2016-11-02 14:23:02,089 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-suffix/plugin.xml
2016-11-02 14:23:02,090 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-suffix name=Suffix URL Filter version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,090 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.suffix.SuffixURLFilter
2016-11-02 14:23:02,090 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlfilter-validator/plugin.xml
2016-11-02 14:23:02,097 DEBUG plugin.PluginRepository - plugin: 
id=urlfilter-validator name=URL Validator version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,097 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLFilter 
class=org.apache.nutch.urlfilter.validator.UrlValidator
2016-11-02 14:23:02,097 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlmeta/plugin.xml
2016-11-02 14:23:02,098 DEBUG plugin.PluginRepository - plugin: id=urlmeta 
name=URL Meta Indexing Filter version=1.0.0 provider=sgonyeaclass=null
2016-11-02 14:23:02,098 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.indexer.IndexingFilter 
class=org.apache.nutch.indexer.urlmeta.URLMetaIndexingFilter
2016-11-02 14:23:02,098 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.scoring.ScoringFilter 
class=org.apache.nutch.scoring.urlmeta.URLMetaScoringFilter
2016-11-02 14:23:02,099 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-ajax/plugin.xml
2016-11-02 14:23:02,100 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-ajax name=AJAX URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,100 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.ajax.AjaxURLNormalizer
2016-11-02 14:23:02,100 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-basic/plugin.xml
2016-11-02 14:23:02,102 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-basic name=Basic URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,102 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer
2016-11-02 14:23:02,102 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-host/plugin.xml
2016-11-02 14:23:02,103 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-host name=Host URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,104 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.host.HostURLNormalizer
2016-11-02 14:23:02,104 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-pass/plugin.xml
2016-11-02 14:23:02,105 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-pass name=Pass-through URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,105 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.pass.PassURLNormalizer
2016-11-02 14:23:02,105 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-protocol/plugin.xml
2016-11-02 14:23:02,121 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-protocol name=Protocol URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,121 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.protocol.ProtocolURLNormalizer
2016-11-02 14:23:02,121 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-querystring/plugin.xml
2016-11-02 14:23:02,122 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-querystring name=Querystrings URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,123 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.querystring.QuerystringURLNormalizer
2016-11-02 14:23:02,123 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-regex/plugin.xml
2016-11-02 14:23:02,124 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-regex name=Regex URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,124 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.regex.RegexURLNormalizer
2016-11-02 14:23:02,124 DEBUG plugin.PluginRepository - parsing: 
/opt/nutch/plugins/urlnormalizer-slash/plugin.xml
2016-11-02 14:23:02,126 DEBUG plugin.PluginRepository - plugin: 
id=urlnormalizer-slash name=Slash URL Normalizer version=1.0.0 
provider=nutch.orgclass=null
2016-11-02 14:23:02,126 DEBUG plugin.PluginRepository - impl: 
point=org.apache.nutch.net.URLNormalizer 
class=org.apache.nutch.net.urlnormalizer.slash.SlashURLNormalizer
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
index-geoip
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: lib-http
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
nutch-extensionpoints
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: lib-xml
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
language-identifier
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
indexer-dummy
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
lib-nekohtml
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
subcollection
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
urlfilter-validator
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: urlmeta
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
scoring-depth
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
indexer-cloudsearch
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
microformats-reltag
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
urlfilter-ignoreexempt
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
protocol-interactiveselenium
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
protocol-ftp
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
parsefilter-regex
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: parse-ext
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: parse-zip
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
lib-htmlunit
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
urlnormalizer-querystring
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: feed
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
index-more
2016-11-02 14:23:02,127 DEBUG plugin.PluginRepository - not including: 
urlnormalizer-slash
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: headings
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
index-links
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
creativecommons
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
parse-metatags
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: parse-swf
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlnormalizer-protocol
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
lib-selenium
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
protocol-htmlunit
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlnormalizer-ajax
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
index-metadata
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
protocol-selenium
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
parsefilter-naivebayes
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
mimetype-filter
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlfilter-suffix
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlfilter-domain
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlfilter-domainblacklist
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: parse-js
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
index-static
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: tld
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
lib-regex-filter
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlfilter-automaton
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
urlfilter-prefix
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
scoring-link
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
protocol-http
2016-11-02 14:23:02,128 DEBUG plugin.PluginRepository - not including: 
scoring-similarity
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - not including: 
urlnormalizer-host
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - not including: 
protocol-file
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - not including: 
index-replace
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - not including: 
indexer-elastic
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.indexer.IndexingFilter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.indexer.IndexWriter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.parse.Parser
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.parse.HtmlParseFilter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.protocol.Protocol
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.net.URLFilter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.net.URLExemptionFilter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.net.URLNormalizer
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.scoring.ScoringFilter
2016-11-02 14:23:02,129 DEBUG plugin.PluginRepository - Adding extension point 
org.apache.nutch.segment.SegmentMergeFilter
2016-11-02 14:23:02,129 INFO  plugin.PluginRepository - Plugin Auto-activation 
mode: [true]
2016-11-02 14:23:02,129 INFO  plugin.PluginRepository - Registered Plugins:
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Regex URL 
Filter (urlfilter-regex)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Html Parse 
Plug-in (parse-html)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         HTTP Framework 
(lib-http)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Http / Https 
Protocol Plug-in (protocol-httpclient)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         the nutch core 
extension points (nutch-extensionpoints)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Basic Indexing 
Filter (index-basic)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Anchor Indexing 
Filter (index-anchor)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Tika Parser 
Plug-in (parse-tika)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Basic URL 
Normalizer (urlnormalizer-basic)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Regex URL 
Filter Framework (lib-regex-filter)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Regex URL 
Normalizer (urlnormalizer-regex)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         CyberNeko HTML 
Parser (lib-nekohtml)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         OPIC Scoring 
Plug-in (scoring-opic)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Pass-through 
URL Normalizer (urlnormalizer-pass)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         SolrIndexWriter 
(indexer-solr)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository - Registered 
Extension-Points:
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch Content 
Parser (org.apache.nutch.parse.Parser)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch URL 
Filter (org.apache.nutch.net.URLFilter)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         HTML Parse 
Filter (org.apache.nutch.parse.HtmlParseFilter)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch Scoring 
(org.apache.nutch.scoring.ScoringFilter)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch URL 
Normalizer (org.apache.nutch.net.URLNormalizer)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch Protocol 
(org.apache.nutch.protocol.Protocol)
2016-11-02 14:23:02,130 INFO  plugin.PluginRepository -         Nutch URL 
Ignore Exemption Filter (org.apache.nutch.net.URLExemptionFilter)
2016-11-02 14:23:02,147 INFO  plugin.PluginRepository -         Nutch Index 
Writer (org.apache.nutch.indexer.IndexWriter)
2016-11-02 14:23:02,147 INFO  plugin.PluginRepository -         Nutch Segment 
Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)
2016-11-02 14:23:02,147 INFO  plugin.PluginRepository -         Nutch Indexing 
Filter (org.apache.nutch.indexer.IndexingFilter)
2016-11-02 14:23:02,148 DEBUG util.ObjectCache - No object cache found for 
conf=Configuration: core-default.xml, core-site.xml, nutch-default.xml, 
nutch-site.xml, instantiating a new object cache
2016-11-02 14:23:02,196 DEBUG params.DefaultHttpParams - Set parameter 
http.useragent = Jakarta Commons-HttpClient/3.1
2016-11-02 14:23:02,197 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.version = HTTP/1.1
2016-11-02 14:23:02,198 DEBUG params.DefaultHttpParams - Set parameter 
http.connection-manager.class = class 
org.apache.commons.httpclient.SimpleHttpConnectionManager
2016-11-02 14:23:02,198 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.cookie-policy = default
2016-11-02 14:23:02,198 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.element-charset = US-ASCII
2016-11-02 14:23:02,198 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.content-charset = ISO-8859-1
2016-11-02 14:23:02,215 DEBUG params.DefaultHttpParams - Set parameter 
http.method.retry-handler = 
org.apache.commons.httpclient.DefaultHttpMethodRetryHandler@7fad8c79
2016-11-02 14:23:02,215 DEBUG params.DefaultHttpParams - Set parameter 
http.dateparser.patterns = [EEE, dd MMM yyyy HH:mm:ss zzz, EEEE, dd-MMM-yy 
HH:mm:ss zzz, EEE MMM d HH:mm:ss yyyy, EEE, dd-MMM-yyyy HH:mm:ss z, EEE, 
dd-MMM-yyyy HH-mm-ss z, EEE, dd MMM yy HH:mm:ss z, EEE dd-MMM-yyyy HH:mm:ss z, 
EEE dd MMM yyyy HH:mm:ss z, EEE dd-MMM-yyyy HH-mm-ss z, EEE dd-MMM-yy HH:mm:ss 
z, EEE dd MMM yy HH:mm:ss z, EEE,dd-MMM-yy HH:mm:ss z, EEE,dd-MMM-yyyy HH:mm:ss 
z, EEE, dd-MM-yyyy HH:mm:ss z]
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Java version: 1.8.0_111
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Java vendor: Oracle 
Corporation
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Java class path: 
/opt/nutch:/opt/nutch/conf:/usr/lib/jvm/jre-1.8.0/lib/tools.jar:/opt/nutch/lib/activation-1.1.jar:/opt/nutch/lib/aopalliance-1.0.jar:/opt/nutch/lib/apache-nutch-1.12.jar:/opt/nutch/lib/args4j-2.0.16.jar:/opt/nutch/lib/asm-3.3.1.jar:/opt/nutch/lib/avro-1.7.4.jar:/opt/nutch/lib/bootstrap-3.0.3.jar:/opt/nutch/lib/cglib-2.2.1-v20090111.jar:/opt/nutch/lib/cglib-2.2.2.jar:/opt/nutch/lib/closure-compiler-v20130603.jar:/opt/nutch/lib/commons-cli-1.2.jar:/opt/nutch/lib/commons-codec-1.10.jar:/opt/nutch/lib/commons-collections-3.2.1.jar:/opt/nutch/lib/commons-collections4-4.0.jar:/opt/nutch/lib/commons-compress-1.9.jar:/opt/nutch/lib/commons-configuration-1.8.jar:/opt/nutch/lib/commons-daemon-1.0.13.jar:/opt/nutch/lib/commons-el-1.0.jar:/opt/nutch/lib/commons-httpclient-3.1.jar:/opt/nutch/lib/commons-io-2.4.jar:/opt/nutch/lib/commons-jexl-2.1.1.jar:/opt/nutch/lib/commons-lang-2.6.jar:/opt/nutch/lib/commons-lang3-3.1.jar:/opt/nutch/lib/commons-logging-1.1.3.jar:/opt/nutch/lib/commons-math3-3.1.1.jar:/opt/nutch/lib/commons-net-3.1.jar:/opt/nutch/lib/crawler-commons-0.6.jar:/opt/nutch/lib/cxf-core-3.0.4.jar:/opt/nutch/lib/cxf-rt-bindings-soap-3.0.4.jar:/opt/nutch/lib/cxf-rt-bindings-xml-3.0.4.jar:/opt/nutch/lib/cxf-rt-databinding-jaxb-3.0.4.jar:/opt/nutch/lib/cxf-rt-frontend-jaxrs-3.0.4.jar:/opt/nutch/lib/cxf-rt-frontend-jaxws-3.0.4.jar:/opt/nutch/lib/cxf-rt-frontend-simple-3.0.4.jar:/opt/nutch/lib/cxf-rt-transports-http-3.0.4.jar:/opt/nutch/lib/cxf-rt-transports-http-jetty-3.0.4.jar:/opt/nutch/lib/cxf-rt-ws-addr-3.0.4.jar:/opt/nutch/lib/cxf-rt-wsdl-3.0.4.jar:/opt/nutch/lib/cxf-rt-ws-policy-3.0.4.jar:/opt/nutch/lib/dom4j-1.6.1.jar:/opt/nutch/lib/dsiutils-2.0.12.jar:/opt/nutch/lib/fastutil-6.5.2.jar:/opt/nutch/lib/geronimo-servlet_3.0_spec-1.0.jar:/opt/nutch/lib/guava-16.0.1.jar:/opt/nutch/lib/guice-3.0.jar:/opt/nutch/lib/guice-servlet-3.0.jar:/opt/nutch/lib/h2-1.4.180.jar:/opt/nutch/lib/hadoop-annotations-2.4.0.jar:/opt/nutch/lib/hadoop-auth-2.4.0.jar:/opt/nutch/lib/hadoop-client-2.2.0.jar:/opt/nutch/lib/hadoop-common-2.4.0.jar:/opt/nutch/lib/hadoop-hdfs-2.4.0.jar:/opt/nutch/lib/hadoop-mapreduce-client-app-2.2.0.jar:/opt/nutch/lib/hadoop-mapreduce-client-common-2.4.0.jar:/opt/nutch/lib/hadoop-mapreduce-client-core-2.4.0.jar:/opt/nutch/lib/hadoop-mapreduce-client-jobclient-2.4.0.jar:/opt/nutch/lib/hadoop-mapreduce-client-shuffle-2.4.0.jar:/opt/nutch/lib/hadoop-yarn-api-2.4.0.jar:/opt/nutch/lib/hadoop-yarn-client-2.4.0.jar:/opt/nutch/lib/hadoop-yarn-common-2.4.0.jar:/opt/nutch/lib/hadoop-yarn-server-common-2.4.0.jar:/opt/nutch/lib/hadoop-yarn-server-nodemanager-2.4.0.jar:/opt/nutch/lib/htmlparser-1.6.jar:/opt/nutch/lib/httpclient-4.3.5.jar:/opt/nutch/lib/httpcore-4.3.2.jar:/opt/nutch/lib/icu4j-55.1.jar:/opt/nutch/lib/jackson-annotations-2.5.0.jar:/opt/nutch/lib/jackson-core-2.5.1.jar:/opt/nutch/lib/jackson-core-asl-1.8.8.jar:/opt/nutch/lib/jackson-databind-2.5.1.jar:/opt/nutch/lib/jackson-dataformat-cbor-2.5.1.jar:/opt/nutch/lib/jackson-jaxrs-1.8.8.jar:/opt/nutch/lib/jackson-jaxrs-base-2.5.1.jar:/opt/nutch/lib/jackson-jaxrs-json-provider-2.5.1.jar:/opt/nutch/lib/jackson-mapper-asl-1.8.8.jar:/opt/nutch/lib/jackson-module-jaxb-annotations-2.5.1.jar:/opt/nutch/lib/jackson-xc-1.8.8.jar:/opt/nutch/lib/jasper-compiler-5.5.23.jar:/opt/nutch/lib/jasper-runtime-5.5.23.jar:/opt/nutch/lib/javassist-3.12.1.GA.jar:/opt/nutch/lib/javax.annotation-api-1.2.jar:/opt/nutch/lib/javax.inject-1.jar:/opt/nutch/lib/java-xmlbuilder-0.4.jar:/opt/nutch/lib/javax.persistence-2.0.0.jar:/opt/nutch/lib/javax.ws.rs-api-2.0.1.jar:/opt/nutch/lib/jaxb-api-2.2.2.jar:/opt/nutch/lib/jaxb-core-2.1.14.jar:/opt/nutch/lib/jaxb-impl-2.2.3-1.jar:/opt/nutch/lib/jersey-client-1.9.jar:/opt/nutch/lib/jersey-core-1.9.jar:/opt/nutch/lib/jersey-guice-1.9.jar:/opt/nutch/lib/jersey-json-1.9.jar:/opt/nutch/lib/jersey-server-1.9.jar:/opt/nutch/lib/jettison-1.1.jar:/opt/nutch/lib/jetty-6.1.26.jar:/opt/nutch/lib/jetty-continuation-8.1.15.v20140411.jar:/opt/nutch/lib/jetty-http-8.1.15.v20140411.jar:/opt/nutch/lib/jetty-io-8.1.15.v20140411.jar:/opt/nutch/lib/jetty-security-8.1.15.v20140411.jar:/opt/nutch/lib/jetty-server-8.1.15.v20140411.jar:/opt/nutch/lib/jetty-util-6.1.26.jar:/opt/nutch/lib/jetty-util-8.1.15.v20140411.jar:/opt/nutch/lib/joda-time-2.3.jar:/opt/nutch/lib/jquery-2.0.3-1.jar:/opt/nutch/lib/jquerypp-1.0.1.jar:/opt/nutch/lib/jquery-selectors-0.0.3.jar:/opt/nutch/lib/jquery-ui-1.10.2-1.jar:/opt/nutch/lib/jsap-2.1.jar:/opt/nutch/lib/jsch-0.1.42.jar:/opt/nutch/lib/json-20131018.jar:/opt/nutch/lib/jsp-api-2.1.jar:/opt/nutch/lib/jsr305-1.3.9.jar:/opt/nutch/lib/juniversalchardet-1.0.3.jar:/opt/nutch/lib/libidn-1.15.jar:/opt/nutch/lib/log4j-1.2.17.jar:/opt/nutch/lib/lucene-analyzers-common-4.10.2.jar:/opt/nutch/lib/lucene-core-4.10.2.jar:/opt/nutch/lib/maven-parent-config-0.3.4.jar:/opt/nutch/lib/modernizr-2.6.2-1.jar:/opt/nutch/lib/neethi-3.0.3.jar:/opt/nutch/lib/netty-3.6.2.Final.jar:/opt/nutch/lib/ormlite-core-4.48.jar:/opt/nutch/lib/ormlite-jdbc-4.48.jar:/opt/nutch/lib/oro-2.0.8.jar:/opt/nutch/lib/paranamer-2.3.jar:/opt/nutch/lib/protobuf-java-2.5.0.jar:/opt/nutch/lib/reflections-0.9.8.jar:/opt/nutch/lib/servlet-api-2.5.jar:/opt/nutch/lib/slf4j-api-1.7.9.jar:/opt/nutch/lib/slf4j-log4j12-1.7.5.jar:/opt/nutch/lib/snappy-java-1.0.4.1.jar:/opt/nutch/lib/spring-aop-4.0.4.RELEASE.jar:/opt/nutch/lib/spring-beans-4.0.4.RELEASE.jar:/opt/nutch/lib/spring-context-4.0.4.RELEASE.jar:/opt/nutch/lib/spring-core-4.0.4.RELEASE.jar:/opt/nutch/lib/spring-expression-4.0.4.RELEASE.jar:/opt/nutch/lib/spring-web-4.0.4.RELEASE.jar:/opt/nutch/lib/stax2-api-3.1.4.jar:/opt/nutch/lib/stax-api-1.0-2.jar:/opt/nutch/lib/t-digest-3.1.jar:/opt/nutch/lib/tika-core-1.12.jar:/opt/nutch/lib/typeaheadjs-0.9.3.jar:/opt/nutch/lib/warc-hadoop-0.1.0.jar:/opt/nutch/lib/webarchive-commons-1.1.5.jar:/opt/nutch/lib/wicket-bootstrap-core-0.9.2.jar:/opt/nutch/lib/wicket-bootstrap-extensions-0.9.2.jar:/opt/nutch/lib/wicket-core-6.16.0.jar:/opt/nutch/lib/wicket-extensions-6.13.0.jar:/opt/nutch/lib/wicket-ioc-6.16.0.jar:/opt/nutch/lib/wicket-request-6.16.0.jar:/opt/nutch/lib/wicket-spring-6.16.0.jar:/opt/nutch/lib/wicket-util-6.16.0.jar:/opt/nutch/lib/wicket-webjars-0.4.0.jar:/opt/nutch/lib/woodstox-core-asl-4.4.1.jar:/opt/nutch/lib/wsdl4j-1.6.3.jar:/opt/nutch/lib/xercesImpl-2.11.0.jar:/opt/nutch/lib/xml-apis-1.4.01.jar:/opt/nutch/lib/xmlenc-0.52.jar:/opt/nutch/lib/xmlParserAPIs-2.6.2.jar:/opt/nutch/lib/xml-resolver-1.2.jar:/opt/nutch/lib/xmlschema-core-2.2.1.jar
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Operating system name: 
Linux
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Operating system 
architecture: amd64
2016-11-02 14:23:02,222 DEBUG httpclient.HttpClient - Operating system version: 
3.10.0-327.36.3.el7.x86_64
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SUN 1.8: SUN (DSA 
key/parameter generation; DSA signing; SHA-1, MD5 digests; SecureRandom; X.509 
certificates; JKS & DKS keystores; PKIX CertPathValidator; PKIX 
CertPathBuilder; LDAP, Collection CertStores, JavaPolicy Policy; 
JavaLoginConfig Configuration)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunRsaSign 1.8: Sun RSA 
signature provider
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunJSSE 1.8: Sun JSSE 
provider(PKCS12, SunX509/PKIX key/trust factories, SSLv3/TLSv1/TLSv1.1/TLSv1.2)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunJCE 1.8: SunJCE 
Provider (implements RSA, DES, Triple DES, AES, Blowfish, ARCFOUR, RC2, PBE, 
Diffie-Hellman, HMAC)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunJGSS 1.8: Sun 
(Kerberos v5, SPNEGO)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunSASL 1.8: Sun SASL 
provider(implements client mechanisms for: DIGEST-MD5, GSSAPI, EXTERNAL, PLAIN, 
CRAM-MD5, NTLM; server mechanisms for: DIGEST-MD5, GSSAPI, CRAM-MD5, NTLM)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - XMLDSig 1.8: XMLDSig (DOM 
XMLSignatureFactory; DOM KeyInfoFactory; C14N 1.0, C14N 1.1, Exclusive C14N, 
Base64, Enveloped, XPath, XPath2, XSLT TransformServices)
2016-11-02 14:23:02,229 DEBUG httpclient.HttpClient - SunPCSC 1.8: Sun PC/SC 
provider
2016-11-02 14:23:02,253 INFO  protocol.RobotRulesParser - Whitelisted hosts: 
[iis75.intranet.org]
2016-11-02 14:23:02,253 INFO  httpclient.Http - http.proxy.host = null
2016-11-02 14:23:02,253 INFO  httpclient.Http - http.proxy.port = 8080
2016-11-02 14:23:02,253 INFO  httpclient.Http - http.proxy.exception.list = 
false
2016-11-02 14:23:02,265 INFO  httpclient.Http - http.timeout = 36000
2016-11-02 14:23:02,265 INFO  httpclient.Http - http.content.limit = 65536
2016-11-02 14:23:02,265 INFO  httpclient.Http - http.agent = 
APL-Nutch-Spider/Nutch-1.12 ([email protected])
2016-11-02 14:23:02,265 INFO  httpclient.Http - http.accept.language = 
en-us,en-gb,en;q=0.7,*;q=0.3
2016-11-02 14:23:02,265 INFO  httpclient.Http - http.accept = 
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
2016-11-02 14:23:02,268 DEBUG params.DefaultHttpParams - Set parameter 
http.connection.timeout = 36000
2016-11-02 14:23:02,268 DEBUG params.DefaultHttpParams - Set parameter 
http.socket.timeout = 36000
2016-11-02 14:23:02,268 DEBUG params.DefaultHttpParams - Set parameter 
http.socket.sendbuffer = 8192
2016-11-02 14:23:02,268 DEBUG params.DefaultHttpParams - Set parameter 
http.socket.receivebuffer = 8192
2016-11-02 14:23:02,268 DEBUG params.DefaultHttpParams - Set parameter 
http.connection-manager.max-total = 50
2016-11-02 14:23:02,269 DEBUG params.DefaultHttpParams - Set parameter 
http.connection-manager.max-per-host = {HostConfiguration[]=10}
2016-11-02 14:23:02,269 DEBUG params.DefaultHttpParams - Set parameter 
http.connection-manager.timeout = 36000
2016-11-02 14:23:02,270 DEBUG params.DefaultHttpParams - Set parameter 
http.default-headers = [Accept-Language: en-us,en-gb,en;q=0.7,*;q=0.3
, Accept-Charset: utf-8,ISO-8859-1;q=0.7,*;q=0.7
, Accept: 
text/html,application/xml;q=0.9,application/xhtml+xml,text/xml;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
, Accept-Encoding: x-gzip, gzip, deflate
]
2016-11-02 14:23:02,279 TRACE httpclient.Http - Credentials - username: 
domainuser; set as default for realm: domain; scheme: ntlm
2016-11-02 14:23:02,296 TRACE httpclient.HttpState - enter 
HttpState.getCredentials(AuthScope)
2016-11-02 14:23:02,296 TRACE httpclient.Http - Pre-configured credentials with 
scope -  host: iis75.intranet.org; port: 80; not found for url: 
http://iis75.intranet.org
2016-11-02 14:23:02,297 TRACE httpclient.HttpState - enter 
HttpState.setCredentials(AuthScope, Credentials)
2016-11-02 14:23:02,352 TRACE methods.GetMethod - enter GetMethod(String)
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.version = HTTP/1.0
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.unambiguous-statusline = false
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.single-cookie-header = false
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.strict-transfer-encoding = false
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.reject-head-body = false
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.warn-extra-input = false
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.status-line-garbage-limit = 2147483647
2016-11-02 14:23:02,352 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.content-charset = UTF-8
2016-11-02 14:23:02,353 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.cookie-policy = compatibility
2016-11-02 14:23:02,353 DEBUG params.DefaultHttpParams - Set parameter 
http.protocol.single-cookie-header = true
2016-11-02 14:23:02,353 DEBUG params.DefaultHttpParams - Set parameter 
http.useragent = APL-Nutch-Spider/Nutch-1.12 ([email protected])
2016-11-02 14:23:02,353 TRACE httpclient.HttpClient - enter 
HttpClient.executeMethod(HttpMethod)
2016-11-02 14:23:02,353 TRACE httpclient.HttpClient - enter 
HttpClient.executeMethod(HostConfiguration,HttpMethod,HttpState)
2016-11-02 14:23:02,373 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,373 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,374 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,374 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,374 TRACE httpclient.MultiThreadedHttpConnectionManager - 
enter HttpConnectionManager.getConnectionWithTimeout(HostConfiguration, long)
2016-11-02 14:23:02,374 DEBUG httpclient.MultiThreadedHttpConnectionManager - 
HttpConnectionManager.getConnection:  config = 
HostConfiguration[host=http://iis75.intranet.org], timeout = 36000
2016-11-02 14:23:02,374 TRACE httpclient.MultiThreadedHttpConnectionManager - 
enter HttpConnectionManager.ConnectionPool.getHostPool(HostConfiguration)
2016-11-02 14:23:02,374 TRACE httpclient.MultiThreadedHttpConnectionManager - 
enter HttpConnectionManager.ConnectionPool.getHostPool(HostConfiguration)
2016-11-02 14:23:02,375 DEBUG httpclient.MultiThreadedHttpConnectionManager - 
Allocating new connection, 
hostConfig=HostConfiguration[host=http://iis75.intranet.org]
2016-11-02 14:23:02,391 TRACE httpclient.HttpMethodDirector - Attempt number 1 
to process request
2016-11-02 14:23:02,391 TRACE httpclient.HttpConnection - enter 
HttpConnection.open()
2016-11-02 14:23:02,391 DEBUG httpclient.HttpConnection - Open connection to 
iis75.intranet.org:80
2016-11-02 14:23:02,441 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.execute(HttpState, HttpConnection)
2016-11-02 14:23:02,441 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequest(HttpState, HttpConnection)
2016-11-02 14:23:02,441 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestLine(HttpState, HttpConnection)
2016-11-02 14:23:02,441 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.generateRequestLine(HttpConnection, String, String, String, 
String)
2016-11-02 14:23:02,442 DEBUG wire.header - >> "GET / HTTP/1.0[\r][\n]"
2016-11-02 14:23:02,442 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,443 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,443 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,443 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,443 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,443 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addUserAgentRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,443 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addHostRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,443 DEBUG httpclient.HttpMethodBase - Adding Host request 
header
2016-11-02 14:23:02,443 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addCookieRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,455 TRACE httpclient.HttpState - enter 
HttpState.getCookies()
2016-11-02 14:23:02,455 TRACE cookie.CookieSpec - enter 
CookieSpecBase.match(String, int, String, boolean, Cookie[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addProxyConnectionHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "Accept-Language: 
en-us,en-gb,en;q=0.7,*;q=0.3[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "Accept-Charset: 
utf-8,ISO-8859-1;q=0.7,*;q=0.7[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "Accept: 
text/html,application/xml;q=0.9,application/xhtml+xml,text/xml;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "Accept-Encoding: x-gzip, gzip, 
deflate[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "User-Agent: 
APL-Nutch-Spider/Nutch-1.12 ([email protected])[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,456 DEBUG wire.header - >> "Host: 
iis75.intranet.org[\r][\n]"
2016-11-02 14:23:02,456 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.writeLine()
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,457 DEBUG wire.header - >> "[\r][\n]"
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.flushRequestOutputStream()
2016-11-02 14:23:02,457 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,457 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readStatusLine(HttpState, HttpConnection)
2016-11-02 14:23:02,457 TRACE httpclient.HttpConnection - enter 
HttpConnection.readLine()
2016-11-02 14:23:02,458 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,458 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,487 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,487 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,488 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,488 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HeaderParser.parseHeaders(InputStream, String)
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,488 DEBUG wire.header - << "Server: 
Microsoft-IIS/7.5[\r][\n]"
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,488 DEBUG wire.header - << "WWW-Authenticate: 
Negotiate[\r][\n]"
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,488 DEBUG wire.header - << "WWW-Authenticate: NTLM[\r][\n]"
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,488 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "WWW-Authenticate: Basic 
realm="iis75.intranet.org"[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "X-Powered-By: ASP.NET[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "Date: Wed, 02 Nov 2016 19:23:03 
GMT[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "Connection: close[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "Content-Length: 0[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,489 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,489 DEBUG wire.header - << "[\r][\n]"
2016-11-02 14:23:02,489 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processResponseHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,489 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processCookieHeaders(Header[], HttpState, HttpConnection)
2016-11-02 14:23:02,489 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpState, HttpConnection)
2016-11-02 14:23:02,489 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpConnection)
2016-11-02 14:23:02,489 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,489 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.canResponseHaveBody(int)
2016-11-02 14:23:02,490 DEBUG httpclient.HttpMethodDirector - Authorization 
required
2016-11-02 14:23:02,490 TRACE httpclient.HttpMethodDirector - enter 
HttpMethodBase.processAuthenticationResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,496 DEBUG auth.AuthChallengeProcessor - Supported 
authentication schemes in the order of preference: [ntlm, digest, basic]
2016-11-02 14:23:02,496 INFO  auth.AuthChallengeProcessor - ntlm authentication 
scheme selected
2016-11-02 14:23:02,496 DEBUG auth.AuthChallengeProcessor - Using 
authentication scheme: ntlm
2016-11-02 14:23:02,496 DEBUG auth.AuthChallengeProcessor - Authorization 
challenge processed
2016-11-02 14:23:02,496 DEBUG httpclient.HttpMethodDirector - Authentication 
scope: NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,496 TRACE httpclient.HttpState - enter 
HttpState.getCredentials(AuthScope)
2016-11-02 14:23:02,497 DEBUG httpclient.HttpMethodDirector - Retry 
authentication
2016-11-02 14:23:02,499 DEBUG httpclient.HttpMethodBase - Should close 
connection in response to directive: close
2016-11-02 14:23:02,499 TRACE httpclient.HttpConnection - enter 
HttpConnection.close()
2016-11-02 14:23:02,499 TRACE httpclient.HttpConnection - enter 
HttpConnection.closeSockedAndStreams()
2016-11-02 14:23:02,499 DEBUG httpclient.HttpMethodDirector - Authenticating 
with NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,499 TRACE httpclient.HttpState - enter 
HttpState.getCredentials(AuthScope)
2016-11-02 14:23:02,499 TRACE auth.NTLMScheme - enter 
NTLMScheme.authenticate(Credentials, HttpMethod)
2016-11-02 14:23:02,501 DEBUG params.HttpMethodParams - Credential charset not 
configured, using HTTP element charset
2016-11-02 14:23:02,504 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,504 TRACE httpclient.HttpMethodDirector - Attempt number 1 
to process request
2016-11-02 14:23:02,504 TRACE httpclient.HttpConnection - enter 
HttpConnection.open()
2016-11-02 14:23:02,504 DEBUG httpclient.HttpConnection - Open connection to 
iis75.intranet.org:80
2016-11-02 14:23:02,507 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.execute(HttpState, HttpConnection)
2016-11-02 14:23:02,507 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequest(HttpState, HttpConnection)
2016-11-02 14:23:02,507 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestLine(HttpState, HttpConnection)
2016-11-02 14:23:02,507 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.generateRequestLine(HttpConnection, String, String, String, 
String)
2016-11-02 14:23:02,507 DEBUG wire.header - >> "GET / HTTP/1.1[\r][\n]"
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addUserAgentRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addHostRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,508 DEBUG httpclient.HttpMethodBase - Adding Host request 
header
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addCookieRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,508 TRACE httpclient.HttpState - enter 
HttpState.getCookies()
2016-11-02 14:23:02,508 TRACE cookie.CookieSpec - enter 
CookieSpecBase.match(String, int, String, boolean, Cookie[])
2016-11-02 14:23:02,508 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addProxyConnectionHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,508 DEBUG wire.header - >> "Accept-Language: 
en-us,en-gb,en;q=0.7,*;q=0.3[\r][\n]"
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,508 DEBUG wire.header - >> "Accept-Charset: 
utf-8,ISO-8859-1;q=0.7,*;q=0.7[\r][\n]"
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,508 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,508 DEBUG wire.header - >> "Accept: 
text/html,application/xml;q=0.9,application/xhtml+xml,text/xml;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 DEBUG wire.header - >> "Accept-Encoding: x-gzip, gzip, 
deflate[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 DEBUG wire.header - >> "User-Agent: 
APL-Nutch-Spider/Nutch-1.12 ([email protected])[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 DEBUG wire.header - >> "Authorization: NTLM TlRMTVNTU 
<snip by bob> MENPQUNE[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 DEBUG wire.header - >> "Host: 
iis75.intranet.org[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.writeLine()
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,509 DEBUG wire.header - >> "[\r][\n]"
2016-11-02 14:23:02,509 TRACE httpclient.HttpConnection - enter 
HttpConnection.flushRequestOutputStream()
2016-11-02 14:23:02,509 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,510 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readStatusLine(HttpState, HttpConnection)
2016-11-02 14:23:02,510 TRACE httpclient.HttpConnection - enter 
HttpConnection.readLine()
2016-11-02 14:23:02,510 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,510 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,603 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,604 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,604 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,604 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HeaderParser.parseHeaders(InputStream, String)
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,604 DEBUG wire.header - << "Content-Type: text/html; 
charset=us-ascii[\r][\n]"
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,604 DEBUG wire.header - << "Server: 
Microsoft-HTTPAPI/2.0[\r][\n]"
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,604 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,605 DEBUG wire.header - << "WWW-Authenticate: NTLM 
TlRMTVNTUAACAAAABQAFADgAAAAGAoECr+K/ <snip by bob> AAAAA[\r][\n]"
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,605 DEBUG wire.header - << "Date: Wed, 02 Nov 2016 19:23:03 
GMT[\r][\n]"
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,605 DEBUG wire.header - << "Content-Length: 341[\r][\n]"
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,605 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,605 DEBUG wire.header - << "[\r][\n]"
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processResponseHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processCookieHeaders(Header[], HttpState, HttpConnection)
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpState, HttpConnection)
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpConnection)
2016-11-02 14:23:02,605 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.canResponseHaveBody(int)
2016-11-02 14:23:02,605 DEBUG httpclient.HttpMethodDirector - Authorization 
required
2016-11-02 14:23:02,605 TRACE httpclient.HttpMethodDirector - enter 
HttpMethodBase.processAuthenticationResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,606 DEBUG auth.AuthChallengeProcessor - Using 
authentication scheme: ntlm
2016-11-02 14:23:02,606 DEBUG auth.AuthChallengeProcessor - Authorization 
challenge processed
2016-11-02 14:23:02,606 DEBUG httpclient.HttpMethodDirector - Authentication 
scope: NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,606 TRACE httpclient.HttpState - enter 
HttpState.getCredentials(AuthScope)
2016-11-02 14:23:02,606 DEBUG httpclient.HttpMethodDirector - Retry 
authentication
2016-11-02 14:23:02,606 DEBUG wire.content - << "<!DOCTYPE HTML PUBLIC 
"-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd";>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG wire.content - << "<HTML><HEAD><TITLE>Not 
Authorized</TITLE>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG wire.content - << "<META 
HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG wire.content - << "<BODY><h2>Not 
Authorized</h2>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG wire.content - << "<hr><p>HTTP Error 401. The 
requested resource requires user authentication.</p>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG wire.content - << "</BODY></HTML>[\r][\n]"
2016-11-02 14:23:02,606 DEBUG httpclient.HttpMethodBase - Resorting to protocol 
version default close connection policy
2016-11-02 14:23:02,606 DEBUG httpclient.HttpMethodBase - Should NOT close 
connection, using HTTP/1.1
2016-11-02 14:23:02,607 TRACE httpclient.HttpConnection - enter 
HttpConnection.isResponseAvailable()
2016-11-02 14:23:02,607 DEBUG httpclient.HttpMethodDirector - Authenticating 
with NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,607 TRACE httpclient.HttpState - enter 
HttpState.getCredentials(AuthScope)
2016-11-02 14:23:02,607 TRACE auth.NTLMScheme - enter 
NTLMScheme.authenticate(Credentials, HttpMethod)
2016-11-02 14:23:02,607 DEBUG params.HttpMethodParams - Credential charset not 
configured, using HTTP element charset
2016-11-02 14:23:02,658 TRACE httpclient.HttpMethodBase - 
HttpMethodBase.addRequestHeader(Header)
2016-11-02 14:23:02,658 TRACE httpclient.HttpMethodDirector - Attempt number 1 
to process request
2016-11-02 14:23:02,660 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.execute(HttpState, HttpConnection)
2016-11-02 14:23:02,660 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequest(HttpState, HttpConnection)
2016-11-02 14:23:02,660 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestLine(HttpState, HttpConnection)
2016-11-02 14:23:02,660 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.generateRequestLine(HttpConnection, String, String, String, 
String)
2016-11-02 14:23:02,660 DEBUG wire.header - >> "GET / HTTP/1.1[\r][\n]"
2016-11-02 14:23:02,660 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,660 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.writeRequestHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addUserAgentRequestHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addHostRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,661 DEBUG httpclient.HttpMethodBase - Adding Host request 
header
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addCookieRequestHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,661 TRACE httpclient.HttpState - enter 
HttpState.getCookies()
2016-11-02 14:23:02,661 TRACE cookie.CookieSpec - enter 
CookieSpecBase.match(String, int, String, boolean, Cookie[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.addProxyConnectionHeader(HttpState, HttpConnection)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "Accept-Language: 
en-us,en-gb,en;q=0.7,*;q=0.3[\r][\n]"
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "Accept-Charset: 
utf-8,ISO-8859-1;q=0.7,*;q=0.7[\r][\n]"
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "Accept: 
text/html,application/xml;q=0.9,application/xhtml+xml,text/xml;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[\r][\n]"
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "Accept-Encoding: x-gzip, gzip, 
deflate[\r][\n]"
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "User-Agent: 
APL-Nutch-Spider/Nutch-1.12 ([email protected])[\r][\n]"
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,661 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,661 DEBUG wire.header - >> "Authorization: NTLM 
TlRMTVNTUAADAAAAGAAYAFUAAAAAAAAAbQAAAAUABQBAAAAABQAF <snip by bob> A==[\r][\n]"
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,662 DEBUG wire.header - >> "Host: 
iis75.intranet.org[\r][\n]"
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.print(String)
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.writeLine()
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[])
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.write(byte[], int, int)
2016-11-02 14:23:02,662 DEBUG wire.header - >> "[\r][\n]"
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.flushRequestOutputStream()
2016-11-02 14:23:02,662 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,662 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readStatusLine(HttpState, HttpConnection)
2016-11-02 14:23:02,662 TRACE httpclient.HttpConnection - enter 
HttpConnection.readLine()
2016-11-02 14:23:02,662 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,662 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,953 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,954 DEBUG wire.header - << "HTTP/1.1 401 
Unauthorized[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseHeaders(HttpState,HttpConnection)
2016-11-02 14:23:02,954 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HeaderParser.parseHeaders(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "Server: 
Microsoft-IIS/7.5[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "WWW-Authenticate: 
Negotiate[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "WWW-Authenticate: NTLM[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "WWW-Authenticate: Basic 
realm="iis75.intranet.org"[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "X-Powered-By: ASP.NET[\r][\n]"
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,954 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,954 DEBUG wire.header - << "Date: Wed, 02 Nov 2016 19:23:03 
GMT[\r][\n]"
2016-11-02 14:23:02,955 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,955 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,955 DEBUG wire.header - << "Content-Length: 0[\r][\n]"
2016-11-02 14:23:02,955 TRACE httpclient.HttpParser - enter 
HttpParser.readLine(InputStream, String)
2016-11-02 14:23:02,955 TRACE httpclient.HttpParser - enter 
HttpParser.readRawLine()
2016-11-02 14:23:02,955 DEBUG wire.header - << "[\r][\n]"
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processResponseHeaders(HttpState, HttpConnection)
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.processCookieHeaders(Header[], HttpState, HttpConnection)
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpState, HttpConnection)
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.readResponseBody(HttpConnection)
2016-11-02 14:23:02,955 TRACE httpclient.HttpConnection - enter 
HttpConnection.getResponseInputStream()
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodBase - enter 
HttpMethodBase.canResponseHaveBody(int)
2016-11-02 14:23:02,955 DEBUG httpclient.HttpMethodDirector - Authorization 
required
2016-11-02 14:23:02,955 TRACE httpclient.HttpMethodDirector - enter 
HttpMethodBase.processAuthenticationResponse(HttpState, HttpConnection)
2016-11-02 14:23:02,955 DEBUG auth.AuthChallengeProcessor - Using 
authentication scheme: ntlm
2016-11-02 14:23:02,955 DEBUG auth.AuthChallengeProcessor - Authorization 
challenge processed
2016-11-02 14:23:02,955 DEBUG httpclient.HttpMethodDirector - Authentication 
scope: NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,955 DEBUG httpclient.HttpMethodDirector - Credentials 
required
2016-11-02 14:23:02,955 DEBUG httpclient.HttpMethodDirector - Credentials 
provider not available
2016-11-02 14:23:02,955 INFO  httpclient.HttpMethodDirector - Failure 
authenticating with NTLM <any realm>@iis75.intranet.org:80
2016-11-02 14:23:02,958 DEBUG httpclient.HttpMethodBase - Resorting to protocol 
version default close connection policy
2016-11-02 14:23:02,959 DEBUG httpclient.HttpMethodBase - Should NOT close 
connection, using HTTP/1.1
2016-11-02 14:23:02,959 TRACE httpclient.HttpConnection - enter 
HttpConnection.isResponseAvailable()
2016-11-02 14:23:02,959 TRACE httpclient.HttpConnection - enter 
HttpConnection.releaseConnection()
2016-11-02 14:23:02,959 DEBUG httpclient.HttpConnection - Releasing connection 
back to connection manager.
2016-11-02 14:23:02,959 TRACE httpclient.MultiThreadedHttpConnectionManager - 
enter HttpConnectionManager.releaseConnection(HttpConnection)
2016-11-02 14:23:02,959 DEBUG httpclient.MultiThreadedHttpConnectionManager - 
Freeing connection, hostConfig=HostConfiguration[host=http://iis75.intranet.org]
2016-11-02 14:23:02,959 TRACE httpclient.MultiThreadedHttpConnectionManager - 
enter HttpConnectionManager.ConnectionPool.getHostPool(HostConfiguration)
2016-11-02 14:23:02,959 DEBUG util.IdleConnectionHandler - Adding connection 
at: 1478114582959
2016-11-02 14:23:02,959 DEBUG httpclient.MultiThreadedHttpConnectionManager - 
Notifying no-one, there are no waiting threads
2016-11-02 14:23:02,959 TRACE httpclient.Http - url: http://iis75.intranet.org; 
status code: 401; bytes received: 0; Content-Length: 0
2016-11-02 14:23:03,239 DEBUG util.ObjectCache - No object cache found for 
conf=Configuration: core-default.xml, core-site.xml, nutch-default.xml, 
nutch-site.xml, instantiating a new object cache
2016-11-02 14:23:03,356 TRACE httpclient.Http - 401 Authentication Required


-----Original Message-----
From: Furkan KAMACI [mailto:[email protected]] 
Sent: Wednesday, November 02, 2016 2:20 PM
To: [email protected]
Cc: Bell, Bob <[email protected]>
Subject: Re: Nutch 1.12 NTLM authentication IIS 7.5 Intranet

Hi Bob,

Server may require that the domain as a part of username. For example, 
"domain\\user". Could you check that?

Kind Regards,
Furkan KAMACI

On Wed, Nov 2, 2016 at 9:11 PM, Bell, Bob <[email protected]> wrote:

> I have replaced <iis74.intranet> is just a string replacement for our 
> actual intranet name something like blah.intranet.org, and I use the 
> <> convention when I obscuring actual data.
>
> What might the log4js.properties entry for httpclient.Http ?  I see it 
> is only at INFO level logging, but I do not know that proper object 
> path to set it up.
>
> Thanks,
> Bob
>
> >Hi Bob,
> >
> >Do you write host as <iis75.intranet> or iis75.intranet ?
> >
> >Kind Regards,
> >Furkan KAMACI
>
> -----Original Message-----
> From: Bell, Bob
> Sent: Wednesday, November 02, 2016 12:17 PM
> To: '[email protected]' <[email protected]>
> Cc: Bell, Bob <[email protected]>
> Subject: Nutch 1.12 NTLM authentication IIS 7.5 Intranet
>
> I have been trying for more than a year to get NTLM to work with IIS 7.5
> without success.   I was
> happy to see the 1.12 recent release, and thought ok I will give it 
> shot again.  I am almost to point where I do not believe it works with 
> ntlm, or it does not know how to handle the multiple 401's
> that are returned, or I have some fundamental problem somewhere ?    I
> have tried everything I
> could think of, and am at loss on how to solve this mystery.    My Nutch
> server is a Centos 7 in a
> Virtual Box.    I am using the httpclient as indicated in the docs but
> with no love.      I can fetch with
> anonymous, but I need ntlm to work.
>
> I am using plugin.includes = >protocol-httpclient
>
> nutch-site.xml:
> <property>
> <name>http.auth.file</name>
> <value>httpclient-auth.xml</value>
> <description>Authentication configuration file for 'protocol-httpclient'
> plugin.
> </description>
> </property>
>
> httpclient-auth.xml for local user:
> <auth-configuration>
>     <credentials username="nutch" password="<somepassword>">
>         <default  scheme="basic" port="80"/>
>     </credentials>
> </auth-configuration>
>
> Here is output with local user account on the server, one thing I 
> notice, is that I cannot force authentication to be anything other 
> than ntlm, even though I support ntlm, basic, and
> digest.   Notice the scheme was basic,
> but it goes though ntlm regardless.
>
> [root@localhost nutch]# nutch parsechecker http://<iis75.intranet>
> fetching: http://<iis75.intranet>
> Whitelisted hosts: [<iis75.intranet>]
> http.proxy.host = null
> http.proxy.port = 8080
> http.proxy.exception.list = false
> http.timeout = 36000
> http.content.limit = 65536
> http.agent = APL-Nutch-Spider/Nutch-1.12 http.accept.language =
> en-us,en-gb,en;q=0.7,*;q=0.3 http.accept = 
> text/html,application/xhtml+
> xml,application/xml;q=0.9,*/*;q=0.8
> Credentials - username: nutch; set as default for realm: ; scheme: 
> basic Pre-configured credentials with scope -  host: <iis75.intranet>; 
> port: 80; not found for url: http://<iis75.intranet> Authorization 
> required Supported authentication schemes in the order of preference: 
> [ntlm, digest, basic] ntlm authentication scheme selected Using 
> authentication scheme:
> ntlm Authorization challenge processed Authentication scope: NTLM <any
> realm>@<iis75.intranet>:80 Credentials required Credentials provider 
> realm>not
> available No credentials available for NTLM <any 
> realm>@<iis75.intranet>:80
> url: http://<iis75.intranet>; status code: 401; bytes received: 0;
> Content-Length: 0
> 401 Authentication Required
> Fetch failed with protocol status: access_denied(17), lastModified=0:
> Authentication required: http://<iis75.intranet> [root@localhost 
> nutch]#
>
>
> httpclient-auth.xml for domain  user:
> <auth-configuration>
>     <credentials username="<domainuser>" password="<domainpassword>
>         <default host="<iis75.intranet>" scheme="ntlm" port="80"
> realm="<domain>"/>
>     </credentials>
> </auth-configuration>
>
> note: doesn’t matter what I put in the host, doesn’t seem to change 
> anything.
>
> [root@localhost nutch]# nutch parsechecker http://<iis75.intranet>
> fetching: http://<iis75.intranet>
> Whitelisted hosts: [<iis75.intranet>]
> http.proxy.host = null
> http.proxy.port = 8080
> http.proxy.exception.list = false
> http.timeout = 36000
> http.content.limit = 65536
> http.agent = APL-Nutch-Spider/Nutch-1.12 http.accept.language =
> en-us,en-gb,en;q=0.7,*;q=0.3 http.accept = 
> text/html,application/xhtml+
> xml,application/xml;q=0.9,*/*;q=0.8
> Credentials - username: <domainuser>"; set as default for realm:
> =<domain>; scheme: ntlm Pre-configured credentials with scope -  host:
> <iis75.intranet>; port: 80; not found for url: http://<iis75.intranet> 
> Authorization required Supported authentication schemes in the order 
> of
> preference: [ntlm, digest, basic] ntlm authentication scheme selected 
> Using authentication scheme: ntlm Authorization challenge processed 
> Authentication scope: NTLM <any realm>@<iis75.intranet>:80 Retry 
> authentication Authenticating with NTLM <any 
> realm>@<iis75.intranet>:80 enter NTLMScheme.authenticate(Credentials, 
> HttpMethod) Authorization required Using authentication scheme: ntlm 
> Authorization challenge processed Authentication scope: NTLM <any 
> realm>@<iis75.intranet>:80 Retry authentication Authenticating with 
> NTLM <any realm>@<iis75.intranet>:80 enter 
> NTLMScheme.authenticate(Credentials, HttpMethod) Authorization 
> required Using authentication scheme: ntlm Authorization challenge 
> processed Authentication scope: NTLM <any realm>@<iis75.intranet>:80 
> Credentials required Credentials provider not available Failure 
> authenticating with NTLM <any realm>@<iis75.intranet>:80
> url: http://<iis75.intranet>; status code: 401; bytes received: 0;
> Content-Length: 0
> 401 Authentication Required
> Fetch failed with protocol status: access_denied(17), lastModified=0:
> Authentication required: http://<iis75.intranet>
>
> Last entry in  Hadoop.log:
>
> 2016-11-02 12:08:49,568 INFO  parse.ParserChecker - fetching: http:// 
> <iis75.intranet>
> 2016-11-02 12:08:50,040 DEBUG util.ObjectCache - No object cache found 
> for
> conf=Configuration: core-default.xml, core-site.xml, 
> nutch-default.xml, nutch-site.xml, instantiating a new object cache
> 2016-11-02 12:08:50,119 INFO  protocol.RobotRulesParser - Whitelisted
> hosts: [<iis75.intranet>]
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.proxy.host = null
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.proxy.port = 8080
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - 
> http.proxy.exception.list = false
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.timeout = 36000
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.content.limit = 
> 65536
> 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.agent =
> APL-Nutch-Spider/Nutch-1.12 ([email protected])
> 2016-11-02 12:08:50,120 INFO  httpclient.Http - http.accept.language =
> en-us,en-gb,en;q=0.7,*;q=0.3
> 2016-11-02 12:08:50,120 INFO  httpclient.Http - http.accept =
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> 2016-11-02 12:08:50,133 TRACE httpclient.Http - Credentials - username:
> <domainuser>; set as default for realm: <domain>; scheme: ntlm
> 2016-11-02 12:08:50,134 TRACE httpclient.Http - Pre-configured 
> credentials with scope -  host: <iis75.intranet>; port: 80; not found 
> for url: http:// <iis75.intranet>
> 2016-11-02 12:08:50,313 DEBUG httpclient.HttpMethodDirector - 
> Authorization required
> 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor - Supported 
> authentication schemes in the order of preference: [ntlm, digest, 
> basic]
> 2016-11-02 12:08:50,320 INFO  auth.AuthChallengeProcessor - ntlm 
> authentication scheme selected
> 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor - Using 
> authentication scheme: ntlm
> 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor - 
> Authorization challenge processed
> 2016-11-02 12:08:50,320 DEBUG httpclient.HttpMethodDirector - 
> Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,320 DEBUG httpclient.HttpMethodDirector - Retry 
> authentication
> 2016-11-02 12:08:50,321 DEBUG httpclient.HttpMethodDirector - 
> Authenticating with NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,321 TRACE auth.NTLMScheme - enter 
> NTLMScheme.authenticate(Credentials, HttpMethod)
> 2016-11-02 12:08:50,351 DEBUG httpclient.HttpMethodDirector - 
> Authorization required
> 2016-11-02 12:08:50,352 DEBUG auth.AuthChallengeProcessor - Using 
> authentication scheme: ntlm
> 2016-11-02 12:08:50,352 DEBUG auth.AuthChallengeProcessor - 
> Authorization challenge processed
> 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector - 
> Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector - Retry 
> authentication
> 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector - 
> Authenticating with NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,352 TRACE auth.NTLMScheme - enter 
> NTLMScheme.authenticate(Credentials, HttpMethod)
> 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector - 
> Authorization required
> 2016-11-02 12:08:50,393 DEBUG auth.AuthChallengeProcessor - Using 
> authentication scheme: ntlm
> 2016-11-02 12:08:50,393 DEBUG auth.AuthChallengeProcessor - 
> Authorization challenge processed
> 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector - 
> Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector - 
> Credentials required
> 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector - 
> Credentials provider not available
> 2016-11-02 12:08:50,393 INFO  httpclient.HttpMethodDirector - Failure 
> authenticating with NTLM <any realm>@<iis75.intranet>:80
> 2016-11-02 12:08:50,395 TRACE httpclient.Http - url: 
> http://<iis75.intranet>; status code: 401; bytes received: 0; 
> Content-Length: 0
> 2016-11-02 12:08:50,681 DEBUG util.ObjectCache - No object cache found 
> for
> conf=Configuration: core-default.xml, core-site.xml, 
> nutch-default.xml, nutch-site.xml, instantiating a new object cache
> 2016-11-02 12:08:50,804 TRACE httpclient.Http - 401 Authentication 
> Required
>
> Any help is appreciated, as I am about to move on to another spirder 
> for solr.
>
> Thanks,
> Bob
>
>

Reply via email to