Hello!

I get the fowling error when I run nutch 1.0 and 1.1dev on some sites:

Error parsing: http://www.nasa.gov/centers/goddard/home/index.html: failed(2,200): java.nio.charset.IllegalCharsetNameException: .utf8

nutch log:

2009-04-10 02:23:56,902 WARN parse.html - java.nio.charset.IllegalCharsetNameException: .utf8 2009-04-10 02:23:56,903 WARN parse.html - at java.nio.charset.Charset.checkName(Charset.java:285) 2009-04-10 02:23:56,903 WARN parse.html - at java.nio.charset.Charset.lookup2(Charset.java:459) 2009-04-10 02:23:56,903 WARN parse.html - at java.nio.charset.Charset.lookup(Charset.java:438) 2009-04-10 02:23:56,903 WARN parse.html - at java.nio.charset.Charset.isSupported(Charset.java:480) 2009-04-10 02:23:56,903 WARN parse.html - at org.apache.nutch.util.EncodingDetector.resolveEncodingAlias(EncodingDetector.java:310) 2009-04-10 02:23:56,903 WARN parse.html - at org.apache.nutch.util.EncodingDetector.addClue(EncodingDetector.java:201) 2009-04-10 02:23:56,903 WARN parse.html - at org.apache.nutch.util.EncodingDetector.addClue(EncodingDetector.java:208) 2009-04-10 02:23:56,903 WARN parse.html - at org.apache.nutch.util.EncodingDetector.autoDetectClues(EncodingDetector.java:193) 2009-04-10 02:23:56,903 WARN parse.html - at org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:136) 2009-04-10 02:23:56,904 WARN parse.html - at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:82) 2009-04-10 02:23:56,904 WARN parse.html - at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:766) 2009-04-10 02:23:56,904 WARN parse.html - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:552) 2009-04-10 02:23:56,911 WARN fetcher.Fetcher - Error parsing: http://www.nasa.gov/centers/goddard/home/index.html: failed(2,200): java.nio.charset.IllegalCharsetNameException: .utf8


thanks!


Reply via email to