[ https://issues.apache.org/jira/browse/NUTCH-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850039#comment-17850039 ]
Hudson commented on NUTCH-3044: ------------------------------- SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #163 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/163/]) NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/4b263533a9cdea208383fdbb0a8cc0b537423d7f]) * (edit) src/java/org/apache/nutch/crawl/Generator.java NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/4729786e4d7f9e1136580ceb191274862d03ba5b]) * (edit) src/test/org/apache/nutch/crawl/TestGenerator.java NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/b153279ad5844b32560ecf62a8e7f83f8ecbd43c]) * (edit) src/java/org/apache/nutch/crawl/Generator.java * (edit) src/test/org/apache/nutch/crawl/TestGenerator.java > Generator: NPE when extracting the host part of a URL fails > ----------------------------------------------------------- > > Key: NUTCH-3044 > URL: https://issues.apache.org/jira/browse/NUTCH-3044 > Project: Nutch > Issue Type: Bug > Components: generator > Affects Versions: 1.20 > Reporter: Sebastian Nagel > Priority: Minor > Fix For: 1.21 > > > When extracting the host part of a URL fails, the Generator job fails because > of a NPE in the SelectorReducer. This issue is reproducible if the CrawlDb > contains an malformed URL, for example, a URL with an unsupported scheme > (smb://). > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:439) > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:300) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)