[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
[ https://issues.apache.org/jira/browse/NUTCH-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850039#comment-17850039 ] Hudson commented on NUTCH-3044: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #163 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/163/]) NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/4b263533a9cdea208383fdbb0a8cc0b537423d7f]) * (edit) src/java/org/apache/nutch/crawl/Generator.java NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/4729786e4d7f9e1136580ceb191274862d03ba5b]) * (edit) src/test/org/apache/nutch/crawl/TestGenerator.java NUTCH-3044 Generator: NPE when extracting the host part of a URL fails (snagel: [https://github.com/apache/nutch/commit/b153279ad5844b32560ecf62a8e7f83f8ecbd43c]) * (edit) src/java/org/apache/nutch/crawl/Generator.java * (edit) src/test/org/apache/nutch/crawl/TestGenerator.java > Generator: NPE when extracting the host part of a URL fails > --- > > Key: NUTCH-3044 > URL: https://issues.apache.org/jira/browse/NUTCH-3044 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Priority: Minor > Fix For: 1.21 > > > When extracting the host part of a URL fails, the Generator job fails because > of a NPE in the SelectorReducer. This issue is reproducible if the CrawlDb > contains an malformed URL, for example, a URL with an unsupported scheme > (smb://). > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:439) > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:300) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands
[ https://issues.apache.org/jira/browse/NUTCH-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850040#comment-17850040 ] Hudson commented on NUTCH-3055: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #163 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/163/]) NUTCH-3055 README: fix Github "hub" commands (snagel: [https://github.com/apache/nutch/commit/ca03d9b76485b7c9d50dff2c3946bb8189daf5e1]) * (edit) README.md > README: fix Github "hub" commands > - > > Key: NUTCH-3055 > URL: https://issues.apache.org/jira/browse/NUTCH-3055 > Project: Nutch > Issue Type: Bug > Components: documentation >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Assignee: Sebastian Nagel >Priority: Trivial > Fix For: 1.21 > > > The [README.md|https://github.com/apache/nutch/blob/master/README.md] > contains [Github hub|https://hub.github.com/] commands but with "git" as > command (executable) name, maybe an alias or some other magic. However, if > hub isn't installed, these commands fail with {{git: 'pull-request' is not a > git command. See 'git --help'.}} or similar. > We should use the command "hub" instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NUTCH-3055) README: fix Github "hub" commands
[ https://issues.apache.org/jira/browse/NUTCH-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3055. Resolution: Fixed > README: fix Github "hub" commands > - > > Key: NUTCH-3055 > URL: https://issues.apache.org/jira/browse/NUTCH-3055 > Project: Nutch > Issue Type: Bug > Components: documentation >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Assignee: Sebastian Nagel >Priority: Trivial > Fix For: 1.21 > > > The [README.md|https://github.com/apache/nutch/blob/master/README.md] > contains [Github hub|https://hub.github.com/] commands but with "git" as > command (executable) name, maybe an alias or some other magic. However, if > hub isn't installed, these commands fail with {{git: 'pull-request' is not a > git command. See 'git --help'.}} or similar. > We should use the command "hub" instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands
[ https://issues.apache.org/jira/browse/NUTCH-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850005#comment-17850005 ] ASF GitHub Bot commented on NUTCH-3055: --- sebastian-nagel merged PR #818: URL: https://github.com/apache/nutch/pull/818 > README: fix Github "hub" commands > - > > Key: NUTCH-3055 > URL: https://issues.apache.org/jira/browse/NUTCH-3055 > Project: Nutch > Issue Type: Bug > Components: documentation >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Assignee: Sebastian Nagel >Priority: Trivial > Fix For: 1.21 > > > The [README.md|https://github.com/apache/nutch/blob/master/README.md] > contains [Github hub|https://hub.github.com/] commands but with "git" as > command (executable) name, maybe an alias or some other magic. However, if > hub isn't installed, these commands fail with {{git: 'pull-request' is not a > git command. See 'git --help'.}} or similar. > We should use the command "hub" instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] NUTCH-3055 README: fix Github "hub" commands [nutch]
sebastian-nagel merged PR #818: URL: https://github.com/apache/nutch/pull/818 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
[ https://issues.apache.org/jira/browse/NUTCH-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3044. Resolution: Fixed > Generator: NPE when extracting the host part of a URL fails > --- > > Key: NUTCH-3044 > URL: https://issues.apache.org/jira/browse/NUTCH-3044 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Priority: Minor > Fix For: 1.21 > > > When extracting the host part of a URL fails, the Generator job fails because > of a NPE in the SelectorReducer. This issue is reproducible if the CrawlDb > contains an malformed URL, for example, a URL with an unsupported scheme > (smb://). > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:439) > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:300) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
[ https://issues.apache.org/jira/browse/NUTCH-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850004#comment-17850004 ] ASF GitHub Bot commented on NUTCH-3044: --- sebastian-nagel merged PR #815: URL: https://github.com/apache/nutch/pull/815 > Generator: NPE when extracting the host part of a URL fails > --- > > Key: NUTCH-3044 > URL: https://issues.apache.org/jira/browse/NUTCH-3044 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.20 >Reporter: Sebastian Nagel >Priority: Minor > Fix For: 1.21 > > > When extracting the host part of a URL fails, the Generator job fails because > of a NPE in the SelectorReducer. This issue is reproducible if the CrawlDb > contains an malformed URL, for example, a URL with an unsupported scheme > (smb://). > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:439) > at > org.apache.nutch.crawl.Generator$SelectorReducer.reduce(Generator.java:300) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] NUTCH-3044 Generator: NPE when extracting the host part of a URL fails [nutch]
sebastian-nagel merged PR #815: URL: https://github.com/apache/nutch/pull/815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org