[jira] [Commented] (NUTCH-2046) The crawl script should be able to skip an initial injection.

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960235#comment-15960235 ] ASF GitHub Bot commented on NUTCH-2046: --- lewismc commented on issue #161: Fix for NU

[jira] [Commented] (NUTCH-2193) Upgrade feed parser plugin to use rome 1.5

2017-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958961#comment-15958961 ] Hudson commented on NUTCH-2193: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3423 (See

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958941#comment-15958941 ] Markus Jelsma commented on NUTCH-2335: -- Thanks! > Injector not to filter and normali

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958931#comment-15958931 ] Sebastian Nagel commented on NUTCH-2335: Or to include multiple commits of a pull

[jira] [Resolved] (NUTCH-2193) Upgrade feed parser plugin to use rome 1.5

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2193. Resolution: Fixed Committed to master/1.x ([c181953|https://github.com/apache/nutch/commit/

[jira] [Assigned] (NUTCH-2193) Upgrade feed parser plugin to use rome 1.5

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2193: -- Assignee: Sebastian Nagel > Upgrade feed parser plugin to use rome 1.5 > --

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958906#comment-15958906 ] Markus Jelsma commented on NUTCH-2335: -- Ah, making it a patch file: https://github.co

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958893#comment-15958893 ] Markus Jelsma commented on NUTCH-2335: -- Ah, i didn't see this one. Anyone knows how i

[jira] [Closed] (NUTCH-2371) Injector to support noFilter and noNormalize

2017-04-06 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma closed NUTCH-2371. Resolution: Duplicate > Injector to support noFilter and noNormalize > -

[jira] [Created] (NUTCH-2371) Injector to support noFilter and noNormalize

2017-04-06 Thread Markus Jelsma (JIRA)
Markus Jelsma created NUTCH-2371: Summary: Injector to support noFilter and noNormalize Key: NUTCH-2371 URL: https://issues.apache.org/jira/browse/NUTCH-2371 Project: Nutch Issue Type: Bug

[jira] [Commented] (NUTCH-2269) Clean not working after crawl

2017-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958866#comment-15958866 ] Hudson commented on NUTCH-2269: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3422 (See

[jira] [Commented] (NUTCH-2269) Clean not working after crawl

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958822#comment-15958822 ] Sebastian Nagel commented on NUTCH-2269: Committed to master/1.x ([d2e60ef|https:

[jira] [Resolved] (NUTCH-2269) Clean not working after crawl

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2269. Resolution: Fixed > Clean not working after crawl > - > >

[jira] [Updated] (NUTCH-2269) Clean not working after crawl

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2269: --- Fix Version/s: 2.4 > Clean not working after crawl > - > >

[jira] [Commented] (NUTCH-2269) Clean not working after crawl

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958800#comment-15958800 ] ASF GitHub Bot commented on NUTCH-2269: --- sebastian-nagel closed pull request #156: f

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958788#comment-15958788 ] Hudson commented on NUTCH-2335: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3421 (See

[jira] [Assigned] (NUTCH-2281) Support non-default FileSystem

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2281: -- Assignee: Sebastian Nagel > Support non-default FileSystem > --

[jira] [Resolved] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2335. Resolution: Fixed Assignee: Sebastian Nagel Merged into master, 37d8aea. Thanks! > In

[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958764#comment-15958764 ] ASF GitHub Bot commented on NUTCH-2335: --- sebastian-nagel closed pull request #158: N

[jira] [Commented] (NUTCH-2351) Log with Generic Class Name at Nutch 2.x

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958749#comment-15958749 ] ASF GitHub Bot commented on NUTCH-2351: --- sebastian-nagel commented on issue #171: NU

[jira] [Commented] (NUTCH-2351) Log with Generic Class Name at Nutch 2.x

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958748#comment-15958748 ] ASF GitHub Bot commented on NUTCH-2351: --- sebastian-nagel closed pull request #171: N

[jira] [Commented] (NUTCH-2281) Support non-default FileSystem

2017-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958729#comment-15958729 ] Hudson commented on NUTCH-2281: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3420 (See

[jira] [Commented] (NUTCH-2336) SegmentReader to implement Tool

2017-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958730#comment-15958730 ] Hudson commented on NUTCH-2336: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3420 (See

[jira] [Resolved] (NUTCH-2281) Support non-default FileSystem

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2281. Resolution: Fixed Merged into master (f046e63). Thanks! > Support non-default FileSystem >

[jira] [Commented] (NUTCH-2281) Support non-default FileSystem

2017-04-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958684#comment-15958684 ] ASF GitHub Bot commented on NUTCH-2281: --- sebastian-nagel closed pull request #119: N

[jira] [Updated] (NUTCH-2071) A parser failure on a single document may fail crawling job

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2071: --- Affects Version/s: 1.11 > A parser failure on a single document may fail crawling job > -

[jira] [Commented] (NUTCH-2071) A parser failure on a single document may fail crawling job

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958660#comment-15958660 ] Sebastian Nagel commented on NUTCH-2071: - caused by a library/dependency conflict

[jira] [Resolved] (NUTCH-2319) Link with "rel=alternate" doesn't return in crawl

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2319. Resolution: Not A Problem Hi [~zbhatuk], please reopen if the problem persists. It's not a b

[jira] [Commented] (NUTCH-2365) HTTP Redirects to SubDomains don't get crawled

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958608#comment-15958608 ] Sebastian Nagel commented on NUTCH-2365: See also [thread on user mailing list|ht

[jira] [Updated] (NUTCH-2365) HTTP Redirects to SubDomains don't get crawled

2017-04-06 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2365: --- Fix Version/s: 1.14 > HTTP Redirects to SubDomains don't get crawled > ---