[jira] [Commented] (NUTCH-2197) Add solr5 solrcloud indexer support

2016-02-15 Thread Arun Kumar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148104#comment-15148104 ] Arun Kumar commented on NUTCH-2197: --- Hi Hudson Would you know if it is applied to main

[jira] [Commented] (NUTCH-2197) Add solr5 solrcloud indexer support

2016-02-15 Thread Arun Kumar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148102#comment-15148102 ] Arun Kumar commented on NUTCH-2197: --- The patch is working with Nutch 1.9 with Solr cloud

[jira] [Commented] (NUTCH-2210) Upgrade to Tika 1.12

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147735#comment-15147735 ] Markus Jelsma commented on NUTCH-2210: -- Apache Tika 1.12 is available. Will upgrade a

[jira] [Updated] (NUTCH-2221) Introduce db.ignore.internal.links to FetcherThread

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2221: - Attachment: NUTCH-2216-NUTCH-2220-NUTCH-2221.patch Patch for trunk. This includes all three patche

[jira] [Updated] (NUTCH-2216) db.ignore.*.links to optionally follow internal redirects

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2216: - Attachment: NUTCH-2216.patch Patch for trunk, introducing db.ignore.treat.redirects.as.links. Sett

[jira] [Updated] (NUTCH-2216) db.ignore.*.links to optionally follow internal redirects

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2216: - Summary: db.ignore.*.links to optionally follow internal redirects (was: ignore.internal.links to

[jira] [Updated] (NUTCH-2221) Introduce db.ignore.internal.links to FetcherThread

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2221: - Attachment: NUTCH-2221.patch Patch for trunk. This includes the modified config of NUTCH-2220, and

[jira] [Updated] (NUTCH-2220) Rename db.* options used only by the linkdb to linkdb.*

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2220: - Patch Info: Patch Available > Rename db.* options used only by the linkdb to linkdb.* > --

[jira] [Updated] (NUTCH-2220) Rename db.* options used only by the linkdb to linkdb.*

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2220: - Attachment: NUTCH-2220.patch Patch for trunk > Rename db.* options used only by the linkdb to lin

[jira] [Updated] (NUTCH-2221) Introduce db.ignore.internal.links to FetcherThread

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2221: - Description: FetcherThread has support for db.ignore.external.links. In config you can find db.ig

[jira] [Updated] (NUTCH-2221) Introduce db.ignore.internal.links to FetcherThread

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2221: - Summary: Introduce db.ignore.internal.links to FetcherThread (was: Introduce db.ignore.external.l

[jira] [Created] (NUTCH-2221) Introduce db.ignore.external.links to FetcherThread

2016-02-15 Thread Markus Jelsma (JIRA)
Markus Jelsma created NUTCH-2221: Summary: Introduce db.ignore.external.links to FetcherThread Key: NUTCH-2221 URL: https://issues.apache.org/jira/browse/NUTCH-2221 Project: Nutch Issue Type:

[jira] [Created] (NUTCH-2220) Rename db.* options used only by the linkdb to linkdb.*

2016-02-15 Thread Markus Jelsma (JIRA)
Markus Jelsma created NUTCH-2220: Summary: Rename db.* options used only by the linkdb to linkdb.* Key: NUTCH-2220 URL: https://issues.apache.org/jira/browse/NUTCH-2220 Project: Nutch Issue T

[jira] [Updated] (NUTCH-2219) Dedup script, allow users to change the order in which main documents are selected.

2016-02-15 Thread Ron van der Vegt (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ron van der Vegt updated NUTCH-2219: Attachment: NUTCH-2219.patch > Dedup script, allow users to change the order in which main d

[jira] [Created] (NUTCH-2219) Dedup script, allow users to change the order in which main documents are selected.

2016-02-15 Thread Ron van der Vegt (JIRA)
Ron van der Vegt created NUTCH-2219: --- Summary: Dedup script, allow users to change the order in which main documents are selected. Key: NUTCH-2219 URL: https://issues.apache.org/jira/browse/NUTCH-2219

[jira] [Resolved] (NUTCH-2189) Domain filter must deactivate if no rules are present

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-2189. -- Resolution: Fixed > Domain filter must deactivate if no rules are present >

[jira] [Updated] (NUTCH-2189) Domain filter must deactivate if no rules are present

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2189: - Fix Version/s: 1.12 > Domain filter must deactivate if no rules are present >

[jira] [Updated] (NUTCH-2189) Domain filter must deactivate if no rules are present

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2189: - Affects Version/s: 1.11 > Domain filter must deactivate if no rules are present >

[jira] [Reopened] (NUTCH-2189) Domain filter must deactivate if no rules are present

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma reopened NUTCH-2189: -- Fix version missing > Domain filter must deactivate if no rules are present > -

[jira] [Closed] (NUTCH-2189) Domain filter must deactivate if no rules are present

2016-02-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma closed NUTCH-2189. > Domain filter must deactivate if no rules are present > --

[GitHub] nutch pull request: fix for NUTCH-2217 contributed by dawid.wolski

2016-02-15 Thread merito
Github user merito closed the pull request at: https://github.com/apache/nutch/pull/90 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[jira] [Commented] (NUTCH-2217) Crawl pages with specified language

2016-02-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147124#comment-15147124 ] ASF GitHub Bot commented on NUTCH-2217: --- Github user merito closed the pull request