[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355396#comment-15355396 ] ASF GitHub Bot commented on NUTCH-2234: --- Github user asfgit closed the pull request at: https://github.com/apache/nutch/pull/118 > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301990#comment-15301990 ] ASF GitHub Bot commented on NUTCH-2234: --- Github user naegelejd commented on a diff in the pull request: https://github.com/apache/nutch/pull/118#discussion_r64734526 --- Diff: ivy/ivy.xml --- @@ -105,6 +105,10 @@ + --- End diff -- The tomcat dependencies were previously pulled in by Hadoop 2.4.0. They are needed for the protocol-http[client] JUnit tests using JSP. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301536#comment-15301536 ] ASF GitHub Bot commented on NUTCH-2234: --- Github user lewismc commented on a diff in the pull request: https://github.com/apache/nutch/pull/118#discussion_r64692054 --- Diff: ivy/ivy.xml --- @@ -105,6 +105,10 @@ + --- End diff -- Why are these Tomcat dependencies added? > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300627#comment-15300627 ] ASF GitHub Bot commented on NUTCH-2234: --- GitHub user naegelejd opened a pull request: https://github.com/apache/nutch/pull/118 fix for NUTCH-2234 and NUTCH-2236 Upgrade Elasticsearch and Lucene dependencies, which, in turn, requires updates to Guava and Hadoop dependencies: - Elasticsearch 1.4.1 -> Elasticsearch 2.3.3 - Lucene 4.10.2 -> 5.5.0 - Guava 16.0.1 -> Guava 18.0 - Hadoop 2.4.0 -> 2.7.2 You can merge this pull request into a Git repository by running: $ git pull https://github.com/naegelejd/nutch NUTCH-2234 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/118.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #118 commit 31e738a014576d8a4d4c8e8d3a0fc8d9fe5f8077 Author: Joseph NaegeleDate: 2016-05-25T18:27:31Z fix for NUTCH-2234 and NUTCH-2236 upgrades Elasticsearch and Lucene dependencies, which, in turn, requires updates to Guava and Hadoop dependencies: - Elasticsearch 1.4.1 -> Elasticsearch 2.3.3 - Lucene 4.10.2 -> 5.5.0 - Guava 16.0.1 -> Guava 18.0 - Hadoop 2.4.0 -> 2.7.2 > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300590#comment-15300590 ] Joseph Naegele commented on NUTCH-2234: --- Understood. The update to Lucene analyzers requires minor programmatic API changes in scoring-similarity, but nothing big. None of the indexers have tests, so I'm testing indexer-elastic manually for now. Unfortunately updating Elasticsearch breaks the plugin due to differences in guava versions: indexer-elastic depends on guava-18.0, which it declares in its plugin.xml, but guava-16.0.1 is a Nutch-wide dependency (for Hadoop). We avoided this issue in the past by also updating Nutch's Hadoop dependency from 2.4.0 -> 2.7.1, which is why Tien created NUTCH-2246. I'll open the PR with all aforementioned dependency updates. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298924#comment-15298924 ] Joseph Naegele commented on NUTCH-2234: --- Hmm I'm a bit confused. ES 2.3.3 depends on Lucene 5.5.0 libraries. It appears indexer-solr does not depend on Lucene, only Solrj. lucene-analyzers-common 4.10.2 is a Nutch-wide dependency in ivy/ivy.xml, but it appears to only be used by plugins: indexer-elastic, parsefilter-naivebayes, and scoring-similarity, of which indexer-elastic and parsefilter-naivebayes specify their Lucene dependencies in their own plugin.xml (scoring-similarity appears to rely on lucene-core 4.10.2 being a transitive dependency through lucene-analyzers-common. Changing the lucene version in ivy/ivy.xml requires changes to the scoring-similarity plugin, which I think should be its own issue. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298842#comment-15298842 ] Lewis John McGibbney commented on NUTCH-2234: - bq. I can update the patch or open a PR on Github. Please do. Please make sure that you run tests as the dependencies have caught us out before. Please also consider that with indexer-solr we want to keep indexer-elastic and indexer-solr (and any other indexers) relying upon the same underlying version of Lucene if possible. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298791#comment-15298791 ] Joseph Naegele commented on NUTCH-2234: --- Since this also adds support for multiple, comma-separated Elasticsearch hosts in {{elastic.host}}, the description {{nutch-default.xml}} should be updated accordingly. Is there any reason not to update this to use the most recent version of Elasticsearch (2.3.3)? I can update the patch or open a PR on Github. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.13 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171264#comment-15171264 ] Tien Nguyen Manh commented on NUTCH-2234: - elasticsearch 2.1.1 use httpclient 4.3.6 > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.12 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169521#comment-15169521 ] Lewis John McGibbney commented on NUTCH-2234: - Out or curiosity. What versions of httpcore and httpclient does 2.X use? > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.12 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169182#comment-15169182 ] Markus Jelsma commented on NUTCH-2234: -- Nice! I'll get this in once i have that Git thing sorted out > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh >Assignee: Markus Jelsma > Fix For: 1.12 > > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2234) Upgrade to elasticsearch 2.1.1
[ https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169167#comment-15169167 ] Otis Gospodnetic commented on NUTCH-2234: - +1, works for us. > Upgrade to elasticsearch 2.1.1 > -- > > Key: NUTCH-2234 > URL: https://issues.apache.org/jira/browse/NUTCH-2234 > Project: Nutch > Issue Type: Improvement > Components: indexer >Affects Versions: 1.11 >Reporter: Tien Nguyen Manh > Attachments: NUTCH-2234.patch > > > Currently we use elasticsearch 1.x, We should upgrade to 2.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)