[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970317#comment-15970317 ] Kaidul Islam commented on NUTCH-2333: - Hi [~roannel] I would like to write the Nutch 2.x version of this plugin. Should I proceed or you have any plan to write it? > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.14 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969918#comment-15969918 ] Hudson commented on NUTCH-2333: --- FAILURE: Integrated in Jenkins build Nutch-trunk #3425 (See [https://builds.apache.org/job/Nutch-trunk/3425/]) Fixes for NUTCH-2333: Added the lines for ant runtime task (gitRoann3l;fhdez: [https://github.com/apache/nutch/commit/5873a24d3845563bd1028f6a27e22438670b4063]) * (edit) src/plugin/build.xml * (edit) build.xml Fixes for NUTCH-2333: Added the logic for indexing process (gitRoann3l;fhdez: [https://github.com/apache/nutch/commit/62496aec84cbf889f14175dbf03f0e8a1200ac9c]) * (add) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitDocument.java * (add) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitMQConstants.java * (add) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitMessage.java * (add) src/plugin/indexer-rabbit/plugin.xml * (add) src/plugin/indexer-rabbit/build-ivy.xml * (add) src/plugin/indexer-rabbit/build.xml * (add) src/plugin/indexer-rabbit/ivy.xml * (add) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitIndexWriter.java Fixes for NUTCH-2333: Added the properties for RabbitMQ indexer. (gitRoann3l;fhdez: [https://github.com/apache/nutch/commit/594564b27258fbcca68e90e41db801a750d11426]) * (edit) conf/nutch-default.xml Fixes for NUTCH-2333: Added new properties to indexer (gitRoann3l;fhdez: [https://github.com/apache/nutch/commit/17886f722ff16da0aa29bd059953feca609a5165]) * (edit) conf/nutch-default.xml Fixes for NUTCH-2333: Corrected some comments in the configuration file (gitRoann3l;fhdez: [https://github.com/apache/nutch/commit/c0af89aeb0e5c9e2059192eac7514cea3825b7e2]) * (edit) conf/nutch-default.xml * (edit) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitIndexWriter.java * (edit) src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitMQConstants.java > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.14 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969913#comment-15969913 ] ASF GitHub Bot commented on NUTCH-2333: --- lewismc closed pull request #168: fix for NUTCH-2333 contributed by r0ann3l URL: https://github.com/apache/nutch/pull/168 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.14 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828234#comment-15828234 ] Sebastian Nagel commented on NUTCH-2333: +1 looks good, although I haven't tested it. Yes, there is some overlap between indexer-rabbit and publish-rabbitmq, mostly regarding configuration of and connection to RabbitMQ. Eventually, code could be shared in a lib-rabbitmq plugin, now or as a later improvement. To implement indexer-rabbit as a NutchPublisher seems difficult: the IndexWriter and NutchPublisher interfaces are different, esp. how objects are serialized (a specific object "NutchDocument" vs. an unknown but universally JSON serializable object). Of course, one could think of indexing as a event, but in reality it's likely that different consumers/queues are used for monitoring and indexing content. > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816791#comment-15816791 ] ASF GitHub Bot commented on NUTCH-2333: --- GitHub user r0ann3l opened a pull request: https://github.com/apache/nutch/pull/168 fix for NUTCH-2333 contributed by r0ann3l An indexer for RabbitMQ is for sending crawled documents to a queue into a server of RabbitMQ. Just like solr-indexer or elastic-indexer do. This indexer send a lot of documents to RabbitMQ, not one by one, according to rabbitmq.indexer.commit.size value. You can merge this pull request into a Git repository by running: $ git pull https://github.com/r0ann3l/nutch NUTCH-2333 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/168.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #168 commit 5873a24d3845563bd1028f6a27e22438670b4063 Author: r0ann3l Date: 2016-11-02T21:40:01Z Fixes for NUTCH-2333: Added the lines for ant runtime task commit 62496aec84cbf889f14175dbf03f0e8a1200ac9c Author: r0ann3l Date: 2016-11-03T15:46:06Z Fixes for NUTCH-2333: Added the logic for indexing process commit 594564b27258fbcca68e90e41db801a750d11426 Author: r0ann3l Date: 2016-11-03T15:47:10Z Fixes for NUTCH-2333: Added the properties for RabbitMQ indexer. commit 17886f722ff16da0aa29bd059953feca609a5165 Author: r0ann3l Date: 2016-11-03T20:58:38Z Fixes for NUTCH-2333: Added new properties to indexer commit c0af89aeb0e5c9e2059192eac7514cea3825b7e2 Author: r0ann3l Date: 2017-01-11T01:21:17Z Fixes for NUTCH-2333: Corrected some comments in the configuration file and indexer description message. > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15635112#comment-15635112 ] Chris A. Mattmann commented on NUTCH-2333: -- Even more so I would recommend that [~roannel] and the proposed work be directly integrated as a plugin to the existing already committed NUTCH-2132. cc [~sujenshah] > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15635113#comment-15635113 ] Chris A. Mattmann commented on NUTCH-2333: -- can you please integrate into the publisher framework > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634348#comment-15634348 ] Roannel Fernández Hernández commented on NUTCH-2333: No, it is not related. This plugin is like indexer-solr plugin or indexer-elastic. The idea is to send the documents to a RabbitMQ server on indexing step. > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634238#comment-15634238 ] Sujen Shah commented on NUTCH-2333: --- Thanks! You may want to check out https://issues.apache.org/jira/browse/NUTCH-2132 to see if it related to this. > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > > A plugin to send the documents to a RabbitMQ server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2333) Indexer for RabbitMQ
[ https://issues.apache.org/jira/browse/NUTCH-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630730#comment-15630730 ] Sujen Shah commented on NUTCH-2333: --- Hi [~roannel], it would be great if you could add some description to this issue. Thanks! > Indexer for RabbitMQ > > > Key: NUTCH-2333 > URL: https://issues.apache.org/jira/browse/NUTCH-2333 > Project: Nutch > Issue Type: New Feature > Components: indexer >Affects Versions: 1.12 >Reporter: Roannel Fernández Hernández >Priority: Minor > Fix For: 1.13 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)