[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2018-02-07 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356095#comment-16356095 ] Noble Paul commented on SOLR-5069: -- Streaming API is there way to go > MapReduce for SolrCloud >

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552175#comment-14552175 ] Noble Paul commented on SOLR-5069: -- Is there some low hanging fruit that we can achieve

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Joel Bernstein
If you're going to do be shuffling data to multiple worker nodes then data will be crossing the network. Shuffling provides the foundation for certain parallel computing tasks, such as performing large scale parallel relational algebra. For machine learning algorithms we'll likely need a parallel

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Noble Paul
On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote: On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote: The problem with streaming is data locality. Data needs to be transferred across network to do the processing Nothing saying that you can't

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Yonik Seeley
On Wed, May 20, 2015 at 12:04 PM, Noble Paul noble.p...@gmail.com wrote: On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote: On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote: The problem with streaming is data locality. Data needs to be transferred

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Noble Paul
On Wed, May 20, 2015 at 10:17 PM, Yonik Seeley ysee...@gmail.com wrote: On Wed, May 20, 2015 at 12:04 PM, Noble Paul noble.p...@gmail.com wrote: On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote: On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Joel Bernstein
The Streaming Expressions language is a DSL to process docs and emit processed data. The parallel SQL engine will also fit into this category. Both of these languages compile to the Streaming API which is basically a real-time map-reduce framework that runs on SolrCloud worker nodes. The

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Noble Paul
Joel, Is this ticket an attempt to solve that ? SOLR-7560 On Wed, May 20, 2015 at 11:08 PM, Joel Bernstein joels...@gmail.com wrote: The Streaming Expressions language is a DSL to process docs and emit processed data. The parallel SQL engine will also fit into this category. Both of these

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Joel Bernstein
SOLR-7560 will provides a parallel SQL engine for SolrCloud. It's designed to run interactive SQL queries across large clusters of servers. This is one of the core big data use cases. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, May 20, 2015 at 7:07 PM, Noble Paul noble.p...@gmail.com

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Noble Paul
The problem with streaming is data locality. Data needs to be transferred across network to do the processing On May 20, 2015 8:15 PM, Yonik Seeley (JIRA) j...@apache.org wrote: [

Re: [jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Yonik Seeley
On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote: The problem with streaming is data locality. Data needs to be transferred across network to do the processing Nothing saying that you can't process data before it's streamed out, right? -Yonik

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-20 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552414#comment-14552414 ] Yonik Seeley commented on SOLR-5069: Looks like SOLR-6526 (Solr Streaming API) is

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2015-05-18 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549383#comment-14549383 ] Markus Jelsma commented on SOLR-5069: - [~ab] anything new to add to this topic? I am

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2014-02-25 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911597#comment-13911597 ] ASF subversion and git services commented on SOLR-5069: --- Commit

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2014-02-25 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911600#comment-13911600 ] ASF subversion and git services commented on SOLR-5069: --- Commit

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-26 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720869#comment-13720869 ] Andrzej Bialecki commented on SOLR-5069: - See here for an explanation how this

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719540#comment-13719540 ] Otis Gospodnetic commented on SOLR-5069: This is great to see - I asked about this

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719550#comment-13719550 ] Noble Paul commented on SOLR-5069: -- bq.Would it be better to mark some nodes as eligible

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719554#comment-13719554 ] Otis Gospodnetic commented on SOLR-5069: bq. Instead of marking certain nodes as

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719556#comment-13719556 ] Noble Paul commented on SOLR-5069: -- bq.But if searches are running on all nodes, then the

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719563#comment-13719563 ] Otis Gospodnetic commented on SOLR-5069: bq. It should be something we should think

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719593#comment-13719593 ] Yonik Seeley commented on SOLR-5069: bq. It should be something we should think of as a

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719612#comment-13719612 ] Andrzej Bialecki commented on SOLR-5069: - bq. some things will be completely

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-25 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719654#comment-13719654 ] Andrzej Bialecki commented on SOLR-5069: - An alternative solution for minimizing

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718058#comment-13718058 ] Andrzej Bialecki commented on SOLR-5069: - bq. why can't reduce start as soon as

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Lukas Vlcek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718068#comment-13718068 ] Lukas Vlcek commented on SOLR-5069: --- Hello, may be OT but in spite of the fact that

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718067#comment-13718067 ] Noble Paul commented on SOLR-5069: -- I guess , I haven't explained correctly. The reducer

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718152#comment-13718152 ] Noble Paul commented on SOLR-5069: -- bq.MR tasks can be both RAM and IO (disk,network)

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718434#comment-13718434 ] Andrzej Bialecki commented on SOLR-5069: - bq. The reducer output is available only

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Lukas Vlcek (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718518#comment-13718518 ] Lukas Vlcek commented on SOLR-5069: --- [~porqpine]: Well, I see the point. From the user

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718578#comment-13718578 ] Noble Paul commented on SOLR-5069: -- bq.that's why I suggested that this framework should

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718762#comment-13718762 ] Yonik Seeley commented on SOLR-5069: Awesome stuff Noble! bq. why can't reduce start

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-24 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718785#comment-13718785 ] Eks Dev commented on SOLR-5069: --- wow, this is getting pretty close to collection clustering

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-23 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716486#comment-13716486 ] Andrzej Bialecki commented on SOLR-5069: - Exciting idea! Almost as exciting as

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-23 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716624#comment-13716624 ] Noble Paul commented on SOLR-5069: -- Thanks Andrzej I started off with a simple model so

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-23 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716655#comment-13716655 ] Andrzej Bialecki commented on SOLR-5069: - bq. Sure, here the idea is to do some

[jira] [Commented] (SOLR-5069) MapReduce for SolrCloud

2013-07-23 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717922#comment-13717922 ] Noble Paul commented on SOLR-5069: -- bq.reduce() can start only when all mappers are