[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356095#comment-16356095
]
Noble Paul commented on SOLR-5069:
--
Streaming API is there way to go
> MapReduce for SolrCloud
>
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552175#comment-14552175
]
Noble Paul commented on SOLR-5069:
--
Is there some low hanging fruit that we can achieve
If you're going to do be shuffling data to multiple worker nodes then data
will be crossing the network. Shuffling provides the foundation for certain
parallel computing tasks, such as performing large scale parallel
relational algebra.
For machine learning algorithms we'll likely need a parallel
On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote:
On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote:
The problem with streaming is data locality. Data needs to be transferred
across network to do the processing
Nothing saying that you can't
On Wed, May 20, 2015 at 12:04 PM, Noble Paul noble.p...@gmail.com wrote:
On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote:
On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote:
The problem with streaming is data locality. Data needs to be
transferred
On Wed, May 20, 2015 at 10:17 PM, Yonik Seeley ysee...@gmail.com wrote:
On Wed, May 20, 2015 at 12:04 PM, Noble Paul noble.p...@gmail.com wrote:
On Wed, May 20, 2015 at 8:41 PM, Yonik Seeley ysee...@gmail.com wrote:
On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com
The Streaming Expressions language is a DSL to process docs and emit
processed data. The parallel SQL engine will also fit into this category.
Both of these languages compile to the Streaming API which is basically a
real-time map-reduce framework that runs on SolrCloud worker nodes.
The
Joel, Is this ticket an attempt to solve that ? SOLR-7560
On Wed, May 20, 2015 at 11:08 PM, Joel Bernstein joels...@gmail.com wrote:
The Streaming Expressions language is a DSL to process docs and emit
processed data. The parallel SQL engine will also fit into this category.
Both of these
SOLR-7560 will provides a parallel SQL engine for SolrCloud. It's designed
to run interactive SQL queries across large clusters of servers. This is
one of the core big data use cases.
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, May 20, 2015 at 7:07 PM, Noble Paul noble.p...@gmail.com
The problem with streaming is data locality. Data needs to be transferred
across network to do the processing
On May 20, 2015 8:15 PM, Yonik Seeley (JIRA) j...@apache.org wrote:
[
On Wed, May 20, 2015 at 11:06 AM, Noble Paul noble.p...@gmail.com wrote:
The problem with streaming is data locality. Data needs to be transferred
across network to do the processing
Nothing saying that you can't process data before it's streamed out, right?
-Yonik
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552414#comment-14552414
]
Yonik Seeley commented on SOLR-5069:
Looks like SOLR-6526 (Solr Streaming API) is
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549383#comment-14549383
]
Markus Jelsma commented on SOLR-5069:
-
[~ab] anything new to add to this topic? I am
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911597#comment-13911597
]
ASF subversion and git services commented on SOLR-5069:
---
Commit
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911600#comment-13911600
]
ASF subversion and git services commented on SOLR-5069:
---
Commit
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720869#comment-13720869
]
Andrzej Bialecki commented on SOLR-5069:
-
See here for an explanation how this
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719540#comment-13719540
]
Otis Gospodnetic commented on SOLR-5069:
This is great to see - I asked about this
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719550#comment-13719550
]
Noble Paul commented on SOLR-5069:
--
bq.Would it be better to mark some nodes as eligible
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719554#comment-13719554
]
Otis Gospodnetic commented on SOLR-5069:
bq. Instead of marking certain nodes as
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719556#comment-13719556
]
Noble Paul commented on SOLR-5069:
--
bq.But if searches are running on all nodes, then the
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719563#comment-13719563
]
Otis Gospodnetic commented on SOLR-5069:
bq. It should be something we should think
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719593#comment-13719593
]
Yonik Seeley commented on SOLR-5069:
bq. It should be something we should think of as a
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719612#comment-13719612
]
Andrzej Bialecki commented on SOLR-5069:
-
bq. some things will be completely
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719654#comment-13719654
]
Andrzej Bialecki commented on SOLR-5069:
-
An alternative solution for minimizing
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718058#comment-13718058
]
Andrzej Bialecki commented on SOLR-5069:
-
bq. why can't reduce start as soon as
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718068#comment-13718068
]
Lukas Vlcek commented on SOLR-5069:
---
Hello,
may be OT but in spite of the fact that
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718067#comment-13718067
]
Noble Paul commented on SOLR-5069:
--
I guess , I haven't explained correctly.
The reducer
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718152#comment-13718152
]
Noble Paul commented on SOLR-5069:
--
bq.MR tasks can be both RAM and IO (disk,network)
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718434#comment-13718434
]
Andrzej Bialecki commented on SOLR-5069:
-
bq. The reducer output is available only
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718518#comment-13718518
]
Lukas Vlcek commented on SOLR-5069:
---
[~porqpine]: Well, I see the point. From the user
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718578#comment-13718578
]
Noble Paul commented on SOLR-5069:
--
bq.that's why I suggested that this framework should
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718762#comment-13718762
]
Yonik Seeley commented on SOLR-5069:
Awesome stuff Noble!
bq. why can't reduce start
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718785#comment-13718785
]
Eks Dev commented on SOLR-5069:
---
wow, this is getting pretty close to collection clustering
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716486#comment-13716486
]
Andrzej Bialecki commented on SOLR-5069:
-
Exciting idea! Almost as exciting as
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716624#comment-13716624
]
Noble Paul commented on SOLR-5069:
--
Thanks Andrzej
I started off with a simple model so
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716655#comment-13716655
]
Andrzej Bialecki commented on SOLR-5069:
-
bq. Sure, here the idea is to do some
[
https://issues.apache.org/jira/browse/SOLR-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717922#comment-13717922
]
Noble Paul commented on SOLR-5069:
--
bq.reduce() can start only when all mappers are
37 matches
Mail list logo