Re: Equivalent to Storm's 'field grouping' in Spark.

2015-06-04 Thread luke89
Hi Matei,

thank you for answering.

Accordingly to what you said, am I mistaken when I say that tuples with the
same key might eventually be spread across more than one node in case an
overloaded worker can no longer accept tuples?
In other words, suppose a worker (processing key K) cannot accept more
tuples: how does Spark Streaming handle the other K-keyed tuples? Systems
like Storm do not provide any mechanism to handle such a situation.

I am pretty new to Spark, and I apologize if the question sounds too naive,
but I am experiencing some troubles in understanding Spark Internals!

Thank you, again!



2015-06-03 19:34 GMT+02:00 Matei Zaharia matei.zaha...@gmail.com:

 This happens automatically when you use the byKey operations, e.g.
 reduceByKey, updateStateByKey, etc. Spark Streaming keeps the state for a
 given set of keys on a specific node and sends new tuples with that key to
 that.

 Matei

  On Jun 3, 2015, at 6:31 AM, allonsy luke1...@gmail.com wrote:
 
  Hi everybody,
  is there in Spark anything sharing the philosophy of Storm's field
 grouping?
 
  I'd like to manage data partitioning across the workers by sending tuples
  sharing the same key to the very same worker in the cluster, but I did
 not
  find any method to do that.
 
  Suggestions?
 
  :)
 
 
 
  --
  View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Equivalent-to-Storm-s-field-grouping-in-Spark-tp23135.html
  Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 




Equivalent to Storm's 'field grouping' in Spark.

2015-06-03 Thread allonsy
Hi everybody,
is there in Spark anything sharing the philosophy of Storm's field grouping?

I'd like to manage data partitioning across the workers by sending tuples
sharing the same key to the very same worker in the cluster, but I did not
find any method to do that.

Suggestions?

:)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Equivalent-to-Storm-s-field-grouping-in-Spark-tp23135.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Equivalent to Storm's 'field grouping' in Spark.

2015-06-03 Thread Matei Zaharia
This happens automatically when you use the byKey operations, e.g. reduceByKey, 
updateStateByKey, etc. Spark Streaming keeps the state for a given set of keys 
on a specific node and sends new tuples with that key to that.

Matei

 On Jun 3, 2015, at 6:31 AM, allonsy luke1...@gmail.com wrote:
 
 Hi everybody,
 is there in Spark anything sharing the philosophy of Storm's field grouping?
 
 I'd like to manage data partitioning across the workers by sending tuples
 sharing the same key to the very same worker in the cluster, but I did not
 find any method to do that.
 
 Suggestions?
 
 :)
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Equivalent-to-Storm-s-field-grouping-in-Spark-tp23135.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org