Hi Matei,
thank you for answering.
Accordingly to what you said, am I mistaken when I say that tuples with the
same key might eventually be spread across more than one node in case an
overloaded worker can no longer accept tuples?
In other words, suppose a worker (processing key K) cannot accept
Hi everybody,
is there in Spark anything sharing the philosophy of Storm's field grouping?
I'd like to manage data partitioning across the workers by sending tuples
sharing the same key to the very same worker in the cluster, but I did not
find any method to do that.
Suggestions?
:)
--
View
This happens automatically when you use the byKey operations, e.g. reduceByKey,
updateStateByKey, etc. Spark Streaming keeps the state for a given set of keys
on a specific node and sends new tuples with that key to that.
Matei
On Jun 3, 2015, at 6:31 AM, allonsy luke1...@gmail.com wrote: