Spark streaming with Redis? Working with large number of model objects at spark compute nodes.

2014-06-16 Thread tnegi
We are creating a real-time stream processing system with spark streaming which uses large number (millions) of analytic models applied to RDDs in the many different type of streams. Since we do not know which spark node will process specific RDDs , we need to make these models available at each

Number of Spark streams in Yarn cluster

2014-06-11 Thread tnegi
Hi, I am trying to get a sense of number of streams we can process in parallel on a Spark streaming cluster(Hadoop Yarn). Is there any benchmark for the same? We need a large number of streams(original + transformed) to be processed in parallel. The number is approximately around= 30,.