Hm, is this not just showing that you're rate-limited by how fast you
can get events to the cluster? you have more network bottleneck
between the data source and cluster in the cloud than your local
cluster.

On Tue, Oct 14, 2014 at 9:44 PM, danilopds <danilob...@gmail.com> wrote:
> Hi,
> I'm learning about Apache Spark Streaming and I'm doing some tests.
>
> Now,
> I have a modified version of the app NetworkWordCount that perform a
> /reduceByKeyAndWindow/ with window of 10 seconds in intervals of 5 seconds.
>
> I'm using also the function to measure the rate of records/second like this:
> /words.foreachRDD(rdd => {
>         val count = rdd.count()
>          println("Current rate: "+ (count/1) +" records/second")
> })/
>
> Then,
> In my computer with 4 cores and 8gb (running: /"local[4]"/) I have this
> average result:
> Current rate: 130 000
>
> Running locally with my computer as /master and worker/ I have this:
> Current rate: 25 000
>
> And running in a cloud computing azure with 4 cores and 7 gb, the result is:
> Current rate: 10 000
>
> I read the  Spark Streaming paper
> <http://www.eecs.berkeley.edu/~matei/papers/2013/sosp_spark_streaming.pdf>
> and the performance evaluation to a similar application was 250 000
> records/second.
>
> To send data in the socket I'm using an application similar to this:
> http://apache-spark-user-list.1001560.n3.nabble.com/streaming-code-to-simulate-a-network-socket-data-source-td3431.html#a13814
>
> So,
> Can anyone suggest me something to improve these rate?
> /(I increased the memory in executor and I didn't have better results)/
>
> Thanks!
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/A-question-about-streaming-throughput-tp16416.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to