subject:"How fast would you expect shuffle serialize to be\?"

RE: How fast would you expect shuffle serialize to be?

2014-04-30 Thread Liu, Raymond

. Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Wednesday, April 30, 2014 1:22 PM To: user@spark.apache.org Subject: Re: How fast would you expect shuffle serialize to be? Hm - I'm still not sure if you mean 100MB/s for each task

How fast would you expect shuffle serialize to be?

2014-04-29 Thread Liu, Raymond

Hi I am running a WordCount program which count words from HDFS, and I noticed that the serializer part of code takes a lot of CPU time. On a 16core/32thread node, the total throughput is around 50MB/s by JavaSerializer, and if I switching to KryoSerializer, it doubles to around

Re: How fast would you expect shuffle serialize to be?

2014-04-29 Thread Patrick Wendell

Is this the serialization throughput per task or the serialization throughput for all the tasks? On Tue, Apr 29, 2014 at 9:34 PM, Liu, Raymond raymond@intel.com wrote: Hi I am running a WordCount program which count words from HDFS, and I noticed that the serializer part of code

RE: How fast would you expect shuffle serialize to be?

2014-04-29 Thread Liu, Raymond

For all the tasks, say 32 task on total Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Is this the serialization throughput per task or the serialization throughput for all the tasks? On Tue, Apr 29, 2014 at 9:34 PM, Liu, Raymond

RE: How fast would you expect shuffle serialize to be?

2014-04-29 Thread Liu, Raymond

By the way, to be clear, I run repartition firstly to make all data go through shuffle instead of run ReduceByKey etc directly ( which reduce the data need to be shuffle and serialized), thus say all 50MB/s data from HDFS will go to serializer. ( in fact, I also tried generate data in memory

RE: How fast would you expect shuffle serialize to be?

2014-04-29 Thread Liu, Raymond

Later case, total throughput aggregated from all cores. Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Wednesday, April 30, 2014 1:22 PM To: user@spark.apache.org Subject: Re: How fast would you expect shuffle serialize to be? Hm

RE: How fast would you expect shuffle serialize to be?

How fast would you expect shuffle serialize to be?

Re: How fast would you expect shuffle serialize to be?

RE: How fast would you expect shuffle serialize to be?

RE: How fast would you expect shuffle serialize to be?

RE: How fast would you expect shuffle serialize to be?

6 matches

Site Navigation

Mail list logo

Footer information