Re: Stress testing hdfs with Spark

Jan Holmberg Tue, 05 Apr 2016 13:37:14 -0700

Yes, I realize that there's a standard way and then there's the way where 
client asks 'how fast can it write the data'. That is what I'm trying to figure 
out. At the moment I'm far from disks teorethical write speed when combining 
all the disks together.


On 05 Apr 2016, at 23:21, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:

so that throughput per second. You can try Spark streaming saving it to HDFS 
and increase the throttle.

The general accepted form is to measure service time which is the average 
service time for IO requests in ms


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 5 April 2016 at 20:56, Jan Holmberg 
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote:
I'm trying to get rough estimate how much data I can write within certain time 
period (GB/sec).
-jan

On 05 Apr 2016, at 22:49, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:

Hi Jan,

What is the definition of stress test in here? What are the matrices? 
Throughput of data, latency, velocity, volume?

HTH


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 5 April 2016 at 20:42, Jan Holmberg 
<jan.holmb...@perigeum.fi<mailto:jan.holmb...@perigeum.fi>> wrote:
Hi,
I'm trying to figure out how to write lots of data from each worker. I tried 
rdd.saveAsTextFile but got OOM when generating 1024MB string for a worker. 
Increasing worker memory would mean that I should drop the number of workers.
Soo, any idea how to write ex. 1gb file from each worker?

cheers,
-jan
---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>

Re: Stress testing hdfs with Spark

Reply via email to