I am trying to benchmark spark in a hadoop cluster.
I need to design a sample spark job to test the CPU utilization, RAM usage,
Input throughput, Output throughput and Duration of execution in the
cluster.

I need to test the state of the cluster for :-

A spark job which uses high CPU
A spark job which uses high RAM
A spark job which uses high Input throughput
A spark job which uses high Output throughput
A spark job which takes long time.

These have to be tested individually and a combination of these scenarios
would also be used.

Please help me in understanding the factors of a Spark Job which would
contribute to  CPU utilization, RAM usage, Input throughput, Output
throughput, Duration of execution in the cluster. So that I can design
spark jobs that could be used for testing.



Thanks
Shalish.

Reply via email to