What application are you running? Here's a few things:

- You will hit bottleneck on CPU if you are doing some complex computation
(like parsing a json etc.)
- You will hit bottleneck on Memory if your data/objects used in the
program is large (like defining playing with HashMaps etc inside your map*
operations), Here you can set spark.executor.memory to a higher number and
also you can change the spark.storage.memoryFraction whose default value is
0.6 of your executor memory.
- Network will be a bottleneck if data is not available locally on one of
the worker and hence it has to collect it from others, which is a lot of
Serialization and data transfer across your cluster.

Thanks
Best Regards

On Tue, Feb 17, 2015 at 11:20 AM, Julaiti Alafate <jalaf...@eng.ucsd.edu>
wrote:

> Hi there,
>
> I am trying to scale up the data size that my application is handling.
> This application is running on a cluster with 16 slave nodes. Each slave
> node has 60GB memory. It is running in standalone mode. The data is coming
> from HDFS that also in same local network.
>
> In order to have an understanding on how my program is running, I also had
> a Ganglia installed on the cluster. From previous run, I know the stage
> that taking longest time to run is counting word pairs (my RDD consists of
> sentences from a corpus). My goal is to identify the bottleneck of my
> application, then modify my program or hardware configurations according to
> that.
>
> Unfortunately, I didn't find too much information on Spark monitoring and
> optimization topics. Reynold Xin gave a great talk on Spark Summit 2014 for
> application tuning from tasks perspective. Basically, his focus is on tasks
> that oddly slower than the average. However, it didn't solve my problem
> because there is no such tasks that run way slow than others in my case.
>
> So I tried to identify the bottleneck from hardware prospective. I want to
> know what the limitation of the cluster is. I think if the executers are
> running hard, either CPU, memory or network bandwidth (or maybe the
> combinations) is hitting the roof. But Ganglia reports the CPU utilization
> of cluster is no more than 50%, network utilization is high for several
> seconds at the beginning, then drop close to 0. From Spark UI, I can see
> the nodes with maximum memory usage is consuming around 6GB, while
> "spark.executor.memory" is set to be 20GB.
>
> I am very confused that the program is not running fast enough, while
> hardware resources are not in shortage. Could you please give me some hints
> about what decides the performance of a Spark application from hardware
> perspective?
>
> Thanks!
>
> Julaiti
>
>

Reply via email to