Re: Identify the performance bottleneck from hardware prospective

2015-03-05 Thread jalafate
Hi David,

It is a great point. It is actually one of the reasons that my program is
slow. I found that the major cause of my program running slow is the huge
garbage collection time. I created too many small objects in the map
procedure which triggers GC mechanism frequently. After I improved my
program by creating fewer objects, the performance is much better.

Here are two videos that may help other people who also struggling about
finding the bottleneck of your spark applications.

1. A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)
http://youtu.be/dmL0N3qfSc8

2. Spark Summit 2014 - Advanced Spark Training - Advanced Spark Internals
and Tuning
http://youtu.be/HG2Yd-3r4-M

I personally learned a lot from the points mentioned in the two videos
above.

In practice, I will monitor CPU user time, CPU idle time (if disk IO is the
bottleneck, CPU idle time should be significant), memory usage, network IO
and garbage collection time per task (can be found on the Spark web UI).
Ganglia will be helpful to monitor CPU, memory and network IO.

Best,
Julaiti



On Thu, Mar 5, 2015 at 1:39 AM, davidkl [via Apache Spark User List] <
ml-node+s1001560n21927...@n3.nabble.com> wrote:

> Hello Julaiti,
>
> Maybe I am just asking the obvious :-) but did you check disk IO?
> Depending on what you are doing that could be the bottleneck.
>
> In my case none of the HW resources was a bottleneck, but using some
> distributed features that were blocking execution (e.g. Hazelcast). Could
> that be your case as well?
>
> Regards
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Identify-the-performance-bottleneck-from-hardware-prospective-tp21684p21927.html
>  To unsubscribe from Identify the performance bottleneck from hardware
> prospective, click here
> 
> .
> NAML
> 
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Identify-the-performance-bottleneck-from-hardware-prospective-tp21684p21937.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Identify the performance bottleneck from hardware prospective

2015-02-16 Thread jalafate
Hi there,

I am trying to scale up the data size that my application is handling. This
application is running on a cluster with 16 slave nodes. Each slave node has
60GB memory. It is running in standalone mode. The data is coming from HDFS
that also in same local network.

In order to have an understanding on how my program is running, I also had a
Ganglia installed on the cluster. From previous run, I know the stage that
taking longest time to run is counting word pairs (my RDD consists of
sentences from a corpus). My goal is to identify the bottleneck of my
application, then modify my program or hardware configurations according to
that.

Unfortunately, I didn't find too much information on Spark monitoring and
optimization topics. Reynold Xin gave a great talk on Spark Summit 2014 for
application tuning from tasks perspective. Basically, his focus is on tasks
that oddly slower than the average. However, it didn't solve my problem
because there is no such tasks that run way slow than others in my case.

So I tried to identify the bottleneck from hardware prospective. I want to
know what the limitation of the cluster is. I think if the executers are
running hard, either CPU, memory or network bandwidth (or maybe the
combinations) is hitting the roof. But Ganglia reports the CPU utilization
of cluster is no more than 50%, network utilization is high for several
seconds at the beginning, then drop close to 0. From Spark UI, I can see the
nodes with maximum memory usage is consuming around 6GB, while
"spark.executor.memory" is set to be 20GB. 

I am very confused that the program is not running fast enough, while
hardware resources are not in shortage. Could you please give me some hints
about what decides the performance of a Spark application from hardware
perspective?

Thanks!

Julaiti



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Identify-the-performance-bottleneck-from-hardware-prospective-tp21684.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org