Re: Profile Spark Executors

Oleg Mazurov Fri, 18 Jan 2019 07:49:39 -0800

statsd-jvm-profiler is a very basic profiler written in Java. It's little
better than running jstack in a loop. From that perspective, it can't do
real CPU profiling.
What it presents as such is a collection of stack traces of Java threads in
RUNNABLE state. RUNNABLE doesn't mean actually running on a CPU (the JVM
wouldn't know that),
only that the thread is not in BLOCKED, WAITING, or any other state
recognizable by the JVM.
Now, a Java thread can be effectively blocked in a system call but still
reported as RUNNABLE by the JVM. This is what happens when a thread, for
example, is reading from a socket:
your leaf Java frame will be sun.nio.ch.EPollArrayWrapper.epollWait(),
which is part of sun.nio.ch.SelectorImpl.select() implementation, called by
socket I/O.
Spark's own threads that can be found in that state are shuffle-server,
shuffle-clients, rpc-clients. If you application accesses HDFS or
Zookeeper, there will others.


    -- Oleg

On Fri, Jan 18, 2019 at 2:49 AM Jack Kolokasis <koloka...@ics.forth.gr>
wrote:

> Hi all,
>
>      I try to profile my spark executors performance when use on Heap
> persistent level in compare to use off-Heap persistent level. I use
> statsd-jvm-profiler to profile each executor.
>
>  From the results i see that application spends 71,92% of its threads
> running the method sun.nio.ch.EPollArrayWrapper.epollWait(). This
> happens to all benchmarks execution. (Matrix Factorization, Linear
> Regression, etc)
>
> Can anyone explain me why this happens ?
>
> Thanks for your help,
> --Iacovos Kolokasis
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: Profile Spark Executors

Reply via email to