subject:"Re\: SPARK\-streaming app running 10x slower on YARN vs STANDALONE cluster"

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2015-01-21 Thread Mukesh Jha

numStreams is 5 in my case.

ListJavaPairDStreambyte[], byte[] kafkaStreams = new
ArrayList(numStreams);
for (int i = 0; i numStreams; i++) {
kafkaStreams.add(KafkaUtils.createStream(sc, byte[].class,
byte[].class, DefaultDecoder.class, DefaultDecoder.class, kafkaConf,
topicMap, StorageLevel.MEMORY_ONLY_SER()));
}
JavaPairDStreambyte[], byte[] ks = sc.union(kafkaStreams.remove(0),
kafkaStreams);

On Wed, Jan 21, 2015 at 3:19 PM, Gerard Maas gerard.m...@gmail.com wrote:

Hi Mukesh,

How are you creating your receivers? Could you post the (relevant) code?

-kr, Gerard.

On Wed, Jan 21, 2015 at 9:42 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

Hello Guys,

I've re partitioned my kafkaStream so that it gets evenly distributed
among the executors and the results are better.
Still from the executors page it seems that only 1 executors all 8 cores
are getting used and other executors are using just 1 core.

Is this the correct interpretation based on the below data? If so how can
we fix this?

[image: Inline image 1]

On Wed, Dec 31, 2014 at 7:22 AM, Tathagata Das
tathagata.das1...@gmail.com wrote:

Thats is kind of expected due to data locality. Though you should see
some tasks running on the executors as the data gets replicated to
other nodes and can therefore run tasks based on locality. You have
two solutions

1. kafkaStream.repartition() to explicitly repartition the received
data across the cluster.
2. Create multiple kafka streams and union them together.

See
http://spark.apache.org/docs/latest/streaming-programming-guide.html#reducing-the-processing-time-of-each-batch

On Tue, Dec 30, 2014 at 1:43 AM, Mukesh Jha me.mukesh@gmail.com
wrote:
Thanks Sandy, It was the issue with the no of cores.

Another issue I was facing is that tasks are not getting distributed
evenly
among all executors and are running on the NODE_LOCAL locality level
i.e.
all the tasks are running on the same executor where my
kafkareceiver(s) are
running even though other executors are idle.

I configured spark.locality.wait=50 instead of the default 3000 ms,
which
forced the task rebalancing among nodes, let me know if there is a
better
way to deal with this.

On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

Makes sense, I've also tries it in standalone mode where all 3
workers
driver were running on the same 8 core box and the results were
similar.

Anyways I will share the results in YARN mode with 8 core yarn
containers.

On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com

wrote:

When running in standalone mode, each executor will be able to use
all 8
cores on the box. When running on YARN, each executor will only
have access
to 2 cores. So the comparison doesn't seem fair, no?

-Sandy

On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha
me.mukesh@gmail.com
wrote:

Nope, I am setting 5 executors with 2 cores each. Below is the
command
that I'm using to submit in YARN mode. This starts up 5 executor
nodes and a
drives as per the spark application master UI.

spark-submit --master yarn-cluster --num-executors 5 --driver-memory
1024m --executor-memory 1024m --executor-cores 2 --class
com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
vm.cloud.com:2181/kafka
spark-yarn avro 1 5000

On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza
sandy.r...@cloudera.com
wrote:

*oops, I mean are you setting --executor-cores to 8

On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza
sandy.r...@cloudera.com
wrote:

Are you setting --num-executors to 8?

On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha
me.mukesh@gmail.com
wrote:

Sorry Sandy, The command is just for reference but I can confirm
that
there are 4 executors and a driver as shown in the spark UI page.

Each of these machines is a 8 core box with ~15G of ram.

On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza
sandy.r...@cloudera.com wrote:

Hi Mukesh,

Based on your spark-submit command, it looks like you're only
running with 2 executors on YARN. Also, how many cores does
each machine
have?

-Sandy

On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha
me.mukesh@gmail.com wrote:

Hello Experts,
I'm bench-marking Spark on YARN
(https://spark.apache.org/docs/latest/running-on-yarn.html)
vs a standalone
spark cluster (
https://spark.apache.org/docs/latest/spark-standalone.html).
I have a standalone cluster with 3 executors, and a spark app
running on yarn with 4 executors as shown below.

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2015-01-21 Thread Mukesh Jha

Hello Guys,

I've re partitioned my kafkaStream so that it gets evenly distributed among
the executors and the results are better.
Still from the executors page it seems that only 1 executors all 8 cores
are getting used and other executors are using just 1 core.

Is this the correct interpretation based on the below data? If so how can
we fix this?

[image: Inline image 1]

On Wed, Dec 31, 2014 at 7:22 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:

1. kafkaStream.repartition() to explicitly repartition the received
data across the cluster.
2. Create multiple kafka streams and union them together.

See
http://spark.apache.org/docs/latest/streaming-programming-guide.html#reducing-the-processing-time-of-each-batch

On Tue, Dec 30, 2014 at 1:43 AM, Mukesh Jha me.mukesh@gmail.com
wrote:
Thanks Sandy, It was the issue with the no of cores.

Another issue I was facing is that tasks are not getting distributed
evenly
among all executors and are running on the NODE_LOCAL locality level i.e.
all the tasks are running on the same executor where my kafkareceiver(s)
are
running even though other executors are idle.

I configured spark.locality.wait=50 instead of the default 3000 ms, which
forced the task rebalancing among nodes, let me know if there is a better
way to deal with this.

On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

Makes sense, I've also tries it in standalone mode where all 3 workers
driver were running on the same 8 core box and the results were similar.

Anyways I will share the results in YARN mode with 8 core yarn
containers.

On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:

When running in standalone mode, each executor will be able to use all
8
cores on the box. When running on YARN, each executor will only have
access
to 2 cores. So the comparison doesn't seem fair, no?

-Sandy

On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

Nope, I am setting 5 executors with 2 cores each. Below is the
command
that I'm using to submit in YARN mode. This starts up 5 executor
nodes and a
drives as per the spark application master UI.

On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com

wrote:

*oops, I mean are you setting --executor-cores to 8

On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza
sandy.r...@cloudera.com
wrote:

Are you setting --num-executors to 8?

On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha
me.mukesh@gmail.com
wrote:

Sorry Sandy, The command is just for reference but I can confirm
that
there are 4 executors and a driver as shown in the spark UI page.

Each of these machines is a 8 core box with ~15G of ram.

On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza
sandy.r...@cloudera.com wrote:

Hi Mukesh,

Based on your spark-submit command, it looks like you're only
running with 2 executors on YARN. Also, how many cores does each
machine
have?

-Sandy

On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha
me.mukesh@gmail.com wrote:

Hello Experts,
I'm bench-marking Spark on YARN
(https://spark.apache.org/docs/latest/running-on-yarn.html) vs
a standalone
spark cluster (
https://spark.apache.org/docs/latest/spark-standalone.html).
I have a standalone cluster with 3 executors, and a spark app
running on yarn with 4 executors as shown below.

The spark job running inside yarn is 10x slower than the one
running on the standalone cluster (even though the yarn has more
number of
workers), also in both the case all the executors are in the
same datacenter
so there shouldn't be any latency. On YARN each 5sec batch is
reading data
from kafka and processing it in 5sec on the standalone cluster
each 5sec
batch is getting processed in 0.4sec.
Also, In YARN mode all the executors are not getting used up
evenly
as vm-13 vm-14 are running most of the tasks whereas in the
standalone
mode all the executors are running the tasks.

Do I need to set up some configuration to evenly distribute the
tasks? Also do you have any pointers on the reasons the yarn job
is 10x
slower than the standalone job?
Any suggestion is greatly appreciated, Thanks in advance.

YARN(5 workers + driver)

Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT
Input
ShuffleRead ShuffleWrite Thread Dump
1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2015-01-21 Thread Gerard Maas

Hi Mukesh,

How are you creating your receivers? Could you post the (relevant) code?

-kr, Gerard.

On Wed, Jan 21, 2015 at 9:42 AM, Mukesh Jha me.mukesh@gmail.com wrote:

Hello Guys,

Is this the correct interpretation based on the below data? If so how can
we fix this?

[image: Inline image 1]

On Wed, Dec 31, 2014 at 7:22 AM, Tathagata Das
tathagata.das1...@gmail.com wrote:

1. kafkaStream.repartition() to explicitly repartition the received
data across the cluster.
2. Create multiple kafka streams and union them together.

See
http://spark.apache.org/docs/latest/streaming-programming-guide.html#reducing-the-processing-time-of-each-batch

On Tue, Dec 30, 2014 at 1:43 AM, Mukesh Jha me.mukesh@gmail.com
wrote:
Thanks Sandy, It was the issue with the no of cores.

I configured spark.locality.wait=50 instead of the default 3000 ms,
which
forced the task rebalancing among nodes, let me know if there is a
better
way to deal with this.

On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

Makes sense, I've also tries it in standalone mode where all 3 workers

driver were running on the same 8 core box and the results were
similar.

Anyways I will share the results in YARN mode with 8 core yarn
containers.

On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:

When running in standalone mode, each executor will be able to use
all 8
cores on the box. When running on YARN, each executor will only have
access
to 2 cores. So the comparison doesn't seem fair, no?

-Sandy

On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com

wrote:

Nope, I am setting 5 executors with 2 cores each. Below is the
command
that I'm using to submit in YARN mode. This starts up 5 executor
nodes and a
drives as per the spark application master UI.

On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza
sandy.r...@cloudera.com
wrote:

*oops, I mean are you setting --executor-cores to 8

On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza
sandy.r...@cloudera.com
wrote:

Are you setting --num-executors to 8?

On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha
me.mukesh@gmail.com
wrote:

Sorry Sandy, The command is just for reference but I can confirm
that
there are 4 executors and a driver as shown in the spark UI page.

Each of these machines is a 8 core box with ~15G of ram.

On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza
sandy.r...@cloudera.com wrote:

Hi Mukesh,

Based on your spark-submit command, it looks like you're only
running with 2 executors on YARN. Also, how many cores does
each machine
have?

-Sandy

On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha
me.mukesh@gmail.com wrote:

Hello Experts,
I'm bench-marking Spark on YARN
(https://spark.apache.org/docs/latest/running-on-yarn.html) vs
a standalone
spark cluster (
https://spark.apache.org/docs/latest/spark-standalone.html).
I have a standalone cluster with 3 executors, and a spark app
running on yarn with 4 executors as shown below.

The spark job running inside yarn is 10x slower than the one
running on the standalone cluster (even though the yarn has
more number of
workers), also in both the case all the executors are in the
same datacenter
so there shouldn't be any latency. On YARN each 5sec batch is
reading data
from kafka and processing it in 5sec on the standalone
cluster each 5sec
batch is getting processed in 0.4sec.
Also, In YARN mode all the executors are not getting used up
evenly
as vm-13 vm-14 are running most of the tasks whereas in the
standalone
mode all the executors are running the tasks.

Do I need to set up some configuration to evenly distribute the
tasks? Also do you have any pointers on the reasons the yarn
job is 10x
slower than the standalone job?
Any suggestion is greatly appreciated, Thanks in advance.

YARN(5 workers + driver)

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-30 Thread Mukesh Jha

Thanks Sandy, It was the issue with the no of cores.

Another issue I was facing is that tasks are not getting distributed evenly
among all executors and are running on the NODE_LOCAL locality level i.e.
all the tasks are running on the same executor where my kafkareceiver(s)
are running even though other executors are idle.

I configured *spark.locality.wait=50* instead of the default 3000 ms, which
forced the task rebalancing among nodes, let me know if there is a better
way to deal with this.


On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

 Makes sense, I've also tries it in standalone mode where all 3 workers 
 driver were running on the same 8 core box and the results were similar.

 Anyways I will share the results in YARN mode with 8 core yarn containers.

 On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 When running in standalone mode, each executor will be able to use all 8
 cores on the box.  When running on YARN, each executor will only have
 access to 2 cores.  So the comparison doesn't seem fair, no?

 -Sandy

 On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Nope, I am setting 5 executors with 2  cores each. Below is the command
 that I'm using to submit in YARN mode. This starts up 5 executor nodes and
 a drives as per the spark  application master UI.

 spark-submit --master yarn-cluster --num-executors 5 --driver-memory
 1024m --executor-memory 1024m --executor-cores 2 --class
 com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka
  spark-yarn avro 1 5000

 On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 *oops, I mean are you setting --executor-cores to 8

 On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
  wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only
 running with 2 executors on YARN.  Also, how many cores does each 
 machine
 have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
  wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app
 running on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one
 running on the standalone cluster (even though the yarn has more 
 number of
 workers), also in both the case all the executors are in the same
 datacenter so there shouldn't be any latency. On YARN each 5sec batch 
 is
 reading data from kafka and processing it in 5sec  on the standalone
 cluster each 5sec batch is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly
 as vm-13  vm-14 are running most of the tasks whereas in the 
 standalone
 mode all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the
 tasks? Also do you have any pointers on the reasons the yarn job is 10x
 slower than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input
 ShuffleRead ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0
 B 2047.0 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0
 B 0.0 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0
 B 1368.0 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0
 B 1368.0 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 
 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0
 B 0.0 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0
 B 1534.0 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0
 B 1368.0 B 4.0 KB Thread Dump
 2

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-30 Thread Tathagata Das

Thats is kind of expected due to data locality. Though you should see
some tasks running on the executors as the data gets replicated to
other nodes and can therefore run tasks based on locality. You have
two solutions

1. kafkaStream.repartition() to explicitly repartition the received
data across the cluster.
2. Create multiple kafka streams and union them together.

See 
http://spark.apache.org/docs/latest/streaming-programming-guide.html#reducing-the-processing-time-of-each-batch

On Tue, Dec 30, 2014 at 1:43 AM, Mukesh Jha me.mukesh@gmail.com wrote:
 Thanks Sandy, It was the issue with the no of cores.

 Another issue I was facing is that tasks are not getting distributed evenly
 among all executors and are running on the NODE_LOCAL locality level i.e.
 all the tasks are running on the same executor where my kafkareceiver(s) are
 running even though other executors are idle.

 I configured spark.locality.wait=50 instead of the default 3000 ms, which
 forced the task rebalancing among nodes, let me know if there is a better
 way to deal with this.


 On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Makes sense, I've also tries it in standalone mode where all 3 workers 
 driver were running on the same 8 core box and the results were similar.

 Anyways I will share the results in YARN mode with 8 core yarn containers.

 On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 When running in standalone mode, each executor will be able to use all 8
 cores on the box.  When running on YARN, each executor will only have access
 to 2 cores.  So the comparison doesn't seem fair, no?

 -Sandy

 On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Nope, I am setting 5 executors with 2  cores each. Below is the command
 that I'm using to submit in YARN mode. This starts up 5 executor nodes and 
 a
 drives as per the spark  application master UI.

 spark-submit --master yarn-cluster --num-executors 5 --driver-memory
 1024m --executor-memory 1024m --executor-cores 2 --class
 com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka
 spark-yarn avro 1 5000

 On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 *oops, I mean are you setting --executor-cores to 8

 On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza
 sandy.r...@cloudera.com wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only
 running with 2 executors on YARN.  Also, how many cores does each 
 machine
 have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha
 me.mukesh@gmail.com wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN
 (https://spark.apache.org/docs/latest/running-on-yarn.html) vs a 
 standalone
 spark cluster 
 (https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app
 running on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one
 running on the standalone cluster (even though the yarn has more 
 number of
 workers), also in both the case all the executors are in the same 
 datacenter
 so there shouldn't be any latency. On YARN each 5sec batch is reading 
 data
 from kafka and processing it in 5sec  on the standalone cluster each 
 5sec
 batch is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly
 as vm-13  vm-14 are running most of the tasks whereas in the 
 standalone
 mode all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the
 tasks? Also do you have any pointers on the reasons the yarn job is 
 10x
 slower than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input
 ShuffleRead ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B
 2047.0 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m
 0.0 B 0.0 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m
 0.0 B 1368.0 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B
 1368.0 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B
 1881.0 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B
 0.0 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Sandy Ryza

Hi Mukesh,

Based on your spark-submit command, it looks like you're only running with
2 executors on YARN.  Also, how many cores does each machine have?

-Sandy

On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running on
 yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running on
 the standalone cluster (even though the yarn has more number of workers),
 also in both the case all the executors are in the same datacenter so there
 shouldn't be any latency. On YARN each 5sec batch is reading data from
 kafka and processing it in 5sec  on the standalone cluster each 5sec batch
 is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0 B 0.0
 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0 B 1368.0
 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0 B 0.0
 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0 B 1534.0
 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0 B 1368.0
 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0 B 2.0
 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0 B 0.0
 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Mukesh Jha

Sorry Sandy, The command is just for reference but I can confirm that there
are 4 executors and a driver as shown in the spark UI page.

Each of these machines is a 8 core box with ~15G of ram.

On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running with
 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running on
 yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running on
 the standalone cluster (even though the yarn has more number of workers),
 also in both the case all the executors are in the same datacenter so there
 shouldn't be any latency. On YARN each 5sec batch is reading data from
 kafka and processing it in 5sec  on the standalone cluster each 5sec batch
 is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0 B 0.0
 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0 B 1368.0
 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0 B 1534.0
 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0 B 1368.0
 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0 B 2.0
 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





-- 


Thanks  Regards,

*Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Mukesh Jha

And this is with spark version 1.2.0.

On Mon, Dec 29, 2014 at 11:43 PM, Mukesh Jha me.mukesh@gmail.com
wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running on
 yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running on
 the standalone cluster (even though the yarn has more number of workers),
 also in both the case all the executors are in the same datacenter so there
 shouldn't be any latency. On YARN each 5sec batch is reading data from
 kafka and processing it in 5sec  on the standalone cluster each 5sec batch
 is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0 B 0.0
 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0 B 
 1368.0
 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0 B 
 1534.0
 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0 B 
 1368.0
 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0 B 2.0
 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*




-- 


Thanks  Regards,

*Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Sandy Ryza

Are you setting --num-executors to 8?

On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running on
 yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running on
 the standalone cluster (even though the yarn has more number of workers),
 also in both the case all the executors are in the same datacenter so there
 shouldn't be any latency. On YARN each 5sec batch is reading data from
 kafka and processing it in 5sec  on the standalone cluster each 5sec batch
 is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0 B 0.0
 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0 B 
 1368.0
 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0 B 
 1534.0
 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0 B 
 1368.0
 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0 B 2.0
 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Sandy Ryza

*oops, I mean are you setting --executor-cores to 8

On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running
 on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running on
 the standalone cluster (even though the yarn has more number of workers),
 also in both the case all the executors are in the same datacenter so there
 shouldn't be any latency. On YARN each 5sec batch is reading data from
 kafka and processing it in 5sec  on the standalone cluster each 5sec batch
 is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0 B 0.0
 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0 B 
 1368.0
 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0 B 
 1534.0
 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0 B 
 1368.0
 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0 B 2.0
 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Mukesh Jha

Nope, I am setting 5 executors with 2  cores each. Below is the command
that I'm using to submit in YARN mode. This starts up 5 executor nodes and
a drives as per the spark  application master UI.

spark-submit --master yarn-cluster --num-executors 5 --driver-memory 1024m
--executor-memory 1024m --executor-cores 2 --class com.oracle.ci.CmsgK2H
/homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka spark-yarn avro 1 5000

On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:

 *oops, I mean are you setting --executor-cores to 8

 On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running
 on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running
 on the standalone cluster (even though the yarn has more number of
 workers), also in both the case all the executors are in the same
 datacenter so there shouldn't be any latency. On YARN each 5sec batch is
 reading data from kafka and processing it in 5sec  on the standalone
 cluster each 5sec batch is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly as
 vm-13  vm-14 are running most of the tasks whereas in the standalone mode
 all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the tasks?
 Also do you have any pointers on the reasons the yarn job is 10x slower
 than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0
 B 0.0 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0
 B 1368.0 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0
 B 1534.0 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0
 B 1368.0 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0
 B 2.0 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*






-- 


Thanks  Regards,

*Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Sandy Ryza

When running in standalone mode, each executor will be able to use all 8
cores on the box.  When running on YARN, each executor will only have
access to 2 cores.  So the comparison doesn't seem fair, no?

-Sandy

On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com
wrote:

 Nope, I am setting 5 executors with 2  cores each. Below is the command
 that I'm using to submit in YARN mode. This starts up 5 executor nodes and
 a drives as per the spark  application master UI.

 spark-submit --master yarn-cluster --num-executors 5 --driver-memory 1024m
 --executor-memory 1024m --executor-cores 2 --class com.oracle.ci.CmsgK2H
 /homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 *oops, I mean are you setting --executor-cores to 8

 On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app running
 on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running
 on the standalone cluster (even though the yarn has more number of
 workers), also in both the case all the executors are in the same
 datacenter so there shouldn't be any latency. On YARN each 5sec batch is
 reading data from kafka and processing it in 5sec  on the standalone
 cluster each 5sec batch is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly
 as vm-13  vm-14 are running most of the tasks whereas in the standalone
 mode all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the
 tasks? Also do you have any pointers on the reasons the yarn job is 10x
 slower than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 
 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0
 B 0.0 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0
 B 1368.0 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 
 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0
 B 1534.0 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0
 B 1368.0 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0
 B 2.0 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*






 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

2014-12-29 Thread Mukesh Jha

Makes sense, I've also tries it in standalone mode where all 3 workers 
driver were running on the same 8 core box and the results were similar.

Anyways I will share the results in YARN mode with 8 core yarn containers.

On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:

 When running in standalone mode, each executor will be able to use all 8
 cores on the box.  When running on YARN, each executor will only have
 access to 2 cores.  So the comparison doesn't seem fair, no?

 -Sandy

 On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Nope, I am setting 5 executors with 2  cores each. Below is the command
 that I'm using to submit in YARN mode. This starts up 5 executor nodes and
 a drives as per the spark  application master UI.

 spark-submit --master yarn-cluster --num-executors 5 --driver-memory
 1024m --executor-memory 1024m --executor-cores 2 --class
 com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka
  spark-yarn avro 1 5000

 On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 *oops, I mean are you setting --executor-cores to 8

 On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Are you setting --num-executors to 8?

 On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Sorry Sandy, The command is just for reference but I can confirm that
 there are 4 executors and a driver as shown in the spark UI page.

 Each of these machines is a 8 core box with ~15G of ram.

 On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Mukesh,

 Based on your spark-submit command, it looks like you're only running
 with 2 executors on YARN.  Also, how many cores does each machine have?

 -Sandy

 On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:

 Hello Experts,
 I'm bench-marking Spark on YARN (
 https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
 standalone spark cluster (
 https://spark.apache.org/docs/latest/spark-standalone.html).
 I have a standalone cluster with 3 executors, and a spark app
 running on yarn with 4 executors as shown below.

 The spark job running inside yarn is 10x slower than the one running
 on the standalone cluster (even though the yarn has more number of
 workers), also in both the case all the executors are in the same
 datacenter so there shouldn't be any latency. On YARN each 5sec batch is
 reading data from kafka and processing it in 5sec  on the standalone
 cluster each 5sec batch is getting processed in 0.4sec.
 Also, In YARN mode all the executors are not getting used up evenly
 as vm-13  vm-14 are running most of the tasks whereas in the standalone
 mode all the executors are running the tasks.

 Do I need to set up some configuration to evenly distribute the
 tasks? Also do you have any pointers on the reasons the yarn job is 10x
 slower than the standalone job?
 Any suggestion is greatly appreciated, Thanks in advance.

 YARN(5 workers + driver)
 
 Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0 B 
 2047.0
 B 1710.0 B Thread Dump
 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0
 B 0.0 B 0.0 B Thread Dump
 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0
 B 1368.0 B 2.8 KB Thread Dump
 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0 B 
 1368.0
 B 1026.0 B Thread Dump
 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 
 1881.0
 B 2.8 KB Thread Dump
 driver vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master yarn-cluster --num-executors 2 --driver-memory 512m
 --executor-memory 512m --executor-cores 2
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-yarn avro 1 5000

 STANDALONE(3 workers + driver)
 ==
 Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
 ShuffleRead
 ShuffleWrite Thread Dump
 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0
 B 1534.0 B 3.0 KB Thread Dump
 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0
 B 1368.0 B 4.0 KB Thread Dump
 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0
 B 2.0 KB 1368.0 B Thread Dump
 driver vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0 B 0.0
 B 0.0 B Thread Dump

 /homext/spark/bin/spark-submit
 --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
 --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
 vm.cloud.com:2181/kafka spark-standalone avro 1 5000

 PS: I did go through the spark website and
 http://www.virdata.com/tuning-spark/, but was out of any luck.

 --
 Cheers,
 Mukesh Jha





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

13 matches

Site Navigation

Mail list logo

Footer information