I do also have this problem. The total time for launching receivers seems
related to the total number of executors. In my case, when I run 400
executors with 200 receivers, it takes about a minute for all receivers
become active, but with 800 executors, it takes 3 minutes to activate all
receivers.
Additional information: The batch duration in my app is 1 minute, from
Spark UI, for each batch, the difference between Output Op Duration and Job
Duration is big. E.g. Output Op Duration is 1min while Job Duration is 19s.
2016-07-14 10:49 GMT-07:00 Renxia Wang :
> Hi all,
>
> I am
Hi all,
I am running a Spark Streaming application with Kinesis on EMR 4.7.1. The
application runs on YARN and use client mode. There are 17 worker nodes
(c3.8xlarge) with 100 executors and 100 receivers. This setting works fine.
But when I increase the number of worker nodes to 50, and increase
Hi,
I am using Spark 1.6.1 on EMR running a streaming app on YARN. From the
Spark UI I see that for each batch, the *Output Op Duration* is larger
than *Job
Duration *(screenshot attached). What's the difference between these two,
is the *Job Duration* only counts the executor time of each time, b
Hi all,
Anybody had tried out Spark 2.0 on EMR 4.x? Will it work? I am looking for
a bootstrap action script to install it on EMR, does some one have a
working one to share? Appreciate that!
Best,
Renxia
Additional Info: I am running Spark on YARN.
2015-10-01 15:42 GMT-07:00 Renxia Wang :
> Hi guys,
>
> I know there is a way to set the number of retry of failed tasks, using
> spark.task.maxFailures. what is the default policy for the failed tasks
> retry? Is it exponential ba
Hi guys,
I know there is a way to set the number of retry of failed tasks, using
spark.task.maxFailures. what is the default policy for the failed tasks
retry? Is it exponential backoff? My tasks sometimes failed because of
Socket connection timeout/reset, even with retry, some of the tasks will
f