Re: spark-submit on YARN is slow

Denny Lee Fri, 05 Dec 2014 23:19:55 -0800

Okay, my bad for not testing out the documented arguments - once i use the
correct ones, the query shrinks completes in ~55s (I can probably make it
faster).   Thanks for the help, eh?!



On Fri Dec 05 2014 at 10:34:50 PM Denny Lee <denny.g....@gmail.com> wrote:

> Sorry for the delay in my response - for my spark calls for stand-alone
> and YARN, I am using the --executor-memory and --total-executor-cores for
> the submission.  In standalone, my baseline query completes in ~40s while
> in YARN, it completes in ~1800s.  It does not appear from the RM web UI
> that its asking for more resources than available but by the same token, it
> appears that its only using a small amount of cores and available memory.
>
> Saying this, let me re-try using the --executor-cores, --executor-memory,
> and --num-executors arguments as suggested (and documented) vs. the
> --total-executor-cores
>
>
> On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <and...@databricks.com> wrote:
>
>> Hey Arun I've seen that behavior before. It happens when the cluster
>> doesn't have enough resources to offer and the RM hasn't given us our
>> containers yet. Can you check the RM Web UI at port 8088 to see whether
>> your application is requesting more resources than the cluster has to offer?
>>
>> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sandy.r...@cloudera.com>:
>>
>> Hey Arun,
>>>
>>> The sleeps would only cause maximum like 5 second overhead.  The idea
>>> was to give executors some time to register.  On more recent versions, they
>>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by
>>> default YARN will wait until either 30 seconds have passed or 80% of the
>>> requested executors have registered.
>>>
>>> -Sandy
>>>
>>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <arang...@gmail.com>
>>> wrote:
>>>
>>>> Likely this not the case here yet one thing to point out with Yarn
>>>> parameters like --num-executors is that they should be specified *before*
>>>> app jar and app args on spark-submit command line otherwise the app only
>>>> gets the default number of containers which is 2.
>>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sandy.r...@cloudera.com> wrote:
>>>>
>>>>> Hi Denny,
>>>>>
>>>>> Those sleeps were only at startup, so if jobs are taking significantly
>>>>> longer on YARN, that should be a different problem.  When you ran on YARN,
>>>>> did you use the --executor-cores, --executor-memory, and --num-executors
>>>>> arguments?  When running against a standalone cluster, by default Spark
>>>>> will make use of all the cluster resources, but when running against YARN,
>>>>> Spark defaults to a couple tiny executors.
>>>>>
>>>>> -Sandy
>>>>>
>>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <denny.g....@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>>> steps. If I was running this on standalone cluster mode the query 
>>>>>> finished
>>>>>> in 55s but on YARN, the query was still running 30min later. Would the 
>>>>>> hard
>>>>>> coded sleeps potentially be in play here?
>>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sandy.r...@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Tobias,
>>>>>>>
>>>>>>> What version are you using?  In some recent versions, we had a
>>>>>>> couple of large hardcoded sleeps on the Spark side.
>>>>>>>
>>>>>>> -Sandy
>>>>>>>
>>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <and...@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Tobias,
>>>>>>>>
>>>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>>>>> needs to first set up the application master container, and then this 
>>>>>>>> AM
>>>>>>>> needs to request more containers for Spark executors. I think this 
>>>>>>>> accounts
>>>>>>>> for most of the overhead. The remaining source probably comes from how 
>>>>>>>> our
>>>>>>>> own YARN integration code polls application (every second) and cluster
>>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>>>>> whether there are optimizations there that can speed this up, but I 
>>>>>>>> believe
>>>>>>>> most of the overhead comes from YARN itself.
>>>>>>>>
>>>>>>>> In other words, no I don't know of any quick fix on your end that
>>>>>>>> you can do to speed this up.
>>>>>>>>
>>>>>>>> -Andrew
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <t...@preferred.jp>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well 
>>>>>>>>> as my
>>>>>>>>> application jar file put in HDFS and can see from the logging output 
>>>>>>>>> that
>>>>>>>>> both files are used from there. However, it still takes about 10 
>>>>>>>>> seconds
>>>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>>>>
>>>>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>>>>> configuration setting (or YARN-inherent slowness), I was just 
>>>>>>>>> wondering if
>>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Tobias
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>

Re: spark-submit on YARN is slow

Reply via email to