Great to hear! -Sandy
On Fri, Dec 5, 2014 at 11:17 PM, Denny Lee <denny.g....@gmail.com> wrote: > Okay, my bad for not testing out the documented arguments - once i use the > correct ones, the query shrinks completes in ~55s (I can probably make it > faster). Thanks for the help, eh?! > > > > On Fri Dec 05 2014 at 10:34:50 PM Denny Lee <denny.g....@gmail.com> wrote: > >> Sorry for the delay in my response - for my spark calls for stand-alone >> and YARN, I am using the --executor-memory and --total-executor-cores for >> the submission. In standalone, my baseline query completes in ~40s while >> in YARN, it completes in ~1800s. It does not appear from the RM web UI >> that its asking for more resources than available but by the same token, it >> appears that its only using a small amount of cores and available memory. >> >> Saying this, let me re-try using the --executor-cores, >> --executor-memory, and --num-executors arguments as suggested (and >> documented) vs. the --total-executor-cores >> >> >> On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <and...@databricks.com> wrote: >> >>> Hey Arun I've seen that behavior before. It happens when the cluster >>> doesn't have enough resources to offer and the RM hasn't given us our >>> containers yet. Can you check the RM Web UI at port 8088 to see whether >>> your application is requesting more resources than the cluster has to offer? >>> >>> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sandy.r...@cloudera.com>: >>> >>> Hey Arun, >>>> >>>> The sleeps would only cause maximum like 5 second overhead. The idea >>>> was to give executors some time to register. On more recent versions, they >>>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and >>>> spark.scheduler.maxRegisteredResourcesWaitingTime. As of 1.1, by >>>> default YARN will wait until either 30 seconds have passed or 80% of the >>>> requested executors have registered. >>>> >>>> -Sandy >>>> >>>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <arang...@gmail.com> >>>> wrote: >>>> >>>>> Likely this not the case here yet one thing to point out with Yarn >>>>> parameters like --num-executors is that they should be specified *before* >>>>> app jar and app args on spark-submit command line otherwise the app only >>>>> gets the default number of containers which is 2. >>>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sandy.r...@cloudera.com> wrote: >>>>> >>>>>> Hi Denny, >>>>>> >>>>>> Those sleeps were only at startup, so if jobs are taking >>>>>> significantly longer on YARN, that should be a different problem. When >>>>>> you >>>>>> ran on YARN, did you use the --executor-cores, --executor-memory, and >>>>>> --num-executors arguments? When running against a standalone cluster, by >>>>>> default Spark will make use of all the cluster resources, but when >>>>>> running >>>>>> against YARN, Spark defaults to a couple tiny executors. >>>>>> >>>>>> -Sandy >>>>>> >>>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <denny.g....@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand >>>>>>> steps. If I was running this on standalone cluster mode the query >>>>>>> finished >>>>>>> in 55s but on YARN, the query was still running 30min later. Would the >>>>>>> hard >>>>>>> coded sleeps potentially be in play here? >>>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sandy.r...@cloudera.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Tobias, >>>>>>>> >>>>>>>> What version are you using? In some recent versions, we had a >>>>>>>> couple of large hardcoded sleeps on the Spark side. >>>>>>>> >>>>>>>> -Sandy >>>>>>>> >>>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <and...@databricks.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hey Tobias, >>>>>>>>> >>>>>>>>> As you suspect, the reason why it's slow is because the resource >>>>>>>>> manager in YARN takes a while to grant resources. This is because YARN >>>>>>>>> needs to first set up the application master container, and then this >>>>>>>>> AM >>>>>>>>> needs to request more containers for Spark executors. I think this >>>>>>>>> accounts >>>>>>>>> for most of the overhead. The remaining source probably comes from >>>>>>>>> how our >>>>>>>>> own YARN integration code polls application (every second) and cluster >>>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail >>>>>>>>> whether there are optimizations there that can speed this up, but I >>>>>>>>> believe >>>>>>>>> most of the overhead comes from YARN itself. >>>>>>>>> >>>>>>>>> In other words, no I don't know of any quick fix on your end that >>>>>>>>> you can do to speed this up. >>>>>>>>> >>>>>>>>> -Andrew >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <t...@preferred.jp>: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am using spark-submit to submit my application to YARN in >>>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well >>>>>>>>>> as my >>>>>>>>>> application jar file put in HDFS and can see from the logging output >>>>>>>>>> that >>>>>>>>>> both files are used from there. However, it still takes about 10 >>>>>>>>>> seconds >>>>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING. >>>>>>>>>> >>>>>>>>>> I am aware that this is probably not a Spark issue, but some YARN >>>>>>>>>> configuration setting (or YARN-inherent slowness), I was just >>>>>>>>>> wondering if >>>>>>>>>> anyone has an advice for how to speed this up. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Tobias >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>