Re: Submit many spark applications

2018-05-25 Thread yncxcw
hi, please try to reduce the default heap size for the machine you use to submit applications: For example: export _JAVA_OPTIONS="-Xmx512M" The submitter which is also a JVM does not need to reserve lots of memory. Wei -- Sent from:

Re: Submit many spark applications

2018-05-25 Thread Marcelo Vanzin
I already gave my recommendation in my very first reply to this thread... On Fri, May 25, 2018 at 10:23 AM, raksja wrote: > ok, when to use what? > do you have any recommendation? > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > >

Re: Submit many spark applications

2018-05-25 Thread raksja
ok, when to use what? do you have any recommendation? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Submit many spark applications

2018-05-25 Thread Marcelo Vanzin
On Fri, May 25, 2018 at 10:18 AM, raksja wrote: > InProcessLauncher would just start a subprocess as you mentioned earlier. No. As the name says, it runs things in the same process. -- Marcelo - To

Re: Submit many spark applications

2018-05-25 Thread raksja
When you mean spark uses, did you meant this https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala? InProcessLauncher would just start a subprocess as you mentioned earlier. How about this, does this makes a rest api call to

Re: Submit many spark applications

2018-05-25 Thread Marcelo Vanzin
That's what Spark uses. On Fri, May 25, 2018 at 10:09 AM, raksja wrote: > thanks for the reply. > > Have you tried submit a spark job directly to Yarn using YarnClient. > https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html > > Not

Re: Submit many spark applications

2018-05-25 Thread raksja
thanks for the reply. Have you tried submit a spark job directly to Yarn using YarnClient. https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html Not sure whether its performant and scalable? -- Sent from:

Re: Submit many spark applications

2018-05-23 Thread Marcelo Vanzin
On Wed, May 23, 2018 at 12:04 PM, raksja wrote: > So InProcessLauncher wouldnt use the native memory, so will it overload the > mem of parent process? I will still use "native memory" (since the parent process will still use memory), just less of it. But yes, it will use

Re: Submit many spark applications

2018-05-23 Thread raksja
Hi Marcelo, I'm facing same issue when making spark-submits from an ec2 instance and reaching native memory limit sooner. we have the #1, but we are still in spark 2.1.0, couldnt try #2. So InProcessLauncher wouldnt use the native memory, so will it overload the mem of parent process? Is

Re: Submit many spark applications

2018-05-16 Thread ayan guha
How about using Livy to submit jobs? On Thu, 17 May 2018 at 7:24 am, Marcelo Vanzin wrote: > You can either: > > - set spark.yarn.submit.waitAppCompletion=false, which will make > spark-submit go away once the app starts in cluster mode. > - use the (new in 2.3)

Re: Submit many spark applications

2018-05-16 Thread Marcelo Vanzin
You can either: - set spark.yarn.submit.waitAppCompletion=false, which will make spark-submit go away once the app starts in cluster mode. - use the (new in 2.3) InProcessLauncher class + some custom Java code to submit all the apps from the same "launcher" process. On Wed, May 16, 2018 at 1:45

Submit many spark applications

2018-05-16 Thread Shiyuan
Hi Spark-users, I want to submit as many spark applications as the resources permit. I am using cluster mode on a yarn cluster. Yarn can queue and launch these applications without problems. The problem lies on spark-submit itself. Spark-submit starts a jvm which could fail due to insufficient