Hi Vipul, Some advantages of using YARN: * YARN allows you to dynamically share and centrally configure the same pool of cluster resources between all frameworks that run on YARN. You can throw your entire cluster at a MapReduce job, then use some of it on an Impala query and the rest on Spark application, without any changes in configuration. * You can take advantage of all the features of YARN schedulers for categorizing, isolating, and prioritizing workloads. * YARN provides CPU-isolation between processes with CGroups. Spark standalone mode requires each application to run an executor on every node in the cluster - with YARN, you choose the number of executors to use. * YARN is the only cluster manager for Spark that supports security and Kerberized clusters.
Some advantages of using standalone: * It has been around for longer, so it is likely a little more stable. * Many report faster startup times for apps. -Sandy On Wed, May 14, 2014 at 3:06 PM, Vipul Pandey <vipan...@gmail.com> wrote: > So here's a followup question : What's the preferred mode? > We have a new cluster coming up with petabytes of data and we intend to > take Spark to production. We are trying to figure out what mode would be > safe and stable for production like environment. > pros and cons? anyone? > > Any reasons why one would chose Standalone over YARN? > > Thanks, > Vipul > > On May 4, 2014, at 5:56 PM, Liu, Raymond <raymond....@intel.com> wrote: > > > In the core, they are not quite different > > In standalone mode, you have spark master and spark worker who allocate > driver and executors for your spark app. > > While in Yarn mode, Yarn resource manager and node manager do this work. > > When the driver and executors have been launched, the rest part of > resource scheduling go through the same process, say between driver and > executor through akka actor. > > > > Best Regards, > > Raymond Liu > > > > > > -----Original Message----- > > From: Sophia [mailto:sln-1...@163.com] > > > > Hey you guys, > > What is the different in spark on yarn mode and standalone mode about > resource schedule? > > Wish you happy everyday. > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/different-in-spark-on-yarn-mode-and-standalone-mode-tp5300.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > >