Thanks Mich. Great explanation On Saturday, 11 June 2016, 22:35, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
Hi Gavin, I believe in standalone mode a simple cluster manager is included with Spark that makes it easyto set up a cluster.It does not rely on YARN or Mesos. In summary this is from my notes: - Spark Local - Spark runs on the localhost. This is the simplest set up and best suited for learners who want to understanddifferent concepts of Spark and those performing unit testing. - Spark Standalone – a simple cluster managerincluded with Spark that makes it easy to set up a cluster. - YARN Cluster Mode, the Spark driver runs inside anapplication master process which is managed by YARN on the cluster, and theclient can go away after initiating the application. - Mesos. I have not used it so cannot comment YARN Client Mode, the driver runs inthe client process, and the application master is only used for requestingresources from YARN. UnlikeLocal or Spark standalone modes, in which the master’s address isspecified in the --master parameter, in YARNmode the ResourceManager’s address is picked up from the Hadoop configuration.Thus, the --master parameter is yarn HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com On 11 June 2016 at 22:26, Gavin Yue <yue.yuany...@gmail.com> wrote: The standalone mode is against Yarn mode or Mesos mode, which means spark uses Yarn or Mesos as cluster managements. Local mode is actually a standalone mode which everything runs on the single local machine instead of remote clusters. That is my understanding. On Sat, Jun 11, 2016 at 12:40 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: Thank you for grateful I know I can start spark-shell by launching the shell itself spark-shell Now I know that in standalone mode I can also connect to master spark-shell --master spark://<HOST>:7077 My point is what are the differences between these two start-up modes for spark-shell? If I start spark-shell and connect to master what performance gain will I get if any or it does not matter. Is it the same as for spark-submit regards On Saturday, 11 June 2016, 19:39, Mohammad Tariq <donta...@gmail.com> wrote: Hi Ashok, In local mode all the processes run inside a single jvm, whereas in standalone mode we have separate master and worker processes running in their own jvms. To quickly test your code from within your IDE you could probable use the local mode. However, to get a real feel of how Spark operates I would suggest you to have a standalone setup as well. It's just the matter of launching a standalone cluster either manually(by starting a master and workers by hand), or by using the launch scripts provided with Spark package. You can find more on this here. HTH | | | | Tariq, Mohammad | about.me/mti | | | | | | | On Sat, Jun 11, 2016 at 11:38 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: Hi, What is the difference between running Spark in Local mode or standalone mode? Are they the same. If they are not which is best suited for non prod work. I am also aware that one can run Spark in Yarn mode as well. Thanks