Thanks Cody, As I already mentioned I am running spark streaming on EC2 cluster in standalone mode. Now in addition to streaming, I want to be able to run spark batch job hourly and adhoc queries using Zeppelin.
Can you please confirm that a standalone cluster is OK for this. Please provide me some links to help me get started. Thanks -Anna On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger <c...@koeninger.org> wrote: > The standalone cluster manager is fine for production. Don't use Yarn > or Mesos unless you already have another need for it. > > On Wed, Apr 26, 2017 at 4:53 PM, anna stax <annasta...@gmail.com> wrote: > > Hi Sam, > > > > Thank you for the reply. > > > > What do you mean by > > I doubt people run spark in a. Single EC2 instance, certainly not in > > production I don't think > > > > What is wrong in having a data pipeline on EC2 that reads data from > kafka, > > processes using spark and outputs to cassandra? Please explain. > > > > Thanks > > -Anna > > > > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin <hussam.ela...@gmail.com> > wrote: > >> > >> Hi Anna > >> > >> There are a variety of options for launching spark clusters. I doubt > >> people run spark in a. Single EC2 instance, certainly not in production > I > >> don't think > >> > >> I don't have enough information of what you are trying to do but if you > >> are just trying to set things up from scratch then I think you can just > use > >> EMR which will create a cluster for you and attach a zeppelin instance > as > >> well > >> > >> > >> You can also use databricks for ease of use and very little management > but > >> you will pay a premium for that abstraction > >> > >> > >> Regards > >> Sam > >> On Wed, 26 Apr 2017 at 22:02, anna stax <annasta...@gmail.com> wrote: > >>> > >>> I need to setup a spark cluster for Spark streaming and scheduled batch > >>> jobs and adhoc queries. > >>> Please give me some suggestions. Can this be done in standalone mode. > >>> > >>> Right now we have a spark cluster in standalone mode on AWS EC2 running > >>> spark streaming application. Can we run spark batch jobs and zeppelin > on the > >>> same. Do we need a better resource manager like Mesos? > >>> > >>> Are there any companies or individuals that can help in setting this > up? > >>> > >>> Thank you. > >>> > >>> -Anna > > > > >