Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Takeshi Yamamuro
Hi,

How about checking Spark survey result 2015 in
https://databricks.com/blog/2015/09/24/spark-survey-results-2015-are-now-available.html
for the statistics?

// maropu

On Fri, Apr 15, 2016 at 4:52 AM, Mark Hamstra 
wrote:

> That's also available in standalone.
>
> On Thu, Apr 14, 2016 at 12:47 PM, Alexander Pivovarov <
> apivova...@gmail.com> wrote:
>
>> Spark on Yarn supports dynamic resource allocation
>>
>> So, you can run several spark-shells / spark-submits / spark-jobserver /
>> zeppelin on one cluster without defining upfront how many executors /
>> memory you want to allocate to each app
>>
>> Great feature for regular users who just want to run Spark / Spark SQL
>>
>>
>> On Thu, Apr 14, 2016 at 12:05 PM, Sean Owen  wrote:
>>
>>> I don't think usage is the differentiating factor. YARN and standalone
>>> are pretty well supported. If you are only running a Spark cluster by
>>> itself with nothing else, standalone is probably simpler than setting
>>> up YARN just for Spark. However if you're running on a cluster that
>>> will host other applications, you'll need to integrate with a shared
>>> resource manager and its security model, and for anything
>>> Hadoop-related that's YARN. Standalone wouldn't make as much sense.
>>>
>>> On Thu, Apr 14, 2016 at 6:46 PM, Alexander Pivovarov
>>>  wrote:
>>> > AWS EMR includes Spark on Yarn
>>> > Hortonworks and Cloudera platforms include Spark on Yarn as well
>>> >
>>> >
>>> > On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz <
>>> arkadiusz.b...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> Is there any statistics regarding YARN vs Standalone Spark Usage in
>>> >> production ?
>>> >>
>>> >> I would like to choose most supported and used technology in
>>> >> production for our project.
>>> >>
>>> >>
>>> >> BR,
>>> >>
>>> >> Arkadiusz Bicz
>>> >>
>>> >> -
>>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> >> For additional commands, e-mail: user-h...@spark.apache.org
>>> >>
>>> >
>>>
>>
>>
>


-- 
---
Takeshi Yamamuro


Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Mark Hamstra
That's also available in standalone.

On Thu, Apr 14, 2016 at 12:47 PM, Alexander Pivovarov 
wrote:

> Spark on Yarn supports dynamic resource allocation
>
> So, you can run several spark-shells / spark-submits / spark-jobserver /
> zeppelin on one cluster without defining upfront how many executors /
> memory you want to allocate to each app
>
> Great feature for regular users who just want to run Spark / Spark SQL
>
>
> On Thu, Apr 14, 2016 at 12:05 PM, Sean Owen  wrote:
>
>> I don't think usage is the differentiating factor. YARN and standalone
>> are pretty well supported. If you are only running a Spark cluster by
>> itself with nothing else, standalone is probably simpler than setting
>> up YARN just for Spark. However if you're running on a cluster that
>> will host other applications, you'll need to integrate with a shared
>> resource manager and its security model, and for anything
>> Hadoop-related that's YARN. Standalone wouldn't make as much sense.
>>
>> On Thu, Apr 14, 2016 at 6:46 PM, Alexander Pivovarov
>>  wrote:
>> > AWS EMR includes Spark on Yarn
>> > Hortonworks and Cloudera platforms include Spark on Yarn as well
>> >
>> >
>> > On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz <
>> arkadiusz.b...@gmail.com>
>> > wrote:
>> >>
>> >> Hello,
>> >>
>> >> Is there any statistics regarding YARN vs Standalone Spark Usage in
>> >> production ?
>> >>
>> >> I would like to choose most supported and used technology in
>> >> production for our project.
>> >>
>> >>
>> >> BR,
>> >>
>> >> Arkadiusz Bicz
>> >>
>> >> -
>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> >> For additional commands, e-mail: user-h...@spark.apache.org
>> >>
>> >
>>
>
>


Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Alexander Pivovarov
Spark on Yarn supports dynamic resource allocation

So, you can run several spark-shells / spark-submits / spark-jobserver /
zeppelin on one cluster without defining upfront how many executors /
memory you want to allocate to each app

Great feature for regular users who just want to run Spark / Spark SQL


On Thu, Apr 14, 2016 at 12:05 PM, Sean Owen  wrote:

> I don't think usage is the differentiating factor. YARN and standalone
> are pretty well supported. If you are only running a Spark cluster by
> itself with nothing else, standalone is probably simpler than setting
> up YARN just for Spark. However if you're running on a cluster that
> will host other applications, you'll need to integrate with a shared
> resource manager and its security model, and for anything
> Hadoop-related that's YARN. Standalone wouldn't make as much sense.
>
> On Thu, Apr 14, 2016 at 6:46 PM, Alexander Pivovarov
>  wrote:
> > AWS EMR includes Spark on Yarn
> > Hortonworks and Cloudera platforms include Spark on Yarn as well
> >
> >
> > On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz <
> arkadiusz.b...@gmail.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> Is there any statistics regarding YARN vs Standalone Spark Usage in
> >> production ?
> >>
> >> I would like to choose most supported and used technology in
> >> production for our project.
> >>
> >>
> >> BR,
> >>
> >> Arkadiusz Bicz
> >>
> >> -
> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: user-h...@spark.apache.org
> >>
> >
>


Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Sean Owen
I don't think usage is the differentiating factor. YARN and standalone
are pretty well supported. If you are only running a Spark cluster by
itself with nothing else, standalone is probably simpler than setting
up YARN just for Spark. However if you're running on a cluster that
will host other applications, you'll need to integrate with a shared
resource manager and its security model, and for anything
Hadoop-related that's YARN. Standalone wouldn't make as much sense.

On Thu, Apr 14, 2016 at 6:46 PM, Alexander Pivovarov
 wrote:
> AWS EMR includes Spark on Yarn
> Hortonworks and Cloudera platforms include Spark on Yarn as well
>
>
> On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz 
> wrote:
>>
>> Hello,
>>
>> Is there any statistics regarding YARN vs Standalone Spark Usage in
>> production ?
>>
>> I would like to choose most supported and used technology in
>> production for our project.
>>
>>
>> BR,
>>
>> Arkadiusz Bicz
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Mich Talebzadeh
Hi Alex,

Do you mean using Spark with Yarn-client compared to using Spark Local?

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 14 April 2016 at 18:46, Alexander Pivovarov  wrote:

> AWS EMR includes Spark on Yarn
> Hortonworks and Cloudera platforms include Spark on Yarn as well
>
>
> On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz 
> wrote:
>
>> Hello,
>>
>> Is there any statistics regarding YARN vs Standalone Spark Usage in
>> production ?
>>
>> I would like to choose most supported and used technology in
>> production for our project.
>>
>>
>> BR,
>>
>> Arkadiusz Bicz
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: YARN vs Standalone Spark Usage in production

2016-04-14 Thread Alexander Pivovarov
AWS EMR includes Spark on Yarn
Hortonworks and Cloudera platforms include Spark on Yarn as well


On Thu, Apr 14, 2016 at 7:29 AM, Arkadiusz Bicz 
wrote:

> Hello,
>
> Is there any statistics regarding YARN vs Standalone Spark Usage in
> production ?
>
> I would like to choose most supported and used technology in
> production for our project.
>
>
> BR,
>
> Arkadiusz Bicz
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>