What's the updated way of deploying spark streaming apps on EMR? Using YARN?

There are some out of date solutions like
https://github.com/ianoc/SparkEMRBootstrap which setup mesos on EMR. I
wonder if this can be simplified by spark 0.9.

Spark-ec2 comes with a considerable amount of configuration, and some
useful utilities like deploy to workers, porting it to a managed service
such as EMR is not as trivial as it might seem to be.


On Fri, Feb 28, 2014 at 6:19 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote:

> I think what you are looking for is sort of a managed service ala EMR or
> Qubole. Spark-ec2 is just software to boot up machines & integrate them
> together using Whirr.
> I agree a managed service for Streaming would be really useful.
> Regards
> Mayur
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Fri, Feb 28, 2014 at 8:50 AM, Aureliano Buendia 
> <buendia...@gmail.com>wrote:
>
>> Another subject that was not that important in spark, but it could be
>> crucial for 24/7 spark streaming, is reconstruction of lost nodes. By that,
>> I do not mean lost data reconstruction by self healing, but bringing up new
>> ec2 instances once they die for whatever reasons. Is this also supported in
>> spark ec2?
>>
>>
>> On Fri, Feb 28, 2014 at 2:24 AM, Tathagata Das <
>> tathagata.das1...@gmail.com> wrote:
>>
>>> Yes, the default spark EC2 cluster runs the standalone deploy mode.
>>> Since Spark 0.9, the standalone deploy mode allows you to launch the driver
>>> app within the cluster itself and automatically restart it if it fails. You
>>> can read about launching your app inside the cluster 
>>> here<http://spark.incubator.apache.org/docs/latest/spark-standalone.html#connecting-an-application-to-the-cluster>.
>>> Using this you can launch your streaming app as well.
>>>
>>> TD
>>>
>>>
>>> On Thu, Feb 27, 2014 at 5:35 PM, Aureliano Buendia <buendia...@gmail.com
>>> > wrote:
>>>
>>>> How about spark stream app itself? Does the ec2 script also provide
>>>> means for daemonizing and monitoring spark streaming apps which are
>>>> supposed to run 24/7? If not, any suggestions for how to do this?
>>>>
>>>>
>>>> On Thu, Feb 27, 2014 at 8:23 PM, Tathagata Das <
>>>> tathagata.das1...@gmail.com> wrote:
>>>>
>>>>> Zookeeper is automatically set up in the cluster as Spark uses
>>>>> Zookeeper. However, you have to setup your own input source like Kafka or
>>>>> Flume.
>>>>>
>>>>> TD
>>>>>
>>>>>
>>>>> On Thu, Feb 27, 2014 at 10:32 AM, Aureliano Buendia <
>>>>> buendia...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 27, 2014 at 6:17 PM, Tathagata Das <
>>>>>> tathagata.das1...@gmail.com> wrote:
>>>>>>
>>>>>>> Yes! Spark streaming programs are just like any spark program and so
>>>>>>> any ec2 cluster setup using the spark-ec2 scripts can be used to run 
>>>>>>> spark
>>>>>>> streaming programs as well.
>>>>>>>
>>>>>>
>>>>>> Great. Does it come with any input source support as well? (Eg kafka
>>>>>> requires setting up zookeeper).
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 27, 2014 at 10:11 AM, Aureliano Buendia <
>>>>>>> buendia...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Does the ec2 support for spark 0.9 also include spark streaming? If
>>>>>>>> not, is there an equivalent?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to