Re: Spark GCE Script

2014-05-23 Thread Aureliano Buendia
On Fri, May 16, 2014 at 11:19 AM, Akhil Das wrote:

> Hi
>
> I have sent a pull request https://github.com/apache/spark/pull/681 you
> can verify it and add it :)
>

Matei,

Would you please verify this pull request for Jenkins? It has been a couple
of weeks.


>
>
> Thanks
> Best Regards
>
>
> On Thu, May 8, 2014 at 2:58 AM, Aureliano Buendia wrote:
>
>> Please send a pull request, this should be maintained by the community,
>> just in case you do not feel like continuing to maintain it.
>>
>> Also, nice to see that the gce version is shorter than the aws version.
>>
>>
>> On Tue, May 6, 2014 at 10:11 AM, Akhil Das wrote:
>>
>>> Hi Matei,
>>>
>>> Will clean up the code a little bit and send the pull request :)
>>>
>>> Thanks
>>> Best Regards
>>>
>>>
>>> On Tue, May 6, 2014 at 1:00 AM, François Le lay  wrote:
>>>
 Has anyone considered using jclouds tooling to support multiple cloud
 providers? Maybe using Pallet?

 François

 On May 5, 2014, at 3:22 PM, Nicholas Chammas <
 nicholas.cham...@gmail.com> wrote:

 I second this motion. :)

 A unified "cloud deployment" tool would be absolutely great.


 On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia 
 wrote:

> Very cool! Have you thought about sending this as a pull request? We’d
> be happy to maintain it inside Spark, though it might be interesting to
> find a single Python package that can manage clusters across both EC2 and
> GCE.
>
> Matei
>
> On May 5, 2014, at 7:18 AM, Akhil Das 
> wrote:
>
> Hi Sparkers,
>
> We have created a quick spark_gce script which can launch a spark
> cluster in the Google Cloud. I'm sharing it because it might be helpful 
> for
> someone using the Google Cloud for deployment rather than AWS.
>
> Here's the link to the script
>
> https://github.com/sigmoidanalytics/spark_gce
>
> Feel free to use it and suggest any feedback around it.
>
> In short here's what it does:
>
> Just like the spark_ec2 script, this one also reads certain
> command-line arguments (See the github 
> page for
> more details) like the cluster name and all, then starts the machines in
> the google cloud, sets up the network, adds a 500GB empty disk to all
> machines, generate the ssh keys on master and transfer it to all slaves 
> and
> install java and downloads and configures Spark/Shark/Hadoop. Also it
> starts the shark server automatically. Currently the version is 0.9.1 but
> I'm happy to add/support more versions if anyone is interested.
>
>
> Cheers.
>
>
> Thanks
> Best Regards
>
>
>

>>>
>>
>


Re: Spark GCE Script

2014-05-16 Thread Akhil Das
Hi

I have sent a pull request https://github.com/apache/spark/pull/681 you can
verify it and add it :)


Thanks
Best Regards


On Thu, May 8, 2014 at 2:58 AM, Aureliano Buendia wrote:

> Please send a pull request, this should be maintained by the community,
> just in case you do not feel like continuing to maintain it.
>
> Also, nice to see that the gce version is shorter than the aws version.
>
>
> On Tue, May 6, 2014 at 10:11 AM, Akhil Das wrote:
>
>> Hi Matei,
>>
>> Will clean up the code a little bit and send the pull request :)
>>
>> Thanks
>> Best Regards
>>
>>
>> On Tue, May 6, 2014 at 1:00 AM, François Le lay  wrote:
>>
>>> Has anyone considered using jclouds tooling to support multiple cloud
>>> providers? Maybe using Pallet?
>>>
>>> François
>>>
>>> On May 5, 2014, at 3:22 PM, Nicholas Chammas 
>>> wrote:
>>>
>>> I second this motion. :)
>>>
>>> A unified "cloud deployment" tool would be absolutely great.
>>>
>>>
>>> On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia 
>>> wrote:
>>>
 Very cool! Have you thought about sending this as a pull request? We’d
 be happy to maintain it inside Spark, though it might be interesting to
 find a single Python package that can manage clusters across both EC2 and
 GCE.

 Matei

 On May 5, 2014, at 7:18 AM, Akhil Das 
 wrote:

 Hi Sparkers,

 We have created a quick spark_gce script which can launch a spark
 cluster in the Google Cloud. I'm sharing it because it might be helpful for
 someone using the Google Cloud for deployment rather than AWS.

 Here's the link to the script

 https://github.com/sigmoidanalytics/spark_gce

 Feel free to use it and suggest any feedback around it.

 In short here's what it does:

 Just like the spark_ec2 script, this one also reads certain
 command-line arguments (See the github 
 page for
 more details) like the cluster name and all, then starts the machines in
 the google cloud, sets up the network, adds a 500GB empty disk to all
 machines, generate the ssh keys on master and transfer it to all slaves and
 install java and downloads and configures Spark/Shark/Hadoop. Also it
 starts the shark server automatically. Currently the version is 0.9.1 but
 I'm happy to add/support more versions if anyone is interested.


 Cheers.


 Thanks
 Best Regards



>>>
>>
>


Re: Spark GCE Script

2014-05-15 Thread Aureliano Buendia
Please send a pull request, this should be maintained by the community,
just in case you do not feel like continuing to maintain it.

Also, nice to see that the gce version is shorter than the aws version.


On Tue, May 6, 2014 at 10:11 AM, Akhil Das wrote:

> Hi Matei,
>
> Will clean up the code a little bit and send the pull request :)
>
> Thanks
> Best Regards
>
>
> On Tue, May 6, 2014 at 1:00 AM, François Le lay  wrote:
>
>> Has anyone considered using jclouds tooling to support multiple cloud
>> providers? Maybe using Pallet?
>>
>> François
>>
>> On May 5, 2014, at 3:22 PM, Nicholas Chammas 
>> wrote:
>>
>> I second this motion. :)
>>
>> A unified "cloud deployment" tool would be absolutely great.
>>
>>
>> On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia wrote:
>>
>>> Very cool! Have you thought about sending this as a pull request? We’d
>>> be happy to maintain it inside Spark, though it might be interesting to
>>> find a single Python package that can manage clusters across both EC2 and
>>> GCE.
>>>
>>> Matei
>>>
>>> On May 5, 2014, at 7:18 AM, Akhil Das 
>>> wrote:
>>>
>>> Hi Sparkers,
>>>
>>> We have created a quick spark_gce script which can launch a spark
>>> cluster in the Google Cloud. I'm sharing it because it might be helpful for
>>> someone using the Google Cloud for deployment rather than AWS.
>>>
>>> Here's the link to the script
>>>
>>> https://github.com/sigmoidanalytics/spark_gce
>>>
>>> Feel free to use it and suggest any feedback around it.
>>>
>>> In short here's what it does:
>>>
>>> Just like the spark_ec2 script, this one also reads certain command-line
>>> arguments (See the github 
>>> page for
>>> more details) like the cluster name and all, then starts the machines in
>>> the google cloud, sets up the network, adds a 500GB empty disk to all
>>> machines, generate the ssh keys on master and transfer it to all slaves and
>>> install java and downloads and configures Spark/Shark/Hadoop. Also it
>>> starts the shark server automatically. Currently the version is 0.9.1 but
>>> I'm happy to add/support more versions if anyone is interested.
>>>
>>>
>>> Cheers.
>>>
>>>
>>> Thanks
>>> Best Regards
>>>
>>>
>>>
>>
>


Re: Spark GCE Script

2014-05-06 Thread Akhil Das
Hi Matei,

Will clean up the code a little bit and send the pull request :)

Thanks
Best Regards


On Tue, May 6, 2014 at 1:00 AM, François Le lay  wrote:

> Has anyone considered using jclouds tooling to support multiple cloud
> providers? Maybe using Pallet?
>
> François
>
> On May 5, 2014, at 3:22 PM, Nicholas Chammas 
> wrote:
>
> I second this motion. :)
>
> A unified "cloud deployment" tool would be absolutely great.
>
>
> On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia wrote:
>
>> Very cool! Have you thought about sending this as a pull request? We’d be
>> happy to maintain it inside Spark, though it might be interesting to find a
>> single Python package that can manage clusters across both EC2 and GCE.
>>
>> Matei
>>
>> On May 5, 2014, at 7:18 AM, Akhil Das  wrote:
>>
>> Hi Sparkers,
>>
>> We have created a quick spark_gce script which can launch a spark cluster
>> in the Google Cloud. I'm sharing it because it might be helpful for someone
>> using the Google Cloud for deployment rather than AWS.
>>
>> Here's the link to the script
>>
>> https://github.com/sigmoidanalytics/spark_gce
>>
>> Feel free to use it and suggest any feedback around it.
>>
>> In short here's what it does:
>>
>> Just like the spark_ec2 script, this one also reads certain command-line
>> arguments (See the github 
>> page for
>> more details) like the cluster name and all, then starts the machines in
>> the google cloud, sets up the network, adds a 500GB empty disk to all
>> machines, generate the ssh keys on master and transfer it to all slaves and
>> install java and downloads and configures Spark/Shark/Hadoop. Also it
>> starts the shark server automatically. Currently the version is 0.9.1 but
>> I'm happy to add/support more versions if anyone is interested.
>>
>>
>> Cheers.
>>
>>
>> Thanks
>> Best Regards
>>
>>
>>
>


Re: Spark GCE Script

2014-05-05 Thread François Le lay
Has anyone considered using jclouds tooling to support multiple cloud 
providers? Maybe using Pallet?

François

> On May 5, 2014, at 3:22 PM, Nicholas Chammas  
> wrote:
> 
> I second this motion. :)
> 
> A unified "cloud deployment" tool would be absolutely great.
> 
> 
> On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia  wrote:
>> Very cool! Have you thought about sending this as a pull request? We’d be 
>> happy to maintain it inside Spark, though it might be interesting to find a 
>> single Python package that can manage clusters across both EC2 and GCE.
>> 
>> Matei
>> 
>>> On May 5, 2014, at 7:18 AM, Akhil Das  wrote:
>>> 
>>> Hi Sparkers,
>>> 
>>> We have created a quick spark_gce script which can launch a spark cluster 
>>> in the Google Cloud. I'm sharing it because it might be helpful for someone 
>>> using the Google Cloud for deployment rather than AWS.
>>> 
>>> Here's the link to the script
>>> 
>>> https://github.com/sigmoidanalytics/spark_gce
>>> 
>>> Feel free to use it and suggest any feedback around it.
>>> 
>>> In short here's what it does:
>>> 
>>> Just like the spark_ec2 script, this one also reads certain command-line 
>>> arguments (See the github page for more details) like the cluster name and 
>>> all, then starts the machines in the google cloud, sets up the network, 
>>> adds a 500GB empty disk to all machines, generate the ssh keys on master 
>>> and transfer it to all slaves and install java and downloads and configures 
>>> Spark/Shark/Hadoop. Also it starts the shark server automatically. 
>>> Currently the version is 0.9.1 but I'm happy to add/support more versions 
>>> if anyone is interested.
>>> 
>>> 
>>> Cheers.
>>> 
>>> 
>>> Thanks
>>> Best Regards
> 


Re: Spark GCE Script

2014-05-05 Thread Nicholas Chammas
I second this motion. :)

A unified "cloud deployment" tool would be absolutely great.


On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia wrote:

> Very cool! Have you thought about sending this as a pull request? We’d be
> happy to maintain it inside Spark, though it might be interesting to find a
> single Python package that can manage clusters across both EC2 and GCE.
>
> Matei
>
> On May 5, 2014, at 7:18 AM, Akhil Das  wrote:
>
> Hi Sparkers,
>
> We have created a quick spark_gce script which can launch a spark cluster
> in the Google Cloud. I'm sharing it because it might be helpful for someone
> using the Google Cloud for deployment rather than AWS.
>
> Here's the link to the script
>
> https://github.com/sigmoidanalytics/spark_gce
>
> Feel free to use it and suggest any feedback around it.
>
> In short here's what it does:
>
> Just like the spark_ec2 script, this one also reads certain command-line
> arguments (See the github page 
> for
> more details) like the cluster name and all, then starts the machines in
> the google cloud, sets up the network, adds a 500GB empty disk to all
> machines, generate the ssh keys on master and transfer it to all slaves and
> install java and downloads and configures Spark/Shark/Hadoop. Also it
> starts the shark server automatically. Currently the version is 0.9.1 but
> I'm happy to add/support more versions if anyone is interested.
>
>
> Cheers.
>
>
> Thanks
> Best Regards
>
>
>


Re: Spark GCE Script

2014-05-05 Thread Matei Zaharia
Very cool! Have you thought about sending this as a pull request? We’d be happy 
to maintain it inside Spark, though it might be interesting to find a single 
Python package that can manage clusters across both EC2 and GCE.

Matei

On May 5, 2014, at 7:18 AM, Akhil Das  wrote:

> Hi Sparkers,
> 
> We have created a quick spark_gce script which can launch a spark cluster in 
> the Google Cloud. I'm sharing it because it might be helpful for someone 
> using the Google Cloud for deployment rather than AWS.
> 
> Here's the link to the script
> 
> https://github.com/sigmoidanalytics/spark_gce
> 
> Feel free to use it and suggest any feedback around it.
> 
> In short here's what it does:
> 
> Just like the spark_ec2 script, this one also reads certain command-line 
> arguments (See the github page for more details) like the cluster name and 
> all, then starts the machines in the google cloud, sets up the network, adds 
> a 500GB empty disk to all machines, generate the ssh keys on master and 
> transfer it to all slaves and install java and downloads and configures 
> Spark/Shark/Hadoop. Also it starts the shark server automatically. Currently 
> the version is 0.9.1 but I'm happy to add/support more versions if anyone is 
> interested.
> 
> 
> Cheers.
> 
> 
> Thanks
> Best Regards



Spark GCE Script

2014-05-05 Thread Akhil Das
Hi Sparkers,

We have created a quick spark_gce script which can launch a spark cluster
in the Google Cloud. I'm sharing it because it might be helpful for someone
using the Google Cloud for deployment rather than AWS.

Here's the link to the script

https://github.com/sigmoidanalytics/spark_gce

Feel free to use it and suggest any feedback around it.

In short here's what it does:

Just like the spark_ec2 script, this one also reads certain command-line
arguments (See the github
page for
more details) like the cluster name and all, then starts the machines in
the google cloud, sets up the network, adds a 500GB empty disk to all
machines, generate the ssh keys on master and transfer it to all slaves and
install java and downloads and configures Spark/Shark/Hadoop. Also it
starts the shark server automatically. Currently the version is 0.9.1 but
I'm happy to add/support more versions if anyone is interested.


Cheers.


Thanks
Best Regards