Re: Installing a python library along with ec2 cluster
Hi, Please take a look at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html Cheers Gen On Mon, Feb 9, 2015 at 6:41 AM, Chengi Liu chengi.liu...@gmail.com wrote: Hi I am very new both in spark and aws stuff.. Say, I want to install pandas on ec2.. (pip install pandas) How do I create the image and the above library which would be used from pyspark. Thanks On Sun, Feb 8, 2015 at 3:03 AM, gen tang gen.tan...@gmail.com wrote: Hi, You can make a image of ec2 with all the python libraries installed and create a bash script to export python_path in the /etc/init.d/ directory. Then you can launch the cluster with this image and ec2.py Hope this can be helpful Cheers Gen On Sun, Feb 8, 2015 at 9:46 AM, Chengi Liu chengi.liu...@gmail.com wrote: Hi, I want to install couple of python libraries (pip install python_library) which I want to use on pyspark cluster which are developed using the ec2 scripts. Is there a way to specify these libraries when I am building those ec2 clusters? Whats the best way to install these libraries on each ec2 node? Thanks
Re: Installing a python library along with ec2 cluster
Hi, You can make a image of ec2 with all the python libraries installed and create a bash script to export python_path in the /etc/init.d/ directory. Then you can launch the cluster with this image and ec2.py Hope this can be helpful Cheers Gen On Sun, Feb 8, 2015 at 9:46 AM, Chengi Liu chengi.liu...@gmail.com wrote: Hi, I want to install couple of python libraries (pip install python_library) which I want to use on pyspark cluster which are developed using the ec2 scripts. Is there a way to specify these libraries when I am building those ec2 clusters? Whats the best way to install these libraries on each ec2 node? Thanks
Re: Installing a python library along with ec2 cluster
Hi I am very new both in spark and aws stuff.. Say, I want to install pandas on ec2.. (pip install pandas) How do I create the image and the above library which would be used from pyspark. Thanks On Sun, Feb 8, 2015 at 3:03 AM, gen tang gen.tan...@gmail.com wrote: Hi, You can make a image of ec2 with all the python libraries installed and create a bash script to export python_path in the /etc/init.d/ directory. Then you can launch the cluster with this image and ec2.py Hope this can be helpful Cheers Gen On Sun, Feb 8, 2015 at 9:46 AM, Chengi Liu chengi.liu...@gmail.com wrote: Hi, I want to install couple of python libraries (pip install python_library) which I want to use on pyspark cluster which are developed using the ec2 scripts. Is there a way to specify these libraries when I am building those ec2 clusters? Whats the best way to install these libraries on each ec2 node? Thanks
Re: Installing a python library along with ec2 cluster
You can basically add one function call to install the stuffs you want. If you look at the spark-ec2 script, there's a function which does all the setup named: setup_cluster(..) https://github.com/apache/spark/blob/master/ec2/spark_ec2.py#L625. Now, if you want to install a python library ( assuming pip is already installed), you can add one more line in the above function like: ssh(master, opts, pip install pandas) This will install it on the master node, you have slave_nodes variable which has all info of slave machines . You can iterate through it and do the same. Thanks Best Regards On Sun, Feb 8, 2015 at 2:16 PM, Chengi Liu chengi.liu...@gmail.com wrote: Hi, I want to install couple of python libraries (pip install python_library) which I want to use on pyspark cluster which are developed using the ec2 scripts. Is there a way to specify these libraries when I am building those ec2 clusters? Whats the best way to install these libraries on each ec2 node? Thanks