PySpark: Python 2.7 cluster installation script (with Numpy, IPython, etc)

Sebastián Ramírez Wed, 11 Mar 2015 14:43:12 -0700

Many times, when I'm setting up a cluster, I have to use an operating
system (as RedHat or CentOS 6.5) which has an old version of Python (Python
2.6). For example, when using a Hadoop distribution that only supports
those operating systems (as Hortonworks' HDP or Cloudera).


 And that also makes installing additional advanced Python packages
difficult (such as Numpy, IPython, etc).

Then I tend to use Anaconda Python, an open source version of Python with
many of those packages pre-built and pre-installed.

But installing Anaconda in each node of the cluster might be tedious.

So I made a *simple script which helps installing Anaconda Python in the
machines of a cluster *more easily.

I wanted to share it here, in case it can help someone wanting using
PySpark.

https://github.com/tiangolo/anaconda_cluster_install


*Sebastián Ramírez*
Head of Software Development

 <http://www.senseta.com>
________________
 Tel: (+571) 795 7950 ext: 1012
 Cel: (+57) 300 370 77 10
 Calle 73 No 7 - 06  Piso 4
 Linkedin: co.linkedin.com/in/tiangolo/
 Twitter: @tiangolo <https://twitter.com/tiangolo>
 Email: sebastian.rami...@senseta.com
 www.senseta.com

-- 
*----------------------------------------------------*
*This e-mail transmission, including any attachments, is intended only for 
the named recipient(s) and may contain information that is privileged, 
confidential and/or exempt from disclosure under applicable law. If you 
have received this transmission in error, or are not the named 
recipient(s), please notify Senseta immediately by return e-mail and 
permanently delete this transmission, including any attachments.*

PySpark: Python 2.7 cluster installation script (with Numpy, IPython, etc)

Reply via email to