Re: numpy + pyspark

2014-06-27 Thread Avishek Saha
I too felt the same Nick but I don't have root privileges on the cluster,
unfortunately. Are there any alternatives?


On 27 June 2014 08:04, Nick Pentreath nick.pentre...@gmail.com wrote:

 I've not tried this - but numpy is a tricky and complex package with many
 dependencies on Fortran/C libraries etc. I'd say by the time you figure out
 correctly deploying numpy in this manner, you may as well have just built
 it into your cluster bootstrap process, or PSSH install it on each node...


 On Fri, Jun 27, 2014 at 4:58 PM, Avishek Saha avishek.s...@gmail.com
 wrote:

 To clarify I tried it and it almost worked -- but I am getting some
 problems from the Random module in numpy. If anyone has successfully passed
 a numpy module (via the --py-files option) to spark-submit then please let
 me know.

 Thanks !!
 Avishek


 On 26 June 2014 17:45, Avishek Saha avishek.s...@gmail.com wrote:

 Hi all,

 Instead of installing numpy in each worker node, is it possible to
 ship numpy (via --py-files option maybe) while invoking the
 spark-submit?

 Thanks,
 Avishek






Re: numpy + pyspark

2014-06-27 Thread Shannon Quinn
Would deploying virtualenv on each directory on the cluster be viable? 
The dependencies would get tricky but I think this is the sort of 
situation it's built for.


On 6/27/14, 11:06 AM, Avishek Saha wrote:
I too felt the same Nick but I don't have root privileges on the 
cluster, unfortunately. Are there any alternatives?



On 27 June 2014 08:04, Nick Pentreath nick.pentre...@gmail.com 
mailto:nick.pentre...@gmail.com wrote:


I've not tried this - but numpy is a tricky and complex package
with many dependencies on Fortran/C libraries etc. I'd say by the
time you figure out correctly deploying numpy in this manner, you
may as well have just built it into your cluster bootstrap
process, or PSSH install it on each node...


On Fri, Jun 27, 2014 at 4:58 PM, Avishek Saha
avishek.s...@gmail.com mailto:avishek.s...@gmail.com wrote:

To clarify I tried it and it almost worked -- but I am getting
some problems from the Random module in numpy. If anyone has
successfully passed a numpy module (via the --py-files option)
to spark-submit then please let me know.

Thanks !!
Avishek


On 26 June 2014 17:45, Avishek Saha avishek.s...@gmail.com
mailto:avishek.s...@gmail.com wrote:

Hi all,

Instead of installing numpy in each worker node, is it
possible to
ship numpy (via --py-files option maybe) while invoking the
spark-submit?

Thanks,
Avishek








Re: numpy + pyspark

2014-06-27 Thread Shannon Quinn
I suppose along those lines, there's also Anaconda: 
https://store.continuum.io/cshop/anaconda/


On 6/27/14, 11:13 AM, Nick Pentreath wrote:
Hadoopy uses http://www.pyinstaller.org/ to package things up into an 
executable that should be runnable without root privileges. It says it 
support numpy



On Fri, Jun 27, 2014 at 5:08 PM, Shannon Quinn squ...@gatech.edu 
mailto:squ...@gatech.edu wrote:


Would deploying virtualenv on each directory on the cluster be
viable? The dependencies would get tricky but I think this is the
sort of situation it's built for.


On 6/27/14, 11:06 AM, Avishek Saha wrote:

I too felt the same Nick but I don't have root privileges on the
cluster, unfortunately. Are there any alternatives?


On 27 June 2014 08:04, Nick Pentreath nick.pentre...@gmail.com
mailto:nick.pentre...@gmail.com wrote:

I've not tried this - but numpy is a tricky and complex
package with many dependencies on Fortran/C libraries etc.
I'd say by the time you figure out correctly deploying numpy
in this manner, you may as well have just built it into your
cluster bootstrap process, or PSSH install it on each node...


On Fri, Jun 27, 2014 at 4:58 PM, Avishek Saha
avishek.s...@gmail.com mailto:avishek.s...@gmail.com wrote:

To clarify I tried it and it almost worked -- but I am
getting some problems from the Random module in numpy. If
anyone has successfully passed a numpy module (via the
--py-files option) to spark-submit then please let me know.

Thanks !!
Avishek


On 26 June 2014 17:45, Avishek Saha
avishek.s...@gmail.com mailto:avishek.s...@gmail.com
wrote:

Hi all,

Instead of installing numpy in each worker node, is
it possible to
ship numpy (via --py-files option maybe) while
invoking the
spark-submit?

Thanks,
Avishek











numpy + pyspark

2014-06-26 Thread Avishek Saha
Hi all,

Instead of installing numpy in each worker node, is it possible to
ship numpy (via --py-files option maybe) while invoking the
spark-submit?

Thanks,
Avishek