You can do it.
If you understand how Hadoop works, then you should realized that it's
a Python question and a Linux question.
Pass the native files via -files and setup environment variables
via mapred.child.env.
I've done a similar thing with Ruby. For Ruby, the environment
variables are PATH,
Hi,
I have successfully installed scipy on my Python 2.7 on my local Linux, and
I want to pack my Python2.7 (with scipy) onto Hadoop and run my Python
MapReduce scripts, like this:
20 ${HADOOP_HOME}/bin/hadoop streaming \$
21 -input ${input} \$
22 -output ${output} \$
23