Thanks Marcelo.

But my case is different. My mypython/libs/numpy-1.9.2.zip is in *local
directory* (can also put in HDFS), but still fails.

But SPARK-5479 <https://issues.apache.org/jira/browse/SPARK-5479> is :
PySpark on yarn mode need to support *non-local* python files.

The job fails only when i try to include 3rd party dependency from local
computer with --py-files (in Spark 1.4)

Both of these commands succeed:

./bin/spark-submit --master yarn-cluster --verbose hdfs:///pi.py
./bin/spark-submit --master yarn-cluster --deploy-mode cluster  --verbose
examples/src/main/python/pi.py

But in this particular example with 3rd party numpy module:

./bin/spark-submit --verbose --master yarn-cluster --py-files
 mypython/libs/numpy-1.9.2.zip --deploy-mode cluster
mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0


All these files :

mypython/libs/numpy-1.9.2.zip,  mypython/scripts/kmeans.py are local files,
kmeans_data.txt is in HDFS.


Thanks.


On Thu, Jun 25, 2015 at 12:22 PM, Marcelo Vanzin <van...@cloudera.com>
wrote:

> That sounds like SPARK-5479 which is not in 1.4...
>
> On Thu, Jun 25, 2015 at 12:17 PM, Elkhan Dadashov <elkhan8...@gmail.com>
> wrote:
>
>> In addition to previous emails, when i try to execute this command from
>> command line:
>>
>> ./bin/spark-submit --verbose --master yarn-cluster --py-files
>>  mypython/libs/numpy-1.9.2.zip --deploy-mode cluster
>> mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0
>>
>>
>> - numpy-1.9.2.zip - is downloaded numpy package
>> - kmeans.py is default example which comes with Spark 1.4
>> - kmeans_data.txt  - is default data file which comes with Spark 1.4
>>
>>
>> It fails saying that it could not find numpy:
>>
>> File "kmeans.py", line 31, in <module>
>>     import numpy
>> ImportError: No module named numpy
>>
>> Has anyone run Python Spark application on Yarn-cluster mode ? (which has
>> 3rd party Python modules to be shipped with)
>>
>> What are the configurations or installations to be done before running
>> Python Spark job with 3rd party dependencies on Yarn-cluster ?
>>
>> Thanks in advance.
>>
>>
> --
> Marcelo
>



-- 

Best regards,
Elkhan Dadashov

Reply via email to