Re: spark python script importError problem

2019-07-16 Thread Patrick McCarthy
Your module 'feature' isn't available to the yarn workers, so you'll need
to either install it on them if you have access, or else upload to the
workers at runtime using --py-files or similar.

On Tue, Jul 16, 2019 at 7:16 AM zenglong chen 
wrote:

> Hi,all,
>   When i run a run a python script on spark submit,it done well in
> local[*] mode,but not in standalone mode or yarn mode.The error like below:
>
> Caused by: org.apache.spark.api.python.PythonException: Traceback (most
> recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/pyspark/worker.py", line
> 364, in main
> func, profiler, deserializer, serializer = read_command(pickleSer,
> infile)
>   File "/usr/local/lib/python2.7/dist-packages/pyspark/worker.py", line
> 69, in read_command
> command = serializer._read_with_length(file)
>   File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py",
> line 172, in _read_with_length
> return self.loads(obj)
>   File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py",
> line 583, in loads
> return pickle.loads(obj)
> ImportError: No module named feature.user.user_feature
>
> The script also run well in "sbin/start-master.sh sbin/start-slave.sh",but
> it has the same importError problem in "sbin/start-master.sh
> sbin/start-slaves.sh".The conf/slaves contents is 'localhost'.
>
> What should i do to solve this import problem?Thanks!!!
>


-- 


*Patrick McCarthy  *

Senior Data Scientist, Machine Learning Engineering

Dstillery

470 Park Ave South, 17th Floor, NYC 10016


spark python script importError problem

2019-07-16 Thread zenglong chen
Hi,all,
  When i run a run a python script on spark submit,it done well in
local[*] mode,but not in standalone mode or yarn mode.The error like below:

Caused by: org.apache.spark.api.python.PythonException: Traceback (most
recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pyspark/worker.py", line
364, in main
func, profiler, deserializer, serializer = read_command(pickleSer,
infile)
  File "/usr/local/lib/python2.7/dist-packages/pyspark/worker.py", line 69,
in read_command
command = serializer._read_with_length(file)
  File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py",
line 172, in _read_with_length
return self.loads(obj)
  File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py",
line 583, in loads
return pickle.loads(obj)
ImportError: No module named feature.user.user_feature

The script also run well in "sbin/start-master.sh sbin/start-slave.sh",but
it has the same importError problem in "sbin/start-master.sh
sbin/start-slaves.sh".The conf/slaves contents is 'localhost'.

What should i do to solve this import problem?Thanks!!!