ok, I see now what's happening - the pkg.mod.test is serialized by reference 
and there is nothing actually trying to import pkg.mod on the executors so the 
reference is broken.
so how can I get the pkg.mod imported on the executors?
thanks,Antony. 

     On Friday, 2 January 2015, 13:49, Antony Mayi <antonym...@yahoo.com> wrote:
   
 

 Hi,
I am running spark 1.1.0 on yarn. I have custom set of modules installed under 
same location on each executor node and wondering how can I pass the executors 
the PYTHONPATH so that they can use the modules.
I've tried this:
spark-env.sh:export PYTHONPATH=/tmp/test/

spark-defaults.conf:spark.executorEnv.PYTHONPATH=/tmp/test/


/tmp/test/pkg:__init__.pymod.py:  def test(x):
      return x
from the pyspark shell I can import the module pkg.mod without any issues:
$$$ import pkg.mod$$$ print pkg.mod.test(1)1
also the path is correctly set:
$$$ print 
os.environ['PYTHONPATH']/usr/lib/spark/python/lib/py4j-0.8.2.1-src.zip:/usr/lib/spark/python/:/tmp/test/
$$$ print sys.path['', '/usr/lib/spark/python/lib/py4j-0.8.2.1-src.zip', 
'/usr/lib/spark/python', '/tmp/test/', ... ]

it even is seen by the executors:
$$$ sc.parallelize(range(4)).map(lambda x: 
os.environ['PYTHONPATH']).collect()['/u01/yarn/local/usercache/user/filecache/24/spark-assembly-1.1.0-cdh5.2.1-hadoop2.5.0-cdh5.2.1.jar:/tmp/test/:/tmp/test/',
 
'/u01/yarn/local/usercache/user/filecache/24/spark-assembly-1.1.0-cdh5.2.1-hadoop2.5.0-cdh5.2.1.jar:/tmp/test/:/tmp/test/',
 
'/u01/yarn/local/usercache/user/filecache/24/spark-assembly-1.1.0-cdh5.2.1-hadoop2.5.0-cdh5.2.1.jar:/tmp/test/:/tmp/test/',
 
'/u02/yarn/local/usercache/user/filecache/24/spark-assembly-1.1.0-cdh5.2.1-hadoop2.5.0-cdh5.2.1.jar:/tmp/test/:/tmp/test/']
yet it fails when actually using the module on the executor:$$$ 
sc.parallelize(range(4)).map(pkg.mod.test).collect()...ImportError: No module 
named mod...
any idea how to achieve this? don't want to use the sc.addPyFile as this is big 
packages and they are installed everywhere anyway...
thank you,Antony.




 
   

Reply via email to