Hi all, I’m trying to switch from pyspark interpreter to python interpreter and ran into weird errors of py4j like “key error ‘x’” or “invalid command” or so when creating spark session.
A little digging reveals that zeppelin has its own py4j stuffed into PYTHONPATH of python interpreter, the value of PYTHONPATH I get from within python interpreter notebook is '/usr/lib/zeppelin/interpreter/python/py4j-0.9.2/src:/usr/lib/spark/python:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip' The latter parts with /usr/lib/spark are added into PYTHONPATH by me in zeppelin-env.sh. However no matter how I arrange the order of paths in PYTHONPATH, somehow zeppelin managed to set its py4j in the first place, preventing python interpreter from finding the right py4j that comes with spark. I suppose zeppelin manipulated PYTHONPATH after loading zeppelin-env.sh. The reason I prefer python interpreter is that I would like to have fine control over spark params per notebook (whereas pyspark uses the same set of spark params for all notebooks) Has anyone run into the same issue or is there a workaround for this? Thanks! -- Lu Rui -- *This email may contain or reference confidential information and is intended only for the individual to whom it is addressed. Please refrain from distributing, disclosing or copying this email and the information contained within unless you are the intended recipient. If you received this email in error, please notify us at le...@appannie.com <mailto:le...@appannie.com>** immediately and remove it from your system.*