THis isn't going to help submitting to a remote cluster though. You need to explicitly include dependencies in your submit.
On Fri, Jan 8, 2021 at 11:15 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Riccardo > > This is the env variables at runtime > > PYTHONUNBUFFERED=1;*PYTHONPATH=* > C:\Users\admin\PycharmProjects\packages\;C:\Users\admin\PycharmProjects\pythonProject2\DS\;C:\Users\admin\PycharmProjects\pythonProject2\DS\conf\;C:\Users\admin\PycharmProjects\pythonProject2\DS\lib\;C:\Users\admin\PycharmProjects\pythonProject2\DS\src > > This is the configuration set up for analyze_house_prices_GCP > > [image: image.png] > > > > > So like in Linux, I created a windows env variable and on PyCharm > terminal, I can see it > > > > (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>*echo > %PYTHONPATH%* > > > PYTHONPATH=C:\Users\admin\PycharmProjects\packages\;C:\Users\admin\PycharmProjects\pythonProject2\DS\;C:\Users\admin\PycharmProjects\pythonProject2\DS\conf\ > > > ;C:\Users\admin\PycharmProjects\pythonProject2\DS\lib\;C:\Users\admin\PycharmProjects\pythonProject2\DS\src > > It picks up sparkstuff.py > > > (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>*where > sparkstuff.py* > > C:\Users\admin\PycharmProjects\packages\sparkutils\sparkstuff.py > > But in spark-submit within the code it does not > > (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>spark-submit > --jars ..\spark-bigquery-with-dependencies_2.12-0.18.0.jar > analyze_house_prices_GCP > .py > Traceback (most recent call last): > File > "C:/Users/admin/PycharmProjects/pythonProject2/DS/src/analyze_house_prices_GCP.py", > line 8, in <module> > import sparkstuff as s > ModuleNotFoundError: No module named 'sparkutils' > > thanks > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 8 Jan 2021 at 16:38, Riccardo Ferrari <ferra...@gmail.com> wrote: > >> I think spark checks the python path env variable. Need to provide that. >> Of course that works in local mode only >> >> On Fri, Jan 8, 2021, 5:28 PM Sean Owen <sro...@gmail.com> wrote: >> >>> I don't see anywhere that you provide 'sparkstuff'? how would the Spark >>> app have this code otherwise? >>> >>> On Fri, Jan 8, 2021 at 10:20 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Thanks Riccardo. >>>> >>>> I am well aware of the submission form >>>> >>>> However, my question relates to doing submission within PyCharm itself. >>>> >>>> This is what I do at Pycharm *terminal* to invoke the module python >>>> >>>> spark-submit --jars >>>> ..\lib\spark-bigquery-with-dependencies_2.12-0.18.0.jar \ >>>> --packages com.github.samelamin:spark-bigquery_2.11:0.2.6 >>>> analyze_house_prices_GCP.py >>>> >>>> However, at terminal run it does not pickup import dependencies in the >>>> code! >>>> >>>> Traceback (most recent call last): >>>> File >>>> "C:/Users/admin/PycharmProjects/pythonProject2/DS/src/analyze_house_prices_GCP.py", >>>> line 8, in <module> >>>> import sparkstuff as s >>>> ModuleNotFoundError: No module named 'sparkstuff' >>>> >>>> The python code is attached, pretty simple >>>> >>>> Thanks >>>> >>>> >>>> >>>>