Hello Everyone, I have code inside my project organized in packages and modules, however I keep getting the error "ImportError: No module named <package.module>" when I run spark on YARN.
My directory structure is something like this: project/ package/ module.py __init__.py bin/ docs/ setup.py main_script.py requirements.txt tests/ package/ module_test.py __init__.py __init__.py So when I pass `main_script.py` to spark-submit with master set to "yarn-client", the packages aren't found and I get the error above. With a code structure like this adding everything as pyfile to the spark context seems counter intuitive. I just want to organize my code as much as possible to make it more readable and maintainable. Is there a better way to achieve good code organization without running into such problems? Best Regards, Mo