[ https://issues.apache.org/jira/browse/FLINK-30803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682412#comment-17682412 ]
Dian Fu commented on FLINK-30803: --------------------------------- [~nuafonso] Thanks for the confirmation. You could work around it using the above approach. I will figure out if there is a better way to handle this. > PyFlink mishandles script dependencies > -------------------------------------- > > Key: FLINK-30803 > URL: https://issues.apache.org/jira/browse/FLINK-30803 > Project: Flink > Issue Type: Bug > Components: API / Python > Affects Versions: 1.16.0, 1.15.2, 1.15.3 > Reporter: Nuno Afonso > Priority: Major > Attachments: word_count_split.zip > > > h2. Summary > Since Flink 1.15, PyFlink is unable to run scripts that import scripts under > other directories. For instance, if _main.py_ imports > {_}job/word_count.py{_}, PyFlink will fail due to not finding the _job_ > directory. > The issue seems to have started after a [refactoring of > _PythonDriver_|https://github.com/apache/flink/commit/330aae0c6e0811f50888d17830f10f7a29efe7d7] > to address FLINK-26847. The path to the Python script is removed, which > forces PyFlink to use the copy in its temporary directory. When files are > copied to this directory, the original directory structure is not maintained > and ends up breaking the imports. > h2. Testing > To confirm the regression, I ran the attached application in both Flink > 1.14.6 and 1.15.3 clusters. > h3. Flink 1.14.6 > Application was able to start after being submitted via CLI: > > {code:java} > % ./bin/flink run --python ~/sandbox/word_count_split/main.py > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by > org.apache.flink.api.java.ClosureCleaner > (file:/.../flink-1.14.6/lib/flink-dist_2.12-1.14.6.jar) to field > java.lang.String.value > WARNING: Please consider reporting this to the maintainers of > org.apache.flink.api.java.ClosureCleaner > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > Job has been submitted with JobID 6f7be21072384ca3a314af10860c4ba8 {code} > > h3. Flink 1.15.3 > Application did not start due to not finding the _job_ directory: > > {code:java} > % ./bin/flink run --python ~/sandbox/word_count_split/main.py > Traceback (most recent call last): > File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main > "__main__", mod_spec) > File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code > exec(code, run_globals) > File > "/tmp/pyflink/40c649c3-24af-46ef-ae27-e0019cb55769/3673dd18-adff-40e0-bb11-06a3f00ba29c/main.py", > line 5, in <module> > from job.word_count import word_count > ModuleNotFoundError: No module named 'job' > org.apache.flink.client.program.ProgramAbortException: > java.lang.RuntimeException: Python process exits with code: 1 > at > org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:140) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:841) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:240) > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1085) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1163) > at > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1163) > Caused by: java.lang.RuntimeException: Python process exits with code: 1 > at > org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:130) > ... 13 more {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)