Hi all, Currently, when running a pipeline that has the options runner=PortableRunner and job_endpoint unset, the Python SDK spins up a Dockerized Flink job server [1]. This is problematic because the PortableRunner can be used by any portable runner. So for example, a Spark runner user was recently baffled when their job ran successfully but printed a bunch of Flink log messages.
There are not too many uses of this default behavior to my knowledge, at least within Beam itself. The only example I could find was in the portableWordCount tests, which is mostly the same as portableWordCountFlinkRunner tests [2]. The default behavior is entirely superseded by the FlinkRunner class, which provides better encapsulation. I also noticed that DockerizedJobServer is only used by [3]. In FlinkRunner, we pull the job server from Maven if necessary and call Java directly. In general, I think there are already quite enough knobs in the portability framework, so we should remove it unless there is reason to prefer running the job server with Docker instead of calling Java directly. There are a couple options: A) Remove the default behavior and require job_endpoint to always be set when using PortableRunner. This would be a breaking change. B) Keep the current behavior, but warn when the user sets runner=PortableRunner without job_endpoint. This is easy to miss, but it's better than nothing. What do you think? [1] https://github.com/apache/beam/blob/33c73739cec8bc6a7c8319efa41eda7a2540bce1/sdks/python/apache_beam/runners/portability/job_server.py#L184 [2] https://github.com/apache/beam/blob/b3596b89dbc002c686bdaa7853074e757a81b6fb/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1983-L2048 [3] https://github.com/apache/beam/blob/33c73739cec8bc6a7c8319efa41eda7a2540bce1/sdks/python/apache_beam/runners/portability/job_server.py#L163
