Hi, I am working on implementing a local beam flink runner for faster development. I have made the docker image which contains the required flink and Beam dependencies, and then launched different containers:Job manager, task manager and beam job server, via docker-compose. I am using the bridge mode (because docker doesn't support "host" network in mac) and exposed all the related ports to localhost.
The test pipeline is written in python and runs as Portable runner, but setting up the `--environment-type` to be `LOOPBACK` so that it uses my local python code to run the change. (Our pipeline is written in python, but we need to use cross language for accessing data from Kafka) Here's my understanding on what would happened 1. Start my short python code with the following arg: ``` '--streaming', '--runner=portableRunner', '--environment_type=LOOPBACK', '--job_endpoint=localhost:8099', '--artifact_endpoint=localhost:8098', '--defaultEnvironmentType=EXTERNAL', '--defaultEnvironmentConfig=host.docker.internal:5000', ``` 2. The job launches Beam Java Expansion Service with process mode, because I am using this function: ``` ReadFromKafka( consumer_config={"bootstrap.servers": "kafka:9092", 'auto.offset.reset': 'earliest'}, topics=["test.topic"], with_metadata=False, expansion_service=default_io_expansion_service( append_args=[ '--defaultEnvironmentType=PROCESS', "--defaultEnvironmentConfig={\"command\":\"/opt/apache/beam/java_boot\"}", '--experiments=use_deprecated_read', ] ) ) ``` 3. The job is then submitted to Beam Job Server 4. The job server then submit the actual job to Flink Job Manager 5. Flink Job manager distributes the work to Task Manager 6. Task Manager launched a Java Worker 7. Once the Java worker is done, it returns the processed content back to original python process (because we are running in LOOPBACK) However, on the very last step, it failed to run because it looks like LOOPBACK opened a random port on the localhost and I have no idea how to make the Java Worker talk to the "Host" with the random port. I know the problem could be easily fixed by setting up network_mode to host. However, we are using Mac for development, and the host network is not supported for Docker on Mac. Wondering if anyone tried the same thing before and if there's any suggested workaround for mac user? Thanks! I also have my script and infra in this gist [1], hopefully that would make it easy to understand. Thanks! [1] https://gist.github.com/lydian/0db7614652c2ccdc733884134bf67f9b Sincerely, Lydian Lee