That is strange. I did nothing special but cloned your repo and then: 1. docker-compose -f docker-compose.yaml up 2. I just ran both ways for a simple t.py test, which works well t.py:
import apache_beam as beam with beam.Pipeline() as p: _ = ( p | beam.Create( [(0, "ttt"), (0, "ttt1"), (0, "ttt2"), (1, "xxx"), (1, "xxx2"), (2, "yyy")] ) | beam.Map(print) ) python t.py \ --topic test --group test-group --bootstrap-server host.docker.internal:9092 \ --job_endpoint host.docker.internal:8099 \ --artifact_endpoint host.docker.internal:8098 \ --environment_type=EXTERNAL \ --environment_config=host.docker.internal:50000 python t.py \ --topic test --group test-group --bootstrap-server localhost:9092 \ --job_endpoint localhost:8099 \ --artifact_endpoint localhost:8098 \ --environment_type=EXTERNAL \ --environment_config=localhost:50000 outputs: (0, 'ttt') (0, 'ttt1') (0, 'ttt2') (1, 'xxx') (1, 'xxx2') (2, 'yyy') On Sun, Mar 31, 2024 at 7:15 PM Lydian Lee <tingyenlee...@gmail.com> wrote: > Hi XQ, > > Sorry to bother you again, but I've tested the same thing again in a linux > env, and it is still not working and showing the same error in the python > worker harness. (Note that this won't fail immediately, but it is failing > after the task is assigned to task manager and the python worker harness is > starting to work) > > Wondering if you can share what you've changed (maybe a PR) so that I can > test again on my linux machine. Thanks so much for your help. There's > someone else also pinging me on the same error when testing, and I do want > to make this work for everyone. Thanks! > > > > On Mon, Mar 18, 2024 at 6:24 PM XQ Hu via user <user@beam.apache.org> > wrote: > >> I did not do anything special but ran `docker-compose -f >> docker-compose.yaml up` from your repo. >> >> On Sun, Mar 17, 2024 at 11:38 PM Lydian Lee <tingyenlee...@gmail.com> >> wrote: >> >>> Hi XQ, >>> >>> The code is simplified from my previous work and thus it is still using >>> the old version. But I've tested with Beam 2.54.0 and the code still works >>> (I mean using my company's image.) If this is running well in your linux, >>> I guess there could be something related to how I build the docker image. >>> Curious if you could share the image you built to docker.io so that I >>> can confirm if the problem is related to only the image, thanks. >>> >>> The goal for this repo is to complete my previous talk: >>> https://www.youtube.com/watch?v=XUz90LpGAgc&ab_channel=ApacheBeam >>> >>> On Sun, Mar 17, 2024 at 8:07 AM XQ Hu via user <user@beam.apache.org> >>> wrote: >>> >>>> I cloned your repo on my Linux machine, which is super useful to run. >>>> Not sure why you use Beam 2.41 but anyway, I tried this on my Linux >>>> machine: >>>> >>>> python t.py \ >>>> --topic test --group test-group --bootstrap-server localhost:9092 \ >>>> --job_endpoint localhost:8099 \ >>>> --artifact_endpoint localhost:8098 \ >>>> --environment_type=EXTERNAL \ >>>> --environment_config=localhost:50000 >>>> >>>> Note I replaced host.docker.internal with localhost and it runs well. >>>> >>>> I then tried to use host.docker.internal and it also runs well, >>>> >>>> Maybe this is related to your Mac setting? >>>> >>>> On Sun, Mar 17, 2024 at 8:34 AM Lydian Lee <tingyenlee...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> Just FYI, the similar things works on a different image with the one I >>>>> built using my company’s image as base image. I’ve only replaced the base >>>>> image with ubuntu. But given that the error log is completely not helpful, >>>>> it’s really hard for me to continue debugging on the issue though. >>>>> >>>>> The docker is not required on my base image as I’ve already add extra >>>>> args to ReadFromKafka with default environment to be Process. This is >>>>> proof >>>>> to work with my company’s docker image. For the host.internal.docker which >>>>> is also supported by docker for mac. The only thing i need to do is to >>>>> configure /etc/hosts so that i can submit the job directly from the laptop >>>>> and not the flink master. >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Mar 17, 2024 at 2:40 AM Jaehyeon Kim <dott...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> The pipeline runs in host while host.docker.internal would only be >>>>>> resolved on the containers that run with the host network mode. I guess >>>>>> the >>>>>> pipeline wouldn't be accessible to host.docker.internal and fails to run. >>>>>> >>>>>> If everything before ReadFromKafka works successfully, a docker >>>>>> container will be launched with the host network mode so that >>>>>> host.docker.internal:9092 can be resolved inside the container. As far as >>>>>> I've checked, however, it fails when I start a flink cluster on docker >>>>>> and >>>>>> I had to rely on a local flink cluster. If you'd like to try to use >>>>>> docker, >>>>>> you should have docker installed on your custom docker image and >>>>>> volume-map /var/run/docker.sock to the flink task manager. Otherwise, it >>>>>> won't be able to launch a Docker container for reading kafka messages. >>>>>> >>>>>> Cheers, >>>>>> Jaehyeon >>>>>> >>>>>> >>>>>> On Sun, 17 Mar 2024 at 18:21, Lydian Lee <tingyenlee...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have an issue when setting up a POC of Python SDK with Flink >>>>>>> runner to run in docker-compose. The python worker harness was not >>>>>>> returning any error but: >>>>>>> ``` >>>>>>> python-worker-harness-1 | 2024/03/17 07:10:17 Executing: python -m >>>>>>> apache_beam.runners.worker.sdk_worker_main >>>>>>> python-worker-harness-1 | 2024/03/17 07:10:24 Python exited: <nil> >>>>>>> ``` >>>>>>> and dead. The error message seems totally unuseful, and I am >>>>>>> wondering if there's a way to make the harness script show more debug >>>>>>> logging. >>>>>>> >>>>>>> I started my harness via: >>>>>>> ``` >>>>>>> /opt/apache/beam/boot --worker_pool >>>>>>> ``` >>>>>>> and configure my script to use the harness >>>>>>> ``` >>>>>>> python docker/src/example.py \ >>>>>>> --topic test --group test-group --bootstrap-server >>>>>>> host.docker.internal:9092 \ >>>>>>> --job_endpoint host.docker.internal:8099 \ >>>>>>> --artifact_endpoint host.docker.internal:8098 \ >>>>>>> --environment_type=EXTERNAL \ >>>>>>> --environment_config=host.docker.internal:50000 >>>>>>> ``` >>>>>>> The full settings is available in: >>>>>>> https://github.com/lydian/beam-python-flink-runner-examples >>>>>>> Thanks for your help >>>>>>> >>>>>>>