That is strange. I did nothing special but cloned your repo and then:
1. docker-compose -f docker-compose.yaml up
2. I just ran both ways for a simple t.py test, which works well
t.py:

import apache_beam as beam

with beam.Pipeline() as p:

    _ = (
        p
        | beam.Create(
            [(0, "ttt"), (0, "ttt1"), (0, "ttt2"), (1, "xxx"), (1, "xxx2"),
(2, "yyy")]
        )
        | beam.Map(print)
    )

python t.py \
  --topic test --group test-group --bootstrap-server
host.docker.internal:9092 \
  --job_endpoint host.docker.internal:8099 \
  --artifact_endpoint host.docker.internal:8098 \
  --environment_type=EXTERNAL \
  --environment_config=host.docker.internal:50000

python t.py \
  --topic test --group test-group --bootstrap-server localhost:9092 \
  --job_endpoint localhost:8099 \
  --artifact_endpoint localhost:8098 \
  --environment_type=EXTERNAL \
  --environment_config=localhost:50000

outputs:

(0, 'ttt')
(0, 'ttt1')
(0, 'ttt2')
(1, 'xxx')
(1, 'xxx2')
(2, 'yyy')


On Sun, Mar 31, 2024 at 7:15 PM Lydian Lee <tingyenlee...@gmail.com> wrote:

> Hi XQ,
>
> Sorry to bother you again, but I've tested the same thing again in a linux
> env, and it is still not working and showing the same error in the python
> worker harness.  (Note that this won't fail immediately, but it is failing
> after the task is assigned to task manager and the python worker harness is
> starting to work)
>
> Wondering if you can share what you've changed  (maybe a PR) so that I can
> test again on my linux machine. Thanks so much for your help.  There's
> someone else also pinging me on the same error when testing, and I do want
> to make this work for everyone.   Thanks!
>
>
>
> On Mon, Mar 18, 2024 at 6:24 PM XQ Hu via user <user@beam.apache.org>
> wrote:
>
>> I did not do anything special but ran `docker-compose -f
>> docker-compose.yaml up` from your repo.
>>
>> On Sun, Mar 17, 2024 at 11:38 PM Lydian Lee <tingyenlee...@gmail.com>
>> wrote:
>>
>>> Hi XQ,
>>>
>>> The code is simplified from my previous work and thus it is still using
>>> the old version. But I've tested with Beam 2.54.0 and the code still works
>>> (I mean using my company's image.)  If this is running well in your linux,
>>> I guess there could be something related to how I build the docker image.
>>> Curious if you could share the image you built to docker.io so that I
>>> can confirm if the problem is related to only the image, thanks.
>>>
>>> The goal for this repo is to complete my previous talk:
>>> https://www.youtube.com/watch?v=XUz90LpGAgc&ab_channel=ApacheBeam
>>>
>>> On Sun, Mar 17, 2024 at 8:07 AM XQ Hu via user <user@beam.apache.org>
>>> wrote:
>>>
>>>> I cloned your repo on my Linux machine, which is super useful to run.
>>>> Not sure why you use Beam 2.41 but anyway, I tried this on my Linux 
>>>> machine:
>>>>
>>>> python t.py \
>>>>   --topic test --group test-group --bootstrap-server localhost:9092 \
>>>>   --job_endpoint localhost:8099 \
>>>>   --artifact_endpoint localhost:8098 \
>>>>   --environment_type=EXTERNAL \
>>>>   --environment_config=localhost:50000
>>>>
>>>> Note I replaced host.docker.internal with localhost and it runs well.
>>>>
>>>> I then tried to use host.docker.internal and it also runs well,
>>>>
>>>> Maybe this is related to your Mac setting?
>>>>
>>>> On Sun, Mar 17, 2024 at 8:34 AM Lydian Lee <tingyenlee...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Just FYI, the similar things works on a different image with the one I
>>>>> built using my company’s image as base image. I’ve only replaced the base
>>>>> image with ubuntu. But given that the error log is completely not helpful,
>>>>> it’s really hard for me to continue debugging on the issue though.
>>>>>
>>>>> The docker is not required on my base image as I’ve already add extra
>>>>> args to ReadFromKafka with default environment to be Process. This is 
>>>>> proof
>>>>> to work with my company’s docker image. For the host.internal.docker which
>>>>> is also supported by docker for mac. The only thing i need to do is to
>>>>> configure /etc/hosts so that i can submit the job directly from the laptop
>>>>> and not the flink master.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Mar 17, 2024 at 2:40 AM Jaehyeon Kim <dott...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> The pipeline runs in host while host.docker.internal would only be
>>>>>> resolved on the containers that run with the host network mode. I guess 
>>>>>> the
>>>>>> pipeline wouldn't be accessible to host.docker.internal and fails to run.
>>>>>>
>>>>>> If everything before ReadFromKafka works successfully, a docker
>>>>>> container will be launched with the host network mode so that
>>>>>> host.docker.internal:9092 can be resolved inside the container. As far as
>>>>>> I've checked, however, it fails when I start a flink cluster on docker 
>>>>>> and
>>>>>> I had to rely on a local flink cluster. If you'd like to try to use 
>>>>>> docker,
>>>>>> you should have docker installed on your custom docker image and
>>>>>> volume-map /var/run/docker.sock to the flink task manager. Otherwise, it
>>>>>> won't be able to launch a Docker container for reading kafka messages.
>>>>>>
>>>>>> Cheers,
>>>>>> Jaehyeon
>>>>>>
>>>>>>
>>>>>> On Sun, 17 Mar 2024 at 18:21, Lydian Lee <tingyenlee...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an issue when setting up a POC of  Python SDK with Flink
>>>>>>> runner to run in docker-compose.  The python worker harness was not
>>>>>>> returning any error but:
>>>>>>> ```
>>>>>>> python-worker-harness-1  | 2024/03/17 07:10:17 Executing: python -m
>>>>>>> apache_beam.runners.worker.sdk_worker_main
>>>>>>> python-worker-harness-1  | 2024/03/17 07:10:24 Python exited: <nil>
>>>>>>> ```
>>>>>>> and dead.  The error message seems totally unuseful, and I am
>>>>>>> wondering if there's a way to make the harness script show more debug
>>>>>>> logging.
>>>>>>>
>>>>>>> I started my harness via:
>>>>>>> ```
>>>>>>> /opt/apache/beam/boot --worker_pool
>>>>>>> ```
>>>>>>> and configure my script to use the harness
>>>>>>> ```
>>>>>>> python docker/src/example.py \
>>>>>>>   --topic test --group test-group --bootstrap-server
>>>>>>> host.docker.internal:9092 \
>>>>>>>   --job_endpoint host.docker.internal:8099 \
>>>>>>>   --artifact_endpoint host.docker.internal:8098 \
>>>>>>>   --environment_type=EXTERNAL \
>>>>>>>   --environment_config=host.docker.internal:50000
>>>>>>> ```
>>>>>>> The full settings is available in:
>>>>>>> https://github.com/lydian/beam-python-flink-runner-examples
>>>>>>> Thanks for your help
>>>>>>>
>>>>>>>

Reply via email to