Re: Error with Beam/Flink Python pipeline with Kafka

2021-05-24 Thread Kyle Weaver
The artifact staging directory (configurable via the "--artifacts_dir"
pipeline option) needs to be accessible to workers, which in this case are
containerized. There are a couple workarounds:

1. Don't stage files through the file system at all, i.e. use
--runner=FlinkRunner in Python instead of --runner=PortableRunner
(instructions on the webpage [1]).
2. Point --artifacts_dir to a distributed file system (GCS, S3, etc.).

I also filed [3] because this happens a lot and our logs aren't very
helpful.

[1] https://beam.apache.org/documentation/runners/flink/
[2] https://issues.apache.org/jira/browse/BEAM-5440
[3] https://issues.apache.org/jira/browse/BEAM-12392

On Mon, May 24, 2021 at 9:57 AM Kenneth Knowles  wrote:

> Thanks for the details in the gist. I would guess the key to the error is
> this:
>
> 2021/05/24 13:45:34 Initializing java harness: /opt/apache/beam/boot
> --id=1-2 --provision_endpoint=localhost:33957
> 2021/05/24 13:45:44 Failed to retrieve staged files: failed to
> retrieve /tmp/staged in 3 attempts: failed to retrieve chunk for
> /tmp/staged/beam-sdks-java-io-expansion-service-2.28.0-oErG3M6t3yoDeSvfFRW5UCvvM33J-4yNp7xiK9glKtI.jar
> caused by:
> rpc error: code = Unknown desc = ;
> ...
>
> There is something wrong with the artifact provisioning endpoint perhaps?
> I can't help any further than this.
>
> Kenn
>
> On Mon, May 24, 2021 at 6:58 AM Nir Gazit  wrote:
>
>> Hey,
>> I'm having issues with running a simple pipeline on a remote Flink
>> cluster. I use a separate Beam job server to which I submit the job, which
>> then goes to the Flink cluster with docker-in-docker enabled. For some
>> reason, I'm getting errors in Flink which I can't decipher (see gist
>> ).
>>
>> Can someone assist me?
>>
>> Thanks!
>> Nir
>>
>


Re: Error with Beam/Flink Python pipeline with Kafka

2021-05-24 Thread Kenneth Knowles
Thanks for the details in the gist. I would guess the key to the error is
this:

2021/05/24 13:45:34 Initializing java harness: /opt/apache/beam/boot
--id=1-2 --provision_endpoint=localhost:33957
2021/05/24 13:45:44 Failed to retrieve staged files: failed to retrieve
/tmp/staged in 3 attempts: failed to retrieve chunk for
/tmp/staged/beam-sdks-java-io-expansion-service-2.28.0-oErG3M6t3yoDeSvfFRW5UCvvM33J-4yNp7xiK9glKtI.jar
caused by:
rpc error: code = Unknown desc = ;
...

There is something wrong with the artifact provisioning endpoint perhaps? I
can't help any further than this.

Kenn

On Mon, May 24, 2021 at 6:58 AM Nir Gazit  wrote:

> Hey,
> I'm having issues with running a simple pipeline on a remote Flink
> cluster. I use a separate Beam job server to which I submit the job, which
> then goes to the Flink cluster with docker-in-docker enabled. For some
> reason, I'm getting errors in Flink which I can't decipher (see gist
> ).
>
> Can someone assist me?
>
> Thanks!
> Nir
>