Thanks! I actually managed to have my own deployment for running it locally
and it works well for local file, but I get this weird error while trying
to run the word count example and I’m trying to understand what can be the
cause of it?

If you’re referring to this error:

2021/01/14 14:11:45 Failed to retrieve staged files: failed to retrieve
/tmp/staged in 3 attempts: failed to retrieve chunk for
/tmp/staged/pickled_main_session

Then that is likely because the flink taskmanager cannot find the staged
artifacts on disk. If you’re using the FlinkRunner you use the flag
--flink_submit_uber_jar which will stuff all the artifacts into a jar file
you submit. If you’re using the PortableRunner, then you need to share the
staging volume between your artifact service (typically running wherever
your job service is) and the flink taskworker like I have configured here
<https://github.com/sambvfx/beam-flink-k8s/tree/master/k8s/with_job_server>.

BTW - is this something we may want to add to the official repo? (I can do
it ofc). It took me a while to find how to set up a local deployment for
Flink and the docs weren’t super detailed.

More documentation is certainly helpful. It took me quite some time to
figure out what I have now so I would +1 any effort to save the next person

On Thu, Jan 14, 2021 at 11:42 AM Nir Gazit <[email protected]> wrote:

> BTW - is this something we may want to add to the official repo? (I can do
> it ofc). It took me a while to find how to set up a local deployment for
> Flink and the docs weren't super detailed.
>
> On Thu, Jan 14, 2021 at 9:39 PM Nir Gazit <[email protected]> wrote:
>
>> Thanks! I actually managed to have my own deployment for running it
>> locally and it works well for local file, but I get this weird error while
>> trying to run the word count example and I'm trying to understand what can
>> be the cause of it?
>>
>> On Thu, Jan 14, 2021 at 7:29 PM Sam Bourne <[email protected]> wrote:
>>
>>> Hi Nir,
>>>
>>> I have a simple repo where I have a proof of concept deployment setup
>>> for doing this.
>>>
>>> https://github.com/sambvfx/beam-flink-k8s
>>>
>>> Depending on the type of runner you're using there are a few
>>> explanations. That repo should hopefully point you in the right direction.
>>>
>>> On Thu, Jan 14, 2021 at 6:16 AM Nir Gazit <[email protected]> wrote:
>>>
>>>> Hi,
>>>> I'm trying to deploy the word count job on a Flink cluster in
>>>> Kubernetes. However, when trying to run the job (with python workers as a
>>>> side car to the Flink task masters), I get the following error:
>>>>
>>>> 2021/01/14 14:11:36 Initializing python harness: /opt/apache/beam/boot
>>>> --id=1-1 --logging_endpoint=localhost:39233
>>>> --artifact_endpoint=localhost:34123 --provision_endpoint=localhost:33519
>>>> --control_endpoint=localhost:44987
>>>> 2021/01/14 14:11:45 Failed to retrieve staged files: failed to retrieve
>>>> /tmp/staged in 3 attempts: failed to retrieve chunk for
>>>> /tmp/staged/pickled_main_session
>>>> caused by:
>>>> rpc error: code = Unknown desc = ; failed to retrieve chunk for
>>>> /tmp/staged/pickled_main_session
>>>> caused by:
>>>> rpc error: code = Unknown desc = ; failed to retrieve chunk for
>>>> /tmp/staged/pickled_main_session
>>>> caused by:
>>>> rpc error: code = Unknown desc = ; failed to retrieve chunk for
>>>> /tmp/staged/pickled_main_session
>>>> caused by:
>>>> rpc error: code = Unknown desc =
>>>>
>>>> Anyone knows what could be the reason?
>>>>
>>>

Reply via email to