[jira] [Commented] (BEAM-10793) Incorrect Flink runner documentation

Danny McCormick (Jira) Sat, 04 Jun 2022 10:18:06 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17548809#comment-17548809
 ]


Danny McCormick commented on BEAM-10793:
----------------------------------------

This issue has been migrated to https://github.com/apache/beam/issues/20424

> Incorrect Flink runner documentation
> ------------------------------------
>
>                 Key: BEAM-10793
>                 URL: https://issues.apache.org/jira/browse/BEAM-10793
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink, sdk-go
>            Reporter: Kevin Sijo Puthusseri
>            Priority: P3
>
> As per the documentation at 
> [https://beam.apache.org/documentation/runners/flink/] under _"Portable 
> (Java/Python/Go)"_, a containerized flink job server needs to be started using
> {code:java}
> docker run --net=host apache/beam_flink1.10_job_server:latest
> {code}
> or
> {code:java}
> docker run --net=host apache/beam_flink1.10_job_server:latest 
> --flink-master=localhost:8081
> {code}
>  If any of the SDKs are run using the DOCKER flag, all crash. As explained by 
> [~danoliveira] – _"This command is building and running it locally on your 
> machine. I'm not 100% sure why running it in a container is causing the 
> error, but my suspicion is that it has to do with writing the 
> manifest/artifact files to disk. One thing the job server does is writing 
> artifacts to disk and then sending the locations to the SDK harness so it can 
> read them. If the job server is in a container, then its probably writing the 
> files to the container instead of your local machine, so they're inaccessible 
> to the SDK harness."_ In fact, [~lostluck] tracked this to an already 
> existing issue https://issues.apache.org/jira/browse/BEAM-5273 which is yet 
> to be resolved and addresses this exact problem. Using Daniel's advice, Go 
> SDK (and others I'm certain) can be run in DOCKER mode if the flink job 
> server is started locally using gradle as follows –
> {code:java}
> ./gradlew :runners:flink:1.10:job-server:runShadow -Djob-host=localhost 
> -Dflink-master=local{code}
> Only if the SDK is run using the LOOPBACK flag does it manage to run on a 
> containerized flink cluster. Moreoever since the LOOPBACK flag is explicitly 
> meant for *local development* purposes only, this makes me wonder how folks 
> are deploying their production beam data pipelines on flink (especially on 
> managed services like Kubernetes). Overall, the main issue (at least until 
> BEAM-5273 is unresolved) is the fact that beam's documentation fails to 
> mention these caveats explicitly.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (BEAM-10793) Incorrect Flink runner documentation

Reply via email to