[jira] [Commented] (BEAM-9225) Flink uber jar job server hangs
[ https://issues.apache.org/jira/browse/BEAM-9225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044916#comment-17044916 ] Kyle Weaver commented on BEAM-9225: --- I was waiting for fix to the tests to verify this, but I'm pretty sure it's resolved. > Flink uber jar job server hangs > --- > > Key: BEAM-9225 > URL: https://issues.apache.org/jira/browse/BEAM-9225 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This was observed on Kubernetes. I suspect this behavior might also be the > reason beam_PostCommit_PortableJar_Flink is timing out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9225) Flink uber jar job server hangs
[ https://issues.apache.org/jira/browse/BEAM-9225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043970#comment-17043970 ] Rui Wang commented on BEAM-9225: I am going to cut 2.20.0 branch on 02/26/2020. Can this Jira be finished by that date? Will you able to push this back to 2.21.0? > Flink uber jar job server hangs > --- > > Key: BEAM-9225 > URL: https://issues.apache.org/jira/browse/BEAM-9225 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This was observed on Kubernetes. I suspect this behavior might also be the > reason beam_PostCommit_PortableJar_Flink is timing out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9225) Flink uber jar job server hangs
[ https://issues.apache.org/jira/browse/BEAM-9225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041248#comment-17041248 ] Kyle Weaver commented on BEAM-9225: --- The Flink uber jar job server has two streams, one for state and one for messages. They share history [1] and don't yield a message if it is a duplicate according to that history [2]. So if the message stream reads the terminal state first, the state history will never yield it, and the job will never register as complete. I avoided this in the Spark implementation by deferring de-duplication to later, so all messages are yielded regardless of whether they are duplicates or not. [1] https://github.com/apache/beam/blob/57ce5b966cfc4f549082a47f50c29d5f9caa2909/sdks/python/apache_beam/runners/portability/abstract_job_service.py#L252 [2] https://github.com/apache/beam/blob/57ce5b966cfc4f549082a47f50c29d5f9caa2909/sdks/python/apache_beam/runners/portability/flink_uber_jar_job_server.py#L205 > Flink uber jar job server hangs > --- > > Key: BEAM-9225 > URL: https://issues.apache.org/jira/browse/BEAM-9225 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Fix For: 2.20.0 > > > This was observed on Kubernetes. I suspect this behavior might also be the > reason beam_PostCommit_PortableJar_Flink is timing out. -- This message was sent by Atlassian Jira (v8.3.4#803005)