[ https://issues.apache.org/jira/browse/BEAM-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108162#comment-16108162 ]
Flavio Fiszman commented on BEAM-2702: -------------------------------------- Hi Johann, JIRA issues are primarily used for Beam SDK-related issues. The issue here seems to be related to the Dataflow Runner in specific, so a better reporting mechanism is to send that information to dataflow-feedb...@google.com . Thanks! > Dataflow pipeline stalls after autoscaling > ------------------------------------------ > > Key: BEAM-2702 > URL: https://issues.apache.org/jira/browse/BEAM-2702 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Affects Versions: 2.0.0 > Reporter: Johann Steinbrecher > Assignee: Thomas Groh > > A 4 step dataflow pipeline (Pubsubio.Read, windowing, message parsing, > DatastoreV1.write) stalls as soon as the autoscaling algorithm is increasing > the number of workers from 1 to 4. > *Expected*: > Throughput (elements/sec) for each pipeline step increases due to more > workers. > *Actual*: > Throughput (elements/sec) goes to 0 for all steps. The number of processed > elements in the first step equals the number of processed elements in the > last step. The number of workers stays high. > Runner: google-cloud-platform managed dataflow runner > Sample dataflow job id (log level debug): > 2017-07-27_14_51_37-4624978117098944513 > Log message after autoscaling: > Rpc to .. completed with error DEADLINE_EXCEEDED (cause or symptom?) > autoscaling configuration > --autoscalingAlgorithm=THROUGHPUT_BASED > --maxNumWorkers=4 > machine types tested: > - n1-highmem-2 > - n1-standard-1 > zone: us-east1-d > sdk version: > org.apache.beam@2.0.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)