Sam Whittle created BEAM-9660:
---------------------------------

             Summary: StreamingDataflowWorker has confusing exception on 
commits over 2GB
                 Key: BEAM-9660
                 URL: https://issues.apache.org/jira/browse/BEAM-9660
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
    Affects Versions: 2.19.0, 2.18.0
            Reporter: Sam Whittle
            Assignee: Sam Whittle


Commits over 2GB have a negative serialized commit size.
When not using streaming engine the max commit limit is 2GB.
https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L450
There appears to be a logging regression introduced by
https://github.com/apache/beam/pull/10013

With the new code, if the serialization overflows the estimated bytes is set to 
Integer.MAX which equals the commit limit for appliance.
Then the comparison here:
https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L1371
which uses > does not trigger and the large commit is just passed on to the 
commit queue, triggering the exception seen in #3 [2] when the weigher uses the 
negative serialized size for the semaphore acquire call. 

So previously where we would have thrown a KeyCommitTooLargeException we are 
throwing the IllegalArgumentException.
>From that exception description: 
>https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L236
          ". This may be caused by grouping a very "
              + "large amount of data in a single window without using Combine,"
              + " or by producing a large amount of data from a single input 
element."

The overflow could be remembered explicitly instead of just comparing with max.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to