[ 
https://issues.apache.org/jira/browse/BEAM-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Whittle closed BEAM-9660.
-----------------------------
    Fix Version/s: 2.21.0
       Resolution: Fixed

> StreamingDataflowWorker has confusing exception on commits over 2GB
> -------------------------------------------------------------------
>
>                 Key: BEAM-9660
>                 URL: https://issues.apache.org/jira/browse/BEAM-9660
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.18.0, 2.19.0
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: Minor
>             Fix For: 2.21.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Commits over 2GB have a negative serialized commit size.
> When not using streaming engine the max commit limit is 2GB.
> https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L450
> There appears to be a logging regression introduced by
> https://github.com/apache/beam/pull/10013
> With the new code, if the serialization overflows the estimated bytes is set 
> to Integer.MAX which equals the commit limit for appliance.
> Then the comparison here:
> https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L1371
> which uses > does not trigger and the large commit is just passed on to the 
> commit queue, triggering the exception seen in #3 [2] when the weigher uses 
> the negative serialized size for the semaphore acquire call. 
> So previously where we would have thrown a KeyCommitTooLargeException we are 
> throwing the IllegalArgumentException.
> From that exception description: 
> https://github.com/apache/beam/blob/v2.19.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java#L236
>           ". This may be caused by grouping a very "
>               + "large amount of data in a single window without using 
> Combine,"
>               + " or by producing a large amount of data from a single input 
> element."
> The overflow could be remembered explicitly instead of just comparing with 
> max.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to