[ 
https://issues.apache.org/jira/browse/BEAM-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785979#comment-16785979
 ] 

Kenneth Knowles commented on BEAM-6757:
---------------------------------------

I can explain the stack trace: this is part of a GroupByKey or CombinePerKey 
operation, where it is adding that element to the grouping for its key and 
window. 

The crash is correct: an element's event timestamp can never be greater than 
the end of the window it is associated with. So the grouping makes no sense.

The question is how did that element get assigned that timestamp and window.

> Random "IllegalStateException: TimestampCombiner moved element from X to Y" 
> seen in running topology
> ----------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-6757
>                 URL: https://issues.apache.org/jira/browse/BEAM-6757
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.10.0
>            Reporter: Steve Niemitz
>            Assignee: Kenneth Knowles
>            Priority: Major
>
> We have a streaming job that does some aggregation, it looks something like:
> {code:java}
> - Read from Pub/sub (use timestamps from pubsub)
> - Window into 4 hour fixed windows, no allowed lateness
> - CombineByKey with some aggregations
> - Write out{code}
> Today, the pipeline stalled, continuously throwing an exception I had never 
> seen before (it had been running fine for ~10 days previously)
> {code:java}
> java.lang.IllegalStateException: TimestampCombiner moved element from 
> 2019-02-28T08:00:05.000Z to earlier time 2019-02-28T07:59:59.999Z for window 
> [2019-02-28T04:00:00.000Z..2019-02-28T08:00:00.000Z)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.WatermarkHold.shift(WatermarkHold.java:117)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.WatermarkHold.addElementHold(WatermarkHold.java:154)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.WatermarkHold.addHolds(WatermarkHold.java:98)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:605)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:349)
> org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:94)
> org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42)
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:134)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:76)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1233)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:144)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:972)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745){code}
> I'm not really sure what could have happened here, it looks like the element 
> is trying to be put into the wrong window?  This is running on dataflow with 
> "enable_streaming_engine" on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to