[ https://issues.apache.org/jira/browse/SPARK-38078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707618#comment-17707618 ]
Vindhya G commented on SPARK-38078: ----------------------------------- https://issues.apache.org/jira/browse/SPARK-43001 similar bug > Aggregation with Watermark in AppendMode is holding data beyong water mark > boundary. > ------------------------------------------------------------------------------------ > > Key: SPARK-38078 > URL: https://issues.apache.org/jira/browse/SPARK-38078 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 3.2.0 > Reporter: krishna > Priority: Major > > I am struggling with a unique issue. I am not sure if my understanding is > wrong or this is a bug with spark. > > # I am reading a stream from events hub/kafka ( Extract) > # Pivoting and Aggregating the above dataframe ( Transformation). This is a > WATERMARKED aggregation. > # writing the aggregation to Console/Delta table in APPEND mode with a > Trigger . > However, the most recently published message to event hub is not writing to > console/delta even after falling out of the watermark time. > > My understanding is the event should be inserted to the Delta table after > Eventtime+Watermark. > > Moreover, all the events in the memory stored must be flushed out to the sink > irrespective of the watermark before stopping to mark a graceful shutdown . > > Please advise. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org