HeartSaVioR opened a new pull request #33689:
URL: https://github.com/apache/spark/pull/33689


   ### What changes were proposed in this pull request?
   
   This PR proposes to prohibit update mode in streaming aggregation with 
session window.
   
   UnsupportedOperationChecker will check and prohibit the case. As a side 
effect, this PR also simplifies the code as we can remove the implementation of 
iterator to support outputs of update mode.
   
   This PR also cleans up test code via deduplicating.
   
   ### Why are the changes needed?
   
   The semantic of "update" mode for session window based streaming aggregation 
is quite unclear.
   
   For normal streaming aggregation, Spark will provide the outputs which can 
be "upsert"ed based on the grouping key. This is based on the fact grouping key 
won't be changed.
   
   This doesn't hold true for session window based streaming aggregation. If 
you're trying to upsert the output based on the grouping key, it's high likely 
possible that existing row is not updated (overwritten) and ended up with 
having different rows.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, as we haven't released this feature.
   
   ### How was this patch tested?
   
   Updated tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to