HeartSaVioR opened a new pull request, #38528:
URL: https://github.com/apache/spark/pull/38528

   ### What changes were proposed in this pull request?
   
   This PR proposes to introduce a new interface ComparableOffset, which is a 
mix-in of streaming Offset interface to enable comparison between two offset 
instances. MicroBatchExecution will perform validation against offset range if 
the offset instance implements ComparableOffset.
   
   The new interface can be mixed-in with both DSv1 streaming Offset and DSv2 
streaming Offset.
   
   This PR also implements this interface for streaming offset in built-in data 
sources.
   
   ### Why are the changes needed?
   
   Currently, Spark doesn't do any assertion against offsets and data source 
implementation is full of responsibility to validate the offset. It seems more 
useful to provide the offset validation by Spark rather than just documenting 
the responsibility and let data source implementation do the duty.
   
   This offset validation is more important since we have Trigger.AvailableNow 
which gradually increases the offset and terminates when the offset is equal to 
the desired offset. A bug in data source may stall the query progress or even 
data duplication.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No for end users.
   
   ### How was this patch tested?
   
   New UTs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to