[GitHub] [spark] HeartSaVioR edited a comment on pull request #28707: [SPARK-31894][SS] Introduce UnsafeRow format validation for streaming state store

GitBox Wed, 03 Jun 2020 17:02:27 -0700


HeartSaVioR edited a comment on pull request #28707:
URL: https://github.com/apache/spark/pull/28707#issuecomment-638520229



   Will this be included to Spark 3.0.0? If this is to unblock SPARK-28067 to 
be included to Spark 3.0.0 then it's OK to consider this first, but if this 
plans to go to Spark 3.1 then I'm not sure about the priority - are all of you 
aware that the PR for SPARK-27237 was submitted more than a year ago, and still 
be considered as later?
   
   I still don't get why the proposal is restricting its usage to streaming 
aggregation, whereas the mechanism is a validation of the UnsafeRow which can 
be applied to all stateful operations. Let's not to pinpoint the problem we've 
just seen.
   
   Also from my side the overhead of the validation logic looks to be trivial 
compared to the operations stateful operators will take - we don't do the 
validation for all rows, even don't sample, just the first one. Unless we have 
a chance to bring a show-stopper bug in the validation logic (so that we need 
to provide the way to disable the validation), I'm not seeing the needs of new 
configuration.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28707: [SPARK-31894][SS] Introduce UnsafeRow format validation for streaming state store

Reply via email to