HeartSaVioR opened a new pull request, #38880:
URL: https://github.com/apache/spark/pull/38880

   ### What changes were proposed in this pull request?
   
   This PR proposes to clear the write batch (and also corresponding prefix 
iterators) after commit has succeeded on RocksDB state store. This PR also 
fixes the test case as side effect, as it had been relying on the "sort of bug" 
that we didn't clean up write batch till either rollback or load has been 
called.
   
   ### Why are the changes needed?
   
   Without this, the memory usage of WriteBatch for RocksDB state store is 
"accumulated" over the partitions in the same executor. Say, 10 partitions in 
stateful operator are assigned to an executor and run sequentially. Given that 
we didn't clear write batch after commit, when the executor processes the last 
partition assigned to it, 10 WriteBatch instances contain all writes being 
performed in this microbatch.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This is a sort of bugfix.
   
   ### How was this patch tested?
   
   Existing tests, with fixing the test case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to