[ https://issues.apache.org/jira/browse/SPARK-48586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Riya Verma updated SPARK-48586: ------------------------------- Description: Currently the lock of the *RocksDB* state store is acquired when uploading the snapshot inside maintenance tasks when change log checkpointing is enabled, which causes lock contention between query processing tasks and state maintenance thread. To eliminate the lock contention, lock acquisition inside maintenance tasks should be avoided. To prevent race conditions between task and maintenance threads, we can ensure that *RocksDBFileManager* has a linear history by ensuring a deep copy of *RocksDBFileManager* every time a previous version is loaded. The original file manager is not affected by future state update. The new file manager is not affected by background snapshot uploading tasks that attempt to upload a snapshot. (was: Currently the lock of the RocksDB state store is acquired when uploading the snapshot inside maintenance tasks when change log checkpointing is enabled, which causes lock contention between query processing tasks and state maintenance thread. To eliminate the lock contention, lock acquisition inside maintenance tasks should be avoided. To prevent race conditions between task and maintenance threads, we can ensure that RocksDBFileManager has a linear history by ensuring a deep copy of RocksDBFileManager every time a previous version is loaded. The original file manager is not affected by future state update. The new file manager is not affected by background snapshot uploading tasks that attempt to upload a snapshot.) > Remove lock contention between maintenance and task threads > ----------------------------------------------------------- > > Key: SPARK-48586 > URL: https://issues.apache.org/jira/browse/SPARK-48586 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 3.4.3 > Reporter: Riya Verma > Priority: Major > > Currently the lock of the *RocksDB* state store is acquired when uploading > the snapshot inside maintenance tasks when change log checkpointing is > enabled, which causes lock contention between query processing tasks and > state maintenance thread. To eliminate the lock contention, lock acquisition > inside maintenance tasks should be avoided. To prevent race conditions > between task and maintenance threads, we can ensure that *RocksDBFileManager* > has a linear history by ensuring a deep copy of *RocksDBFileManager* every > time a previous version is loaded. The original file manager is not affected > by future state update. The new file manager is not affected by background > snapshot uploading tasks that attempt to upload a snapshot. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org