[jira] [Updated] (SPARK-48586) Remove lock acquisition in doMaintenance() by making a deep copy of file mappings in RocksDBFileManager in load()

Riya Verma (Jira) Wed, 26 Jun 2024 12:19:10 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-48586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Riya Verma updated SPARK-48586:
-------------------------------
    Description: Currently the lock of the *RocksDB* state store is acquired 
when uploading the snapshot inside maintenance tasks when change log 
checkpointing is enabled, which causes lock contention between query processing 
tasks and state maintenance thread. We want to eliminate lock contention to 
decrease latency of streaming queries so lock acquisition inside maintenance 
tasks should be avoided. This can introduce race conditions between task and 
maintenance threads. By making a deep copy of {{versionToRocksDBFiles}} and 
{{localFilesToDfsFiles}} in {*}RocksDBFileManager{*}, we can ensure that the 
file manager state is not updated by task thread when background snapshot 
uploading tasks attempt to upload a snapshot.  (was: Currently the lock of the 
*RocksDB* state store is acquired when uploading the snapshot inside 
maintenance tasks when change log checkpointing is enabled, which causes lock 
contention between query processing tasks and state maintenance thread. To 
eliminate the lock contention, lock acquisition inside maintenance tasks should 
be avoided. To prevent race conditions between task and maintenance threads, we 
can ensure that *RocksDBFileManager* has a linear history by ensuring a deep 
copy of *RocksDBFileManager* every time a previous version is loaded. The 
original file manager is not affected by future state update. The new file 
manager is not affected by background snapshot uploading tasks that attempt to 
upload a snapshot.)

> Remove lock acquisition in doMaintenance() by making a deep copy of file 
> mappings in RocksDBFileManager in load()
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-48586
>                 URL: https://issues.apache.org/jira/browse/SPARK-48586
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.4.3
>            Reporter: Riya Verma
>            Priority: Major
>              Labels: pull-request-available
>
> Currently the lock of the *RocksDB* state store is acquired when uploading 
> the snapshot inside maintenance tasks when change log checkpointing is 
> enabled, which causes lock contention between query processing tasks and 
> state maintenance thread. We want to eliminate lock contention to decrease 
> latency of streaming queries so lock acquisition inside maintenance tasks 
> should be avoided. This can introduce race conditions between task and 
> maintenance threads. By making a deep copy of {{versionToRocksDBFiles}} and 
> {{localFilesToDfsFiles}} in {*}RocksDBFileManager{*}, we can ensure that the 
> file manager state is not updated by task thread when background snapshot 
> uploading tasks attempt to upload a snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48586) Remove lock acquisition in doMaintenance() by making a deep copy of file mappings in RocksDBFileManager in load()

Reply via email to