[jira] [Updated] (CASSANDRA-18119) Handle sstable metadata stats file getting a new mtime after compaction has finished

Marcus Eriksson (Jira) Thu, 15 Dec 2022 01:35:10 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-18119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marcus Eriksson updated CASSANDRA-18119:
----------------------------------------
    Description: 
Due to a race between compaction finishing and compaction strategies getting 
reloaded there is a chance that we try to add both the new sstable and the old 
compacted sstable to the compaction strategy, and in the LCS case this can 
cause the old sstable to get sent to L0 to avoid overlap. This changes the 
mtime of the stats metadata file and if the node is shut down before the 
sstable is actually deleted from disk, we fail starting with the following 
exception:

{code}
.../mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb_txn_compaction_3983c030-7c5a-11ed-8c66-2f5760cb10b3.log
REMOVE:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-0-big-,1671096247000,5][4003386800]
         ***Unexpected files detected for sstable [nb-0-big-]: last update time 
[Thu Dec 15 10:24:09 CET 2022] (1671096249000) should have been [Thu Dec 15 
10:24:07 CET 2022] (1671096247000)
ADD:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-2-big-,0,5][319189529]
{code}

A workaround for this (until we properly fix the way compaction strategies get 
notified about sstable changes) is to ignore the timestamp of the STATS 
component when cleaning up compaction leftovers on startup. 

  was:
Due to a race between compaction finishing and compaction strategies getting 
reloaded there is a chance that we try to add both the new sstable and the old 
compacted sstable to the compaction strategy, and in the LCS case this can 
cause the old sstable to get sent to L0 to avoid overlap. This changes the 
mtime of the stats metadata file and if the node is shut down before the 
sstable is actually deleted from disk, we fail starting with the following 
exception:

{code}
.../mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb_txn_compaction_3983c030-7c5a-11ed-8c66-2f5760cb10b3.log
[junit-timeout]         
REMOVE:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-0-big-,1671096247000,5][4003386800]
[junit-timeout]                 ***Unexpected files detected for sstable 
[nb-0-big-]: last update time [Thu Dec 15 10:24:09 CET 2022] (1671096249000) 
should have been [Thu Dec 15 10:24:07 CET 2022] (1671096247000)
[junit-timeout]         
ADD:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-2-big-,0,5][319189529]
{code}

A workaround for this (until we properly fix the way compaction strategies get 
notified about sstable changes) is to ignore the timestamp of the STATS 
component when cleaning up compaction leftovers on startup. 


> Handle sstable metadata stats file getting a new mtime after compaction has 
> finished
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18119
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18119
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction, Local/Startup and Shutdown
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 3.11.x, 4.0.x, 4.1.x
>
>
> Due to a race between compaction finishing and compaction strategies getting 
> reloaded there is a chance that we try to add both the new sstable and the 
> old compacted sstable to the compaction strategy, and in the LCS case this 
> can cause the old sstable to get sent to L0 to avoid overlap. This changes 
> the mtime of the stats metadata file and if the node is shut down before the 
> sstable is actually deleted from disk, we fail starting with the following 
> exception:
> {code}
> .../mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb_txn_compaction_3983c030-7c5a-11ed-8c66-2f5760cb10b3.log
> REMOVE:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-0-big-,1671096247000,5][4003386800]
>          ***Unexpected files detected for sstable [nb-0-big-]: last update 
> time [Thu Dec 15 10:24:09 CET 2022] (1671096249000) should have been [Thu Dec 
> 15 10:24:07 CET 2022] (1671096247000)
> ADD:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-2-big-,0,5][319189529]
> {code}
> A workaround for this (until we properly fix the way compaction strategies 
> get notified about sstable changes) is to ignore the timestamp of the STATS 
> component when cleaning up compaction leftovers on startup. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18119) Handle sstable metadata stats file getting a new mtime after compaction has finished

Reply via email to