[jira] [Updated] (CASSANDRA-19569) sstableupgrade is very slow

2024-04-18 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19569:
-
 Bug Category: Parent values: Degradation(12984)Level 1 values: Performance 
Bug/Regression(12997)
   Complexity: Normal
  Component/s: Local/Compaction
Discovered By: User Report
Fix Version/s: 4.1.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> sstableupgrade is very slow
> ---
>
> Key: CASSANDRA-19569
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19569
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Norbert Schultz
>Priority: Normal
> Fix For: 4.1.x
>
> Attachments: flamegraph_ok.png, flamegraph_sstableupgrade.png
>
>
> We are in the process of migrating cassandra from 3.11.x to 4.1.4 and 
> upgrading the sstables using sstableupgrade from Cassandra V4.1.4, from `me-` 
> to `nb-` Format
> Unfortunately, the process is very very slow (less than 0.5 MB/s).
> Some observations:
> - The process is only slow on (fast) SSDs, but not on ram disks.
> - The sstables consist of many partitions (this may be unrelated)
> - The upgrade process is fast, if we use `automatic_sstable_upgrade` instead 
> of the sstableupgradetool.
> - We give enough RAM (export MAX_HEAP_SIZE=8g)
> On profiling, we found out, that sstableupgrade is burning most CPU time on 
> {{posix_fadvise}} (see flamegraph_sstableupgrade.png ).
> My naive interpretation of the whole {{maybeReopenEarly}} to 
> {{posix_fadvise}} chain is, that the process just informs the linux kernel, 
> that the written data should not be cached. If we comment out the call to 
> {{NativeLibrary.trySkipCache}}, the conversion is running at expected 10MB/s 
> (see flamegraph_ok.png )



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19569) sstableupgrade is very slow

2024-04-17 Thread Norbert Schultz (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Schultz updated CASSANDRA-19569:

   Platform: Java11,Linux  (was: All)
Description: 
We are in the process of migrating cassandra from 3.11.x to 4.1.4 and upgrading 
the sstables using sstableupgrade from Cassandra V4.1.4, from `me-` to `nb-` 
Format

Unfortunately, the process is very very slow (less than 0.5 MB/s).

Some observations:
- The process is only slow on (fast) SSDs, but not on ram disks.
- The sstables consist of many partitions (this may be unrelated)
- The upgrade process is fast, if we use `automatic_sstable_upgrade` instead of 
the sstableupgradetool.
- We give enough RAM (export MAX_HEAP_SIZE=8g)

On profiling, we found out, that sstableupgrade is burning most CPU time on 
{{posix_fadvise}} (see flamegraph_sstableupgrade.png ).

My naive interpretation of the whole {{maybeReopenEarly}} to {{posix_fadvise}} 
chain is, that the process just informs the linux kernel, that the written data 
should not be cached. If we comment out the call to 
{{NativeLibrary.trySkipCache}}, the conversion is running at expected 10MB/s 
(see flamegraph_ok.png )



  was:
We are in the process of migrating cassandra from 3.11.x to 4.1.4 and upgrading 
the sstables using sstableupgrade from Cassandra V4.1.4, from `me-` to `nb-` 
Format

Unfortunately, the process is very very slow (less than 0.5 MB/s).

Some observations:
- The process is only slow on (fast) SSDs, but not on ram disks.
- The sstables consist of many partitions (this may be unrelated)
- The upgrade process is fast, if we use `automatic_sstable_upgrade` instead of 
the sstableupgradetool.
- We give enough RAM (export MAX_HEAP_SIZE=8g)

On profiling, we found out, that sstableupgrade is burning most CPU time on 
{{posix_fadvise}} (see attached   !flamegraph_sstableupgrade.png! ).

My naive interpretation of the whole {{maybeReopenEarly}} to {{posix_fadvise}} 
chain is, that the process just informs the linux kernel, that the written data 
should not be cached. If we comment out the call to 
{{NativeLibrary.trySkipCache}}, the conversion is running at expected 10MB/s 
(see  !flamegraph_ok.png! )




> sstableupgrade is very slow
> ---
>
> Key: CASSANDRA-19569
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19569
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Norbert Schultz
>Priority: Normal
> Attachments: flamegraph_ok.png, flamegraph_sstableupgrade.png
>
>
> We are in the process of migrating cassandra from 3.11.x to 4.1.4 and 
> upgrading the sstables using sstableupgrade from Cassandra V4.1.4, from `me-` 
> to `nb-` Format
> Unfortunately, the process is very very slow (less than 0.5 MB/s).
> Some observations:
> - The process is only slow on (fast) SSDs, but not on ram disks.
> - The sstables consist of many partitions (this may be unrelated)
> - The upgrade process is fast, if we use `automatic_sstable_upgrade` instead 
> of the sstableupgradetool.
> - We give enough RAM (export MAX_HEAP_SIZE=8g)
> On profiling, we found out, that sstableupgrade is burning most CPU time on 
> {{posix_fadvise}} (see flamegraph_sstableupgrade.png ).
> My naive interpretation of the whole {{maybeReopenEarly}} to 
> {{posix_fadvise}} chain is, that the process just informs the linux kernel, 
> that the written data should not be cached. If we comment out the call to 
> {{NativeLibrary.trySkipCache}}, the conversion is running at expected 10MB/s 
> (see flamegraph_ok.png )



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org