[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866332#comment-17866332
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19429 at 7/16/24 9:31 AM:
------------------------------------------------------------------------

Well ... that would indeed work, just wrapping it in logger.isTraceEnabled(). 
On the other hand, I am a little bit irritated we are not cleaning it up on a 
deeper level, but I have to say that I would see it more problematic if we were 
talking about trunk. This is "just" 4.0 / 4.1 so it is just easier to wrap it 
in isTrace and be done with it if people are OK with that and if that unblocks 
us as we are short of reviewers and their time. 

I do not know how we want to "wrap" this: (the change proposed by [~dipiets] in 
his patch)
{code:java}
         InstrumentingCache<KeyCacheKey, RowIndexEntry> maybeKeyCache = 
CacheService.instance.keyCache;
-        if (maybeKeyCache.getCapacity() > 0)
+        if (maybeKeyCache.size() > 0)
             keyCache = maybeKeyCache; {code}
I think this would need to stay there as it is. I do not think this is called 
in hot path, it is used in some "setup / isOffline" methods which seem to be 
called just once?

On the other hand, my patch I proposed is dealing with this by rewriting the 
logic of how we are resolving this value.

 

EDIT: I think that code is reached everytime we are switching the reader from 
one SSTable to the other.


was (Author: smiklosovic):
Well ... that would indeed work, just wrapping it in logger.isTraceEnabled(). 
On the other hand, I am a little bit irritated we are not cleaning it up on a 
deeper level, but I have to say that I would see it more problematic if we were 
talking about trunk. This is "just" 4.0 / 4.1 so it is just easier to wrap it 
in isTrace and be done with it if people are OK with that and if that unblocks 
us as we are short of reviewers and their time. 

I do not know how we want to "wrap" this: (the change proposed by [~dipiets] in 
his patch)
{code:java}
         InstrumentingCache<KeyCacheKey, RowIndexEntry> maybeKeyCache = 
CacheService.instance.keyCache;
-        if (maybeKeyCache.getCapacity() > 0)
+        if (maybeKeyCache.size() > 0)
             keyCache = maybeKeyCache; {code}
I think this would need to stay there as it is. I do not think this is called 
in hot path, it is used in some "setup / isOffline" methods which seem to be 
called just once?

On the other hand, my patch I proposed is dealing with this by rewriting the 
logic of how we are resolving this value.

> Remove lock contention generated by getCapacity function in SSTableReader
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19429
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/SSTable
>            Reporter: Dipietro Salvatore
>            Assignee: Dipietro Salvatore
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x
>
>         Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, Screenshot 2024-03-19 at 15.22.50.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html, 
> image-2024-03-08-15-51-30-439.png, image-2024-03-08-15-52-07-902.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to