[
https://issues.apache.org/jira/browse/CASSANDRA-19477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830036#comment-17830036
]
Stefan Miklosovic commented on CASSANDRA-19477:
---
[CASSANDRA-19477-4.1|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19477-4.1]
{noformat}
java8_pre-commit_tests
✓ j8_build 3m 56s
✓ j8_cqlsh_dtests_py3 5m 12s
✓ j8_cqlsh_dtests_py3118m 35s
✓ j8_cqlsh_dtests_py311_vnode 9m 4s
✓ j8_cqlsh_dtests_py38 6m 47s
✓ j8_cqlsh_dtests_py38_vnode6m 8s
✓ j8_cqlsh_dtests_py3_vnode9m 18s
✓ j8_cqlshlib_cython_tests11m 55s
✓ j8_cqlshlib_tests 8m 6s
✓ j8_dtests 32m 16s
✓ j8_dtests_vnode 36m 21s
✓ j8_jvm_dtests 16m 18s
✓ j8_jvm_dtests_repeat41m 32s
✓ j8_jvm_dtests_vnode_repeat 41m 13s
✓ j8_simulator_dtests 2m 49s
✓ j8_unit_tests_repeat 3m 59s
✓ j8_utests_system_keyspace_directory_repeat 3m 45s
✓ j11_unit_tests_repeat0m 31s
✓ j11_jvm_dtests_vnode_repeat 38m 49s
✓ j11_jvm_dtests_vnode12m 37s
✓ j11_jvm_dtests_repeat39m 0s
✓ j11_jvm_dtests 16m 8s
✓ j11_dtests_vnode35m 43s
✓ j11_dtests 33m 55s
✓ j11_cqlshlib_tests 6m 15s
✓ j11_cqlshlib_cython_tests7m 10s
✓ j11_cqlsh_dtests_py3_vnode 5m 37s
✓ j11_cqlsh_dtests_py38_vnode 6m 10s
✓ j11_cqlsh_dtests_py385m 26s
✓ j11_cqlsh_dtests_py311_vnode 5m 46s
✓ j11_cqlsh_dtests_py311 5m 29s
✓ j11_cqlsh_dtests_py3 5m 20s
✕ j8_jvm_dtests_vnode 16m 54s
org.apache.cassandra.distributed.test.GossipTest nodeDownDuringMove
✕ j8_unit_tests 11m 19s
org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
✕ j8_utests_system_keyspace_directory 9m 35s
org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
✕ j11_unit_tests8m 6s
org.apache.cassandra.db.compaction.DateTieredCompactionStrategyTest
testDropExpiredSSTables
org.apache.cassandra.db.compaction.DateTieredCompactionStrategyTest
testFilterOldSSTables
org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
{noformat}
[java8_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4065/workflows/3c368b8e-2cc7-4c78-afe3-62b45253e416]
> Do not go to disk to get HintsStore.getTotalFileSize
>
>
> Key: CASSANDRA-19477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19477
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Hints
>Reporter: Jon Haddad
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.1.x, 5.0-rc, 5.x
>
> Attachments: flamegraph.cpu.html
>
> Time Spent: 4h 10m
> Remaining Estimate: 0h
>
> When testing a cluster with more requests than it could handle, I noticed
> significant CPU time (25%) spent in HintsStore.getTotalFileSize. Here's what
> I'm seeing from profiling:
> 10% of CPU time spent in HintsDescriptor.fileName which only does this:
>
> {noformat}
> return String.format("%s-%s-%s.hints", hostId, timestamp, version);{noformat}
> At a bare minimum here we should create this string up front with the host
> and version and eliminate 2 of the 3 substitutions, but I think it's probably
> faster to use a StringBuilder and avoid the underlying regular expression
> altogether.
> 12% of the time is spent in org.apache.cassandra.io.util.File.length. It
> looks like this is called once for each hint file on disk for each host we're
> hinting to. In the case of an overloaded cluster, this is significant. It
> would be better if we were to track the file size in memory for each hint
> file and reference that rather than go to the filesystem.
> These fairly small changes should make Cassandra more reliable when under
> load spikes.
> CPU Flame graph attached.
> I only tested this in 4.1 but it looks like this is present up to trunk.
>
--
This message was sent by