Hello,

On a 3 nodes cluster (nodes : 48 procs, 32 Go RAM, SSD), I've timeouts on counter table UPDATEs. One node is specifically slow, generating timeouts. IO bound. iotop shows consistently about 300 Mb/s reads, and writes are around 100 ko/s, changing.
The keys seem well distributed.

The application uses a PHP driver, token aware, and sends updates asynchronously from 11 client machines.

I don't know what could be the cause :
- too many concurrent UPDATE in async mode ?
- a counter type problem ? We've given 1 Gb for counter cache.
- disk ? SSD with software RAID 1
- key hotspot ?

I've compiled some information below. If someone has suggestions or other checks or lines of thought I might pursue, that'd be great !

----------------------------------------

Cassandra version 3.11.0

*iostat* shows something like that on the slow node (software RAID 1 on sda and sdb) Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 2,00 0,00 2160,00 0,00 169,20 0,00 160,43 147,10 68,53 68,53 0,00 0,46 100,00 sdb 1,00 0,00 1289,00 0,00 87,35 0,00 138,79 148,00 109,07 109,07 0,00 0,78 100,00


*nodetools status*
UN  X.X.X.X  52.15 GiB  256          66,7%
UN  X.X.X.X  54.86 GiB  256          69,3%
UN  X.X.X.X  49.18 GiB  256          64,0%

*table structure*

CREATE TABLE document_search (

    id_document bigint,

    search_type ascii,

    searchkeyword_id bigint,

    nb_click counter,

    nb_display counter,

PRIMARY KEY ((id_document, search_type), searchkeyword_id)

) WITH CLUSTERING ORDER BY(searchkeyword_id ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

AND comment = ''

AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}

AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}

AND crc_check_chance = 1.0

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99PERCENTILE';

*2 examples of nodetool tpstats at 2 different times
*

1

Pool Name Active Pending Completed Blocked All time blocked

Native-Transport-Requests 128 1083 1824166 0 0

CounterMutationStage 32 338 710480 0 0

2

Pool Name Active Pending Completed Blocked All time blocked

ReadStage 32 758 418822 0 0

CounterMutationStage 0 0 98310 0 0

*tablestats*

nodetool tablestats document_search

Total number of tables: 43

----------------

Read Count: 0

Read Latency: NaN ms.

Write Count: 288636

Write Latency: 2.354803579595061 ms.

Pending Flushes: 0

SSTable count: 11

Space used (live): 19683318113

Space used (total): 19683318113

Space used by snapshots (total): 0

Off heap memory used (total): 39258415

SSTable Compression Ratio: 0.3099081738824526

Number of keys (estimate): 4397936

Memtable cell count: 169182

Memtable data size: 20761379

Memtable off heap memory used: 0

Memtable switch count: 0

Local read count: 0

Local read latency: NaN ms

Local write count: 169182

Local write latency: NaN ms

Pending flushes: 0

Percent repaired: 61.58

Bloom filter false positives: 1

Bloom filter false ratio: 0,00000

Bloom filter space used: 26271840

Bloom filter off heap memory used: 26271752

Index summary off heap memory used: 5496319

Compression metadata off heap memory used: 7490344

Compacted partition minimum bytes: 104

Compacted partition maximum bytes: 4055269

Compacted partition mean bytes: 3206

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 19804

*nodetool info*

Gossip active          : true
Thrift active          : true
Native Transport active: true
Load                   : 53.85 GiB
Generation No          : 1503674199
Uptime (seconds)       : 194310
Heap Memory (MB)       : 4663,19 / 7774,75
Off Heap Memory (MB)   : 208,24
Exceptions             : 0
Key Cache : entries 11987913, size 1,09 GiB, capacity 2 GiB, 129046135 hits, 144375554 requests, 0,894 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 7579853, size 1 GiB, capacity 1 GiB, 9479923 hits, 39619041 requests, 0,239 recent hit rate, 7200 save period in seconds Chunk Cache : entries 97792, size 5,97 GiB, capacity 5,97 GiB, 38965356 misses, 182409581 requests, 0,786 recent hit rate, 56,113 microseconds miss latency
Percent Repaired       : 46.78765116584098%

<<attachment: rudi.vcf>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to