fine tuning for wide rows and mixed worload system

Marco Gasparini Fri, 11 Jan 2019 05:20:39 -0800

Hello everyone,

I need some advise in order to solve my use case problem. I have already
tried some solutions but it didn't work out.
Can you help me with the following configuration please? any help is very
appreciate


I'm using:
- Cassandra 3.11.3
- java version "1.8.0_191"

My use case is composed by the following constraints:
- about 1M reads per day (it is going to rise up)
- about 2M writes per day (it is going to rise up)
- there is a high peek of requests in less than 2 hours in which the system
receives half of all day traffic (500K reads, 1M writes)
- each request is composed by 1 read and 2 writes (1 delete + 1 write)
* the read query selects max 3 records based on the primary key (select *
from my_keyspace.my_table where pkey = ? limit 3)
* then is performed a deletion of one record (delete from
my_keyspace.my_table where pkey = ? and event_datetime = ? IF EXISTS)
* finally the new data is stored (insert into my_keyspace.my_table
(event_datetime, pkey, agent, some_id, ft, ftt..) values (?,?,?,?,?,?...))

- each row is pretty wide. I don't really know the exact size because there
are 2 dynamic text columns that stores data between 1MB to 50MB length
each.
  So, reads are going to be huge because I read 3 records of that dimension
every time. Writes are complex as well because each row is that wide.

Currently, I own 3 nodes with the following properties:
- node1:
* Intel Core i7-3770
* 2x HDD SATA 3,0 TB
* 4x RAM 8192 MB DDR3
* nominative bit rate 175MB/s
# blockdev --report /dev/sd[ab]
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0   3000592982016   /dev/sda
rw   256   512  4096          0   3000592982016   /dev/sdb
- node2,3:
* Intel Core i7-2600
* 2x HDD SATA 3,0 TB
* 4x RAM 4096 MB DDR3
* nominative bit rate 155MB/s
# blockdev --report /dev/sd[ab]
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0   3000592982016   /dev/sda
rw   256   512  4096          0   3000592982016   /dev/sdb
Each node has 2 disks but I have disabled RAID option and I have created a
virtual single disk in order to get much free space.
Can this configuration create issues?

I have already tried some configurations in order to make it work, like:
1) straigthforward attempt
- default Cassandra configuration (cassandra.yaml)
- RF=1
- SizeTieredCompactionStrategy  (write strategy)
- no row cache (because of wide rows dimension is better to have no row
cache)
- gc_grace_seconds = 1 day (unfortunately, I did no repair schedule at all)
results:
too many timeouts, losing data

2)
- added repair schedules
- RF=3 (in order increase reads speed)
results:
- too many timeouts, losing data
- high I/O consumption on each nodes (iostat shows 100% in %util on each
nodes, dstat shows hundred of M read for each iteration)
- node2 frozen until I stopped data writes.
- node3 almost frozen
- many panding MutationStage events in TPSTATS in node2
- many full GC
- many HintsDispatchExecutor events in system.log
actual)
- added repair schedules
- RF=3
- set durable_writes = false in order to speed up writes
- increased young heap
- decreased SurviviorRatio in order to get much young size available
because of wide rows data
- increased from 1 to 3 MaxTenuringThreshold in order to decrease reads
latency
- increased Cassandra's memtable onheap and offheap dimensions beacause of
wide rows data
- changed memtable_allocation_type to offheap_objects bacause of wide rows
data
results:
- better GC performance on nodes1 and node3
- still high I/O consumption on each nodes (iostat shows 100% in %util on
each nodes, dstat shows hundred of M read for each iteration)
- still node2 completely frozen
- many panding MutationStage events in TPSTATS in node2
- many HintsDispatchExecutor events in system.log in each nodes

I cannot go to AWS but I can only get dedicated server.
Do you have any suggestions to fine tune the system on this use case?

Thank you
Marco

fine tuning for wide rows and mixed worload system

Reply via email to