TWCS on Non TTL Data

Isaeed Mohanna Tue, 14 Sep 2021 05:41:51 -0700

Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month - three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables     Write Latency      Read Latency    Partition Size      
  Cell Count
                              (micros)          (micros)           (bytes)
50%           215.00             17.08          89970.66              1916      
         149
75%           446.00             24.60         223875.79              2759      
         215
95%           535.00             35.43         464228.84              8239      
         642
98%           642.00             51.01         668489.53             24601      
        1916
99%           642.00             73.46         962624.93             42510      
        3311
Min             0.00              2.30          10090.81                43      
           0
Max           770.00           1358.10        2395318.86           5839588      
      454826


As u can see we are scaning hundreds of sstables, turns out we are using DTCS  
(min:4,max32) , the table folder contains ~33K files  of ~130GB per node 
(cleanup pending after increasing the cluster), And compaction takes a very 
long time to complete.
As I understood DTCS is deprecated so my questions

  1.  should we switch to TWCS even though our data is not TTLed since we do 
not do delete at all can we still use it? Will it improve performance?
  2.  If we should switch I am thinking of using a time window of a week, this 
way the read will scan 10s of sstables instead of hundreds today. Does it sound 
reasonable?
  3.  Is there a recommended size of a window bucket in terms of disk space?
  4.  If TWCS is not a good idea should I switch to STCS instead could that 
yield in better performance than current situation?
  5.  What are the risk of changing compaction strategy on a production system, 
can it be done on the fly? Or its better to go through a full test, backup 
cycle?

All input will be appreciated,
Thank you

TWCS on Non TTL Data

Reply via email to