Also: coordinator handles tracing and read repair. Make sure tracing is off for 
production. Have your data repaired if possible to eliminate that.

Use tracing to see what’s taking the time.

-- 
Jeff Jirsa


> On Feb 5, 2018, at 6:32 AM, Jeff Jirsa <jji...@gmail.com> wrote:
> 
> There’s two parts to latency on the Cassandra side:
> 
> Local and coordinator
> 
> When you read, the node to which you connect coordinates the request to the 
> node which has the data (potentially itself). Long tail in coordinator 
> latencies tend to be the coordinator itself gc’ing, which will happen from 
> time to time. If it’s more consistently high, it may be natural latencies in 
> your cluster (ie: your requests are going cross wan and the other dc is 
> 10-20ms away).
> 
> If the latency is seen in p99 but not p50, you can almost always 
> speculatively read from another coordinator (driver level speculation) after 
> a millisecond or so.
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Feb 5, 2018, at 5:41 AM, mohsin k <moshinkarova...@gmail.com> wrote:
>> 
>> Thanks for response @Nicolas. I was considering the total read latency from 
>> the client to server (as shown in the image above) which is around 30ms. 
>> Which I want to get around 3ms (client and server are both on same network). 
>> I did not consider read latency provided by the server (which I should 
>> have). I monitored CPU , memory and JVM lifecycle, which is at a safe level. 
>> I think the difference(0.03 to 30) might be because of low network 
>> bandwidth, correct me if I am wrong.
>> 
>> I did reduce chunk_length_in_kb to 4kb, but I couldn't get a considerable 
>> amount of difference, might be because there is less room for improvement on 
>> the server side.
>> 
>> Thanks again.
>> 
>>> On Mon, Feb 5, 2018 at 6:52 PM, Nicolas Guyomar <nicolas.guyo...@gmail.com> 
>>> wrote:
>>> Your row hit rate is 0.971 which is already very high, IMHO there is 
>>> "nothing" left to do here if you can afford to store your entire dataset in 
>>> memory
>>> 
>>> Local read latency: 0.030 ms already seems good to me, what makes you think 
>>> that you can achieve more with the relative "small" box you are using ? 
>>> 
>>> You have to keep an eye on other metrics which might be a limiting factor, 
>>> like cpu usage, JVM heap lifecycle and so on
>>> 
>>> For read heavy workflow it is sometimes advised to reduce 
>>> chunk_length_in_kb from the default 64kb to 4kb, see if it helps ! 
>>> 
>>>> On 5 February 2018 at 13:09, mohsin k <moshinkarova...@gmail.com> wrote:
>>>> Hey Rahul,
>>>> 
>>>> Each partition has around 10 cluster keys. Based on nodetool, I can 
>>>> roughly estimate partition size to be less than 1KB.
>>>> 
>>>>> On Mon, Feb 5, 2018 at 5:37 PM, mohsin k <moshinkarova...@gmail.com> 
>>>>> wrote:
>>>>> Hey Nicolas,
>>>>> 
>>>>> My goal is to reduce latency as much as possible. I did wait for warmup. 
>>>>> The test ran for more than 15mins, I am not sure why it shows 2mins 
>>>>> though.
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Mon, Feb 5, 2018 at 5:25 PM, Rahul Singh 
>>>>>> <rahul.xavier.si...@gmail.com> wrote:
>>>>>> What is the average size of your partitions / rows. 1GB may not be 
>>>>>> enough.
>>>>>> 
>>>>>> Rahul
>>>>>> 
>>>>>>> On Feb 5, 2018, 6:52 AM -0500, mohsin k <moshinkarova...@gmail.com>, 
>>>>>>> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have been looking into different configurations for tuning my 
>>>>>>> cassandra servers. So, initially I loadtested server using 
>>>>>>> cassandra-stress tool, with default configs and then tuning one by one 
>>>>>>> config to measure impact of change. First config, I tried was setting 
>>>>>>> "row_cache_size_in_mb" to 1000 (MB) in yaml, adding caching {'keys': 
>>>>>>> 'ALL', 'rows_per_partition': 'ALL'}. After changing these configs, I 
>>>>>>> observed that latency has increased rather than decreasing. It would be 
>>>>>>> really helpful if I get to understand why is this the case and what 
>>>>>>> steps must be taken to decrease the latency.
>>>>>>> 
>>>>>>> I am running a cluster with 4 nodes.
>>>>>>> 
>>>>>>> Following is my schema:
>>>>>>> 
>>>>>>> CREATE TABLE stresstest.user_to_segment (
>>>>>>>     userid text,
>>>>>>>     segmentid text,
>>>>>>>     PRIMARY KEY (userid, segmentid)
>>>>>>> ) WITH CLUSTERING ORDER BY (segmentid DESC)
>>>>>>>     AND bloom_filter_fp_chance = 0.1
>>>>>>>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
>>>>>>>     AND comment = 'A table to hold blog segment user relation'
>>>>>>>     AND compaction = {'class': 
>>>>>>> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
>>>>>>>     AND compression = {'chunk_length_in_kb': '64', 'class': 
>>>>>>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>>>>>>>     AND crc_check_chance = 1.0
>>>>>>>     AND dclocal_read_repair_chance = 0.1
>>>>>>>     AND default_time_to_live = 0
>>>>>>>     AND gc_grace_seconds = 864000
>>>>>>>     AND max_index_interval = 2048
>>>>>>>     AND memtable_flush_period_in_ms = 0
>>>>>>>     AND min_index_interval = 128
>>>>>>>     AND read_repair_chance = 0.0
>>>>>>>     AND speculative_retry = '99PERCENTILE';
>>>>>>> 
>>>>>>> Following are node specs:
>>>>>>> RAM: 4GB
>>>>>>> CPU: 4 Core
>>>>>>> HDD: 250BG
>>>>>>> 
>>>>>>> 
>>>>>>> Following is the output of 'nodetool info' after setting 
>>>>>>> row_cache_size_in_mb:
>>>>>>> 
>>>>>>> ID                     : d97dfbbf-1dc3-4d95-a1d9-c9a8d22a3d32
>>>>>>> Gossip active          : true
>>>>>>> Thrift active          : false
>>>>>>> Native Transport active: true
>>>>>>> Load                   : 10.94 MiB
>>>>>>> Generation No          : 1517571163
>>>>>>> Uptime (seconds)       : 9169
>>>>>>> Heap Memory (MB)       : 136.01 / 3932.00
>>>>>>> Off Heap Memory (MB)   : 0.10
>>>>>>> Data Center            : dc1
>>>>>>> Rack                   : rack1
>>>>>>> Exceptions             : 0
>>>>>>> Key Cache              : entries 125881, size 9.6 MiB, capacity 100 
>>>>>>> MiB, 107 hits, 126004 requests, 0.001 recent hit rate, 14400 save 
>>>>>>> period in seconds
>>>>>>> Row Cache              : entries 125861, size 31.54 MiB, capacity 1000 
>>>>>>> MiB, 4262684 hits, 4388545 requests, 0.971 recent hit rate, 0 save 
>>>>>>> period in seconds
>>>>>>> Counter Cache          : entries 0, size 0 bytes, capacity 50 MiB, 0 
>>>>>>> hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
>>>>>>> Chunk Cache            : entries 273, size 17.06 MiB, capacity 480 MiB, 
>>>>>>> 325 misses, 126623 requests, 0.997 recent hit rate, NaN microseconds 
>>>>>>> miss latency
>>>>>>> Percent Repaired       : 100.0%
>>>>>>> Token                  : (invoke with -T/--tokens to see all 256 tokens)
>>>>>>> 
>>>>>>> 
>>>>>>> Following is output of nodetool cfstats:
>>>>>>> 
>>>>>>> Total number of tables: 37
>>>>>>> ----------------
>>>>>>> Keyspace : stresstest
>>>>>>> Read Count: 4398162
>>>>>>> Read Latency: 0.02184742626579012 ms.
>>>>>>> Write Count: 0
>>>>>>> Write Latency: NaN ms.
>>>>>>> Pending Flushes: 0
>>>>>>> Table: user_to_segment
>>>>>>> SSTable count: 1
>>>>>>> SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0]
>>>>>>> Space used (live): 11076103
>>>>>>> Space used (total): 11076103
>>>>>>> Space used by snapshots (total): 0
>>>>>>> Off heap memory used (total): 107981
>>>>>>> SSTable Compression Ratio: 0.5123353861375962
>>>>>>> Number of partitions (estimate): 125782
>>>>>>> Memtable cell count: 0
>>>>>>> Memtable data size: 0
>>>>>>> Memtable off heap memory used: 0
>>>>>>> Memtable switch count: 2
>>>>>>> Local read count: 4398162
>>>>>>> Local read latency: 0.030 ms
>>>>>>> Local write count: 0
>>>>>>> Local write latency: NaN ms
>>>>>>> Pending flushes: 0
>>>>>>> Percent repaired: 0.0
>>>>>>> Bloom filter false positives: 0
>>>>>>> Bloom filter false ratio: 0.00000
>>>>>>> Bloom filter space used: 79280
>>>>>>> Bloom filter off heap memory used: 79272
>>>>>>> Index summary off heap memory used: 26757
>>>>>>> Compression metadata off heap memory used: 1952
>>>>>>> Compacted partition minimum bytes: 43
>>>>>>> Compacted partition maximum bytes: 215
>>>>>>> Compacted partition mean bytes: 136
>>>>>>> Average live cells per slice (last five minutes): 5.719932432432432
>>>>>>> Maximum live cells per slice (last five minutes): 10
>>>>>>> Average tombstones per slice (last five minutes): 1.0
>>>>>>> Maximum tombstones per slice (last five minutes): 1
>>>>>>> Dropped Mutations: 0
>>>>>>> 
>>>>>>>                  Following are my results:
>>>>>>>                  The blue graph is before setting row_cache_size_in_mb, 
>>>>>>> orange is after
>>>>>>> 
>>>>>>> Thanks, 
>>>>>>> Mohsin
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>> 
>>>> 
>>> 
>> 

Reply via email to