Also: coordinator handles tracing and read repair. Make sure tracing is off for production. Have your data repaired if possible to eliminate that.
Use tracing to see what’s taking the time. -- Jeff Jirsa > On Feb 5, 2018, at 6:32 AM, Jeff Jirsa <jji...@gmail.com> wrote: > > There’s two parts to latency on the Cassandra side: > > Local and coordinator > > When you read, the node to which you connect coordinates the request to the > node which has the data (potentially itself). Long tail in coordinator > latencies tend to be the coordinator itself gc’ing, which will happen from > time to time. If it’s more consistently high, it may be natural latencies in > your cluster (ie: your requests are going cross wan and the other dc is > 10-20ms away). > > If the latency is seen in p99 but not p50, you can almost always > speculatively read from another coordinator (driver level speculation) after > a millisecond or so. > > -- > Jeff Jirsa > > >> On Feb 5, 2018, at 5:41 AM, mohsin k <moshinkarova...@gmail.com> wrote: >> >> Thanks for response @Nicolas. I was considering the total read latency from >> the client to server (as shown in the image above) which is around 30ms. >> Which I want to get around 3ms (client and server are both on same network). >> I did not consider read latency provided by the server (which I should >> have). I monitored CPU , memory and JVM lifecycle, which is at a safe level. >> I think the difference(0.03 to 30) might be because of low network >> bandwidth, correct me if I am wrong. >> >> I did reduce chunk_length_in_kb to 4kb, but I couldn't get a considerable >> amount of difference, might be because there is less room for improvement on >> the server side. >> >> Thanks again. >> >>> On Mon, Feb 5, 2018 at 6:52 PM, Nicolas Guyomar <nicolas.guyo...@gmail.com> >>> wrote: >>> Your row hit rate is 0.971 which is already very high, IMHO there is >>> "nothing" left to do here if you can afford to store your entire dataset in >>> memory >>> >>> Local read latency: 0.030 ms already seems good to me, what makes you think >>> that you can achieve more with the relative "small" box you are using ? >>> >>> You have to keep an eye on other metrics which might be a limiting factor, >>> like cpu usage, JVM heap lifecycle and so on >>> >>> For read heavy workflow it is sometimes advised to reduce >>> chunk_length_in_kb from the default 64kb to 4kb, see if it helps ! >>> >>>> On 5 February 2018 at 13:09, mohsin k <moshinkarova...@gmail.com> wrote: >>>> Hey Rahul, >>>> >>>> Each partition has around 10 cluster keys. Based on nodetool, I can >>>> roughly estimate partition size to be less than 1KB. >>>> >>>>> On Mon, Feb 5, 2018 at 5:37 PM, mohsin k <moshinkarova...@gmail.com> >>>>> wrote: >>>>> Hey Nicolas, >>>>> >>>>> My goal is to reduce latency as much as possible. I did wait for warmup. >>>>> The test ran for more than 15mins, I am not sure why it shows 2mins >>>>> though. >>>>> >>>>> >>>>> >>>>>> On Mon, Feb 5, 2018 at 5:25 PM, Rahul Singh >>>>>> <rahul.xavier.si...@gmail.com> wrote: >>>>>> What is the average size of your partitions / rows. 1GB may not be >>>>>> enough. >>>>>> >>>>>> Rahul >>>>>> >>>>>>> On Feb 5, 2018, 6:52 AM -0500, mohsin k <moshinkarova...@gmail.com>, >>>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I have been looking into different configurations for tuning my >>>>>>> cassandra servers. So, initially I loadtested server using >>>>>>> cassandra-stress tool, with default configs and then tuning one by one >>>>>>> config to measure impact of change. First config, I tried was setting >>>>>>> "row_cache_size_in_mb" to 1000 (MB) in yaml, adding caching {'keys': >>>>>>> 'ALL', 'rows_per_partition': 'ALL'}. After changing these configs, I >>>>>>> observed that latency has increased rather than decreasing. It would be >>>>>>> really helpful if I get to understand why is this the case and what >>>>>>> steps must be taken to decrease the latency. >>>>>>> >>>>>>> I am running a cluster with 4 nodes. >>>>>>> >>>>>>> Following is my schema: >>>>>>> >>>>>>> CREATE TABLE stresstest.user_to_segment ( >>>>>>> userid text, >>>>>>> segmentid text, >>>>>>> PRIMARY KEY (userid, segmentid) >>>>>>> ) WITH CLUSTERING ORDER BY (segmentid DESC) >>>>>>> AND bloom_filter_fp_chance = 0.1 >>>>>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} >>>>>>> AND comment = 'A table to hold blog segment user relation' >>>>>>> AND compaction = {'class': >>>>>>> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} >>>>>>> AND compression = {'chunk_length_in_kb': '64', 'class': >>>>>>> 'org.apache.cassandra.io.compress.LZ4Compressor'} >>>>>>> AND crc_check_chance = 1.0 >>>>>>> AND dclocal_read_repair_chance = 0.1 >>>>>>> AND default_time_to_live = 0 >>>>>>> AND gc_grace_seconds = 864000 >>>>>>> AND max_index_interval = 2048 >>>>>>> AND memtable_flush_period_in_ms = 0 >>>>>>> AND min_index_interval = 128 >>>>>>> AND read_repair_chance = 0.0 >>>>>>> AND speculative_retry = '99PERCENTILE'; >>>>>>> >>>>>>> Following are node specs: >>>>>>> RAM: 4GB >>>>>>> CPU: 4 Core >>>>>>> HDD: 250BG >>>>>>> >>>>>>> >>>>>>> Following is the output of 'nodetool info' after setting >>>>>>> row_cache_size_in_mb: >>>>>>> >>>>>>> ID : d97dfbbf-1dc3-4d95-a1d9-c9a8d22a3d32 >>>>>>> Gossip active : true >>>>>>> Thrift active : false >>>>>>> Native Transport active: true >>>>>>> Load : 10.94 MiB >>>>>>> Generation No : 1517571163 >>>>>>> Uptime (seconds) : 9169 >>>>>>> Heap Memory (MB) : 136.01 / 3932.00 >>>>>>> Off Heap Memory (MB) : 0.10 >>>>>>> Data Center : dc1 >>>>>>> Rack : rack1 >>>>>>> Exceptions : 0 >>>>>>> Key Cache : entries 125881, size 9.6 MiB, capacity 100 >>>>>>> MiB, 107 hits, 126004 requests, 0.001 recent hit rate, 14400 save >>>>>>> period in seconds >>>>>>> Row Cache : entries 125861, size 31.54 MiB, capacity 1000 >>>>>>> MiB, 4262684 hits, 4388545 requests, 0.971 recent hit rate, 0 save >>>>>>> period in seconds >>>>>>> Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 >>>>>>> hits, 0 requests, NaN recent hit rate, 7200 save period in seconds >>>>>>> Chunk Cache : entries 273, size 17.06 MiB, capacity 480 MiB, >>>>>>> 325 misses, 126623 requests, 0.997 recent hit rate, NaN microseconds >>>>>>> miss latency >>>>>>> Percent Repaired : 100.0% >>>>>>> Token : (invoke with -T/--tokens to see all 256 tokens) >>>>>>> >>>>>>> >>>>>>> Following is output of nodetool cfstats: >>>>>>> >>>>>>> Total number of tables: 37 >>>>>>> ---------------- >>>>>>> Keyspace : stresstest >>>>>>> Read Count: 4398162 >>>>>>> Read Latency: 0.02184742626579012 ms. >>>>>>> Write Count: 0 >>>>>>> Write Latency: NaN ms. >>>>>>> Pending Flushes: 0 >>>>>>> Table: user_to_segment >>>>>>> SSTable count: 1 >>>>>>> SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0] >>>>>>> Space used (live): 11076103 >>>>>>> Space used (total): 11076103 >>>>>>> Space used by snapshots (total): 0 >>>>>>> Off heap memory used (total): 107981 >>>>>>> SSTable Compression Ratio: 0.5123353861375962 >>>>>>> Number of partitions (estimate): 125782 >>>>>>> Memtable cell count: 0 >>>>>>> Memtable data size: 0 >>>>>>> Memtable off heap memory used: 0 >>>>>>> Memtable switch count: 2 >>>>>>> Local read count: 4398162 >>>>>>> Local read latency: 0.030 ms >>>>>>> Local write count: 0 >>>>>>> Local write latency: NaN ms >>>>>>> Pending flushes: 0 >>>>>>> Percent repaired: 0.0 >>>>>>> Bloom filter false positives: 0 >>>>>>> Bloom filter false ratio: 0.00000 >>>>>>> Bloom filter space used: 79280 >>>>>>> Bloom filter off heap memory used: 79272 >>>>>>> Index summary off heap memory used: 26757 >>>>>>> Compression metadata off heap memory used: 1952 >>>>>>> Compacted partition minimum bytes: 43 >>>>>>> Compacted partition maximum bytes: 215 >>>>>>> Compacted partition mean bytes: 136 >>>>>>> Average live cells per slice (last five minutes): 5.719932432432432 >>>>>>> Maximum live cells per slice (last five minutes): 10 >>>>>>> Average tombstones per slice (last five minutes): 1.0 >>>>>>> Maximum tombstones per slice (last five minutes): 1 >>>>>>> Dropped Mutations: 0 >>>>>>> >>>>>>> Following are my results: >>>>>>> The blue graph is before setting row_cache_size_in_mb, >>>>>>> orange is after >>>>>>> >>>>>>> Thanks, >>>>>>> Mohsin >>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>> >>>> >>> >>