Hi, Thank you for the reply. I've measured LWT throughput in 4.0.
I used the cassandra-stress tool to insert rows with LWT for 3 minutes on i3.xlarge and i3.4xlarge For 3.11, I modified the tool to support LWT. Before each measurement, I cleaned up all Cassandra data. The throughput in 4.0 is 5 % faster than 3.11. The CPU load of i3.4xlarge (16 vCPUs) is only up to 75% in both versions. And, the throughput was slower than 4 times that of i3.xlarge. I think the throughput wasn't bounded by CPU also in 4.0. The CPU load of i3.4xlarge is up to 80 % with non-LWT write. I wonder what is the bottleneck for writes on a many-core machine if the issue about messaging has been resolved in 4.0. Can I use up CPU to insert rows by changing any parameter? # LWT insert * Cassandra 3.11.3 | instance type | # of threads | concurrent_writes | Throughput [op/s] | | i3.xlarge | 64 | 32 | 2815 | | i3.4xlarge | 256 | 128 | 9506 | | i3.4xlarge | 512 | 256 | 10540 | * Cassandra 4.0 (trunk) | instance type | # of threads | concurrent_writes | Throughput [op/s] | | i3.xlarge | 64 | 32 | 2951 | | i3.4xlarge | 256 | 128 | 9816 | | i3.4xlarge | 512 | 256 | 11055 | * Environment - 3 node cluster - Replication factor: 3 - Node instance: AWS EC2 i3.xlarge / i3.4xlarge * C* configuration - Apache Cassandra 3.11.3 / 4.0 (trunk) - commitlog_sync: batch - concurrent_writes: 32, 256 - native_transport_max_threads: 128(default), 256 (when concurrent_writes is 256) Thanks, Yuji 2018年11月26日(月) 17:27 sankalp kohli <kohlisank...@gmail.com>: > Inter-node messaging is rewritten using Netty in 4.0. It will be better to > test it using that as potential changes will mostly land on top of that. > > On Mon, Nov 26, 2018 at 7:39 AM Yuji Ito <y...@phact-columba.com> wrote: > >> Hi, >> >> I'm investigating LWT performance with C* 3.11.3. >> It looks that the performance is bounded by messaging latency when many >> requests are issued concurrently. >> >> According to the source code, the number of messaging threads per node is >> only 1 thread for incoming and 1 thread for outbound "small" message to >> another node. >> >> I guess these threads are frequently interrupted because many threads are >> executed when many requests are issued. >> Especially, I think it affects the LWT performance when many LWT requests >> which need lots of inter-node messaging are issued. >> >> I measured that latency. It took 2.5 ms in average to enqueue a message >> at a node and to receive the message at the **same** node with 96 >> concurrent LWT writes. >> Is it normal? I think it is too big latency, though a message was sent to >> the same node. >> >> Decreasing numbers of other threads like `concurrent_counter_writes`, >> `concurrent_materialized_view_writes` reduced a bit the latency. >> Can I change any other parameter to reduce the latency? >> I've tried using message coalescing, but they didn't reduce that. >> >> * Environment >> - 3 node cluster >> - Replication factor: 3 >> - Node instance: AWS EC2 i3.xlarge >> >> * C* configuration >> - Apache Cassandra 3.11.3 >> - commitlog_sync: batch >> - concurrent_reads: 32 (default) >> - concurrent_writes: 32 (default) >> >> Thanks, >> Yuji >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >