We have a Kudu cluster with 5 tablet servers each has 28 CPU cores, 160GB RAM and 2TB SSD.
The RPC queue length we set is 500. We now write to 10 tables at the same time. We’re using 10 threads each to write(simply insert) to 8 out of these 10 tables. We have 5 task (each task with 10 threads) to upsert corresponding fields for the rest two tables. For example, for one of these two tables we have 5 fields(a,b,c,d,e) with `key` fields as primary key . 1 task(10 thread) is running upsert (key, a) 1 task(10 thread) is running upsert (key, b) 1 task(10 thread) is running upsert (key, c) 1 task(10 thread) is running upsert (key, d) 1 task(10 thread) is running upsert (key, e) Now we observed that writes are very slow(less than 1000 thousand records/second). We also observed when we have less threads for writing, the speed is not that bad(about a few thousand records/ second). Here’s the CPU utilization report for Kudu threads. Threads: 724 total, 15 running, 709 sleeping, 0 stopped, 0 zombie %Cpu(s): 18.5 us, 8.3 sy, 0.0 ni, 67.6 id, 4.9 wa, 0.0 hi, 0.7 si, 0.0 st KiB Mem : 16488888+total, 1776956 free, 12737900 used, 15037401+buff/cache KiB Swap: 3145724 total, 3004924 free, 140800 used. 15048467+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14000 kudu 20 0 70.0g 12.7g 1.9g R 72.8 8.1 26083:49 MaintenanceMgr 13992 kudu 20 0 70.0g 12.7g 1.9g R 53.2 8.1 4067:40 rpc reactor-139 13995 kudu 20 0 70.0g 12.7g 1.9g R 42.9 8.1 3996:32 rpc reactor-139 13993 kudu 20 0 70.0g 12.7g 1.9g R 39.9 8.1 3167:19 rpc reactor-139 14231 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 142:14.00 rpc worker-1423 14242 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 107:12.97 rpc worker-1424 14226 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 109:21.71 rpc worker-1422 14274 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 95:54.12 rpc worker-1427 14216 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 136:26.72 rpc worker-1421 14221 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 129:04.78 rpc worker-1422 14253 kudu 20 0 70.0g 12.7g 1.9g S 9.3 8.1 104:26.75 rpc worker-1425 14250 kudu 20 0 70.0g 12.7g 1.9g S 8.6 8.1 145:44.18 rpc worker-1425 14224 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 112:22.56 rpc worker-1422 14255 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 133:47.72 rpc worker-1425 14282 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 126:01.94 rpc worker-1428 10864 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 0:00.45 apply [worker]- 10932 kudu 20 0 70.0g 12.7g 1.9g S 6.6 8.1 0:00.38 apply [worker]- 14271 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 98:02.94 rpc worker-1427 11099 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 0:00.19 prepare [worker 11001 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.29 apply [worker]- 11103 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare [worker 11105 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare [worker 14057 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 1427:58 rpc worker-1405 11004 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.23 apply [worker]- 11037 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.20 prepare [worker 14270 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 146:34.95 rpc worker-1427 14280 kudu 20 0 70.0g 12.7g 1.9g R 5.0 8.1 133:00.90 rpc worker-1428 10366 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.77 raft [worker]-1 10749 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.36 raft [worker]-1 14053 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 1428:13 rpc worker-1405 14213 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 145:18.80 rpc worker-1421 And memory usage info total used free shared buff/cache available Mem: 157G 12G 1.7G 726M 143G 143G Swap: 3.0G 137M 2.9G Here’s the recent logs from one of the tablet servers. https://justpaste.it/76qg2 Please advise me how I can optimize the write performance.