Hi, Answering my own question, the high load was related to the cpufreq kernel module. Unloaded the cpufreq module and the CPU load instantly dropped and the mirroring started to work. Obviously there is a bug somewhere but for the moment I’m just happy it works.
/Magnus Den tors 15 nov. 2018 kl 15:24 skrev Magnus Grönlund <mag...@gronlund.se>: > Hi, > > I’m trying to setup one-way rbd-mirroring for a ceph-cluster used by an > openstack cloud, but the rbd-mirror is unable to “catch up” with the > changes. However it appears to me as if it's not due to the ceph-clusters > or the network but due to the server running the rbd-mirror process running > out of cpu? > > Is a high cpu load to be expected or is it a symptom of something else? > Or in other words, what can I check/do to get the mirroring working? 😊 > > # rbd mirror pool status nova > health: WARNING > images: 596 total > 572 starting_replay > 24 replaying > > top - 13:31:36 up 79 days, 5:31, 1 user, load average: 32.27, 26.82, > 25.33 > Tasks: 360 total, 17 running, 182 sleeping, 0 stopped, 0 zombie > %Cpu(s): 8.9 us, 70.0 sy, 0.0 ni, 18.5 id, 0.0 wa, 0.0 hi, 2.7 si, > 0.0 st > KiB Mem : 13205185+total, 12862490+free, 579508 used, 2847444 buff/cache > KiB Swap: 0 total, 0 free, 0 used. 12948856+avail Mem > PID USER PR NI VIRT RES SHR S %CPU %MEM > TIME+ COMMAND > 2336553 ceph 20 0 17.1g 178160 20344 S 417.2 0.1 21:50.61 > rbd-mirror > 2312698 root 20 0 0 0 0 I 70.2 0.0 70:11.51 > kworker/12:2 > 2312851 root 20 0 0 0 0 R 69.2 0.0 62:29.69 > kworker/24:1 > 2324627 root 20 0 0 0 0 I 68.4 0.0 40:36.77 > kworker/14:1 > 2235817 root 20 0 0 0 0 I 68.0 0.0 469:14.08 > kworker/8:0 > 2241720 root 20 0 0 0 0 R 67.3 0.0 437:46.51 > kworker/9:1 > 2306648 root 20 0 0 0 0 R 66.9 0.0 109:27.44 > kworker/25:0 > 2324625 root 20 0 0 0 0 R 66.9 0.0 40:37.53 > kworker/13:1 > 2336318 root 20 0 0 0 0 R 66.7 0.0 14:51.96 > kworker/27:3 > 2324643 root 20 0 0 0 0 I 66.5 0.0 36:21.46 > kworker/15:2 > 2294989 root 20 0 0 0 0 I 66.3 0.0 134:09.89 > kworker/11:1 > 2324626 root 20 0 0 0 0 I 66.3 0.0 39:44.14 > kworker/28:2 > 2324019 root 20 0 0 0 0 I 65.3 0.0 44:51.80 > kworker/26:1 > 2235814 root 20 0 0 0 0 R 65.1 0.0 459:14.70 > kworker/29:2 > 2294174 root 20 0 0 0 0 I 64.5 0.0 220:58.50 > kworker/30:1 > 2324355 root 20 0 0 0 0 R 63.3 0.0 45:04.29 > kworker/10:1 > 2263800 root 20 0 0 0 0 R 62.9 0.0 353:38.48 > kworker/31:1 > 2270765 root 20 0 0 0 0 R 60.2 0.0 294:46.34 > kworker/0:0 > 2294798 root 20 0 0 0 0 R 59.8 0.0 148:48.23 > kworker/1:2 > 2307128 root 20 0 0 0 0 R 59.8 0.0 86:15.45 > kworker/6:2 > 2307129 root 20 0 0 0 0 I 59.6 0.0 85:29.66 > kworker/5:0 > 2294826 root 20 0 0 0 0 R 58.2 0.0 138:53.56 > kworker/7:3 > 2294575 root 20 0 0 0 0 I 57.8 0.0 155:03.74 > kworker/2:3 > 2294310 root 20 0 0 0 0 I 57.2 0.0 176:10.92 > kworker/4:2 > 2295000 root 20 0 0 0 0 I 57.2 0.0 132:47.28 > kworker/3:2 > 2307060 root 20 0 0 0 0 I 56.6 0.0 87:46.59 > kworker/23:2 > 2294931 root 20 0 0 0 0 I 56.4 0.0 133:31.47 > kworker/17:2 > 2318659 root 20 0 0 0 0 I 56.2 0.0 55:01.78 > kworker/16:2 > 2336304 root 20 0 0 0 0 I 56.0 0.0 11:45.92 > kworker/21:2 > 2306947 root 20 0 0 0 0 R 55.6 0.0 90:45.31 > kworker/22:2 > 2270628 root 20 0 0 0 0 I 53.8 0.0 273:43.31 > kworker/19:3 > 2294797 root 20 0 0 0 0 R 52.3 0.0 141:13.67 > kworker/18:0 > 2330537 root 20 0 0 0 0 R 52.3 0.0 25:33.25 > kworker/20:2 > > The main cluster has 12 nodes with 120 OSDs and the backup cluster has 6 > nodes with 60 OSDs (but roughly the same amount of storage), the rbd-mirror > runs on a separate server with 2* E5-2650v2 cpus and 128GB memory. > > Best regards > /Magnus >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com