Hi folks,
I have an 8-node cluster running on an IBM Bladecenter HS21. Using RHEL 5.2, GFS (no GFS2). The nodes are exhibiting high-cpu load with the following apps: aisexec and cman_tool Both these apps race the cpu without any other user apps doing much at all. Affectively, the user experience is dog-slow. After I reboot one of the nodes it clears up, these apps (aisexec and cman_tool)\ seem to behave, for awhile. Eventually they race the cpu again days to weeks later. Has anyone ever experienced this? Top output is below. Thanks, Ed [r...@blade1]# top top - 13:47:51 up 40 days, 22:16, 37 users, load average: 4.17, 3.94, 3.86 Tasks: 372 total, 2 running, 369 sleeping, 1 stopped, 0 zombie Cpu(s): 5.9%us, 32.6%sy, 0.0%ni, 61.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8311372k total, 1934844k used, 6376528k free, 76332k buffers Swap: 8388600k total, 322976k used, 8065624k free, 443172k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4352 root RT 0 37404 35m 2020 R 100 0.4 10519:34 aisexec 20806 root 16 0 1684 560 484 S 42 0.0 8324:49 cman_tool 12501 root 15 0 1680 556 484 S 31 0.0 609:38.46 cman_tool 27245 root 16 0 1688 560 484 S 30 0.0 508:14.31 cman_tool 4635 root 34 19 0 0 0 S 2 0.0 1271:52 kipmi0 5047 root 18 0 405m 17m 6260 S 1 0.2 21:57.04 cimserver 28975 root 15 0 2564 1296 900 R 1 0.0 0:00.05 top 1 root 15 0 2064 576 524 S 0 0.0 0:02.91 init 2 root RT -5 0 0 0 S 0 0.0 0:02.98 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.11 ksoftirqd/0 4 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0 0.0 0:01.29 migration/1
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster