If "fs_locks -B" is empty, then the processes are not waiting on a cluster lock.
Process pegged at 100% cpu means it is actively waiting to acquire a spinlock. Is the other process running? Unfortunately in EL5 there is no clean way to get the kernel stack for a process. "echo t >/proc/sysrq-trigger" is the only way but it might take the box if there are a lot of processes. If you do that, ensure netconsole is setup. The stack trace should tell us more. Sunil Jason Price wrote: > As another note, the process that's trying to read the file is in a > VERY busy wait state... it's taking all the CPU it can get. STRACE > doesn't show any output when I try to connect to the process. > > --Jason > > On Fri, Apr 2, 2010 at 12:44 PM, Jason Price <japr...@gmail.com> wrote: > >> To add further information: >> >> 1) Note A: >> # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state >> Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5 >> Thread Pid: 3869 Node: 1 State: JOINED >> Number of Joins: 1 Joining Node: 255 >> Domain Map: 1 2 >> Live Map: 1 2 >> Lock Resources: 70731 (442210) >> MLEs: 0 (1048380) >> Blocking: 0 (647669) >> Mastery: 0 (400711) >> Migration: 0 (0) >> Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty >> Purge Count: 0 Refs: 70732 >> Dead Node: 255 >> Recovery Pid: 3870 Master: 255 State: INACTIVE >> Recovery Map: >> Recovery Node State: >> >> Node B: >> # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state >> Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5 >> Thread Pid: 3757 Node: 2 State: JOINED >> Number of Joins: 1 Joining Node: 255 >> Domain Map: 1 2 >> Live Map: 1 2 >> Lock Resources: 48113 (50521) >> MLEs: 0 (85510) >> Blocking: 0 (35121) >> Mastery: 0 (50389) >> Migration: 0 (0) >> Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty >> Purge Count: 0 Refs: 48114 >> Dead Node: 255 >> Recovery Pid: 3758 Master: 255 State: INACTIVE >> Recovery Map: >> Recovery Node State: >> >> There are no busy locks apparently, as shown by >> >> # debugfs.ocfs2 -R "fs_locks -B" /dev/sda1 >> # >> >> I am unable to kill any of these processes, even with kill -9. >> >> # cat /etc/ocfs2/cluster.conf >> cluster: >> node_count = 2 >> name = ocfs2ftpcluster >> >> node: >> ip_port = 7777 >> ip_address = 192.168.0.1 >> number = 1 >> name = prtftp01 >> cluster = ocfs2ftpcluster >> >> node: >> ip_port = 7777 >> ip_address = 192.168.0.2 >> number = 2 >> name = prtftp02 >> cluster = ocfs2ftpcluster >> >> If you'd like the output of : >> >> # debugfs.ocfs2 -R "fs_locks" /dev/sda1 | wc -l >> 768681 >> >> I can give it, but it's a lot output. >> >> --Jason >> >> On Fri, Apr 2, 2010 at 11:38 AM, Jason Price <japr...@gmail.com> wrote: >> >>> I'm setting up an HA ftp server (amongst other services). >>> >>> When two connections happen simultaneously, and (more specifically) the >>> same user from two IP's attempt to access the same file (one for reading, >>> and one for writing), the processes both hang. And all subsequent attempts >>> to either read or write the file fail. >>> >>> The two processes that seem to have caused the lock: >>> user 24139 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >>> ::ffff:xxx.yyy.0.253: RETR prim_wo_img_dom.obs >>> user 24142 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >>> ::ffff:xxx.yyy.103.208: STOR prim_wo_img_dom.obs >>> >>> (there are 49 other process trying to do the same things, but these are the >>> first ones.) >>> >>> I'm more than happy to provide any information needed on this issue: >>> >>> OSL >>> CentOS release 5.4 (Final) >>> >>> uname -a: >>> Linux prtftp01<omitted> 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST >>> 2010 x86_64 x86_64 x86_64 GNU/Linux >>> >>> ocfs2 version 1.4.4 >>> >>> At the moment, only one host is actively serving FTP at any time. I can >>> fail the services back and forth as needed. >>> >>> --Jason >>> > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users