If "fs_locks -B" is empty, then the processes are not waiting on a 
cluster lock.

Process pegged at 100% cpu means it is actively waiting to acquire a 
spinlock.
Is the other process running?

Unfortunately in EL5 there is no clean way to get the kernel stack for a 
process.
"echo t >/proc/sysrq-trigger" is the only way but it might take the box 
if there
are a lot of processes. If you do that, ensure netconsole is setup. The 
stack trace
should tell us more.

Sunil

Jason Price wrote:
> As another note, the process that's trying to read the file is in a
> VERY busy wait state... it's taking all the CPU it can get.  STRACE
> doesn't show any output when I try to connect to the process.
>
> --Jason
>
> On Fri, Apr 2, 2010 at 12:44 PM, Jason Price <japr...@gmail.com> wrote:
>   
>> To add further information:
>>
>> 1) Note A:
>> # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state
>> Domain: 6D419D86AE8A4DB1940788EDDA27027B  Key: 0xc955c1d5
>> Thread Pid: 3869  Node: 1  State: JOINED
>> Number of Joins: 1  Joining Node: 255
>> Domain Map: 1 2
>> Live Map: 1 2
>> Lock Resources: 70731 (442210)
>> MLEs: 0 (1048380)
>>   Blocking: 0 (647669)
>>   Mastery: 0 (400711)
>>   Migration: 0 (0)
>> Lists: Dirty=Empty  Purge=Empty  PendingASTs=Empty  PendingBASTs=Empty
>> Purge Count: 0  Refs: 70732
>> Dead Node: 255
>> Recovery Pid: 3870  Master: 255  State: INACTIVE
>> Recovery Map:
>> Recovery Node State:
>>
>> Node B:
>> #  cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state
>> Domain: 6D419D86AE8A4DB1940788EDDA27027B  Key: 0xc955c1d5
>> Thread Pid: 3757  Node: 2  State: JOINED
>> Number of Joins: 1  Joining Node: 255
>> Domain Map: 1 2
>> Live Map: 1 2
>> Lock Resources: 48113 (50521)
>> MLEs: 0 (85510)
>>   Blocking: 0 (35121)
>>   Mastery: 0 (50389)
>>   Migration: 0 (0)
>> Lists: Dirty=Empty  Purge=Empty  PendingASTs=Empty  PendingBASTs=Empty
>> Purge Count: 0  Refs: 48114
>> Dead Node: 255
>> Recovery Pid: 3758  Master: 255  State: INACTIVE
>> Recovery Map:
>> Recovery Node State:
>>
>> There are no busy locks apparently, as shown by
>>
>> # debugfs.ocfs2 -R "fs_locks -B" /dev/sda1
>> #
>>
>> I am unable to kill any of these processes, even with kill -9.
>>
>> # cat /etc/ocfs2/cluster.conf
>> cluster:
>>         node_count = 2
>>         name = ocfs2ftpcluster
>>
>> node:
>>         ip_port = 7777
>>         ip_address = 192.168.0.1
>>         number = 1
>>         name = prtftp01
>>         cluster = ocfs2ftpcluster
>>
>> node:
>>         ip_port = 7777
>>         ip_address = 192.168.0.2
>>         number = 2
>>         name = prtftp02
>>         cluster = ocfs2ftpcluster
>>
>> If you'd like the output of :
>>
>> # debugfs.ocfs2 -R "fs_locks" /dev/sda1 | wc -l
>> 768681
>>
>> I can give it, but it's a lot output.
>>
>> --Jason
>>
>> On Fri, Apr 2, 2010 at 11:38 AM, Jason Price <japr...@gmail.com> wrote:
>>     
>>> I'm setting up an HA ftp server (amongst other services).
>>>
>>> When two connections happen simultaneously, and (more specifically) the 
>>> same user from two IP's attempt to access the same file (one for reading, 
>>> and one for writing), the processes both hang.  And all subsequent attempts 
>>> to either read or write the file fail.
>>>
>>> The two processes that seem to have caused the lock:
>>> user  24139  1657 Thu Apr  1 18:25:01 2010 proftpd: cbs - 
>>> ::ffff:xxx.yyy.0.253: RETR prim_wo_img_dom.obs
>>> user  24142  1657 Thu Apr  1 18:25:01 2010 proftpd: cbs - 
>>> ::ffff:xxx.yyy.103.208: STOR prim_wo_img_dom.obs
>>>
>>> (there are 49 other process trying to do the same things, but these are the 
>>> first ones.)
>>>
>>> I'm more than happy to provide any information needed on this issue:
>>>
>>> OSL
>>> CentOS release 5.4 (Final)
>>>
>>> uname -a:
>>> Linux prtftp01<omitted> 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 
>>> 2010 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> ocfs2 version 1.4.4
>>>
>>> At the moment, only one host is actively serving FTP at any time.  I can 
>>> fail the services back and forth as needed.
>>>
>>> --Jason
>>>       
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users@oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to