Re: [Ocfs2-users] problem stopping o2cb service on one of nodes

Nikola Ciprich Mon, 13 Apr 2009 01:46:10 -0700

Hi Sunil,
well, I don't know exact version which started creating them,
but latest centos (based on RHEL 5.4) uses version 95, which doesn't
create them for me.
Version 127 I used for tests already creates them correctly..
BR
nik


On Mon, Apr 06, 2009 at 12:08:34PM -0700, Sunil Mushran wrote:
> AFAIK, this is not an issue on (rh)el4/sles9. It could be that the
> enterprise distros never shipped that older version of udev.
>
> I could cross check. Do you know the version of udev that started
> creating the /dev/dm-xx devices?
>
> Nikola Ciprich wrote:
>> OK, I've got it...
>> the problem is, that mounted.ocfs2 scans devices appearing in 
>> /proc/partitions, and only in directly under /dev
>>
>> but I'm using device mapper based storage, andd older versions of udev do 
>> not create all device mapper devices also in /dev/dm-XX, but only in 
>> /dev/mapper/... which is not therefore scanned by mounted.ocfs2
>> the reason why it was working on one of my nodes is, that I've updated udev 
>> there some time ago for some other tests.
>>
>> so while updating udev is a workaround around the problem, I guess it might 
>> be good to fix in mounted.ocfs2, as people using older distros(especially 
>> enterprise ones) might stumble upon the problem if using device-mapper based 
>> storage...
>>
>> I can try to create a fix for this problem, trying to open dev under 
>> /dev/mapper if it's not found under /dev might be the way?
>>
>> Anyways Sunil thanks a lot for Your help!
>>
>> On Sun, Apr 05, 2009 at 07:31:52AM -0700, Sunil Mushran wrote:
>>   
>>> Email me the ouput of:
>>> $ mounted.ocfs2 -d
>>>
>>> Also, does hb stop using uuid work?
>>> $ ocfs2_hb_ctl -K -u <uuid> o2cb
>>>
>>> Lastly, what versions of the fs, tools, kernel?
>>>
>>> On Apr 4, 2009, at 1:24 AM, Nikola Ciprich <extmaill...@linuxbox.cz>  
>>> wrote:
>>>
>>>     
>>>> Hi,
>>>> it says:
>>>> /sbin/ocfs2_hb_ctl
>>>> on both nodes, which's correct - the binary is there...
>>>> n.
>>>>
>>>> On Fri, Apr 03, 2009 at 02:27:34PM -0700, Sunil Mushran wrote:
>>>>       
>>>>> Do:
>>>>> $ cat /proc/sys/fs/ocfs2/nm/hb_ctl_path
>>>>>
>>>>>
>>>>> Nikola Ciprich wrote:
>>>>>         
>>>>>> Hi Sunil,
>>>>>> thanks for reply..
>>>>>> I don't observe any segfaults...
>>>>>> regarding info You want, as I wrote, umount doesn't decrease   
>>>>>> refcount...:
>>>>>>
>>>>>> [r...@vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
>>>>>> 2A5D351D0A934061BBC6B5392A30187E: 1 refs
>>>>>> [r...@vbox4 ~]# umount /home/LVS
>>>>>> [r...@vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
>>>>>> 2A5D351D0A934061BBC6B5392A30187E: 1 refs
>>>>>>
>>>>>> nik
>>>>>>
>>>>>> On Fri, Apr 03, 2009 at 10:21:33AM -0700, Sunil Mushran wrote:
>>>>>>
>>>>>>           
>>>>>>> umount is supposed to stop the heartbeat. In bz1053, 
>>>>>>> ocfs2_hb_ctl was
>>>>>>> segfaulting.
>>>>>>> Are you seeing any segfaults or any other errors during umount?
>>>>>>>
>>>>>>> Also, run the following before and after umount:
>>>>>>> $ ocfs2_hb_ctl -I -d /dev/sdX o2cb
>>>>>>>
>>>>>>> Email me the output.
>>>>>>>
>>>>>>> Nikola Ciprich wrote:
>>>>>>>
>>>>>>>             
>>>>>>>> Hello Tao,
>>>>>>>> and thanks a lot for reply!
>>>>>>>> It seems not to be the same bug, at least applying the 
>>>>>>>> patch  didn't help.
>>>>>>>> stopping hb using -K parameter really helps, but why 
>>>>>>>> doesn't  this work automatically
>>>>>>>> on umount?
>>>>>>>> it always happens on the second node...
>>>>>>>> I don't see any error in logs, anything.
>>>>>>>> But the reference count always increases on mount, and 
>>>>>>>> doesn't  decrease on umount on this node..
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 03, 2009 at 10:58:18AM +0800, Tao Ma wrote:
>>>>>>>>
>>>>>>>>               
>>>>>>>>> Hi Nikola,
>>>>>>>>>
>>>>>>>>> Nikola Ciprich wrote:
>>>>>>>>>
>>>>>>>>>                 
>>>>>>>>>> Hi,
>>>>>>>>>> I'm trying ocfs2 RHEL5 distro, 2.6.29 kernel,  
>>>>>>>>>> ocfstools-1.4.1. I'm using DRBD in primary/primary mode
>>>>>>>>>> as shared storage...
>>>>>>>>>>
>>>>>>>>>> I've configured the service according to quickstart  
>>>>>>>>>> document, and everything works,
>>>>>>>>>> but when I umount fs on both nodes, stopping o2cb 
>>>>>>>>>> service on one of the nodes always
>>>>>>>>>> fails with:
>>>>>>>>>>
>>>>>>>>>> [r...@vbox4 sysconfig]# /etc/rc.d/init.d/o2cb stop
>>>>>>>>>> Stopping O2CB cluster vb34: Failed
>>>>>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>>>>>
>>>>>>>>>>                   
>>>>>>>>> It looks that your disk heartbeat is still there. I don't know
>>>>>>>>> the   specific reason, maybe
>>>>>>>>> http://oss.oracle.com/bugzilla/show_bug.cgi?id=1053 ?
>>>>>>>>>
>>>>>>>>> but you can stop it manually.
>>>>>>>>> 1.  ocfs2_hb_ctl -I -d <device>
>>>>>>>>> or ocfs2_hb_ctl -I -u <uuid>
>>>>>>>>> this will tell you the reference number for the hearbeat.
>>>>>>>>> 2.  ocfs2_hb_ctl -K -d <device> <service>
>>>>>>>>>  or  ocfs2_hb_ctl -K -u <uuid> <service>
>>>>>>>>> this will killed the heartbeat manually.
>>>>>>>>> service is the stack you used, and it should be "o2cb" in 
>>>>>>>>> your case.
>>>>>>>>>
>>>>>>>>> btw, you can try cfs2_hb_ctl -K -u <uuid> <service> to see
>>>>>>>>> whether it is  the same problem as bug 1053.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Tao
>>>>>>>>>
>>>>>>>>>                 
>>>>>>           
>>>> -- 
>>>> -------------------------------------
>>>> Nikola CIPRICH
>>>> LinuxBox.cz, s.r.o.
>>>> 28. rijna 168, 709 01 Ostrava
>>>>
>>>> tel.:   +420 596 603 142
>>>> fax:    +420 596 621 273
>>>> mobil:  +420 777 093 799
>>>> www.linuxbox.cz
>>>>
>>>> mobil servis: +420 737 238 656
>>>> email servis: ser...@linuxbox.cz
>>>> -------------------------------------
>>>>       
>>
>>   
>

-- 
-------------------------------------
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:    +420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-------------------------------------

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] problem stopping o2cb service on one of nodes

Reply via email to