We are getting these two errors after a manual failover (fairly easy to 
recreate):

Feb 27 09:23:47 jamaica-a kernel: EXT3-fs warning: maximal mount count reached, 
running e2fsck is recommended
(we know the max count is 20 now.)


Feb 26 00:36:03 jamaica-a kernel: drbd0: State change failed: Device is held 
open by someone


Dan Phillips


From: Phillips, Dan
Sent: Wednesday, February 27, 2013 2:07 PM
To: Dan Barker; drbd List (drbd-user@lists.linbit.com)
Cc: Phillips, Dan; Felipe Gutierrez
Subject: RE: [DRBD-user] Device is held open by someone

On our system:

[root@jamaica-a ~]# lsof | grep drbd
drbd0_wor 14557      root  cwd       DIR        9,2     1024          2 /
drbd0_wor 14557      root  rtd       DIR        9,2     1024          2 /
drbd0_wor 14557      root  txt   unknown                                
/proc/14557/exe
drbd0_rec 28332      root  cwd       DIR        9,2     1024          2 /
drbd0_rec 28332      root  rtd       DIR        9,2     1024          2 /
drbd0_rec 28332      root  txt   unknown                                
/proc/28332/exe
drbd0_ase 28333      root  cwd       DIR        9,2     1024          2 /
drbd0_ase 28333      root  rtd       DIR        9,2     1024          2 /
drbd0_ase 28333      root  txt   unknown                                
/proc/28333/exe

HERE are the processes that correspond to task IDs 14557, 28332, 28333

[root@jamaica-a ~]# ps -aux | grep 14557
Warning: bad syntax, perhaps a bogus '-'? See 
/usr/share/doc/procps-3.2.7.2.7/FAQ
root     14557  0.0  0.0      0     0 ?        S    06:57   0:02 [drbd0_worker]
root     30919  0.0  0.0   3852   600 pts/3    S+   13:54   0:00 grep 14557

[root@jamaica-a ~]# ps -aux | grep 28332
Warning: bad syntax, perhaps a bogus '-'? See 
/usr/share/doc/procps-3.2.7.2.7/FAQ
root     28332  0.0  0.0      0     0 ?        S    07:04   0:01 
[drbd0_receiver]
root     31384  0.0  0.0   3848   588 pts/3    S+   13:54   0:00 grep 28332

[root@jamaica-a ~]# ps -aux | grep 28333
Warning: bad syntax, perhaps a bogus '-'? See 
/usr/share/doc/procps-3.2.7.2.7/FAQ
root     28333  0.0  0.0      0     0 ?        S    07:04   0:02 [drbd0_asender]
root     31451  0.0  0.0   3848   592 pts/3    S+   13:54   0:00 grep 28333

So drbd0_worker, drbd0_receiver and drbd0_asender have files open. After a 
failover, should these processes still have a device(s) held open?

What does this tell us?

Thanks,

Dan


From: drbd-user-boun...@lists.linbit.com 
[mailto:drbd-user-boun...@lists.linbit.com] On Behalf Of Dan Barker
Sent: Wednesday, February 27, 2013 12:39 PM
To: drbd List (drbd-user@lists.linbit.com)
Subject: Re: [DRBD-user] Device is held open by someone

Well, that's who's got it open. Task 7354, 27005 and 27174. See which you may 
be able to stop or kill.

Dan

From: Felipe Gutierrez [mailto:felipe.o.gutier...@gmail.com]
Sent: Wednesday, February 27, 2013 11:46 AM
To: Dan Barker
Subject: Re: [DRBD-user] Device is held open by someone

root@cloud15:/home/cloud15# lsof | grep drbd
lsof: WARNING: can't stat() fuse.gvfs-fuse-daemon file system 
/home/cloud15/.gvfs
      Output information may be incomplete.
drbd7_wor  7354            root  cwd       DIR                8,2       4096    
      2 /
drbd7_wor  7354            root  rtd       DIR                8,2       4096    
      2 /
drbd7_wor  7354            root  txt   unknown                                  
        /proc/7354/exe
drbd7_rec 27005            root  cwd       DIR                8,2       4096    
      2 /
drbd7_rec 27005            root  rtd       DIR                8,2       4096    
      2 /
drbd7_rec 27005            root  txt   unknown                                  
        /proc/27005/exe
drbd7_ase 27174            root  cwd       DIR                8,2       4096    
      2 /
drbd7_ase 27174            root  rtd       DIR                8,2       4096    
      2 /
drbd7_ase 27174            root  txt   unknown                                  
        /proc/27174/exe


On Wed, Feb 27, 2013 at 1:28 PM, Dan Barker 
<dbar...@visioncomm.net<mailto:dbar...@visioncomm.net>> wrote:
And what did "lsof | grep drbd" say?

From: 
drbd-user-boun...@lists.linbit.com<mailto:drbd-user-boun...@lists.linbit.com> 
[mailto:drbd-user-boun...@lists.linbit.com<mailto:drbd-user-boun...@lists.linbit.com>]
 On Behalf Of Felipe Gutierrez
Sent: Wednesday, February 27, 2013 11:24 AM
To: Prater, James K.
Cc: drbd-user@lists.linbit.com<mailto:drbd-user@lists.linbit.com>

Subject: Re: [DRBD-user] Device is held open by someone

Hi James,

even stoping Xen I couldn't umount my file system and set drbdadm secondary.

This is my output:

root@cloud15:/home/cloud15# umount /mnt/drbd7/
umount: /mnt/drbd7: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
root@cloud15:/home/cloud15# drbd-overview
  7:r7  StandAlone Primary/Unknown UpToDate/DUnknown r----- /mnt/drbd7 ext3 23G 
8.3G 14G 39%


Any hint?
Thanks


On Wed, Feb 27, 2013 at 6:50 AM, Prater, James K. 
<jpra...@draper.com<mailto:jpra...@draper.com>> wrote:
a separate system just for XEN. You are probably having some kernel based 
conflicts that is blocking the release of the volume(s).


From: Felipe Gutierrez 
[mailto:felipe.o.gutier...@gmail.com<mailto:felipe.o.gutier...@gmail.com>]
Sent: Tuesday, February 26, 2013 04:57 PM
To: Arnold Krille <arn...@arnoldarts.de<mailto:arn...@arnoldarts.de>>
Cc: drbd-user@lists.linbit.com<mailto:drbd-user@lists.linbit.com> 
<drbd-user@lists.linbit.com<mailto:drbd-user@lists.linbit.com>>
Subject: Re: [DRBD-user] Device is held open by someone

Hi Arnold,

I will try to stop Xen.

Talking about stonith/fencing I was working with Corosync+Pacemaker+Xen+DRBD 
but the pace maker configurations got failed when I put all components 
together. I mean, when I was with Corosync+Pacemaker+DRBD the fencing worked 
well! After I put Xen together the pacemaker configuration got failed.

Now I am not using Corosyn+Pacemaker anymore :(

Do you have some clue to me about this?

Thanks in advance!
Felipe
On Tue, Feb 26, 2013 at 6:47 PM, Arnold Krille 
<arn...@arnoldarts.de<mailto:arn...@arnoldarts.de>> wrote:
On Tue, 26 Feb 2013 09:43:55 -0300 Felipe Gutierrez
<felipe.o.gutier...@gmail.com<mailto:felipe.o.gutier...@gmail.com>> wrote:
> No, it is not mount. it is why i did the option -l on umount
>
> primary# umount -l /mnt/drbd7
>
> I was saving files on this partition with Xen hypervisor.
> If I test the same thing with out Xen, everything works fine.
Well, then make xen stop when you have to switch-over the primary. Or
at least make xen stop using that directory. Could be its still running
vms from there, could be its only still looking at the dir because it
'could' run vms from there.
If you or your cluster-manager want to fail-over the resource and that
fails, its a case for stonith/fencing. Or a case for a manual reboot if
you haven't configured fencing yet.

> I just have to know how to force to make it secondary. For this time I
> rebbot the machine and I get to put to secondary. But I have to
> simulate it with out rebooting.
Have fun,

Arnold

_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com<mailto:drbd-user@lists.linbit.com>
http://lists.linbit.com/mailman/listinfo/drbd-user



--
--
-- Felipe Oliveira Gutierrez
-- felipe.o.gutier...@gmail.com<mailto:felipe.o.gutier...@gmail.com>
-- https://sites.google.com/site/lipe82/Home/diaadia



--
--
-- Felipe Oliveira Gutierrez
-- felipe.o.gutier...@gmail.com<mailto:felipe.o.gutier...@gmail.com>
-- https://sites.google.com/site/lipe82/Home/diaadia



--
--
-- Felipe Oliveira Gutierrez
-- felipe.o.gutier...@gmail.com<mailto:felipe.o.gutier...@gmail.com>
-- https://sites.google.com/site/lipe82/Home/diaadia
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to