Re: [Linux-HA] Antw: Re: Q: unmanaged MD-RAID & auto-recovery

Ulrich Windl Sun, 27 Nov 2011 23:42:32 -0800

>>> Lars Ellenberg <lars.ellenb...@linbit.com> schrieb am 25.11.2011 um 13:29 in
Nachricht <20111125122944.GC7722@barkeeper1-xen.linbit>:
> On Fri, Nov 25, 2011 at 11:34:33AM +0100, Florian Haas wrote:
> > On 11/25/11 10:47, Ulrich Windl wrote:
> > > The resource is unmanaged:
> > > Nov 24 12:59:05 h03 pengine: [15876]: notice: LogActions: Leave   
> prm_c11_db_15k_raid1     (Started unmanaged)
> > > [...]
> > > LUN shrink begins:
> > > Nov 24 12:59:39 h03 kernel: [1220873.890571] sd 2:0:3:13: [sdai] Result: 
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > > [...]
> > > RAID monitor runs:
> > > Nov 24 13:08:31 h03 lrmd: [15874]: info: rsc:prm_c11_db_15k_raid1 
> monitor[262] (pid 28046)
> > > Nov 24 13:08:31 h03 Raid1[28046]: [28061]: WARNING: /dev/md10 has at 
> > > least 
> one failed device.
> > > Nov 24 13:08:31 h03 Raid1[28046]: [28063]: INFO: Attempting recovery 
> sequence to re-add devices on /dev/md10:
> > > Nov 24 13:08:31 h03 lrmd: [15874]: info: RA output: 
> (prm_c11_db_15k_raid1:monitor:stderr) mdadm: re-added 
> /dev/disk/by-id/dm-name-C11_DB_15k-E2
> > > Nov 24 13:08:31 h03 kernel: [1221405.912376] md: bind<dm-48>
> > > Nov 24 13:08:31 h03 kernel: [1221405.913568] RAID1 conf printout:
> > > Nov 24 13:08:31 h03 kernel: [1221405.913572]  --- wd:1 rd:2
> > > Nov 24 13:08:31 h03 kernel: [1221405.913576]  disk 0, wo:0, o:1, dev:dm-14
> > > Nov 24 13:08:31 h03 kernel: [1221405.913579]  disk 1, wo:1, o:1, dev:dm-48
> > > Nov 24 13:08:31 h03 kernel: [1221405.913706] md: recovery of RAID array 
> md10
> > > [...]
> > > (The resource still is unmanaged)
> > > Nov 24 13:14:06 h03 pengine: [15876]: notice: LogActions: Leave   
> prm_c11_db_15k_raid1     (Started unmanaged)
> > > 
> > > So you see.
> > 
> > I agree that this clearly ought not to happen.
> >
> > From the log snippet it's
> > not entirely clear whether that's a recurring monitor (interval ==
> > whatever you configured, or 20 if default), or a probe (interval == 0).
> > 
> > A recurring monitor clearly should not happen at all when unmanaged.
> 
> That is incorrect.
> 
> is-managed=false does still monitor the resource.  It only prevents
> pacemaker from sending start/stop etc commands to that resource.
> 
> If the implementation of the monitor action in the RA does trigger
> "auto-recovery" or other things, well, then it does.


Well,

IMHO the RA should not try auto-recovery in "unmanaged" mode. I'm unsure 
whether I can set "maintenance-mode" for a single resource.

Regards,
Ulrich

> 
> If you don't want that, you'd need to either go "maintenance-mode=true",
> (which I'd recommend; it simply does no actioin at all).
> Or disable monitor operations on the is-managed=false resource as well.
> 
> > A
> > probe, however, would cause the RA to skip this part of monitor. And, it
> > would be skipped altogether if OCF_CHECK_LEVEL == 0.
> 
> In this case it would be sufficient to disable only the monitor
> action(s) with OCF_CHECK_LEVEL > 0.



 

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: Q: unmanaged MD-RAID & auto-recovery

Reply via email to