Re: [Linux-HA] Antw: Re: Q: unmanaged MD-RAID & auto-recovery

Lars Ellenberg Fri, 25 Nov 2011 04:30:00 -0800

On Fri, Nov 25, 2011 at 11:34:33AM +0100, Florian Haas wrote:
> On 11/25/11 10:47, Ulrich Windl wrote:
> > The resource is unmanaged:
> > Nov 24 12:59:05 h03 pengine: [15876]: notice: LogActions: Leave   
> > prm_c11_db_15k_raid1     (Started unmanaged)
> > [...]
> > LUN shrink begins:
> > Nov 24 12:59:39 h03 kernel: [1220873.890571] sd 2:0:3:13: [sdai] Result: 
> > hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > [...]
> > RAID monitor runs:
> > Nov 24 13:08:31 h03 lrmd: [15874]: info: rsc:prm_c11_db_15k_raid1 
> > monitor[262] (pid 28046)
> > Nov 24 13:08:31 h03 Raid1[28046]: [28061]: WARNING: /dev/md10 has at least 
> > one failed device.
> > Nov 24 13:08:31 h03 Raid1[28046]: [28063]: INFO: Attempting recovery 
> > sequence to re-add devices on /dev/md10:
> > Nov 24 13:08:31 h03 lrmd: [15874]: info: RA output: 
> > (prm_c11_db_15k_raid1:monitor:stderr) mdadm: re-added 
> > /dev/disk/by-id/dm-name-C11_DB_15k-E2
> > Nov 24 13:08:31 h03 kernel: [1221405.912376] md: bind<dm-48>
> > Nov 24 13:08:31 h03 kernel: [1221405.913568] RAID1 conf printout:
> > Nov 24 13:08:31 h03 kernel: [1221405.913572]  --- wd:1 rd:2
> > Nov 24 13:08:31 h03 kernel: [1221405.913576]  disk 0, wo:0, o:1, dev:dm-14
> > Nov 24 13:08:31 h03 kernel: [1221405.913579]  disk 1, wo:1, o:1, dev:dm-48
> > Nov 24 13:08:31 h03 kernel: [1221405.913706] md: recovery of RAID array md10
> > [...]
> > (The resource still is unmanaged)
> > Nov 24 13:14:06 h03 pengine: [15876]: notice: LogActions: Leave   
> > prm_c11_db_15k_raid1     (Started unmanaged)
> > 
> > So you see.
> 
> I agree that this clearly ought not to happen.
>
> From the log snippet it's
> not entirely clear whether that's a recurring monitor (interval ==
> whatever you configured, or 20 if default), or a probe (interval == 0).
> 
> A recurring monitor clearly should not happen at all when unmanaged.


That is incorrect.

is-managed=false does still monitor the resource.  It only prevents
pacemaker from sending start/stop etc commands to that resource.

If the implementation of the monitor action in the RA does trigger
"auto-recovery" or other things, well, then it does.

If you don't want that, you'd need to either go "maintenance-mode=true",
(which I'd recommend; it simply does no actioin at all).
Or disable monitor operations on the is-managed=false resource as well.

> A
> probe, however, would cause the RA to skip this part of monitor. And, it
> would be skipped altogether if OCF_CHECK_LEVEL == 0.

In this case it would be sufficient to disable only the monitor
action(s) with OCF_CHECK_LEVEL > 0.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: Q: unmanaged MD-RAID & auto-recovery

Reply via email to