Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-19 Thread Sean Redmond
Hi,

I would be interested in this case when a mds in standby-replay fails.

Thanks

On Wed, Oct 19, 2016 at 4:06 PM, Scottix <scot...@gmail.com> wrote:

> I would take the analogy of a Raid scenario. Basically a standby is
> considered like a spare drive. If that spare drive goes down. It is good to
> know about the event, but it does in no way indicate a degraded system,
> everything keeps running at top speed.
>
> If you had multi active MDS and one goes down then I would say that is a
> degraded system, but still waiting for that feature.
>
>
> On Tue, Oct 18, 2016 at 10:18 AM Goncalo Borges <
> goncalo.bor...@sydney.edu.au> wrote:
>
>> Hi John.
>>
>> That would be good.
>>
>> In our case we are just picking that up simply through nagios and some
>> fancy scripts parsing the dump of the MDS maps.
>>
>> Cheers
>> Goncalo
>> 
>> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John
>> Spray [jsp...@redhat.com]
>> Sent: 18 October 2016 22:46
>> To: ceph-users
>> Subject: [ceph-users] Feedback wanted: health warning when standby MDS
>> dies?
>>
>> Hi all,
>>
>> Someone asked me today how to get a list of down MDS daemons, and I
>> explained that currently the MDS simply forgets about any standby that
>> stops sending beacons.  That got me thinking about the case where a
>> standby dies while the active MDS remains up -- the cluster has gone
>> into a non-highly-available state, but we are not giving the admin any
>> indication.
>>
>> I've suggested a solution here:
>> http://tracker.ceph.com/issues/17604
>>
>> This is probably going to be a bit of a subjective thing in terms of
>> whether people find it useful or find it to be annoying noise, so I'd
>> be interested in feedback from people currently running cephfs.
>>
>> Cheers,
>> John
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-19 Thread Scottix
I would take the analogy of a Raid scenario. Basically a standby is
considered like a spare drive. If that spare drive goes down. It is good to
know about the event, but it does in no way indicate a degraded system,
everything keeps running at top speed.

If you had multi active MDS and one goes down then I would say that is a
degraded system, but still waiting for that feature.

On Tue, Oct 18, 2016 at 10:18 AM Goncalo Borges <
goncalo.bor...@sydney.edu.au> wrote:

> Hi John.
>
> That would be good.
>
> In our case we are just picking that up simply through nagios and some
> fancy scripts parsing the dump of the MDS maps.
>
> Cheers
> Goncalo
> 
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John
> Spray [jsp...@redhat.com]
> Sent: 18 October 2016 22:46
> To: ceph-users
> Subject: [ceph-users] Feedback wanted: health warning when standby MDS
> dies?
>
> Hi all,
>
> Someone asked me today how to get a list of down MDS daemons, and I
> explained that currently the MDS simply forgets about any standby that
> stops sending beacons.  That got me thinking about the case where a
> standby dies while the active MDS remains up -- the cluster has gone
> into a non-highly-available state, but we are not giving the admin any
> indication.
>
> I've suggested a solution here:
> http://tracker.ceph.com/issues/17604
>
> This is probably going to be a bit of a subjective thing in terms of
> whether people find it useful or find it to be annoying noise, so I'd
> be interested in feedback from people currently running cephfs.
>
> Cheers,
> John
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread Goncalo Borges
Hi John.

That would be good.

In our case we are just picking that up simply through nagios and some fancy 
scripts parsing the dump of the MDS maps.

Cheers 
Goncalo 

From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John Spray 
[jsp...@redhat.com]
Sent: 18 October 2016 22:46
To: ceph-users
Subject: [ceph-users] Feedback wanted: health warning when standby MDS dies?

Hi all,

Someone asked me today how to get a list of down MDS daemons, and I
explained that currently the MDS simply forgets about any standby that
stops sending beacons.  That got me thinking about the case where a
standby dies while the active MDS remains up -- the cluster has gone
into a non-highly-available state, but we are not giving the admin any
indication.

I've suggested a solution here:
http://tracker.ceph.com/issues/17604

This is probably going to be a bit of a subjective thing in terms of
whether people find it useful or find it to be annoying noise, so I'd
be interested in feedback from people currently running cephfs.

Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread Lars Marowsky-Bree
On 2016-10-18T12:46:48, John Spray  wrote:

> I've suggested a solution here:
> http://tracker.ceph.com/issues/17604
> 
> This is probably going to be a bit of a subjective thing in terms of
> whether people find it useful or find it to be annoying noise, so I'd
> be interested in feedback from people currently running cephfs.

I'm not adding much new value here, but, yes.

Being warned that we're in a degraded mode where the next failure might
cause an outage would be very useful. Maybe that's even worth flagging
as a separate health level.


-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread Benjeman Meekhof
+1 to this, it would be useful

On Tue, Oct 18, 2016 at 8:31 AM, Wido den Hollander  wrote:
>
>> Op 18 oktober 2016 om 14:06 schreef Dan van der Ster :
>>
>>
>> +1 I would find this warning useful.
>>
>
> +1 Probably make it configurable, say, you want at least X standby MDS to be 
> available before WARN. But in general, yes, please!
>
> Wido
>
>>
>>
>> On Tue, Oct 18, 2016 at 1:46 PM, John Spray  wrote:
>> > Hi all,
>> >
>> > Someone asked me today how to get a list of down MDS daemons, and I
>> > explained that currently the MDS simply forgets about any standby that
>> > stops sending beacons.  That got me thinking about the case where a
>> > standby dies while the active MDS remains up -- the cluster has gone
>> > into a non-highly-available state, but we are not giving the admin any
>> > indication.
>> >
>> > I've suggested a solution here:
>> > http://tracker.ceph.com/issues/17604
>> >
>> > This is probably going to be a bit of a subjective thing in terms of
>> > whether people find it useful or find it to be annoying noise, so I'd
>> > be interested in feedback from people currently running cephfs.
>> >
>> > Cheers,
>> > John
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread Wido den Hollander

> Op 18 oktober 2016 om 14:06 schreef Dan van der Ster :
> 
> 
> +1 I would find this warning useful.
> 

+1 Probably make it configurable, say, you want at least X standby MDS to be 
available before WARN. But in general, yes, please!

Wido

> 
> 
> On Tue, Oct 18, 2016 at 1:46 PM, John Spray  wrote:
> > Hi all,
> >
> > Someone asked me today how to get a list of down MDS daemons, and I
> > explained that currently the MDS simply forgets about any standby that
> > stops sending beacons.  That got me thinking about the case where a
> > standby dies while the active MDS remains up -- the cluster has gone
> > into a non-highly-available state, but we are not giving the admin any
> > indication.
> >
> > I've suggested a solution here:
> > http://tracker.ceph.com/issues/17604
> >
> > This is probably going to be a bit of a subjective thing in terms of
> > whether people find it useful or find it to be annoying noise, so I'd
> > be interested in feedback from people currently running cephfs.
> >
> > Cheers,
> > John
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread Dan van der Ster
+1 I would find this warning useful.



On Tue, Oct 18, 2016 at 1:46 PM, John Spray  wrote:
> Hi all,
>
> Someone asked me today how to get a list of down MDS daemons, and I
> explained that currently the MDS simply forgets about any standby that
> stops sending beacons.  That got me thinking about the case where a
> standby dies while the active MDS remains up -- the cluster has gone
> into a non-highly-available state, but we are not giving the admin any
> indication.
>
> I've suggested a solution here:
> http://tracker.ceph.com/issues/17604
>
> This is probably going to be a bit of a subjective thing in terms of
> whether people find it useful or find it to be annoying noise, so I'd
> be interested in feedback from people currently running cephfs.
>
> Cheers,
> John
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-18 Thread John Spray
Hi all,

Someone asked me today how to get a list of down MDS daemons, and I
explained that currently the MDS simply forgets about any standby that
stops sending beacons.  That got me thinking about the case where a
standby dies while the active MDS remains up -- the cluster has gone
into a non-highly-available state, but we are not giving the admin any
indication.

I've suggested a solution here:
http://tracker.ceph.com/issues/17604

This is probably going to be a bit of a subjective thing in terms of
whether people find it useful or find it to be annoying noise, so I'd
be interested in feedback from people currently running cephfs.

Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com