Re: [ceph-users] Cluster Map Problems

Dewan Shamsul Alam Wed, 03 Apr 2013 07:37:31 -0700

Hi,

I've seen this in 0.56. In my case I shutdown one server then bring it
back. I have to run /etc/init.d/ceph -a restart to make it healthy.  It
doesn't impact the running VM I have in that cluster though.



On Wed, Apr 3, 2013 at 8:32 PM, Martin Mailand <mar...@tuxadero.com> wrote:

> Hi,
>
> I still have this problem in v0.60.
> If I stop one OSD, the OSD get set down after 20 seconds. But after 300
> seconds the OSD get not set out, there for the ceph stays degraded for
> ever.
> I can reproduce it with a fresh created cluster.
>
> root@store1:~# ceph -s
>    health HEALTH_WARN 405 pgs degraded; 405 pgs stuck unclean; recovery
> 10603/259576 degraded (4.085%); 1/24 in osds are down
>    monmap e1: 3 mons at
> {a=192.168.195.31:6789/0,b=192.168.195.33:6789/0,c=192.168.195.35:6789/0},
> election epoch 10, quorum 0,1,2 a,b,c
>    osdmap e150: 24 osds: 23 up, 24 in
>     pgmap v12028: 4800 pgs: 4395 active+clean, 405 active+degraded; 505
> GB data, 1017 GB used, 173 TB / 174 TB avail; 0B/s rd, 6303B/s wr,
> 2op/s; 10603/259576 degraded (4.085%)
>    mdsmap e1: 0/0/1 up
>
>
> -martin
>
>
> On 28.03.2013 23:45, John Wilkins wrote:
> > Martin,
> >
> > I'm just speculating: since I just rewrote the networking section and
> > there is an empty mon_host value, and I do recall a chat last week
> > where mon_host was considered a different setting now, maybe you might
> > try specifying:
> >
> > [mon.a]
> >         mon host = store1
> >         mon addr = 192.168.195.31:6789
> >
> > etc. for monitors. I'm assuming that's not the case, but I want to
> > make sure my docs are right on this point.
> >
> >
> > On Thu, Mar 28, 2013 at 3:24 PM, Martin Mailand <mar...@tuxadero.com>
> wrote:
> >> Hi John,
> >>
> >> my ceph.conf is a bit further down in this email.
> >>
> >> -martin
> >>
> >> Am 28.03.2013 23:21, schrieb John Wilkins:
> >>
> >>> Martin,
> >>>
> >>> Would you mind posting your Ceph configuration file too?  I don't see
> >>> any value set for "mon_host": ""
> >>>
> >>> On Thu, Mar 28, 2013 at 1:04 PM, Martin Mailand <mar...@tuxadero.com>
> >>> wrote:
> >>>>
> >>>> Hi Greg,
> >>>>
> >>>> the dump from mon.a is attached.
> >>>>
> >>>> -martin
> >>>>
> >>>> On 28.03.2013 20:55, Gregory Farnum wrote:
> >>>>>
> >>>>> Hmm. The monitor code for checking this all looks good to me. Can you
> >>>>> go to one of your monitor nodes and dump the config?
> >>>>>
> >>>>> (
> http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admin%20socket#viewing-a-configuration-at-runtime
> )
> >>>>> -Greg
> >>>>>
> >>>>> On Thu, Mar 28, 2013 at 12:33 PM, Martin Mailand <
> mar...@tuxadero.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I get the same behavior an new created cluster as well, no changes
> to
> >>>>>> the cluster config at all.
> >>>>>> I stop the osd.1, after 20 seconds it got marked down. But it never
> get
> >>>>>> marked out.
> >>>>>>
> >>>>>> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759)
> >>>>>>
> >>>>>> -martin
> >>>>>>
> >>>>>> On 28.03.2013 19:48, John Wilkins wrote:
> >>>>>>>
> >>>>>>> Martin,
> >>>>>>>
> >>>>>>> Greg is talking about noout. With Ceph, you can specifically
> preclude
> >>>>>>> OSDs from being marked out when down to prevent rebalancing--e.g.,
> >>>>>>> during upgrades, short-term maintenance, etc.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing
> >>>>>>>
> >>>>>>> On Thu, Mar 28, 2013 at 11:12 AM, Martin Mailand <
> mar...@tuxadero.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Greg,
> >>>>>>>>
> >>>>>>>> setting the osd manually out triggered the recovery.
> >>>>>>>> But now it is the question, why is the osd not marked out after
> 300
> >>>>>>>> seconds? That's a default cluster, I use the 0.59 build from your
> >>>>>>>> site.
> >>>>>>>> And I didn't change any value, except for the crushmap.
> >>>>>>>>
> >>>>>>>> That's my ceph.conf.
> >>>>>>>>
> >>>>>>>> -martin
> >>>>>>>>
> >>>>>>>> [global]
> >>>>>>>>          auth cluster requierd = none
> >>>>>>>>          auth service required = none
> >>>>>>>>          auth client required = none
> >>>>>>>> #       log file = ""
> >>>>>>>>          log_max_recent=100
> >>>>>>>>          log_max_new=100
> >>>>>>>>
> >>>>>>>> [mon]
> >>>>>>>>          mon data = /data/mon.$id
> >>>>>>>> [mon.a]
> >>>>>>>>          host = store1
> >>>>>>>>          mon addr = 192.168.195.31:6789
> >>>>>>>> [mon.b]
> >>>>>>>>          host = store3
> >>>>>>>>          mon addr = 192.168.195.33:6789
> >>>>>>>> [mon.c]
> >>>>>>>>          host = store5
> >>>>>>>>          mon addr = 192.168.195.35:6789
> >>>>>>>> [osd]
> >>>>>>>>          journal aio = true
> >>>>>>>>          osd data = /data/osd.$id
> >>>>>>>>          osd mount options btrfs =
> rw,noatime,nodiratime,autodefrag
> >>>>>>>>          osd mkfs options btrfs = -n 32k -l 32k
> >>>>>>>>
> >>>>>>>> [osd.0]
> >>>>>>>>          host = store1
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.1]
> >>>>>>>>          host = store1
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.2]
> >>>>>>>>          host = store1
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.3]
> >>>>>>>>          host = store1
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>> [osd.4]
> >>>>>>>>          host = store2
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.5]
> >>>>>>>>          host = store2
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.6]
> >>>>>>>>          host = store2
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.7]
> >>>>>>>>          host = store2
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>> [osd.8]
> >>>>>>>>          host = store3
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.9]
> >>>>>>>>          host = store3
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.10]
> >>>>>>>>          host = store3
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.11]
> >>>>>>>>          host = store3
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>> [osd.12]
> >>>>>>>>          host = store4
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.13]
> >>>>>>>>          host = store4
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.14]
> >>>>>>>>          host = store4
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.15]
> >>>>>>>>          host = store4
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>> [osd.16]
> >>>>>>>>          host = store5
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.17]
> >>>>>>>>          host = store5
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.18]
> >>>>>>>>          host = store5
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.19]
> >>>>>>>>          host = store5
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>> [osd.20]
> >>>>>>>>          host = store6
> >>>>>>>>          osd journal = /dev/sdg1
> >>>>>>>>          btrfs devs = /dev/sdc
> >>>>>>>> [osd.21]
> >>>>>>>>          host = store6
> >>>>>>>>          osd journal = /dev/sdh1
> >>>>>>>>          btrfs devs = /dev/sdd
> >>>>>>>> [osd.22]
> >>>>>>>>          host = store6
> >>>>>>>>          osd journal = /dev/sdi1
> >>>>>>>>          btrfs devs = /dev/sde
> >>>>>>>> [osd.23]
> >>>>>>>>          host = store6
> >>>>>>>>          osd journal = /dev/sdj1
> >>>>>>>>          btrfs devs = /dev/sdf
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 28.03.2013 19:01, Gregory Farnum wrote:
> >>>>>>>>>
> >>>>>>>>> Your crush map looks fine to me. I'm saying that your ceph -s
> output
> >>>>>>>>> showed the OSD still hadn't been marked out. No data will be
> >>>>>>>>> migrated
> >>>>>>>>> until it's marked out.
> >>>>>>>>> After ten minutes it should have been marked out, but that's
> based
> >>>>>>>>> on
> >>>>>>>>> a number of factors you have some control over. If you just want
> a
> >>>>>>>>> quick check of your crush map you can mark it out manually, too.
> >>>>>>>>> -Greg
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> ceph-users mailing list
> >>>>>>>> ceph-users@lists.ceph.com
> >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>
> >>>
> >>>
> >>
> >
> >
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cluster Map Problems

Reply via email to