try to restart some of the down osds in 'ceph osd tree', and to see
what happened?

nokia ceph <nokiacephus...@gmail.com> 于2019年11月8日周五 下午6:24写道:
>
> Adding my official mail id
>
> ---------- Forwarded message ---------
> From: nokia ceph <nokiacephus...@gmail.com>
> Date: Fri, Nov 8, 2019 at 3:57 PM
> Subject: OSD's not coming up in Nautilus
> To: Ceph Users <ceph-users@lists.ceph.com>
>
>
> Hi Team,
>
> There is one 5 node ceph cluster which we have upgraded from Luminous to 
> Nautilus and everything was going well until yesterday when we noticed that 
> the ceph osd's are marked down and not recognized by the monitors as running 
> eventhough the osd processes are running.
>
> We noticed that the admin.keyring and the mon.keyring are missing in the 
> nodes which we have recreated it with the below commands.
>
> ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key 
> -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds allow
>
> ceph-authtool --create_keyring /etc/ceph/ceph.mon.keyring --gen-key -n mon. 
> --cap mon 'allow *'
>
> In logs we find the below lines.
>
> 2019-11-08 09:01:50.525 7ff61722b700  0 log_channel(audit) log [DBG] : 
> from='client.? 10.50.11.44:0/2398064782' entity='client.admin' 
> cmd=[{"prefix": "df", "format": "json"}]: dispatch
> 2019-11-08 09:02:37.686 7ff61722b700  0 log_channel(cluster) log [INF] : 
> mon.cn1 calling monitor election
> 2019-11-08 09:02:37.686 7ff61722b700  1 mon.cn1@0(electing).elector(31157) 
> init, last seen epoch 31157, mid-election, bumping
> 2019-11-08 09:02:37.688 7ff61722b700 -1 mon.cn1@0(electing) e3 failed to get 
> devid for : udev_device_new_from_subsystem_sysname failed on ''
> 2019-11-08 09:02:37.770 7ff61722b700  0 log_channel(cluster) log [INF] : 
> mon.cn1 is new leader, mons cn1,cn2,cn3,cn4,cn5 in quorum (ranks 0,1,2,3,4)
> 2019-11-08 09:02:37.857 7ff613a24700  0 log_channel(cluster) log [DBG] : 
> monmap e3: 5 mons at 
> {cn1=[v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0],cn2=[v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0],cn3=[v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0],cn4=[v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0],cn5=[v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0]}
>
>
>
> # ceph mon dump
> dumped monmap epoch 3
> epoch 3
> fsid 9dbf207a-561c-48ba-892d-3e79b86be12f
> last_changed 2019-09-03 07:53:39.031174
> created 2019-08-23 18:30:55.970279
> min_mon_release 14 (nautilus)
> 0: [v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0] mon.cn1
> 1: [v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0] mon.cn2
> 2: [v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0] mon.cn3
> 3: [v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0] mon.cn4
> 4: [v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0] mon.cn5
>
>
> # ceph -s
>   cluster:
>     id:     9dbf207a-561c-48ba-892d-3e79b86be12f
>     health: HEALTH_WARN
>             85 osds down
>             3 hosts (72 osds) down
>             1 nearfull osd(s)
>             1 pool(s) nearfull
>             Reduced data availability: 2048 pgs inactive
>             too few PGs per OSD (17 < min 30)
>             1/5 mons down, quorum cn2,cn3,cn4,cn5
>
>   services:
>     mon: 5 daemons, quorum cn2,cn3,cn4,cn5 (age 57s), out of quorum: cn1
>     mgr: cn1(active, since 73m), standbys: cn2, cn3, cn4, cn5
>     osd: 120 osds: 35 up, 120 in; 909 remapped pgs
>
>   data:
>     pools:   1 pools, 2048 pgs
>     objects: 0 objects, 0 B
>     usage:   176 TiB used, 260 TiB / 437 TiB avail
>     pgs:     100.000% pgs unknown
>              2048 unknown
>
>
> The osd logs show the below logs.
>
> 2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load kvs
> 2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load lua
> 2019-11-08 09:05:33.337 7fd1a36eed80  0 _get_class not permitted to load sdk
> 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features 
> 432629308056666112, adjusting msgr requires for clients
> 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features 
> 432629308056666112 was 8705, adjusting msgr requires for mons
> 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features 
> 1009090060360105984, adjusting msgr requires for osds
>
> Please let us know what might be the issue. There seems to be no network 
> issues in any of the servers public and private interfaces.
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to