Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

nokia ceph Fri, 08 Nov 2019 02:39:44 -0800

Hi,



Below is the status of the OSD after restart.



# systemctl status ceph-osd@0.service

● ceph-osd@0.service - Ceph object storage daemon osd.0

   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
enabled-runtime; vendor preset: disabled)

  Drop-In: /etc/systemd/system/ceph-osd@.service.d

           └─90-ExecStart_NUMA.conf

   Active: active (running) since Fri 2019-11-08 10:32:51 UTC; 1min 1s ago

  Process: 219213 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster
${CLUSTER} --id %i (code=exited, status=0/SUCCESS)  Main PID: 219218
(ceph-osd)

   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service

           └─219218 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser
ceph --setgroup ceph



Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object storage
daemon osd.0...

Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object storage
daemon osd.0.

Nov 08 10:33:03 cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:03.785
7f9adeed4d80 -1 osd.0 1795 log_to_monitors {default=true} Nov 08 10:33:05
cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:05.474 7f9ad14df700 -1
osd.0 1795 set_numa_affinity unable to identify public interface
'dss-client' numa n...r directory

Hint: Some lines were ellipsized, use -l to show in full.





And I have attached the logs in the file in this mail while this restart
was initiated.



On Fri, Nov 8, 2019 at 3:59 PM huang jun <hjwsm1...@gmail.com> wrote:

> try to restart some of the down osds in 'ceph osd tree', and to see
> what happened?
>
> nokia ceph <nokiacephus...@gmail.com> 于2019年11月8日周五 下午6:24写道：
> >
> > Adding my official mail id
> >
> > ---------- Forwarded message ---------
> > From: nokia ceph <nokiacephus...@gmail.com>
> > Date: Fri, Nov 8, 2019 at 3:57 PM
> > Subject: OSD's not coming up in Nautilus
> > To: Ceph Users <ceph-users@lists.ceph.com>
> >
> >
> > Hi Team,
> >
> > There is one 5 node ceph cluster which we have upgraded from Luminous to
> Nautilus and everything was going well until yesterday when we noticed that
> the ceph osd's are marked down and not recognized by the monitors as
> running eventhough the osd processes are running.
> >
> > We noticed that the admin.keyring and the mon.keyring are missing in the
> nodes which we have recreated it with the below commands.
> >
> > ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring
> --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds
> allow
> >
> > ceph-authtool --create_keyring /etc/ceph/ceph.mon.keyring --gen-key -n
> mon. --cap mon 'allow *'
> >
> > In logs we find the below lines.
> >
> > 2019-11-08 09:01:50.525 7ff61722b700  0 log_channel(audit) log [DBG] :
> from='client.? 10.50.11.44:0/2398064782' entity='client.admin'
> cmd=[{"prefix": "df", "format": "json"}]: dispatch
> > 2019-11-08 09:02:37.686 7ff61722b700  0 log_channel(cluster) log [INF] :
> mon.cn1 calling monitor election
> > 2019-11-08 09:02:37.686 7ff61722b700  1 mon.cn1@0(electing).elector(31157)
> init, last seen epoch 31157, mid-election, bumping
> > 2019-11-08 09:02:37.688 7ff61722b700 -1 mon.cn1@0(electing) e3 failed
> to get devid for : udev_device_new_from_subsystem_sysname failed on ''
> > 2019-11-08 09:02:37.770 7ff61722b700  0 log_channel(cluster) log [INF] :
> mon.cn1 is new leader, mons cn1,cn2,cn3,cn4,cn5 in quorum (ranks 0,1,2,3,4)
> > 2019-11-08 09:02:37.857 7ff613a24700  0 log_channel(cluster) log [DBG] :
> monmap e3: 5 mons at {cn1=[v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0
> ],cn2=[v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0],cn3=[v2:
> 10.50.11.43:3300/0,v1:10.50.11.43:6789/0],cn4=[v2:
> 10.50.11.44:3300/0,v1:10.50.11.44:6789/0],cn5=[v2:
> 10.50.11.45:3300/0,v1:10.50.11.45:6789/0]}
> >
> >
> >
> > # ceph mon dump
> > dumped monmap epoch 3
> > epoch 3
> > fsid 9dbf207a-561c-48ba-892d-3e79b86be12f
> > last_changed 2019-09-03 07:53:39.031174
> > created 2019-08-23 18:30:55.970279
> > min_mon_release 14 (nautilus)
> > 0: [v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0] mon.cn1
> > 1: [v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0] mon.cn2
> > 2: [v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0] mon.cn3
> > 3: [v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0] mon.cn4
> > 4: [v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0] mon.cn5
> >
> >
> > # ceph -s
> >   cluster:
> >     id:     9dbf207a-561c-48ba-892d-3e79b86be12f
> >     health: HEALTH_WARN
> >             85 osds down
> >             3 hosts (72 osds) down
> >             1 nearfull osd(s)
> >             1 pool(s) nearfull
> >             Reduced data availability: 2048 pgs inactive
> >             too few PGs per OSD (17 < min 30)
> >             1/5 mons down, quorum cn2,cn3,cn4,cn5
> >
> >   services:
> >     mon: 5 daemons, quorum cn2,cn3,cn4,cn5 (age 57s), out of quorum: cn1
> >     mgr: cn1(active, since 73m), standbys: cn2, cn3, cn4, cn5
> >     osd: 120 osds: 35 up, 120 in; 909 remapped pgs
> >
> >   data:
> >     pools:   1 pools, 2048 pgs
> >     objects: 0 objects, 0 B
> >     usage:   176 TiB used, 260 TiB / 437 TiB avail
> >     pgs:     100.000% pgs unknown
> >              2048 unknown
> >
> >
> > The osd logs show the below logs.
> >
> > 2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load
> kvs
> > 2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load
> lua
> > 2019-11-08 09:05:33.337 7fd1a36eed80  0 _get_class not permitted to load
> sdk
> > 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has
> features 432629308056666112, adjusting msgr requires for clients
> > 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has
> features 432629308056666112 was 8705, adjusting msgr requires for mons
> > 2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has
> features 1009090060360105984, adjusting msgr requires for osds
> >
> > Please let us know what might be the issue. There seems to be no network
> issues in any of the servers public and private interfaces.
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

ceph-osd.0.log
Description: Binary data

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

Reply via email to