Hi,

after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing random 
crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not affected.
I destroyed and recreated some of the SSD OSDs which seemed to help. 

this happens on centos 7.5 (different kernels tested)

/var/log/messages: 
Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 thread_name:bstore_kv_final
Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
libtcmalloc.so.4.4.5[7f8a997a8000+46000]
Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, code=killed, 
status=11/SEGV
Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
restart.
Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
/var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
libtcmalloc.so.4.4.5[7f5f430cd000+46000]
Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, code=killed, 
status=11/SEGV
Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
Aug 29 10:24:35  systemd: ceph-osd@0.service failed.

did I hit a known issue?
any suggestions are highly appreciated


br
wolfgang


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to