[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-10 Thread Eugen Block
Please ignore my email, the PGs have eventually recovered, it just took way more time than expected or observed for the other PGs. I'll try to be more patient next time. ;-) ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: HeartbeatMap FAILED assert(0 == "hit suicide timeout")

2019-10-10 Thread Janne Johansson
Den tors 10 okt. 2019 kl 15:12 skrev 潘东元 : > hi all, > my osd hit suicide timeout. > > common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout") > > ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) > > can you give some advice on troubleshooting? > It is a very

[ceph-users] Re: MDS rejects clients causing hanging mountpoint on linux kernel client

2019-10-10 Thread Manuel Riel
I noticed a similar issue tonight. Still looking into the details, but here are the client logs I Oct 9 19:27:59 mon5-cx kernel: libceph: mds0 ***:6800 socket closed (con state OPEN) Oct 9 19:28:01 mon5-cx kernel: libceph: mds0 ***:6800 connection reset Oct 9 19:28:01 mon5-cx kernel:

[ceph-users] Re: Wrong %USED and MAX AVAIL stats for pool

2019-10-10 Thread Yordan Yordanov (Innologica)
Hi Igor, Thank you for responding. In this case this looks like a breaking change. I know of two applications that are now incorrectly displaying the pool usage and capacity, It looks like they both rely on the USED field to be divided by the number of replicas. One of those application is

[ceph-users] Nautilus: PGs stuck remapped+backfilling

2019-10-10 Thread Eugen Block
Hi all, I have a strange issue with backfilling and I'm not sure what the cause is. It's a Nautilus cluster (upgraded) that has an SSD cache tier for OpenStack and CephFS metadata residing on the same SSDs, there were three SSDs in total. Today I added two new SSDs (NVMe) (osd.15, osd.16) to