Re: [ceph-users] OSD up takes 15 minutes after machine restarts

2020-01-19 Thread huxia...@horebdata.cn
HI, Igor, does this could cause the problem? huxia...@horebdata.cn From: Igor Fedotov Date: 2020-01-19 11:41 To: huxia...@horebdata.cn; ceph-users Subject: Re: [ceph-users] OSD up takes 15 minutes after machine restarts Hi Samuel, wondering if you have bluestore_fsck_on_mount option set to

Re: [ceph-users] Issues with Nautilus 14.2.6 ceph-volume lvm batch --bluestore ?

2020-01-19 Thread Dave Hall
Nigel, Thanks.  I've never seen that.  Cool. -Dave Dave Hall Binghamton University On 1/19/2020 11:15 PM, Nigel Williams wrote: On Mon, 20 Jan 2020 at 14:15, Dave Hall wrote: BTW, I did try to search the list archives via http://lists.ceph.com/pipermail/ceph-users-ceph.com/, but that didn

Re: [ceph-users] Issues with Nautilus 14.2.6 ceph-volume lvm batch --bluestore ?

2020-01-19 Thread Nigel Williams
On Mon, 20 Jan 2020 at 14:15, Dave Hall wrote: > BTW, I did try to search the list archives via > http://lists.ceph.com/pipermail/ceph-users-ceph.com/, but that didn't work > well for me. Is there another way to search? With your favorite search engine (say Goog / ddg ), you can do this: ceph

[ceph-users] Issues with Nautilus 14.2.6 ceph-volume lvm batch --bluestore ?

2020-01-19 Thread Dave Hall
Hello, Since upgrading to Nautilus (+ Debian 10 Backports), when I issue 'ceph-volume lvm batch --bluestore ' it fails with bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid I previously had Luminous + Debian 9 running on the same hardware with the same OSD layout, but I de

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Stefan Priebe - Profihost AG
Hello Igor, there's absolutely nothing in the logs before. What do those lines mean: Put( Prefix = O key = 0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x' Value size = 480) Put( Prefix = O key = 0x7f8001cc45c8

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Stefan Priebe - Profihost AG
Yes, except that this happens on 8 different clusters with different hw but same ceph version and same kernel version. Greets, Stefan > Am 19.01.2020 um 11:53 schrieb Igor Fedotov : > > So the intermediate summary is: > > Any OSD in the cluster can experience interim RocksDB checksum failure.

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Igor Fedotov
So the intermediate summary is: Any OSD in the cluster can experience interim RocksDB checksum failure. Which isn't present after OSD restart. No HW issues observed, no persistent artifacts (except OSD log) afterwards. And looks like the issue is rather specific to the cluster as no similar

Re: [ceph-users] OSD up takes 15 minutes after machine restarts

2020-01-19 Thread Igor Fedotov
Hi Samuel, wondering if you have bluestore_fsck_on_mount option set to true? Can you see high read load over OSD device(s) during the startup? If so it might be fsck running which takes that long. Thanks, Igor On 1/19/2020 11:53 AM, huxia...@horebdata.cn wrote: Dear folks, I had a stra

[ceph-users] [ceph-osd ] osd can not boot

2020-01-19 Thread Wei Zhao
Hi : A server was just rebooted and the osd cant boot .The log is the following. -3> 2020-01-19 17:39:25.904673 7f5b8e5e9d80 -1 bluestore(/var/lib/ceph/osd/ceph-44) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xd2acc81f, expected 0x62cf539d, device location [0xaee7c~1000

[ceph-users] OSD up takes 15 minutes after machine restarts

2020-01-19 Thread huxia...@horebdata.cn
Dear folks, I had a strange situation with 3-node Ceph cluster on Luminous 12.2.12 with bluestore. Each machine has 5 OSDs on HDD, and each OSD uses a 30GB DB/WAL partition on SSD. At the beginning without much data, OSDs can quickly up if one node restarts. Then I ran 4-day long stress tests