Thanks for the idea, I've tried it with 1 thread, and it shredded another OSD.
I've updated the tracker ticket :)

At least non-racecondition bugs are hopefully easier to spot...

I wouldn't just disable the fsck and upgrade anyway until the cause is rooted 
out.

-- Jonas


On 29/03/2021 14.34, Dan van der Ster wrote:
> Hi,
> 
> Saw that, looks scary!
> 
> I have no experience with that particular crash, but I was thinking
> that if you have already backfilled the degraded PGs, and can afford
> to try another OSD, you could try:
> 
>     "bluestore_fsck_quick_fix_threads": "1",  # because
> https://github.com/facebook/rocksdb/issues/5068 showed a similar crash
> and the dev said it occurs because WriteBatch is not thread safe.
> 
>     "bluestore_fsck_quick_fix_on_mount": "false", # should disable the
> fsck during upgrade. See https://github.com/ceph/ceph/pull/40198
> 
> -- Dan
> 
> On Mon, Mar 29, 2021 at 2:23 PM Jonas Jelten <jel...@in.tum.de> wrote:
>>
>> Hi!
>>
>> After upgrading MONs and MGRs successfully, the first OSD host I upgraded on 
>> Ubuntu Bionic from 14.2.16 to 15.2.10
>> shredded all OSDs on it by corrupting RocksDB, and they now refuse to boot.
>> RocksDB complains "Corruption: unknown WriteBatch tag".
>>
>> The initial crash/corruption occured when the automatic fsck was ran, and 
>> when it committed the changes for a lot of "zombie spanning blobs".
>>
>> Tracker issue with logs: https://tracker.ceph.com/issues/50017
>>
>>
>> Anyone else encountered this error? I've "suspended" the upgrade for now :)
>>
>> -- Jonas
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to