[ceph-users] Behavior of EC pool when a host goes offline
Hi all, We have a Ceph(version 12.2.4)cluster that adopts EC pools, and it consists of 10 hosts for OSDs. The corresponding commands to create the EC pool are listed as follows: ceph osd erasure-code-profile set profile_jerasure_4_3_reed_sol_van \ plugin=jerasure \ k=4 \ m=3 \ technique=reed_sol_van \ packetsize=2048 \ crush-device-class=hdd \ crush-failure-domain=host ceph osd pool create pool_jerasure_4_3_reed_sol_van 2048 2048 erasure profile_jerasure_4_3_reed_sol_van Since that the EC pool's crush-failure-domain is configured to be "host", we just disable the network interfaces of some hosts (using "ifdown" command) to verify the functionality of the EC pool. And here are the phenomena we have observed: First of all, the IO rate (of "rados bench", which we used for benchmark) drops immediately to 0 when one host goes offline. Secondly, it takes a lot of time (around 100 seconds) for Ceph to detect corresponding OSDs on that host are down. Finally, once the Ceph has detected all offline OSDs, the EC pool seems to act normally and it is ready for IO operations again. So, here are my questions: 1. Is this normal that the IO rate drops to 0 immediately even though there is only one host goes offline? 2. How to make Ceph reduce the time needed to detect failed OSDs? Thanks for any help. Best regards, Majia Xiao ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: EC pool used space high
That seems like it. Thanks a lot Serkan! On 26 Nov 2019 Tue at 20:08 Serkan Çoban wrote: > Maybe following link helps... > https://www.spinics.net/lists/dev-ceph/msg00795.html > > On Tue, Nov 26, 2019 at 6:17 PM Erdem Agaoglu > wrote: > > > > I thought of that but it doesn't make much sense. AFAICT min_size should > block IO when i lose 3 osds, but it shouldn't effect the amount of the > stored data. Am i missing something? > > > > On Tue, Nov 26, 2019 at 6:04 AM Konstantin Shalygin > wrote: > >> > >> On 11/25/19 6:05 PM, Erdem Agaoglu wrote: > >> > >> > >> What I can't find is the 138,509 G difference between the > ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static > BTW, checking the same data historically shows we have about 1.12x of what > we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in > reality. Anyone have any ideas for why this is the case? > >> > >> May be min_size related? Because you are right, 6+3 is a 1.50, but 6+3 > (+1) is a your calculated 1.67. > >> > >> > >> > >> k > > > > > > > > -- > > erdem agaoglu > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > -- erdem agaoglu ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: EC pool used space high
Maybe following link helps... https://www.spinics.net/lists/dev-ceph/msg00795.html On Tue, Nov 26, 2019 at 6:17 PM Erdem Agaoglu wrote: > > I thought of that but it doesn't make much sense. AFAICT min_size should > block IO when i lose 3 osds, but it shouldn't effect the amount of the stored > data. Am i missing something? > > On Tue, Nov 26, 2019 at 6:04 AM Konstantin Shalygin wrote: >> >> On 11/25/19 6:05 PM, Erdem Agaoglu wrote: >> >> >> What I can't find is the 138,509 G difference between the >> ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static >> BTW, checking the same data historically shows we have about 1.12x of what >> we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in >> reality. Anyone have any ideas for why this is the case? >> >> May be min_size related? Because you are right, 6+3 is a 1.50, but 6+3 (+1) >> is a your calculated 1.67. >> >> >> >> k > > > > -- > erdem agaoglu > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] [radosgw-admin] Unable to Unlink Bucket From UID
Hi all, I seem to be running into an issue when attempting to unlink a bucket from a user; this is my output: user@server ~ $ radosgw-admin bucket unlink --bucket=user_5493/LF-Store --uid=user_5493 failure: 2019-11-26 15:19:48.689 7fda1c2009c0 0 bucket entry point user mismatch, can't unlink bucket: user_5493$BRTC != user_5493 (22) Invalid argument user@server ~ $ I did some searching around, and no one seems to have seen this before. Any ideas? Thanks, Mac ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io