[ceph-users] Behavior of EC pool when a host goes offline

2019-11-26 Thread majia xiao
Hi all,

We have a Ceph(version 12.2.4)cluster that adopts EC pools, and it consists
of 10 hosts for OSDs.

The corresponding commands to create the EC pool are listed as follows:



ceph osd erasure-code-profile set profile_jerasure_4_3_reed_sol_van \
  plugin=jerasure \
  k=4 \
  m=3 \
  technique=reed_sol_van \
  packetsize=2048 \
  crush-device-class=hdd \
  crush-failure-domain=host

ceph osd pool create pool_jerasure_4_3_reed_sol_van 2048 2048 erasure
profile_jerasure_4_3_reed_sol_van



Since that the EC pool's crush-failure-domain is configured to be "host",
we just disable the network interfaces of some hosts (using "ifdown"
command) to verify the functionality of the EC pool.
And here are the phenomena we have observed:


First of all, the IO rate (of "rados bench", which we used for benchmark)
drops immediately to 0 when one host goes offline.

Secondly, it takes a lot of time (around 100 seconds) for Ceph to detect
corresponding OSDs on that host are down.

Finally, once the Ceph has detected all offline OSDs, the EC pool seems to
act normally and it is ready for IO operations again.

So, here are my questions:

1. Is this normal that the IO rate drops to 0 immediately even though there
is only one host goes offline?
2. How to make Ceph reduce the time needed to detect failed OSDs?


Thanks for any help.


Best regards,
Majia Xiao
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC pool used space high

2019-11-26 Thread Erdem Agaoglu
That seems like it. Thanks a lot Serkan!

On 26 Nov 2019 Tue at 20:08 Serkan Çoban  wrote:

> Maybe following link helps...
> https://www.spinics.net/lists/dev-ceph/msg00795.html
>
> On Tue, Nov 26, 2019 at 6:17 PM Erdem Agaoglu 
> wrote:
> >
> > I thought of that but it doesn't make much sense. AFAICT min_size should
> block IO when i lose 3 osds, but it shouldn't effect the amount of the
> stored data. Am i missing something?
> >
> > On Tue, Nov 26, 2019 at 6:04 AM Konstantin Shalygin 
> wrote:
> >>
> >> On 11/25/19 6:05 PM, Erdem Agaoglu wrote:
> >>
> >>
> >> What I can't find is the 138,509 G difference between the
> ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static
> BTW, checking the same data historically shows we have about 1.12x of what
> we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in
> reality. Anyone have any ideas for why this is the case?
> >>
> >> May be min_size related? Because you are right, 6+3 is a 1.50, but 6+3
> (+1) is a your calculated 1.67.
> >>
> >>
> >>
> >> k
> >
> >
> >
> > --
> > erdem agaoglu
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
-- 
erdem agaoglu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC pool used space high

2019-11-26 Thread Serkan Çoban
Maybe following link helps...
https://www.spinics.net/lists/dev-ceph/msg00795.html

On Tue, Nov 26, 2019 at 6:17 PM Erdem Agaoglu  wrote:
>
> I thought of that but it doesn't make much sense. AFAICT min_size should 
> block IO when i lose 3 osds, but it shouldn't effect the amount of the stored 
> data. Am i missing something?
>
> On Tue, Nov 26, 2019 at 6:04 AM Konstantin Shalygin  wrote:
>>
>> On 11/25/19 6:05 PM, Erdem Agaoglu wrote:
>>
>>
>> What I can't find is the 138,509 G difference between the 
>> ceph_cluster_total_used_bytes and ceph_pool_stored_raw. This is not static 
>> BTW, checking the same data historically shows we have about 1.12x of what 
>> we expect. This seems to make our 1.5x EC overhead a 1.68x overhead in 
>> reality. Anyone have any ideas for why this is the case?
>>
>> May be min_size related? Because you are right, 6+3 is a 1.50, but 6+3 (+1) 
>> is a your calculated 1.67.
>>
>>
>>
>> k
>
>
>
> --
> erdem agaoglu
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [radosgw-admin] Unable to Unlink Bucket From UID

2019-11-26 Thread Mac Wynkoop
Hi all,

I seem to be running into an issue when attempting to unlink a bucket from
a user; this is my output:

user@server ~ $ radosgw-admin bucket unlink --bucket=user_5493/LF-Store
--uid=user_5493
failure: 2019-11-26 15:19:48.689 7fda1c2009c0  0 bucket entry point user
mismatch, can't unlink bucket: user_5493$BRTC != user_5493
(22) Invalid argument
user@server ~ $
I did some searching around, and no one seems to have seen this before. Any
ideas?

Thanks,

Mac
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io