Thanks for trying to help, Igor. 

> From: "Igor Fedotov" <ifedo...@suse.de>
> To: "Andrei Mikhailovsky" <and...@arhont.com>
> Cc: "ceph-users" <ceph-users@lists.ceph.com>
> Sent: Thursday, 4 July, 2019 12:52:16
> Subject: Re: [ceph-users] troubleshooting space usage

> Yep, this looks fine..

> hmm... sorry, but I'm out of ideas what's happening..

> Anyway I think ceph reports are more trustworthy than rgw ones. Looks like 
> some
> issue with rgw reporting or may be some object leakage.

> Regards,

> Igor

> On 7/3/2019 6:34 PM, Andrei Mikhailovsky wrote:

>> Hi Igor.

>> The numbers are identical it seems:

>> .rgw.buckets 19 15 TiB 78.22 4.3 TiB 8786934

>> # cat /root/ceph-rgw.buckets-rados-ls-all |wc -l
>> 8786934

>> Cheers

>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de | <ifedo...@suse.de> ]
>>> To: "andrei" [ mailto:and...@arhont.com | <and...@arhont.com> ]
>>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>> <ceph-users@lists.ceph.com> ]
>>> Sent: Wednesday, 3 July, 2019 13:49:02
>>> Subject: Re: [ceph-users] troubleshooting space usage

>>> Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a 
>>> little
>>> difference. So that's not the allocation overhead.

>>> What's about comparing object counts reported by ceph and radosgw tools?

>>> Igor.

>>> On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

>>>> Thanks Igor, Here is a link to the ceph perf data on several osds.

>>>> [ https://paste.ee/p/IzDMy | https://paste.ee/p/IzDMy ]

>>>> In terms of the object sizes. We use rgw to backup the data from various
>>>> workstations and servers. So, the sizes would be from a few kb to a few 
>>>> gig per
>>>> individual file.

>>>> Cheers

>>>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de | <ifedo...@suse.de> ]
>>>>> To: "andrei" [ mailto:and...@arhont.com | <and...@arhont.com> ]
>>>>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>>> <ceph-users@lists.ceph.com> ]
>>>>> Sent: Wednesday, 3 July, 2019 12:29:33
>>>>> Subject: Re: [ceph-users] troubleshooting space usage

>>>>> Hi Andrei,

>>>>> Additionally I'd like to see performance counters dump for a couple of 
>>>>> HDD OSDs
>>>>> (obtained through 'ceph daemon osd.N perf dump' command).

>>>>> W.r.t average object size - I was thinking that you might know what 
>>>>> objects had
>>>>> been uploaded... If not then you might want to estimate it by using 
>>>>> "rados get"
>>>>> command on the pool: retrieve some random object set and check their 
>>>>> sizes. But
>>>>> let's check performance counters first - most probably they will show 
>>>>> loses
>>>>> caused by allocation.

>>>>> Also I've just found similar issue (still unresolved) in our internal 
>>>>> tracker -
>>>>> but its root cause is definitely different from allocation overhead. 
>>>>> Looks like
>>>>> some orphaned objects in the pool. Could you please compare and share the
>>>>> amounts of objects in the pool reported by "ceph (or rados) df detail" and
>>>>> radosgw tools?

>>>>> Thanks,

>>>>> Igor

>>>>> On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

>>>>>> Hi Igor,

>>>>>> Many thanks for your reply. Here are the details about the cluster:

>>>>>> 1. Ceph version - 13.2.5-1xenial (installed from Ceph repository for 
>>>>>> ubuntu
>>>>>> 16.04)

>>>>>> 2. main devices for radosgw pool - hdd. we do use a few ssds for the 
>>>>>> other pool,
>>>>>> but it is not used by radosgw

>>>>>> 3. we use BlueStore

>>>>>> 4. Average rgw object size - I have no idea how to check that. Couldn't 
>>>>>> find a
>>>>>> simple answer from google either. Could you please let me know how to 
>>>>>> check
>>>>>> that?

>>>>>> 5. Ceph osd df tree:

>>>>>> 6. Other useful info on the cluster:

>>>>>> # ceph osd df tree
>>>>>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME

>>>>>> -1 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - root uk
>>>>>> -5 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - datacenter ldex
>>>>>> -11 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - room ldex-dc3
>>>>>> -13 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - row row-a
>>>>>> -4 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - rack ldex-rack-a5
>>>>>> -2 28.04495 - 28 TiB 22 TiB 6.2 TiB 77.96 0.98 - host arh-ibstorage1-ib

>>>>>> 0 hdd 2.73000 0.79999 2.8 TiB 2.3 TiB 519 GiB 81.61 1.03 145 osd.0
>>>>>> 1 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 847 GiB 70.00 0.88 130 osd.1
>>>>>> 2 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 561 GiB 80.12 1.01 152 osd.2
>>>>>> 3 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 469 GiB 83.41 1.05 160 osd.3
>>>>>> 4 hdd 2.73000 1.00000 2.8 TiB 1.8 TiB 983 GiB 65.18 0.82 141 osd.4
>>>>>> 32 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.1 TiB 80.68 1.02 306 osd.32
>>>>>> 35 hdd 2.73000 1.00000 2.8 TiB 1.7 TiB 1.0 TiB 62.89 0.79 126 osd.35
>>>>>> 36 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 464 GiB 83.58 1.05 175 osd.36
>>>>>> 37 hdd 2.73000 0.89999 2.8 TiB 2.5 TiB 301 GiB 89.34 1.13 160 osd.37
>>>>>> 5 ssd 0.74500 1.00000 745 GiB 642 GiB 103 GiB 86.15 1.09 65 osd.5

>>>>>> -3 28.04495 - 28 TiB 24 TiB 4.5 TiB 84.03 1.06 - host arh-ibstorage2-ib
>>>>>> 9 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 405 GiB 85.65 1.08 158 osd.9
>>>>>> 10 hdd 2.73000 0.89999 2.8 TiB 2.4 TiB 352 GiB 87.52 1.10 169 osd.10
>>>>>> 11 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 783 GiB 72.28 0.91 160 osd.11
>>>>>> 12 hdd 2.73000 0.84999 2.8 TiB 2.4 TiB 359 GiB 87.27 1.10 153 osd.12
>>>>>> 13 hdd 2.73000 1.00000 2.8 TiB 2.4 TiB 348 GiB 87.69 1.11 169 osd.13
>>>>>> 14 hdd 2.73000 1.00000 2.8 TiB 2.5 TiB 283 GiB 89.97 1.14 170 osd.14
>>>>>> 15 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 560 GiB 80.18 1.01 155 osd.15
>>>>>> 16 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 332 GiB 88.26 1.11 178 osd.16
>>>>>> 26 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.0 TiB 81.04 1.02 324 osd.26
>>>>>> 7 ssd 0.74500 1.00000 745 GiB 607 GiB 138 GiB 81.48 1.03 62 osd.7

>>>>>> -15 28.04495 - 28 TiB 22 TiB 6.4 TiB 77.40 0.98 - host arh-ibstorage3-ib
>>>>>> 18 hdd 2.73000 0.95000 2.8 TiB 2.5 TiB 312 GiB 88.96 1.12 156 osd.18
>>>>>> 19 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 771 GiB 72.68 0.92 162 osd.19
>>>>>> 20 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 733 GiB 74.04 0.93 149 osd.20
>>>>>> 21 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 533 GiB 81.12 1.02 155 osd.21
>>>>>> 22 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 692 GiB 75.48 0.95 144 osd.22
>>>>>> 23 hdd 2.73000 1.00000 2.8 TiB 1.6 TiB 1.1 TiB 58.43 0.74 130 osd.23
>>>>>> 24 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 579 GiB 79.51 1.00 146 osd.24
>>>>>> 25 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 886 GiB 68.63 0.87 147 osd.25
>>>>>> 31 hdd 5.45999 1.00000 5.5 TiB 4.7 TiB 758 GiB 86.50 1.09 326 osd.31
>>>>>> 6 ssd 0.74500 0.89999 744 GiB 640 GiB 104 GiB 86.01 1.09 61 osd.6

>>>>>> -17 28.04494 - 28 TiB 22 TiB 6.3 TiB 77.61 0.98 - host arh-ibstorage4-ib
>>>>>> 8 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 909 GiB 67.80 0.86 141 osd.8
>>>>>> 17 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 904 GiB 67.99 0.86 144 osd.17
>>>>>> 27 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 654 GiB 76.84 0.97 152 osd.27
>>>>>> 28 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 481 GiB 82.98 1.05 153 osd.28
>>>>>> 29 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 829 GiB 70.65 0.89 137 osd.29
>>>>>> 30 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 762 GiB 73.03 0.92 142 osd.30
>>>>>> 33 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 501 GiB 82.25 1.04 166 osd.33
>>>>>> 34 hdd 5.45998 1.00000 5.5 TiB 4.5 TiB 968 GiB 82.77 1.04 325 osd.34
>>>>>> 39 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 402 GiB 85.77 1.08 162 osd.39
>>>>>> 38 ssd 0.74500 1.00000 745 GiB 671 GiB 74 GiB 90.02 1.14 68 osd.38
>>>>>> TOTAL 113 TiB 90 TiB 23 TiB 79.25
>>>>>> MIN/MAX VAR: 0.74/1.14 STDDEV: 8.14

>>>>>> # for i in $(radosgw-admin bucket list | jq -r '.[]'); do radosgw-admin 
>>>>>> bucket
>>>>>> stats --bucket=$i | jq '.usage | ."rgw.main" | .size_kb' ; done | awk '{ 
>>>>>> SUM +=
>>>>>> $1} END { print SUM/1024/1024/1024 }'
>>>>>> 6.59098

>>>>>> # ceph df

>>>>>> GLOBAL:
>>>>>> SIZE AVAIL RAW USED %RAW USED
>>>>>> 113 TiB 23 TiB 90 TiB 79.25

>>>>>> POOLS:
>>>>>> NAME ID USED %USED MAX AVAIL OBJECTS
>>>>>> Primary-ubuntu-1 5 27 TiB 87.56 3.9 TiB 7302534
>>>>>> .users.uid 15 6.8 KiB 0 3.9 TiB 39
>>>>>> .users 16 335 B 0 3.9 TiB 20
>>>>>> .users.swift 17 14 B 0 3.9 TiB 1
>>>>>> .rgw.buckets 19 15 TiB 79.88 3.9 TiB 8787763
>>>>>> .users.email 22 0 B 0 3.9 TiB 0
>>>>>> .log 24 109 MiB 0 3.9 TiB 102301
>>>>>> .rgw.buckets.extra 37 0 B 0 2.6 TiB 0
>>>>>> .rgw.root 44 2.9 KiB 0 2.6 TiB 16
>>>>>> .rgw.meta 45 1.7 MiB 0 2.6 TiB 6249
>>>>>> .rgw.control 46 0 B 0 2.6 TiB 8
>>>>>> .rgw.gc 47 0 B 0 2.6 TiB 32
>>>>>> .usage 52 0 B 0 2.6 TiB 0
>>>>>> .intent-log 53 0 B 0 2.6 TiB 0
>>>>>> default.rgw.buckets.non-ec 54 0 B 0 2.6 TiB 0
>>>>>> .rgw.buckets.index 55 0 B 0 2.6 TiB 11485
>>>>>> .rgw 56 491 KiB 0 2.6 TiB 1686
>>>>>> Primary-ubuntu-1-ssd 57 1.2 TiB 92.39 105 GiB 379516

>>>>>> I am not too sure if the issue relates to the BlueStore overhead as I 
>>>>>> would
>>>>>> probably have seen the discrepancy in my Primary-ubuntu-1 pool as well.
>>>>>> However, the data usage on Primary-ubuntu-1 pool seems to be consistent 
>>>>>> with my
>>>>>> expectations (precise numbers to be verified soon). The issues seems to 
>>>>>> be only
>>>>>> with the .rgw-buckets pool where the "ceph df " output shows 15TB of 
>>>>>> usage and
>>>>>> the sum of all buckets in that pool shows just over 6.5TB.

>>>>>> Cheers

>>>>>> Andrei

>>>>>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de | <ifedo...@suse.de> ]
>>>>>>> To: "andrei" [ mailto:and...@arhont.com | <and...@arhont.com> ] , 
>>>>>>> "ceph-users" [
>>>>>>> mailto:ceph-users@lists.ceph.com | <ceph-users@lists.ceph.com> ]
>>>>>>> Sent: Tuesday, 2 July, 2019 10:58:54
>>>>>>> Subject: Re: [ceph-users] troubleshooting space usage

>>>>>>> Hi Andrei,

>>>>>>> The most obvious reason is space usage overhead caused by BlueStore 
>>>>>>> allocation
>>>>>>> granularity, e.g. if bluestore_min_alloc_size is 64K and average object 
>>>>>>> size is
>>>>>>> 16K one will waste 48K per object in average. This is rather a 
>>>>>>> speculation so
>>>>>>> far as we lack key the information about your cluster:

>>>>>>> - Ceph version

>>>>>>> - What are the main devices for OSD: hdd or ssd.

>>>>>>> - BlueStore or FileStore.

>>>>>>> - average RGW object size.

>>>>>>> You might also want to collect and share performance counter dumps 
>>>>>>> (ceph daemon
>>>>>>> osd.N perf dump) and "
>>>>>>> " reports from a couple of your OSDs.

>>>>>>> Thanks,

>>>>>>> Igor

>>>>>>> On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:

>>>>>>>> Bump!

>>>>>>>>> From: "Andrei Mikhailovsky" [ mailto:and...@arhont.com | 
>>>>>>>>> <and...@arhont.com> ]
>>>>>>>>> To: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>>>>>>> <ceph-users@lists.ceph.com> ]
>>>>>>>>> Sent: Friday, 28 June, 2019 14:54:53
>>>>>>>>> Subject: [ceph-users] troubleshooting space usage

>>>>>>>>> Hi

>>>>>>>>> Could someone please explain / show how to troubleshoot the space 
>>>>>>>>> usage in Ceph
>>>>>>>>> and how to reclaim the unused space?

>>>>>>>>> I have a small cluster with 40 osds, replica of 2, mainly used as a 
>>>>>>>>> backend for
>>>>>>>>> cloud stack as well as the S3 gateway. The used space doesn't make 
>>>>>>>>> any sense to
>>>>>>>>> me, especially the rgw pool, so I am seeking help.

>>>>>>>>> Here is what I found from the client:

>>>>>>>>> Ceph -s shows the

>>>>>>>>> usage: 89 TiB used, 24 TiB / 113 TiB avail

>>>>>>>>> Ceph df shows:

>>>>>>>>> Primary-ubuntu-1 5 27 TiB 90.11 3.0 TiB 7201098
>>>>>>>>> Primary-ubuntu-1-ssd 57 1.2 TiB 89.62 143 GiB 359260
>>>>>>>>> .rgw.buckets 19 15 TiB 83.73 3.0 TiB 8742222

>>>>>>>>> the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in line 
>>>>>>>>> with my
>>>>>>>>> expectations. However, the .rgw.buckets pool seems to be using way 
>>>>>>>>> too much.
>>>>>>>>> The usage of all rgw buckets shows 6.5TB usage (looking at the 
>>>>>>>>> size_kb values
>>>>>>>>> from the "radosgw-admin bucket stats"). I am trying to figure out why
>>>>>>>>> .rgw.buckets is using 15TB of space instead of the 6.5TB as shown 
>>>>>>>>> from the
>>>>>>>>> bucket usage.

>>>>>>>>> Thanks

>>>>>>>>> Andrei

>>>>>>>>> _______________________________________________
>>>>>>>>> ceph-users mailing list
>>>>>>>>> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
>>>>>>>>> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]

>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
>>>>>>>> ceph-users@lists.ceph.com ] [
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to