Hi Igor. The numbers are identical it seems:
.rgw.buckets 19 15 TiB 78.22 4.3 TiB 8786934 # cat /root/ceph-rgw.buckets-rados-ls-all |wc -l 8786934 Cheers > From: "Igor Fedotov" <ifedo...@suse.de> > To: "andrei" <and...@arhont.com> > Cc: "ceph-users" <ceph-users@lists.ceph.com> > Sent: Wednesday, 3 July, 2019 13:49:02 > Subject: Re: [ceph-users] troubleshooting space usage > Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a little > difference. So that's not the allocation overhead. > What's about comparing object counts reported by ceph and radosgw tools? > Igor. > On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote: >> Thanks Igor, Here is a link to the ceph perf data on several osds. >> [ https://paste.ee/p/IzDMy | https://paste.ee/p/IzDMy ] >> In terms of the object sizes. We use rgw to backup the data from various >> workstations and servers. So, the sizes would be from a few kb to a few gig >> per >> individual file. >> Cheers >>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de | <ifedo...@suse.de> ] >>> To: "andrei" [ mailto:and...@arhont.com | <and...@arhont.com> ] >>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com | >>> <ceph-users@lists.ceph.com> ] >>> Sent: Wednesday, 3 July, 2019 12:29:33 >>> Subject: Re: [ceph-users] troubleshooting space usage >>> Hi Andrei, >>> Additionally I'd like to see performance counters dump for a couple of HDD >>> OSDs >>> (obtained through 'ceph daemon osd.N perf dump' command). >>> W.r.t average object size - I was thinking that you might know what objects >>> had >>> been uploaded... If not then you might want to estimate it by using "rados >>> get" >>> command on the pool: retrieve some random object set and check their sizes. >>> But >>> let's check performance counters first - most probably they will show loses >>> caused by allocation. >>> Also I've just found similar issue (still unresolved) in our internal >>> tracker - >>> but its root cause is definitely different from allocation overhead. Looks >>> like >>> some orphaned objects in the pool. Could you please compare and share the >>> amounts of objects in the pool reported by "ceph (or rados) df detail" and >>> radosgw tools? >>> Thanks, >>> Igor >>> On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote: >>>> Hi Igor, >>>> Many thanks for your reply. Here are the details about the cluster: >>>> 1. Ceph version - 13.2.5-1xenial (installed from Ceph repository for ubuntu >>>> 16.04) >>>> 2. main devices for radosgw pool - hdd. we do use a few ssds for the other >>>> pool, >>>> but it is not used by radosgw >>>> 3. we use BlueStore >>>> 4. Average rgw object size - I have no idea how to check that. Couldn't >>>> find a >>>> simple answer from google either. Could you please let me know how to check >>>> that? >>>> 5. Ceph osd df tree: >>>> 6. Other useful info on the cluster: >>>> # ceph osd df tree >>>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME >>>> -1 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - root uk >>>> -5 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - datacenter ldex >>>> -11 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - room ldex-dc3 >>>> -13 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - row row-a >>>> -4 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - rack ldex-rack-a5 >>>> -2 28.04495 - 28 TiB 22 TiB 6.2 TiB 77.96 0.98 - host arh-ibstorage1-ib >>>> 0 hdd 2.73000 0.79999 2.8 TiB 2.3 TiB 519 GiB 81.61 1.03 145 osd.0 >>>> 1 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 847 GiB 70.00 0.88 130 osd.1 >>>> 2 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 561 GiB 80.12 1.01 152 osd.2 >>>> 3 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 469 GiB 83.41 1.05 160 osd.3 >>>> 4 hdd 2.73000 1.00000 2.8 TiB 1.8 TiB 983 GiB 65.18 0.82 141 osd.4 >>>> 32 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.1 TiB 80.68 1.02 306 osd.32 >>>> 35 hdd 2.73000 1.00000 2.8 TiB 1.7 TiB 1.0 TiB 62.89 0.79 126 osd.35 >>>> 36 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 464 GiB 83.58 1.05 175 osd.36 >>>> 37 hdd 2.73000 0.89999 2.8 TiB 2.5 TiB 301 GiB 89.34 1.13 160 osd.37 >>>> 5 ssd 0.74500 1.00000 745 GiB 642 GiB 103 GiB 86.15 1.09 65 osd.5 >>>> -3 28.04495 - 28 TiB 24 TiB 4.5 TiB 84.03 1.06 - host arh-ibstorage2-ib >>>> 9 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 405 GiB 85.65 1.08 158 osd.9 >>>> 10 hdd 2.73000 0.89999 2.8 TiB 2.4 TiB 352 GiB 87.52 1.10 169 osd.10 >>>> 11 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 783 GiB 72.28 0.91 160 osd.11 >>>> 12 hdd 2.73000 0.84999 2.8 TiB 2.4 TiB 359 GiB 87.27 1.10 153 osd.12 >>>> 13 hdd 2.73000 1.00000 2.8 TiB 2.4 TiB 348 GiB 87.69 1.11 169 osd.13 >>>> 14 hdd 2.73000 1.00000 2.8 TiB 2.5 TiB 283 GiB 89.97 1.14 170 osd.14 >>>> 15 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 560 GiB 80.18 1.01 155 osd.15 >>>> 16 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 332 GiB 88.26 1.11 178 osd.16 >>>> 26 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.0 TiB 81.04 1.02 324 osd.26 >>>> 7 ssd 0.74500 1.00000 745 GiB 607 GiB 138 GiB 81.48 1.03 62 osd.7 >>>> -15 28.04495 - 28 TiB 22 TiB 6.4 TiB 77.40 0.98 - host arh-ibstorage3-ib >>>> 18 hdd 2.73000 0.95000 2.8 TiB 2.5 TiB 312 GiB 88.96 1.12 156 osd.18 >>>> 19 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 771 GiB 72.68 0.92 162 osd.19 >>>> 20 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 733 GiB 74.04 0.93 149 osd.20 >>>> 21 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 533 GiB 81.12 1.02 155 osd.21 >>>> 22 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 692 GiB 75.48 0.95 144 osd.22 >>>> 23 hdd 2.73000 1.00000 2.8 TiB 1.6 TiB 1.1 TiB 58.43 0.74 130 osd.23 >>>> 24 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 579 GiB 79.51 1.00 146 osd.24 >>>> 25 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 886 GiB 68.63 0.87 147 osd.25 >>>> 31 hdd 5.45999 1.00000 5.5 TiB 4.7 TiB 758 GiB 86.50 1.09 326 osd.31 >>>> 6 ssd 0.74500 0.89999 744 GiB 640 GiB 104 GiB 86.01 1.09 61 osd.6 >>>> -17 28.04494 - 28 TiB 22 TiB 6.3 TiB 77.61 0.98 - host arh-ibstorage4-ib >>>> 8 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 909 GiB 67.80 0.86 141 osd.8 >>>> 17 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 904 GiB 67.99 0.86 144 osd.17 >>>> 27 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 654 GiB 76.84 0.97 152 osd.27 >>>> 28 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 481 GiB 82.98 1.05 153 osd.28 >>>> 29 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 829 GiB 70.65 0.89 137 osd.29 >>>> 30 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 762 GiB 73.03 0.92 142 osd.30 >>>> 33 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 501 GiB 82.25 1.04 166 osd.33 >>>> 34 hdd 5.45998 1.00000 5.5 TiB 4.5 TiB 968 GiB 82.77 1.04 325 osd.34 >>>> 39 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 402 GiB 85.77 1.08 162 osd.39 >>>> 38 ssd 0.74500 1.00000 745 GiB 671 GiB 74 GiB 90.02 1.14 68 osd.38 >>>> TOTAL 113 TiB 90 TiB 23 TiB 79.25 >>>> MIN/MAX VAR: 0.74/1.14 STDDEV: 8.14 >>>> # for i in $(radosgw-admin bucket list | jq -r '.[]'); do radosgw-admin >>>> bucket >>>> stats --bucket=$i | jq '.usage | ."rgw.main" | .size_kb' ; done | awk '{ >>>> SUM += >>>> $1} END { print SUM/1024/1024/1024 }' >>>> 6.59098 >>>> # ceph df >>>> GLOBAL: >>>> SIZE AVAIL RAW USED %RAW USED >>>> 113 TiB 23 TiB 90 TiB 79.25 >>>> POOLS: >>>> NAME ID USED %USED MAX AVAIL OBJECTS >>>> Primary-ubuntu-1 5 27 TiB 87.56 3.9 TiB 7302534 >>>> .users.uid 15 6.8 KiB 0 3.9 TiB 39 >>>> .users 16 335 B 0 3.9 TiB 20 >>>> .users.swift 17 14 B 0 3.9 TiB 1 >>>> .rgw.buckets 19 15 TiB 79.88 3.9 TiB 8787763 >>>> .users.email 22 0 B 0 3.9 TiB 0 >>>> .log 24 109 MiB 0 3.9 TiB 102301 >>>> .rgw.buckets.extra 37 0 B 0 2.6 TiB 0 >>>> .rgw.root 44 2.9 KiB 0 2.6 TiB 16 >>>> .rgw.meta 45 1.7 MiB 0 2.6 TiB 6249 >>>> .rgw.control 46 0 B 0 2.6 TiB 8 >>>> .rgw.gc 47 0 B 0 2.6 TiB 32 >>>> .usage 52 0 B 0 2.6 TiB 0 >>>> .intent-log 53 0 B 0 2.6 TiB 0 >>>> default.rgw.buckets.non-ec 54 0 B 0 2.6 TiB 0 >>>> .rgw.buckets.index 55 0 B 0 2.6 TiB 11485 >>>> .rgw 56 491 KiB 0 2.6 TiB 1686 >>>> Primary-ubuntu-1-ssd 57 1.2 TiB 92.39 105 GiB 379516 >>>> I am not too sure if the issue relates to the BlueStore overhead as I would >>>> probably have seen the discrepancy in my Primary-ubuntu-1 pool as well. >>>> However, the data usage on Primary-ubuntu-1 pool seems to be consistent >>>> with my >>>> expectations (precise numbers to be verified soon). The issues seems to be >>>> only >>>> with the .rgw-buckets pool where the "ceph df " output shows 15TB of usage >>>> and >>>> the sum of all buckets in that pool shows just over 6.5TB. >>>> Cheers >>>> Andrei >>>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de | <ifedo...@suse.de> ] >>>>> To: "andrei" [ mailto:and...@arhont.com | <and...@arhont.com> ] , >>>>> "ceph-users" [ >>>>> mailto:ceph-users@lists.ceph.com | <ceph-users@lists.ceph.com> ] >>>>> Sent: Tuesday, 2 July, 2019 10:58:54 >>>>> Subject: Re: [ceph-users] troubleshooting space usage >>>>> Hi Andrei, >>>>> The most obvious reason is space usage overhead caused by BlueStore >>>>> allocation >>>>> granularity, e.g. if bluestore_min_alloc_size is 64K and average object >>>>> size is >>>>> 16K one will waste 48K per object in average. This is rather a >>>>> speculation so >>>>> far as we lack key the information about your cluster: >>>>> - Ceph version >>>>> - What are the main devices for OSD: hdd or ssd. >>>>> - BlueStore or FileStore. >>>>> - average RGW object size. >>>>> You might also want to collect and share performance counter dumps (ceph >>>>> daemon >>>>> osd.N perf dump) and " >>>>> " reports from a couple of your OSDs. >>>>> Thanks, >>>>> Igor >>>>> On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote: >>>>>> Bump! >>>>>>> From: "Andrei Mikhailovsky" [ mailto:and...@arhont.com | >>>>>>> <and...@arhont.com> ] >>>>>>> To: "ceph-users" [ mailto:ceph-users@lists.ceph.com | >>>>>>> <ceph-users@lists.ceph.com> ] >>>>>>> Sent: Friday, 28 June, 2019 14:54:53 >>>>>>> Subject: [ceph-users] troubleshooting space usage >>>>>>> Hi >>>>>>> Could someone please explain / show how to troubleshoot the space usage >>>>>>> in Ceph >>>>>>> and how to reclaim the unused space? >>>>>>> I have a small cluster with 40 osds, replica of 2, mainly used as a >>>>>>> backend for >>>>>>> cloud stack as well as the S3 gateway. The used space doesn't make any >>>>>>> sense to >>>>>>> me, especially the rgw pool, so I am seeking help. >>>>>>> Here is what I found from the client: >>>>>>> Ceph -s shows the >>>>>>> usage: 89 TiB used, 24 TiB / 113 TiB avail >>>>>>> Ceph df shows: >>>>>>> Primary-ubuntu-1 5 27 TiB 90.11 3.0 TiB 7201098 >>>>>>> Primary-ubuntu-1-ssd 57 1.2 TiB 89.62 143 GiB 359260 >>>>>>> .rgw.buckets 19 15 TiB 83.73 3.0 TiB 8742222 >>>>>>> the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in line >>>>>>> with my >>>>>>> expectations. However, the .rgw.buckets pool seems to be using way too >>>>>>> much. >>>>>>> The usage of all rgw buckets shows 6.5TB usage (looking at the size_kb >>>>>>> values >>>>>>> from the "radosgw-admin bucket stats"). I am trying to figure out why >>>>>>> .rgw.buckets is using 15TB of space instead of the 6.5TB as shown from >>>>>>> the >>>>>>> bucket usage. >>>>>>> Thanks >>>>>>> Andrei >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] >>>>>>> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] >>>>>> _______________________________________________ >>>>>> ceph-users mailing list [ mailto:ceph-users@lists.ceph.com | >>>>>> ceph-users@lists.ceph.com ] [ >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com