Yep, this looks fine..

hmm... sorry, but I'm out of ideas what's happening..

Anyway I think ceph  reports are more trustworthy than rgw ones. Looks like some issue with rgw reporting or may be some object leakage.


Regards,

Igor


On 7/3/2019 6:34 PM, Andrei Mikhailovsky wrote:
Hi Igor.

The numbers are identical it seems:

    .rgw.buckets   19      15 TiB     78.22       4.3 TiB *8786934*

# cat /root/ceph-rgw.buckets-rados-ls-all |wc -l
*8786934*

Cheers
------------------------------------------------------------------------

    *From: *"Igor Fedotov" <ifedo...@suse.de>
    *To: *"andrei" <and...@arhont.com>
    *Cc: *"ceph-users" <ceph-users@lists.ceph.com>
    *Sent: *Wednesday, 3 July, 2019 13:49:02
    *Subject: *Re: [ceph-users] troubleshooting space usage

    Looks fine - comparing bluestore_allocated vs. bluestore_stored
    shows a little difference. So that's not the allocation overhead.

    What's about comparing object counts reported by ceph and radosgw
    tools?


    Igor.


    On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

        Thanks Igor, Here is a link to the ceph perf data on several osds.

        https://paste.ee/p/IzDMy

        In terms of the object sizes. We use rgw to backup the data
        from various workstations and servers. So, the sizes would be
        from a few kb to a few gig per individual file.

        Cheers



        ------------------------------------------------------------------------

            *From: *"Igor Fedotov" <ifedo...@suse.de>
            *To: *"andrei" <and...@arhont.com>
            *Cc: *"ceph-users" <ceph-users@lists.ceph.com>
            *Sent: *Wednesday, 3 July, 2019 12:29:33
            *Subject: *Re: [ceph-users] troubleshooting space usage

            Hi Andrei,

            Additionally I'd like to see performance counters dump for
            a couple of HDD OSDs (obtained through 'ceph daemon osd.N
            perf dump' command).

            W.r.t average object size - I was thinking that you might
            know what objects had been uploaded... If not then you
            might want to estimate it by using "rados get" command on
            the pool: retrieve some random object set and check their
            sizes. But let's check performance counters first - most
            probably they will show loses caused by allocation.


            Also I've just found similar issue (still unresolved) in
            our internal tracker - but its root cause is definitely
            different from allocation overhead. Looks like some
            orphaned objects in the pool. Could you please compare and
            share the amounts of objects in the pool reported by "ceph
            (or rados) df detail" and radosgw tools?


            Thanks,

            Igor


            On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

                Hi Igor,

                Many thanks for your reply. Here are the details about
                the cluster:

                1. Ceph version - 13.2.5-1xenial (installed from Ceph
                repository for ubuntu 16.04)

                2. main devices for radosgw pool - hdd. we do use a
                few ssds for the other pool, but it is not used by radosgw

                3. we use BlueStore

                4. Average rgw object size - I have no idea how to
                check that. Couldn't find a simple answer from google
                either. Could you please let me know how to check that?

                5. Ceph osd df tree:

                6. Other useful info on the cluster:

                # ceph osd df tree
                ID  CLASS WEIGHT    REWEIGHT SIZE    USE     AVAIL  
                %USE  VAR  PGS TYPE NAME

                 -1       112.17979        - 113 TiB  90 TiB  23 TiB
                79.25 1.00   - root uk
                 -5       112.17979        - 113 TiB  90 TiB  23 TiB
                79.25 1.00   -     datacenter ldex
                -11       112.17979        - 113 TiB  90 TiB  23 TiB
                79.25 1.00   -         room ldex-dc3
                -13       112.17979        - 113 TiB  90 TiB  23 TiB
                79.25 1.00   -             row row-a
                 -4       112.17979        - 113 TiB  90 TiB  23 TiB
                79.25 1.00   - rack ldex-rack-a5
                 -2        28.04495        -  28 TiB  22 TiB 6.2 TiB
                77.96 0.98   -   host arh-ibstorage1-ib


                  0   hdd   2.73000  0.79999 2.8 TiB 2.3 TiB 519 GiB
                81.61 1.03 145       osd.0
                  1   hdd   2.73000  1.00000 2.8 TiB 1.9 TiB 847 GiB
                70.00 0.88 130       osd.1
                 2   hdd   2.73000  1.00000 2.8 TiB 2.2 TiB 561 GiB
                80.12 1.01 152         osd.2
                  3   hdd   2.73000  1.00000 2.8 TiB 2.3 TiB 469 GiB
                83.41 1.05 160             osd.3
                  4   hdd   2.73000  1.00000 2.8 TiB 1.8 TiB 983 GiB
                65.18 0.82 141             osd.4
                 32   hdd   5.45999  1.00000 5.5 TiB 4.4 TiB 1.1 TiB
                80.68 1.02 306             osd.32
                 35   hdd   2.73000  1.00000 2.8 TiB 1.7 TiB 1.0 TiB
                62.89 0.79 126             osd.35
                 36   hdd   2.73000  1.00000 2.8 TiB 2.3 TiB 464 GiB
                83.58 1.05 175             osd.36
                 37   hdd   2.73000  0.89999 2.8 TiB 2.5 TiB 301 GiB
                89.34 1.13 160             osd.37
                  5   ssd   0.74500  1.00000 745 GiB 642 GiB 103 GiB
                86.15 1.09  65             osd.5

                 -3        28.04495        -  28 TiB  24 TiB 4.5 TiB
                84.03 1.06   -         host arh-ibstorage2-ib
                  9   hdd   2.73000  0.95000 2.8 TiB 2.4 TiB 405 GiB
                85.65 1.08 158             osd.9
                 10   hdd   2.73000  0.89999 2.8 TiB 2.4 TiB 352 GiB
                87.52 1.10 169             osd.10
                 11   hdd   2.73000  1.00000 2.8 TiB 2.0 TiB 783 GiB
                72.28 0.91 160               osd.11
                 12   hdd   2.73000  0.84999 2.8 TiB 2.4 TiB 359 GiB
                87.27 1.10 153               osd.12
                 13   hdd   2.73000  1.00000 2.8 TiB 2.4 TiB 348 GiB
                87.69 1.11 169               osd.13
                 14   hdd   2.73000  1.00000 2.8 TiB 2.5 TiB 283 GiB
                89.97 1.14 170               osd.14
                 15   hdd   2.73000  1.00000 2.8 TiB 2.2 TiB 560 GiB
                80.18 1.01 155               osd.15
                 16   hdd   2.73000  0.95000 2.8 TiB 2.4 TiB 332 GiB
                88.26 1.11 178               osd.16
                 26   hdd   5.45999  1.00000 5.5 TiB 4.4 TiB 1.0 TiB
                81.04 1.02 324               osd.26
                  7   ssd   0.74500  1.00000 745 GiB 607 GiB 138 GiB
                81.48 1.03  62               osd.7

                -15        28.04495        -  28 TiB  22 TiB 6.4 TiB
                77.40 0.98   -           host arh-ibstorage3-ib
                 18   hdd   2.73000  0.95000 2.8 TiB 2.5 TiB 312 GiB
                88.96 1.12 156               osd.18
                 19   hdd   2.73000  1.00000 2.8 TiB 2.0 TiB 771 GiB
                72.68 0.92 162               osd.19
                 20   hdd   2.73000  1.00000 2.8 TiB 2.0 TiB 733 GiB
                74.04 0.93 149               osd.20
                 21   hdd   2.73000  1.00000 2.8 TiB 2.2 TiB 533 GiB
                81.12 1.02 155                     osd.21
                 22   hdd   2.73000  1.00000 2.8 TiB 2.1 TiB 692 GiB
                75.48 0.95 144                     osd.22
                 23   hdd   2.73000  1.00000 2.8 TiB 1.6 TiB 1.1 TiB
                58.43 0.74 130                     osd.23
                 24   hdd   2.73000  1.00000 2.8 TiB 2.2 TiB 579 GiB
                79.51 1.00 146                     osd.24
                 25   hdd   2.73000  1.00000 2.8 TiB 1.9 TiB 886 GiB
                68.63 0.87 147                     osd.25
                 31   hdd   5.45999  1.00000 5.5 TiB 4.7 TiB 758 GiB
                86.50 1.09 326                     osd.31
                  6   ssd   0.74500  0.89999 744 GiB 640 GiB 104 GiB
                86.01 1.09  61                     osd.6

                -17        28.04494        -  28 TiB  22 TiB 6.3 TiB
                77.61 0.98   -                 host arh-ibstorage4-ib
                  8   hdd   2.73000  1.00000 2.8 TiB 1.9 TiB 909 GiB
                67.80 0.86 141                     osd.8
                 17   hdd   2.73000  1.00000 2.8 TiB 1.9 TiB 904 GiB
                67.99 0.86 144                     osd.17
                 27   hdd   2.73000  1.00000 2.8 TiB 2.1 TiB 654 GiB
                76.84 0.97 152                     osd.27
                 28   hdd   2.73000  1.00000 2.8 TiB 2.3 TiB 481 GiB
                82.98 1.05 153                     osd.28
                 29   hdd   2.73000  1.00000 2.8 TiB 1.9 TiB 829 GiB
                70.65 0.89 137                       osd.29
                 30   hdd   2.73000  1.00000 2.8 TiB 2.0 TiB 762 GiB
                73.03 0.92 142                       osd.30
                 33   hdd   2.73000  1.00000 2.8 TiB 2.3 TiB 501 GiB
                82.25 1.04 166                       osd.33
                 34   hdd   5.45998  1.00000 5.5 TiB 4.5 TiB 968 GiB
                82.77 1.04 325                       osd.34
                 39   hdd   2.73000  0.95000 2.8 TiB 2.4 TiB 402 GiB
                85.77 1.08 162                       osd.39
                 38   ssd   0.74500  1.00000 745 GiB 671 GiB  74 GiB
                90.02 1.14  68                       osd.38
                                       TOTAL 113 TiB  90 TiB  23 TiB 79.25
                MIN/MAX VAR: 0.74/1.14  STDDEV: 8.14



                # for i in $(radosgw-admin bucket list | jq -r '.[]');
                do radosgw-admin bucket stats --bucket=$i | jq '.usage
                | ."rgw.main" | .size_kb' ; done | awk '{ SUM += $1}
                END { print SUM/1024/1024/1024 }'
                6.59098


                # ceph df


                GLOBAL:
                    SIZE        AVAIL      RAW USED %RAW USED
                    113 TiB     23 TiB       90 TiB     79.25

                POOLS:
                    NAME                           ID USED      
                 %USED     MAX AVAIL     OBJECTS
                    Primary-ubuntu-1               5   27 TiB    
                87.56       3.9 TiB     7302534
                    .users.uid                     15 6.8 KiB        
                0       3.9 TiB          39
                    .users                         16   335 B        
                0       3.9 TiB          20
                    .users.swift                   17    14 B        
                0       3.9 TiB           1
                *    .rgw.buckets           19      15 TiB     79.88
                3.9 TiB     8787763*
                    .users.email                   22     0 B        
                0       3.9 TiB           0
                    .log                           24 109 MiB        
                0       3.9 TiB      102301
                    .rgw.buckets.extra             37     0 B        
                0       2.6 TiB           0
                    .rgw.root                      44 2.9 KiB        
                0       2.6 TiB          16
                    .rgw.meta                      45 1.7 MiB        
                0       2.6 TiB        6249
                    .rgw.control                   46     0 B        
                0       2.6 TiB           8
                    .rgw.gc                        47     0 B        
                0       2.6 TiB          32
                    .usage                         52     0 B        
                0       2.6 TiB           0
                    .intent-log                    53     0 B        
                0       2.6 TiB           0
                    default.rgw.buckets.non-ec     54     0 B        
                0       2.6 TiB           0
                    .rgw.buckets.index             55     0 B        
                0       2.6 TiB       11485
                    .rgw                           56 491 KiB        
                0       2.6 TiB        1686
                    Primary-ubuntu-1-ssd           57 1.2 TiB    
                92.39       105 GiB      379516


                I am not too sure if the issue relates to the
                BlueStore overhead as I would probably have seen the
                discrepancy in my Primary-ubuntu-1 pool as well.
                However, the data usage on Primary-ubuntu-1 pool seems
                to be consistent with my expectations (precise numbers
                to be verified soon). The issues seems to be only with
                the .rgw-buckets pool where the "ceph df " output
                shows 15TB of usage and the sum of all buckets in that
                pool shows just over 6.5TB.

                Cheers

                Andrei


                
------------------------------------------------------------------------

                    *From: *"Igor Fedotov" <ifedo...@suse.de>
                    *To: *"andrei" <and...@arhont.com>, "ceph-users"
                    <ceph-users@lists.ceph.com>
                    *Sent: *Tuesday, 2 July, 2019 10:58:54
                    *Subject: *Re: [ceph-users] troubleshooting space
                    usage

                    Hi Andrei,

                    The most obvious reason is space usage overhead
                    caused by BlueStore allocation granularity, e.g.
                    if bluestore_min_alloc_size is 64K  and average
                    object size is 16K one will waste 48K per object
                    in average. This is rather a speculation so far as
                    we lack key the information about your cluster:

                    - Ceph version

                    - What are the main devices for OSD: hdd or ssd.

                    - BlueStore or FileStore.

                    - average RGW object size.

                    You might also want to collect and share
                    performance counter dumps (ceph daemon osd.N perf
                    dump) and "

                    " reports from a couple of your OSDs.


                    Thanks,

                    Igor


                    On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:

                        Bump!


                        
------------------------------------------------------------------------

                            *From: *"Andrei Mikhailovsky"
                            <and...@arhont.com>
                            *To: *"ceph-users" <ceph-users@lists.ceph.com>
                            *Sent: *Friday, 28 June, 2019 14:54:53
                            *Subject: *[ceph-users] troubleshooting
                            space usage

                            Hi

                            Could someone please explain / show how to
                            troubleshoot the space usage in Ceph and
                            how to reclaim the unused space?

                            I have a small cluster with 40 osds,
                            replica of 2, mainly used as a backend for
                            cloud stack as well as the S3 gateway. The
                            used space doesn't make any sense to me,
                            especially the rgw pool, so I am seeking help.

                            Here is what I found from the client:

                            Ceph -s shows the

                             usage:   89 TiB used, 24 TiB / 113 TiB avail

                            Ceph df shows:

                            Primary-ubuntu-1   5       27 TiB    
                            90.11 3.0 TiB     7201098
                            Primary-ubuntu-1-ssd   57     1.2 TiB    
                            89.62 143 GiB      359260
                            .rgw.buckets         19      15 TiB 83.73
                                  3.0 TiB 8742222

                            the usage of the Primary-ubuntu-1 and
                            Primary-ubuntu-1-ssd is in line with my
                            expectations. However, the .rgw.buckets
                            pool seems to be using way too much. The
                            usage of all rgw buckets shows 6.5TB usage
                            (looking at the size_kb values from the
                            "radosgw-admin bucket stats"). I am trying
                            to figure out why .rgw.buckets is using
                            15TB of space instead of the 6.5TB as
                            shown from the bucket usage.

                            Thanks

                            Andrei

                            _______________________________________________
                            ceph-users mailing list
                            ceph-users@lists.ceph.com
                            
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


                        _______________________________________________
                        ceph-users mailing list
                        ceph-users@lists.ceph.com
                        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to