Re: [ceph-users] troubleshooting space usage

2019-07-04 Thread Andrei Mikhailovsky
Thanks for trying to help, Igor. 

> From: "Igor Fedotov" 
> To: "Andrei Mikhailovsky" 
> Cc: "ceph-users" 
> Sent: Thursday, 4 July, 2019 12:52:16
> Subject: Re: [ceph-users] troubleshooting space usage

> Yep, this looks fine..

> hmm... sorry, but I'm out of ideas what's happening..

> Anyway I think ceph reports are more trustworthy than rgw ones. Looks like 
> some
> issue with rgw reporting or may be some object leakage.

> Regards,

> Igor

> On 7/3/2019 6:34 PM, Andrei Mikhailovsky wrote:

>> Hi Igor.

>> The numbers are identical it seems:

>> .rgw.buckets 19 15 TiB 78.22 4.3 TiB 8786934

>> # cat /root/ceph-rgw.buckets-rados-ls-all |wc -l
>> 8786934

>> Cheers

>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de |  ]
>>> To: "andrei" [ mailto:and...@arhont.com |  ]
>>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>  ]
>>> Sent: Wednesday, 3 July, 2019 13:49:02
>>> Subject: Re: [ceph-users] troubleshooting space usage

>>> Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a 
>>> little
>>> difference. So that's not the allocation overhead.

>>> What's about comparing object counts reported by ceph and radosgw tools?

>>> Igor.

>>> On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

>>>> Thanks Igor, Here is a link to the ceph perf data on several osds.

>>>> [ https://paste.ee/p/IzDMy | https://paste.ee/p/IzDMy ]

>>>> In terms of the object sizes. We use rgw to backup the data from various
>>>> workstations and servers. So, the sizes would be from a few kb to a few 
>>>> gig per
>>>> individual file.

>>>> Cheers

>>>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de |  ]
>>>>> To: "andrei" [ mailto:and...@arhont.com |  ]
>>>>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>>>  ]
>>>>> Sent: Wednesday, 3 July, 2019 12:29:33
>>>>> Subject: Re: [ceph-users] troubleshooting space usage

>>>>> Hi Andrei,

>>>>> Additionally I'd like to see performance counters dump for a couple of 
>>>>> HDD OSDs
>>>>> (obtained through 'ceph daemon osd.N perf dump' command).

>>>>> W.r.t average object size - I was thinking that you might know what 
>>>>> objects had
>>>>> been uploaded... If not then you might want to estimate it by using 
>>>>> "rados get"
>>>>> command on the pool: retrieve some random object set and check their 
>>>>> sizes. But
>>>>> let's check performance counters first - most probably they will show 
>>>>> loses
>>>>> caused by allocation.

>>>>> Also I've just found similar issue (still unresolved) in our internal 
>>>>> tracker -
>>>>> but its root cause is definitely different from allocation overhead. 
>>>>> Looks like
>>>>> some orphaned objects in the pool. Could you please compare and share the
>>>>> amounts of objects in the pool reported by "ceph (or rados) df detail" and
>>>>> radosgw tools?

>>>>> Thanks,

>>>>> Igor

>>>>> On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

>>>>>> Hi Igor,

>>>>>> Many thanks for your reply. Here are the details about the cluster:

>>>>>> 1. Ceph version - 13.2.5-1xenial (installed from Ceph repository for 
>>>>>> ubuntu
>>>>>> 16.04)

>>>>>> 2. main devices for radosgw pool - hdd. we do use a few ssds for the 
>>>>>> other pool,
>>>>>> but it is not used by radosgw

>>>>>> 3. we use BlueStore

>>>>>> 4. Average rgw object size - I have no idea how to check that. Couldn't 
>>>>>> find a
>>>>>> simple answer from google either. Could you please let me know how to 
>>>>>> check
>>>>>> that?

>>>>>> 5. Ceph osd df tree:

>>>>>> 6. Other useful info on the cluster:

>>>>>> # ceph osd df tree
>>>>>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME

>>>>>> -1 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - root uk
>>>>>> -5 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - datacenter ldex
>>>>>

Re: [ceph-users] troubleshooting space usage

2019-07-04 Thread Igor Fedotov

Yep, this looks fine..

hmm... sorry, but I'm out of ideas what's happening..

Anyway I think ceph  reports are more trustworthy than rgw ones. Looks 
like some issue with rgw reporting or may be some object leakage.



Regards,

Igor


On 7/3/2019 6:34 PM, Andrei Mikhailovsky wrote:

Hi Igor.

The numbers are identical it seems:

    .rgw.buckets   19      15 TiB     78.22       4.3 TiB *8786934*

# cat /root/ceph-rgw.buckets-rados-ls-all |wc -l
*8786934*

Cheers


*From: *"Igor Fedotov" 
*To: *"andrei" 
*Cc: *"ceph-users" 
*Sent: *Wednesday, 3 July, 2019 13:49:02
*Subject: *Re: [ceph-users] troubleshooting space usage

Looks fine - comparing bluestore_allocated vs. bluestore_stored
shows a little difference. So that's not the allocation overhead.

What's about comparing object counts reported by ceph and radosgw
tools?


Igor.


On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

Thanks Igor, Here is a link to the ceph perf data on several osds.

https://paste.ee/p/IzDMy

In terms of the object sizes. We use rgw to backup the data
from various workstations and servers. So, the sizes would be
from a few kb to a few gig per individual file.

Cheers





*From: *"Igor Fedotov" 
*To: *"andrei" 
*Cc: *"ceph-users" 
    *Sent: *Wednesday, 3 July, 2019 12:29:33
*Subject: *Re: [ceph-users] troubleshooting space usage

Hi Andrei,

Additionally I'd like to see performance counters dump for
a couple of HDD OSDs (obtained through 'ceph daemon osd.N
perf dump' command).

W.r.t average object size - I was thinking that you might
know what objects had been uploaded... If not then you
might want to estimate it by using "rados get" command on
the pool: retrieve some random object set and check their
sizes. But let's check performance counters first - most
probably they will show loses caused by allocation.


Also I've just found similar issue (still unresolved) in
our internal tracker - but its root cause is definitely
different from allocation overhead. Looks like some
orphaned objects in the pool. Could you please compare and
share the amounts of objects in the pool reported by "ceph
(or rados) df detail" and radosgw tools?


Thanks,

Igor


On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

Hi Igor,

Many thanks for your reply. Here are the details about
the cluster:

1. Ceph version - 13.2.5-1xenial (installed from Ceph
repository for ubuntu 16.04)

2. main devices for radosgw pool - hdd. we do use a
few ssds for the other pool, but it is not used by radosgw

3. we use BlueStore

4. Average rgw object size - I have no idea how to
check that. Couldn't find a simple answer from google
either. Could you please let me know how to check that?

5. Ceph osd df tree:

6. Other useful info on the cluster:

# ceph osd df tree
ID  CLASS WEIGHT    REWEIGHT SIZE    USE     AVAIL  
%USE  VAR  PGS TYPE NAME

 -1       112.17979        - 113 TiB  90 TiB  23 TiB
79.25 1.00   - root uk
 -5       112.17979        - 113 TiB  90 TiB  23 TiB
79.25 1.00   -     datacenter ldex
-11       112.17979        - 113 TiB  90 TiB  23 TiB
79.25 1.00   -         room ldex-dc3
-13       112.17979        - 113 TiB  90 TiB  23 TiB
79.25 1.00   -             row row-a
 -4       112.17979        - 113 TiB  90 TiB  23 TiB
79.25 1.00   - rack ldex-rack-a5
 -2        28.04495        -  28 TiB  22 TiB 6.2 TiB
77.96 0.98   -   host arh-ibstorage1-ib


  0   hdd   2.73000  0.7 2.8 TiB 2.3 TiB 519 GiB
81.61 1.03 145       osd.0
  1   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 847 GiB
70.00 0.88 130       osd.1
 2   hdd   2.73000  1.0 2.8 TiB 2.2 TiB 561 GiB
80.12 1.01 152         osd.2
  3   hdd   2.73000  1.0 2.8 TiB 2.3 TiB 469 GiB
83.41 1.05 160             osd.3
  4   hdd   2.73000  1.0 2.8 TiB 1.8 TiB 983 GiB
 

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
Hi Igor. 

The numbers are identical it seems: 

.rgw.buckets 19 15 TiB 78.22 4.3 TiB 8786934 

# cat /root/ceph-rgw.buckets-rados-ls-all |wc -l 
8786934 

Cheers 

> From: "Igor Fedotov" 
> To: "andrei" 
> Cc: "ceph-users" 
> Sent: Wednesday, 3 July, 2019 13:49:02
> Subject: Re: [ceph-users] troubleshooting space usage

> Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a little
> difference. So that's not the allocation overhead.

> What's about comparing object counts reported by ceph and radosgw tools?

> Igor.

> On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

>> Thanks Igor, Here is a link to the ceph perf data on several osds.

>> [ https://paste.ee/p/IzDMy | https://paste.ee/p/IzDMy ]

>> In terms of the object sizes. We use rgw to backup the data from various
>> workstations and servers. So, the sizes would be from a few kb to a few gig 
>> per
>> individual file.

>> Cheers

>>> From: "Igor Fedotov" [ mailto:ifedo...@suse.de |  ]
>>> To: "andrei" [ mailto:and...@arhont.com |  ]
>>> Cc: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>  ]
>>> Sent: Wednesday, 3 July, 2019 12:29:33
>>> Subject: Re: [ceph-users] troubleshooting space usage

>>> Hi Andrei,

>>> Additionally I'd like to see performance counters dump for a couple of HDD 
>>> OSDs
>>> (obtained through 'ceph daemon osd.N perf dump' command).

>>> W.r.t average object size - I was thinking that you might know what objects 
>>> had
>>> been uploaded... If not then you might want to estimate it by using "rados 
>>> get"
>>> command on the pool: retrieve some random object set and check their sizes. 
>>> But
>>> let's check performance counters first - most probably they will show loses
>>> caused by allocation.

>>> Also I've just found similar issue (still unresolved) in our internal 
>>> tracker -
>>> but its root cause is definitely different from allocation overhead. Looks 
>>> like
>>> some orphaned objects in the pool. Could you please compare and share the
>>> amounts of objects in the pool reported by "ceph (or rados) df detail" and
>>> radosgw tools?

>>> Thanks,

>>> Igor

>>> On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

>>>> Hi Igor,

>>>> Many thanks for your reply. Here are the details about the cluster:

>>>> 1. Ceph version - 13.2.5-1xenial (installed from Ceph repository for ubuntu
>>>> 16.04)

>>>> 2. main devices for radosgw pool - hdd. we do use a few ssds for the other 
>>>> pool,
>>>> but it is not used by radosgw

>>>> 3. we use BlueStore

>>>> 4. Average rgw object size - I have no idea how to check that. Couldn't 
>>>> find a
>>>> simple answer from google either. Could you please let me know how to check
>>>> that?

>>>> 5. Ceph osd df tree:

>>>> 6. Other useful info on the cluster:

>>>> # ceph osd df tree
>>>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME

>>>> -1 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - root uk
>>>> -5 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - datacenter ldex
>>>> -11 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - room ldex-dc3
>>>> -13 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - row row-a
>>>> -4 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - rack ldex-rack-a5
>>>> -2 28.04495 - 28 TiB 22 TiB 6.2 TiB 77.96 0.98 - host arh-ibstorage1-ib

>>>> 0 hdd 2.73000 0.7 2.8 TiB 2.3 TiB 519 GiB 81.61 1.03 145 osd.0
>>>> 1 hdd 2.73000 1.0 2.8 TiB 1.9 TiB 847 GiB 70.00 0.88 130 osd.1
>>>> 2 hdd 2.73000 1.0 2.8 TiB 2.2 TiB 561 GiB 80.12 1.01 152 osd.2
>>>> 3 hdd 2.73000 1.0 2.8 TiB 2.3 TiB 469 GiB 83.41 1.05 160 osd.3
>>>> 4 hdd 2.73000 1.0 2.8 TiB 1.8 TiB 983 GiB 65.18 0.82 141 osd.4
>>>> 32 hdd 5.45999 1.0 5.5 TiB 4.4 TiB 1.1 TiB 80.68 1.02 306 osd.32
>>>> 35 hdd 2.73000 1.0 2.8 TiB 1.7 TiB 1.0 TiB 62.89 0.79 126 osd.35
>>>> 36 hdd 2.73000 1.0 2.8 TiB 2.3 TiB 464 GiB 83.58 1.05 175 osd.36
>>>> 37 hdd 2.73000 0.8 2.8 TiB 2.5 TiB 301 GiB 89.34 1.13 160 osd.37
>>>> 5 ssd 0.74500 1.0 745 GiB 642 GiB 103 GiB 86.15 1.09 65 osd.5

>>>> -3 28.04495 - 28 TiB 24 TiB 4.5 TiB 84.03 1.06 - host arh-ibstorage2-ib
>>>> 9 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 405 GiB 85.65 1.08 158 osd.9

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Igor Fedotov
Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a 
little difference. So that's not the allocation overhead.


What's about comparing object counts reported by ceph and radosgw tools?


Igor.


On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:

Thanks Igor, Here is a link to the ceph perf data on several osds.

https://paste.ee/p/IzDMy

In terms of the object sizes. We use rgw to backup the data from 
various workstations and servers. So, the sizes would be from a few kb 
to a few gig per individual file.


Cheers





*From: *"Igor Fedotov" 
*To: *"andrei" 
*Cc: *"ceph-users" 
*Sent: *Wednesday, 3 July, 2019 12:29:33
*Subject: *Re: [ceph-users] troubleshooting space usage

Hi Andrei,

Additionally I'd like to see performance counters dump for a
couple of HDD OSDs (obtained through 'ceph daemon osd.N perf dump'
command).

W.r.t average object size - I was thinking that you might know
what objects had been uploaded... If not then you might want to
estimate it by using "rados get" command on the pool: retrieve
some random object set and check their sizes. But let's check
performance counters first - most probably they will show loses
caused by allocation.


Also I've just found similar issue (still unresolved) in our
internal tracker - but its root cause is definitely different from
allocation overhead. Looks like some orphaned objects in the pool.
Could you please compare and share the amounts of objects in the
pool reported by "ceph (or rados) df detail" and radosgw tools?


Thanks,

Igor


On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

Hi Igor,

Many thanks for your reply. Here are the details about the
cluster:

1. Ceph version - 13.2.5-1xenial (installed from Ceph
repository for ubuntu 16.04)

2. main devices for radosgw pool - hdd. we do use a few ssds
for the other pool, but it is not used by radosgw

3. we use BlueStore

4. Average rgw object size - I have no idea how to check that.
Couldn't find a simple answer from google either. Could you
please let me know how to check that?

5. Ceph osd df tree:

6. Other useful info on the cluster:

# ceph osd df tree
ID  CLASS WEIGHT    REWEIGHT SIZE    USE AVAIL   %USE  VAR
 PGS TYPE NAME

 -1       112.17979        - 113 TiB  90 TiB  23 TiB 79.25
1.00   - root uk
 -5       112.17979        - 113 TiB  90 TiB  23 TiB 79.25
1.00   -     datacenter ldex
-11       112.17979        - 113 TiB  90 TiB  23 TiB 79.25
1.00   -         room ldex-dc3
-13       112.17979        - 113 TiB  90 TiB  23 TiB 79.25
1.00   -             row row-a
 -4       112.17979        - 113 TiB  90 TiB  23 TiB 79.25
1.00   -                 rack ldex-rack-a5
 -2        28.04495        -  28 TiB  22 TiB 6.2 TiB 77.96
0.98   -                     host arh-ibstorage1-ib


  0   hdd   2.73000  0.7 2.8 TiB 2.3 TiB 519 GiB 81.61
1.03 145                         osd.0
  1   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 847 GiB 70.00
0.88 130                         osd.1
 2   hdd   2.73000  1.0 2.8 TiB 2.2 TiB 561 GiB 80.12 1.01
152                         osd.2
  3   hdd   2.73000  1.0 2.8 TiB 2.3 TiB 469 GiB 83.41
1.05 160 osd.3
  4   hdd   2.73000  1.0 2.8 TiB 1.8 TiB 983 GiB 65.18
0.82 141 osd.4
 32   hdd   5.45999  1.0 5.5 TiB 4.4 TiB 1.1 TiB 80.68
1.02 306 osd.32
 35   hdd   2.73000  1.0 2.8 TiB 1.7 TiB 1.0 TiB 62.89
0.79 126 osd.35
 36   hdd   2.73000  1.0 2.8 TiB 2.3 TiB 464 GiB 83.58
1.05 175 osd.36
 37   hdd   2.73000  0.8 2.8 TiB 2.5 TiB 301 GiB 89.34
1.13 160 osd.37
  5   ssd   0.74500  1.0 745 GiB 642 GiB 103 GiB 86.15
1.09  65 osd.5

 -3        28.04495        -  28 TiB  24 TiB 4.5 TiB 84.03
1.06   -                     host arh-ibstorage2-ib
  9   hdd   2.73000  0.95000 2.8 TiB 2.4 TiB 405 GiB 85.65
1.08 158 osd.9
 10   hdd   2.73000  0.8 2.8 TiB 2.4 TiB 352 GiB 87.52
1.10 169 osd.10
 11   hdd   2.73000  1.0 2.8 TiB 2.0 TiB 783 GiB 72.28
0.91 160 osd.11
 12   hdd   2.73000  0.84999 2.8 TiB 2.4 TiB 359 GiB 87.27
1.10 153 osd.12
 13   hdd   2.73000  1.0 2.8 TiB 2.4 TiB 348 GiB 87.69
1.11 169 osd.13
 14   hdd   2.73000  1.0 2.8 TiB 2.5 TiB 283 GiB 89.97
1.14 170 osd.14
 15   hdd   2.73000  1.0 2.8 TiB 2.2 TiB 560 GiB 80.18
1.01 155 osd.15
 16   hdd 

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
Thanks Igor, Here is a link to the ceph perf data on several osds. 

https://paste.ee/p/IzDMy 

In terms of the object sizes. We use rgw to backup the data from various 
workstations and servers. So, the sizes would be from a few kb to a few gig per 
individual file. 

Cheers 

> From: "Igor Fedotov" 
> To: "andrei" 
> Cc: "ceph-users" 
> Sent: Wednesday, 3 July, 2019 12:29:33
> Subject: Re: [ceph-users] troubleshooting space usage

> Hi Andrei,

> Additionally I'd like to see performance counters dump for a couple of HDD 
> OSDs
> (obtained through 'ceph daemon osd.N perf dump' command).

> W.r.t average object size - I was thinking that you might know what objects 
> had
> been uploaded... If not then you might want to estimate it by using "rados 
> get"
> command on the pool: retrieve some random object set and check their sizes. 
> But
> let's check performance counters first - most probably they will show loses
> caused by allocation.

> Also I've just found similar issue (still unresolved) in our internal tracker 
> -
> but its root cause is definitely different from allocation overhead. Looks 
> like
> some orphaned objects in the pool. Could you please compare and share the
> amounts of objects in the pool reported by "ceph (or rados) df detail" and
> radosgw tools?

> Thanks,

> Igor

> On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

>> Hi Igor,

>> Many thanks for your reply. Here are the details about the cluster:

>> 1. Ceph version - 13.2.5-1xenial (installed from Ceph repository for ubuntu
>> 16.04)

>> 2. main devices for radosgw pool - hdd. we do use a few ssds for the other 
>> pool,
>> but it is not used by radosgw

>> 3. we use BlueStore

>> 4. Average rgw object size - I have no idea how to check that. Couldn't find 
>> a
>> simple answer from google either. Could you please let me know how to check
>> that?

>> 5. Ceph osd df tree:

>> 6. Other useful info on the cluster:

>> # ceph osd df tree
>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME

>> -1 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - root uk
>> -5 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - datacenter ldex
>> -11 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - room ldex-dc3
>> -13 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - row row-a
>> -4 112.17979 - 113 TiB 90 TiB 23 TiB 79.25 1.00 - rack ldex-rack-a5
>> -2 28.04495 - 28 TiB 22 TiB 6.2 TiB 77.96 0.98 - host arh-ibstorage1-ib

>> 0 hdd 2.73000 0.7 2.8 TiB 2.3 TiB 519 GiB 81.61 1.03 145 osd.0
>> 1 hdd 2.73000 1.0 2.8 TiB 1.9 TiB 847 GiB 70.00 0.88 130 osd.1
>> 2 hdd 2.73000 1.0 2.8 TiB 2.2 TiB 561 GiB 80.12 1.01 152 osd.2
>> 3 hdd 2.73000 1.0 2.8 TiB 2.3 TiB 469 GiB 83.41 1.05 160 osd.3
>> 4 hdd 2.73000 1.0 2.8 TiB 1.8 TiB 983 GiB 65.18 0.82 141 osd.4
>> 32 hdd 5.45999 1.0 5.5 TiB 4.4 TiB 1.1 TiB 80.68 1.02 306 osd.32
>> 35 hdd 2.73000 1.0 2.8 TiB 1.7 TiB 1.0 TiB 62.89 0.79 126 osd.35
>> 36 hdd 2.73000 1.0 2.8 TiB 2.3 TiB 464 GiB 83.58 1.05 175 osd.36
>> 37 hdd 2.73000 0.8 2.8 TiB 2.5 TiB 301 GiB 89.34 1.13 160 osd.37
>> 5 ssd 0.74500 1.0 745 GiB 642 GiB 103 GiB 86.15 1.09 65 osd.5

>> -3 28.04495 - 28 TiB 24 TiB 4.5 TiB 84.03 1.06 - host arh-ibstorage2-ib
>> 9 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 405 GiB 85.65 1.08 158 osd.9
>> 10 hdd 2.73000 0.8 2.8 TiB 2.4 TiB 352 GiB 87.52 1.10 169 osd.10
>> 11 hdd 2.73000 1.0 2.8 TiB 2.0 TiB 783 GiB 72.28 0.91 160 osd.11
>> 12 hdd 2.73000 0.84999 2.8 TiB 2.4 TiB 359 GiB 87.27 1.10 153 osd.12
>> 13 hdd 2.73000 1.0 2.8 TiB 2.4 TiB 348 GiB 87.69 1.11 169 osd.13
>> 14 hdd 2.73000 1.0 2.8 TiB 2.5 TiB 283 GiB 89.97 1.14 170 osd.14
>> 15 hdd 2.73000 1.0 2.8 TiB 2.2 TiB 560 GiB 80.18 1.01 155 osd.15
>> 16 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 332 GiB 88.26 1.11 178 osd.16
>> 26 hdd 5.45999 1.0 5.5 TiB 4.4 TiB 1.0 TiB 81.04 1.02 324 osd.26
>> 7 ssd 0.74500 1.0 745 GiB 607 GiB 138 GiB 81.48 1.03 62 osd.7

>> -15 28.04495 - 28 TiB 22 TiB 6.4 TiB 77.40 0.98 - host arh-ibstorage3-ib
>> 18 hdd 2.73000 0.95000 2.8 TiB 2.5 TiB 312 GiB 88.96 1.12 156 osd.18
>> 19 hdd 2.73000 1.0 2.8 TiB 2.0 TiB 771 GiB 72.68 0.92 162 osd.19
>> 20 hdd 2.73000 1.0 2.8 TiB 2.0 TiB 733 GiB 74.04 0.93 149 osd.20
>> 21 hdd 2.73000 1.0 2.8 TiB 2.2 TiB 533 GiB 81.12 1.02 155 osd.21
>> 22 hdd 2.73000 1.0 2.8 TiB 2.1 TiB 692 GiB 75.48 0.95 144 osd.22
>> 23 hdd 2.73000 1.0 2.8 TiB 1.6 TiB 1.1 TiB 58.43 0.74 130 osd.23
>> 24 hdd 2.73000 1.0 2.8 TiB 2.2 TiB 579 GiB 79.51 1.00 146 osd.24
>> 25 hdd 2.73000 1

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Igor Fedotov
00  1.0 2.8 TiB 2.2 TiB 579 GiB 79.51 1.00 146   
                      osd.24
 25   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 886 GiB 68.63 0.87 147   
                      osd.25
 31   hdd   5.45999  1.0 5.5 TiB 4.7 TiB 758 GiB 86.50 1.09 326   
                      osd.31
  6   ssd   0.74500  0.8 744 GiB 640 GiB 104 GiB 86.01 1.09  61   
                      osd.6


-17        28.04494        -  28 TiB  22 TiB 6.3 TiB 77.61 0.98   -   
                  host arh-ibstorage4-ib
  8   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 909 GiB 67.80 0.86 141   
                      osd.8
 17   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 904 GiB 67.99 0.86 144   
                      osd.17
 27   hdd   2.73000  1.0 2.8 TiB 2.1 TiB 654 GiB 76.84 0.97 152   
                      osd.27
 28   hdd   2.73000  1.0 2.8 TiB 2.3 TiB 481 GiB 82.98 1.05 153   
                      osd.28
 29   hdd   2.73000  1.0 2.8 TiB 1.9 TiB 829 GiB 70.65 0.89 137   
                      osd.29
 30   hdd   2.73000  1.0 2.8 TiB 2.0 TiB 762 GiB 73.03 0.92 142   
                      osd.30
 33   hdd   2.73000  1.0 2.8 TiB 2.3 TiB 501 GiB 82.25 1.04 166   
                      osd.33
 34   hdd   5.45998  1.0 5.5 TiB 4.5 TiB 968 GiB 82.77 1.04 325   
                      osd.34
 39   hdd   2.73000  0.95000 2.8 TiB 2.4 TiB 402 GiB 85.77 1.08 162   
                      osd.39
 38   ssd   0.74500  1.0 745 GiB 671 GiB  74 GiB 90.02 1.14  68   
                      osd.38

                       TOTAL 113 TiB  90 TiB  23 TiB 79.25
MIN/MAX VAR: 0.74/1.14  STDDEV: 8.14



# for i in $(radosgw-admin bucket list | jq -r '.[]'); do 
radosgw-admin bucket stats --bucket=$i | jq '.usage | ."rgw.main" | 
.size_kb' ; done | awk '{ SUM += $1} END { print SUM/1024/1024/1024 }'

6.59098


# ceph df


GLOBAL:
    SIZE        AVAIL      RAW USED     %RAW USED
    113 TiB     23 TiB       90 TiB         79.25

POOLS:
    NAME                           ID     USED  %USED     MAX AVAIL   
  OBJECTS
    Primary-ubuntu-1               5       27 TiB 87.56       3.9 TiB 
    7302534
    .users.uid                     15     6.8 KiB 0       3.9 TiB     
     39
    .users                         16       335 B 0       3.9 TiB     
     20
    .users.swift                   17        14 B 0       3.9 TiB     
      1
*.rgw.buckets                   19      15 TiB     79.88       3.9 TiB 
    8787763*
    .users.email                   22         0 B 0       3.9 TiB     
      0
    .log                           24     109 MiB 0       3.9 TiB     
 102301
    .rgw.buckets.extra             37         0 B 0       2.6 TiB     
      0
    .rgw.root                      44     2.9 KiB 0       2.6 TiB     
     16
    .rgw.meta                      45     1.7 MiB 0       2.6 TiB     
   6249
    .rgw.control                   46         0 B 0       2.6 TiB     
      8
    .rgw.gc                        47         0 B 0       2.6 TiB     
     32
    .usage                         52         0 B 0       2.6 TiB     
      0
    .intent-log                    53         0 B 0       2.6 TiB     
      0
    default.rgw.buckets.non-ec     54         0 B 0       2.6 TiB     
      0
    .rgw.buckets.index             55         0 B 0       2.6 TiB     
  11485
    .rgw                           56     491 KiB 0       2.6 TiB     
   1686
    Primary-ubuntu-1-ssd           57     1.2 TiB 92.39       105 GiB 
     379516



I am not too sure if the issue relates to the BlueStore overhead as I 
would probably have seen the discrepancy in my Primary-ubuntu-1 pool 
as well. However, the data usage on Primary-ubuntu-1 pool seems to be 
consistent with my expectations (precise numbers to be verified soon). 
The issues seems to be only with the .rgw-buckets pool where the "ceph 
df " output shows 15TB of usage and the sum of all buckets in that 
pool shows just over 6.5TB.


Cheers

Andrei




*From: *"Igor Fedotov" 
    *To: *"andrei" , "ceph-users"

*Sent: *Tuesday, 2 July, 2019 10:58:54
*Subject: *Re: [ceph-users] troubleshooting space usage

Hi Andrei,

The most obvious reason is space usage overhead caused by
BlueStore allocation granularity, e.g. if bluestore_min_alloc_size
is 64K  and average object size is 16K one will waste 48K per
object in average. This is rather a speculation so far as we lack
key the information about your cluster:

- Ceph version

- What are the main devices for OSD: hdd or ssd.

- BlueStore or FileStore.

- average RGW object size.

You might also want to collect and share performance counter dumps
(ceph daemon osd.N perf dump) and "

" reports from a couple of your OSDs.


Thanks,

Igor


On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:

Bump!


---

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
.6 TiB 0 
.rgw.buckets.index 55 0 B 0 2.6 TiB 11485 
.rgw 56 491 KiB 0 2.6 TiB 1686 
Primary-ubuntu-1-ssd 57 1.2 TiB 92.39 105 GiB 379516 

I am not too sure if the issue relates to the BlueStore overhead as I would 
probably have seen the discrepancy in my Primary-ubuntu-1 pool as well. 
However, the data usage on Primary-ubuntu-1 pool seems to be consistent with my 
expectations (precise numbers to be verified soon). The issues seems to be only 
with the .rgw-buckets pool where the "ceph df " output shows 15TB of usage and 
the sum of all buckets in that pool shows just over 6.5TB. 

Cheers 

Andrei 

> From: "Igor Fedotov" 
> To: "andrei" , "ceph-users" 
> Sent: Tuesday, 2 July, 2019 10:58:54
> Subject: Re: [ceph-users] troubleshooting space usage

> Hi Andrei,

> The most obvious reason is space usage overhead caused by BlueStore allocation
> granularity, e.g. if bluestore_min_alloc_size is 64K and average object size 
> is
> 16K one will waste 48K per object in average. This is rather a speculation so
> far as we lack key the information about your cluster:

> - Ceph version

> - What are the main devices for OSD: hdd or ssd.

> - BlueStore or FileStore.

> - average RGW object size.

> You might also want to collect and share performance counter dumps (ceph 
> daemon
> osd.N perf dump) and "
> " reports from a couple of your OSDs.

> Thanks,

> Igor

> On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:

>> Bump!

>>> From: "Andrei Mikhailovsky" [ mailto:and...@arhont.com | 
>>>  ]
>>> To: "ceph-users" [ mailto:ceph-users@lists.ceph.com |
>>>  ]
>>> Sent: Friday, 28 June, 2019 14:54:53
>>> Subject: [ceph-users] troubleshooting space usage

>>> Hi

>>> Could someone please explain / show how to troubleshoot the space usage in 
>>> Ceph
>>> and how to reclaim the unused space?

>>> I have a small cluster with 40 osds, replica of 2, mainly used as a backend 
>>> for
>>> cloud stack as well as the S3 gateway. The used space doesn't make any 
>>> sense to
>>> me, especially the rgw pool, so I am seeking help.

>>> Here is what I found from the client:

>>> Ceph -s shows the

>>> usage: 89 TiB used, 24 TiB / 113 TiB avail

>>> Ceph df shows:

>>> Primary-ubuntu-1 5 27 TiB 90.11 3.0 TiB 7201098
>>> Primary-ubuntu-1-ssd 57 1.2 TiB 89.62 143 GiB 359260
>>> .rgw.buckets 19 15 TiB 83.73 3.0 TiB 874

>>> the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in line with 
>>> my
>>> expectations. However, the .rgw.buckets pool seems to be using way too much.
>>> The usage of all rgw buckets shows 6.5TB usage (looking at the size_kb 
>>> values
>>> from the "radosgw-admin bucket stats"). I am trying to figure out why
>>> .rgw.buckets is using 15TB of space instead of the 6.5TB as shown from the
>>> bucket usage.

>>> Thanks

>>> Andrei

>>> ___
>>> ceph-users mailing list
>>> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
>>> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]

>> ___
>> ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
>> ceph-users@lists.ceph.com ] [
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] troubleshooting space usage

2019-07-02 Thread Igor Fedotov

Hi Andrei,

The most obvious reason is space usage overhead caused by BlueStore 
allocation granularity, e.g. if bluestore_min_alloc_size is 64K  and 
average object size is 16K one will waste 48K per object in average. 
This is rather a speculation so far as we lack key the information about 
your cluster:


- Ceph version

- What are the main devices for OSD: hdd or ssd.

- BlueStore or FileStore.

- average RGW object size.

You might also want to collect and share performance counter dumps (ceph 
daemon osd.N perf dump) and "ceph osd df tree" reports from a couple of 
your OSDs.



Thanks,

Igor


On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:

Bump!




*From: *"Andrei Mikhailovsky" 
*To: *"ceph-users" 
*Sent: *Friday, 28 June, 2019 14:54:53
*Subject: *[ceph-users] troubleshooting space usage

Hi

Could someone please explain / show how to troubleshoot the space
usage in Ceph and how to reclaim the unused space?

I have a small cluster with 40 osds, replica of 2, mainly used as
a backend for cloud stack as well as the S3 gateway. The used
space doesn't make any sense to me, especially the rgw pool, so I
am seeking help.

Here is what I found from the client:

Ceph -s shows the

 usage:   89 TiB used, 24 TiB / 113 TiB avail

Ceph df shows:

Primary-ubuntu-1               5       27 TiB 90.11       3.0 TiB
    7201098
Primary-ubuntu-1-ssd           57     1.2 TiB 89.62       143 GiB
     359260
.rgw.buckets         19      15 TiB     83.73       3.0 TiB 874

the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in
line with my expectations. However, the .rgw.buckets pool seems to
be using way too much. The usage of all rgw buckets shows 6.5TB
usage (looking at the size_kb values from the "radosgw-admin
bucket stats"). I am trying to figure out why .rgw.buckets is
using 15TB of space instead of the 6.5TB as shown from the bucket
usage.

Thanks

Andrei

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] troubleshooting space usage

2019-07-02 Thread Andrei Mikhailovsky
Bump! 

> From: "Andrei Mikhailovsky" 
> To: "ceph-users" 
> Sent: Friday, 28 June, 2019 14:54:53
> Subject: [ceph-users] troubleshooting space usage

> Hi

> Could someone please explain / show how to troubleshoot the space usage in 
> Ceph
> and how to reclaim the unused space?

> I have a small cluster with 40 osds, replica of 2, mainly used as a backend 
> for
> cloud stack as well as the S3 gateway. The used space doesn't make any sense 
> to
> me, especially the rgw pool, so I am seeking help.

> Here is what I found from the client:

> Ceph -s shows the

> usage: 89 TiB used, 24 TiB / 113 TiB avail

> Ceph df shows:

> Primary-ubuntu-1 5 27 TiB 90.11 3.0 TiB 7201098
> Primary-ubuntu-1-ssd 57 1.2 TiB 89.62 143 GiB 359260
> .rgw.buckets 19 15 TiB 83.73 3.0 TiB 874

> the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in line with my
> expectations. However, the .rgw.buckets pool seems to be using way too much.
> The usage of all rgw buckets shows 6.5TB usage (looking at the size_kb values
> from the "radosgw-admin bucket stats"). I am trying to figure out why
> .rgw.buckets is using 15TB of space instead of the 6.5TB as shown from the
> bucket usage.

> Thanks

> Andrei

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] troubleshooting space usage

2019-06-28 Thread Andrei Mikhailovsky
Hi 

Could someone please explain / show how to troubleshoot the space usage in Ceph 
and how to reclaim the unused space? 

I have a small cluster with 40 osds, replica of 2, mainly used as a backend for 
cloud stack as well as the S3 gateway. The used space doesn't make any sense to 
me, especially the rgw pool, so I am seeking help. 

Here is what I found from the client: 

Ceph -s shows the 

usage: 89 TiB used, 24 TiB / 113 TiB avail 

Ceph df shows: 

Primary-ubuntu-1 5 27 TiB 90.11 3.0 TiB 7201098 
Primary-ubuntu-1-ssd 57 1.2 TiB 89.62 143 GiB 359260 
.rgw.buckets 19 15 TiB 83.73 3.0 TiB 874 

the usage of the Primary-ubuntu-1 and Primary-ubuntu-1-ssd is in line with my 
expectations. However, the .rgw.buckets pool seems to be using way too much. 
The usage of all rgw buckets shows 6.5TB usage (looking at the size_kb values 
from the "radosgw-admin bucket stats"). I am trying to figure out why 
.rgw.buckets is using 15TB of space instead of the 6.5TB as shown from the 
bucket usage. 

Thanks 

Andrei 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com