Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-12 Thread Wido den Hollander



On 6/11/19 9:48 PM, J. Eric Ivancich wrote:
> Hi Wido,
> 
> Interleaving below
> 
> On 6/11/19 3:10 AM, Wido den Hollander wrote:
>>
>> I thought it was resolved, but it isn't.
>>
>> I counted all the OMAP values for the GC objects and I got back:
>>
>> gc.0: 0
>> gc.11: 0
>> gc.14: 0
>> gc.15: 0
>> gc.16: 0
>> gc.18: 0
>> gc.19: 0
>> gc.1: 0
>> gc.20: 0
>> gc.21: 0
>> gc.22: 0
>> gc.23: 0
>> gc.24: 0
>> gc.25: 0
>> gc.27: 0
>> gc.29: 0
>> gc.2: 0
>> gc.30: 0
>> gc.3: 0
>> gc.4: 0
>> gc.5: 0
>> gc.6: 0
>> gc.7: 0
>> gc.8: 0
>> gc.9: 0
>> gc.13: 110996
>> gc.10: 04
>> gc.26: 42
>> gc.28: 111292
>> gc.17: 111314
>> gc.12: 111534
>> gc.31: 111956
> 
> Casey Bodley mentioned to me that he's seen similar behavior to what
> you're describing when RGWs are upgraded but not all OSDs are upgraded
> as well. Is it possible that the OSDs hosting gc.13, gc.10, and so forth
> are running a different version of ceph?
> 

Yes, the OSDs are still on 13.2.5. As this is a big (2500 OSD)
production environment we only created a temporary machine with 13.2.6
(just a few hours before it's release) to run the GC.

We did not upgrade the cluster itself as we will have to wait with that
before we have validated the release on the testing cluster before.

Wido

> Eric
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-11 Thread J. Eric Ivancich
Hi Wido,

Interleaving below

On 6/11/19 3:10 AM, Wido den Hollander wrote:
> 
> I thought it was resolved, but it isn't.
> 
> I counted all the OMAP values for the GC objects and I got back:
> 
> gc.0: 0
> gc.11: 0
> gc.14: 0
> gc.15: 0
> gc.16: 0
> gc.18: 0
> gc.19: 0
> gc.1: 0
> gc.20: 0
> gc.21: 0
> gc.22: 0
> gc.23: 0
> gc.24: 0
> gc.25: 0
> gc.27: 0
> gc.29: 0
> gc.2: 0
> gc.30: 0
> gc.3: 0
> gc.4: 0
> gc.5: 0
> gc.6: 0
> gc.7: 0
> gc.8: 0
> gc.9: 0
> gc.13: 110996
> gc.10: 04
> gc.26: 42
> gc.28: 111292
> gc.17: 111314
> gc.12: 111534
> gc.31: 111956

Casey Bodley mentioned to me that he's seen similar behavior to what
you're describing when RGWs are upgraded but not all OSDs are upgraded
as well. Is it possible that the OSDs hosting gc.13, gc.10, and so forth
are running a different version of ceph?

Eric

-- 
J. Eric Ivancich
he/him/his
Red Hat Storage
Ann Arbor, Michigan, USA
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-11 Thread Wido den Hollander



On 6/4/19 8:00 PM, J. Eric Ivancich wrote:
> On 6/4/19 7:37 AM, Wido den Hollander wrote:
>> I've set up a temporary machine next to the 13.2.5 cluster with the
>> 13.2.6 packages from Shaman.
>>
>> On that machine I'm running:
>>
>> $ radosgw-admin gc process
>>
>> That seems to work as intended! So the PR seems to have fixed it.
>>
>> Should be fixed permanently when 13.2.6 is officially released.
>>
>> Wido
> 
> Thank you, Wido, for sharing the results of your experiment. I'm happy
> to learn that it was successful. And v13.2.6 was just released about 2
> hours ago.
> 

I thought it was resolved, but it isn't.

I counted all the OMAP values for the GC objects and I got back:

gc.0: 0
gc.11: 0
gc.14: 0
gc.15: 0
gc.16: 0
gc.18: 0
gc.19: 0
gc.1: 0
gc.20: 0
gc.21: 0
gc.22: 0
gc.23: 0
gc.24: 0
gc.25: 0
gc.27: 0
gc.29: 0
gc.2: 0
gc.30: 0
gc.3: 0
gc.4: 0
gc.5: 0
gc.6: 0
gc.7: 0
gc.8: 0
gc.9: 0
gc.13: 110996
gc.10: 04
gc.26: 42
gc.28: 111292
gc.17: 111314
gc.12: 111534
gc.31: 111956

So as you can see a few remain.

I ran:

$ radosgw-admin gc process --debug-rados=10

That finishes within 10 seconds. Then I tried:

$ radosgw-admin gc process --debug-rados=10 --include-all

That also finishes within 10 seconds.

What I noticed in the logs was this:

2019-06-11 09:06:58.711 7f8ffb876240 10 librados: call oid=gc.17 nspace=
2019-06-11 09:06:58.717 7f8ffb876240 10 librados: Objecter returned from
call r=-16

The return value is '-16' for gc.17 where for gc.18 or any other object
with 0 OMAP values it is:

2019-06-11 09:06:58.717 7f8ffb876240 10 librados: call oid=gc.18 nspace=
2019-06-11 09:06:58.720 7f8ffb876240 10 librados: Objecter returned from
call r=0

So I set --debug-rgw=10

RGWGC::process failed to acquire lock on gc.17

I haven't tried stopping all the RGWs yet as that will impact the
services, but might that be the root-cause here?

Wido

> Eric
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-04 Thread J. Eric Ivancich
On 6/4/19 7:37 AM, Wido den Hollander wrote:
> I've set up a temporary machine next to the 13.2.5 cluster with the
> 13.2.6 packages from Shaman.
> 
> On that machine I'm running:
> 
> $ radosgw-admin gc process
> 
> That seems to work as intended! So the PR seems to have fixed it.
> 
> Should be fixed permanently when 13.2.6 is officially released.
> 
> Wido

Thank you, Wido, for sharing the results of your experiment. I'm happy
to learn that it was successful. And v13.2.6 was just released about 2
hours ago.

Eric
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-04 Thread Wido den Hollander



On 5/30/19 2:45 PM, Wido den Hollander wrote:
> 
> 
> On 5/29/19 11:22 PM, J. Eric Ivancich wrote:
>> Hi Wido,
>>
>> When you run `radosgw-admin gc list`, I assume you are *not* using the
>> "--include-all" flag, right? If you're not using that flag, then
>> everything listed should be expired and be ready for clean-up. If after
>> running `radosgw-admin gc process` the same entries appear in
>> `radosgw-admin gc list` then gc apparently stalled.
>>
> 
> Not using the --include-all in both cases.
> 
> GC seems to stall and doesn't do anything when looking at it with
> --debug-rados=10
> 
>> There were a few bugs within gc processing that could prevent it from
>> making forward progress. They were resolved with a PR (master:
>> https://github.com/ceph/ceph/pull/26601 ; mimic backport:
>> https://github.com/ceph/ceph/pull/27796). Unfortunately that code was
>> backported after the 13.2.5 release, but it is in place for the 13.2.6
>> release of mimic.
>>
> 
> Thanks! I'll might grab some packages from Shaman to give GC a try.
> 

I've set up a temporary machine next to the 13.2.5 cluster with the
13.2.6 packages from Shaman.

On that machine I'm running:

$ radosgw-admin gc process

That seems to work as intended! So the PR seems to have fixed it.

Should be fixed permanently when 13.2.6 is officially released.

Wido

> Wido
> 
>> Eric
>>
>>
>> On 5/29/19 3:19 AM, Wido den Hollander wrote:
>>> Hi,
>>>
>>> I've got a Ceph cluster with this status:
>>>
>>> health: HEALTH_WARN
>>> 3 large omap objects
>>>
>>> After looking into it I see that the issue comes from objects in the
>>> '.rgw.gc' pool.
>>>
>>> Investigating it I found that the gc.* objects have a lot of OMAP keys:
>>>
>>> for OBJ in $(rados -p .rgw.gc ls); do
>>>   echo $OBJ
>>>   rados -p .rgw.gc listomapkeys $OBJ|wc -l
>>> done
>>>
>>> I then found out that on average these objects have about 100k of OMAP
>>> keys each, but two stand out and have about 3M OMAP keys.
>>>
>>> I can list the GC with 'radosgw-admin gc list' and this yields a JSON
>>> which is a couple of MB in size.
>>>
>>> I ran:
>>>
>>> $ radosgw-admin gc process
>>>
>>> That runs for hours and then finishes, but the large list of OMAP keys
>>> stays.
>>>
>>> Running Mimic 13.3.5 on this cluster.
>>>
>>> Has anybody seen this before?
>>>
>>> Wido
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-05-30 Thread Wido den Hollander



On 5/29/19 11:22 PM, J. Eric Ivancich wrote:
> Hi Wido,
> 
> When you run `radosgw-admin gc list`, I assume you are *not* using the
> "--include-all" flag, right? If you're not using that flag, then
> everything listed should be expired and be ready for clean-up. If after
> running `radosgw-admin gc process` the same entries appear in
> `radosgw-admin gc list` then gc apparently stalled.
> 

Not using the --include-all in both cases.

GC seems to stall and doesn't do anything when looking at it with
--debug-rados=10

> There were a few bugs within gc processing that could prevent it from
> making forward progress. They were resolved with a PR (master:
> https://github.com/ceph/ceph/pull/26601 ; mimic backport:
> https://github.com/ceph/ceph/pull/27796). Unfortunately that code was
> backported after the 13.2.5 release, but it is in place for the 13.2.6
> release of mimic.
> 

Thanks! I'll might grab some packages from Shaman to give GC a try.

Wido

> Eric
> 
> 
> On 5/29/19 3:19 AM, Wido den Hollander wrote:
>> Hi,
>>
>> I've got a Ceph cluster with this status:
>>
>> health: HEALTH_WARN
>> 3 large omap objects
>>
>> After looking into it I see that the issue comes from objects in the
>> '.rgw.gc' pool.
>>
>> Investigating it I found that the gc.* objects have a lot of OMAP keys:
>>
>> for OBJ in $(rados -p .rgw.gc ls); do
>>   echo $OBJ
>>   rados -p .rgw.gc listomapkeys $OBJ|wc -l
>> done
>>
>> I then found out that on average these objects have about 100k of OMAP
>> keys each, but two stand out and have about 3M OMAP keys.
>>
>> I can list the GC with 'radosgw-admin gc list' and this yields a JSON
>> which is a couple of MB in size.
>>
>> I ran:
>>
>> $ radosgw-admin gc process
>>
>> That runs for hours and then finishes, but the large list of OMAP keys
>> stays.
>>
>> Running Mimic 13.3.5 on this cluster.
>>
>> Has anybody seen this before?
>>
>> Wido
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP object in RGW GC pool

2019-05-29 Thread J. Eric Ivancich
Hi Wido,

When you run `radosgw-admin gc list`, I assume you are *not* using the
"--include-all" flag, right? If you're not using that flag, then
everything listed should be expired and be ready for clean-up. If after
running `radosgw-admin gc process` the same entries appear in
`radosgw-admin gc list` then gc apparently stalled.

There were a few bugs within gc processing that could prevent it from
making forward progress. They were resolved with a PR (master:
https://github.com/ceph/ceph/pull/26601 ; mimic backport:
https://github.com/ceph/ceph/pull/27796). Unfortunately that code was
backported after the 13.2.5 release, but it is in place for the 13.2.6
release of mimic.

Eric


On 5/29/19 3:19 AM, Wido den Hollander wrote:
> Hi,
> 
> I've got a Ceph cluster with this status:
> 
> health: HEALTH_WARN
> 3 large omap objects
> 
> After looking into it I see that the issue comes from objects in the
> '.rgw.gc' pool.
> 
> Investigating it I found that the gc.* objects have a lot of OMAP keys:
> 
> for OBJ in $(rados -p .rgw.gc ls); do
>   echo $OBJ
>   rados -p .rgw.gc listomapkeys $OBJ|wc -l
> done
> 
> I then found out that on average these objects have about 100k of OMAP
> keys each, but two stand out and have about 3M OMAP keys.
> 
> I can list the GC with 'radosgw-admin gc list' and this yields a JSON
> which is a couple of MB in size.
> 
> I ran:
> 
> $ radosgw-admin gc process
> 
> That runs for hours and then finishes, but the large list of OMAP keys
> stays.
> 
> Running Mimic 13.3.5 on this cluster.
> 
> Has anybody seen this before?
> 
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com