Re: [ceph-users] RadosGW slow gc

2015-01-02 Thread Gregory Farnum
You can store radosgw data in a regular EC pool without any caching in
front. I suspect this will work better for you, as part of the slowness is
probably the OSDs trying to look up all the objects in the ec pool before
deleting them. You should be able to check if that's the case by looking at
the osd perfcounters over time. (We've discussed cache counters before;
check the docs or the list).
-Greg
On Thu, Jan 1, 2015 at 1:01 PM Aaron Bassett aa...@five3genomics.com
wrote:

 I’m doing some load testing on radosgw to get ready for production and I
 had a problem with it stalling out. I had 100 cores from several nodes
 doing multipart uploads in parallel. This ran great for about two days,
 managing to upload about 2000 objects with an average size of 100GB. Then
 it stalled out and stopped. Ever since then, the gw has been gc’ing very
 slowly. During the upload run, it was creating objects at ~ 100/s, now it’s
 cleaning them at ~3/s. At this rate it wont be done for nearly a year and
 this is only a fraction of the data I need to put in.

 The pool I’m writing to is a cache pool at size 2 with an EC pool at 10+2
 behind it. (This data is not mission critical so we are trying to save
 space). I don’t know if this will affect the slow gc or not.

 I tried turning up rgw gc max objs to 256, but it didn’t seem to make a
 difference.

 I’m working under the assumption that my uploads started stalling because
 too many un-gc’ed parts accumulated, but I may be way off base there.

 Any thoughts would be much appreciated, Aaron
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RadosGW slow gc

2015-01-01 Thread Aaron Bassett
I’m doing some load testing on radosgw to get ready for production and I had a 
problem with it stalling out. I had 100 cores from several nodes doing 
multipart uploads in parallel. This ran great for about two days, managing to 
upload about 2000 objects with an average size of 100GB. Then it stalled out 
and stopped. Ever since then, the gw has been gc’ing very slowly. During the 
upload run, it was creating objects at ~ 100/s, now it’s cleaning them at ~3/s. 
At this rate it wont be done for nearly a year and this is only a fraction of 
the data I need to put in. 

The pool I’m writing to is a cache pool at size 2 with an EC pool at 10+2 
behind it. (This data is not mission critical so we are trying to save space). 
I don’t know if this will affect the slow gc or not. 

I tried turning up rgw gc max objs to 256, but it didn’t seem to make a 
difference.

I’m working under the assumption that my uploads started stalling because too 
many un-gc’ed parts accumulated, but I may be way off base there. 

Any thoughts would be much appreciated, Aaron 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com