Re: [ceph-users] S3 objects deleted but storage doesn't free space

2017-12-15 Thread David Turner
You can check to see how backed up your GC is with `radosgw-admin gc list |
wc -l`.  In one of our clusters, we realized that early testing and
re-configuring of the realm completely messed up the GC and that realm had
never actually deleted in an object in all the time it had been running in
production.  The way we found out of that mess was to create a new realm
and manually copy over bucket contents from one to the other (for any data
that couldn't be lost) and blasted away the rest.  We now no longer have
any GC problems in RGW with the fresh realm with an identical workload.

It took about 2 weeks for the data pool of the bad realm to finish deleting
form the cluster (60% full 10TB drives).

On Thu, Dec 14, 2017 at 7:41 PM Jan-Willem Michels 
wrote:

>
> Hi there all,
> Perhaps someone can help.
>
> We tried to free some storage so we deleted a lot S3 objects. The bucket
> has also valuable data so we can't delete the whole bucket.
> Everything went fine, but used storage space doesn't get  less.  We are
> expecting several TB of data to be freed.
>
> We then learned of garbage collection. So we thought let's wait. But
> even day's later no real change.
> We started " radosgw-admin gc process ", that never finished , or
> displayed any  error or anything.
> Could find anything like -verbose or debug for this command or find a
> place with log to debug what is going on when radosgw-admin is working
>
> We tried to change the default settings, we got from old posting.
> We have put them in global and tried  also in [client.rgw..]
> rgw_gc_max_objs =7877 ( but also rgw_gc_max_objs =200 or
> rgw_gc_max_objs =1000)
> rgw_lc_max_objs = 7877
> rgw_gc_obj_min_wait = 300
> rgw_gc_processor_period = 600
> rgw_gc_processor_max_time = 600
>
> We restarted the  ceph-radosgw several times, the computers, all over
> period of days etc . Tried radosgw-admin gc process a few times etc.
> Did not find any references in radosgw logs like gc:: delete etc. But we
> don't know what to look for
> System is well, nor errors or warnings. But system is in use ( we are
> loading up data) -> Will GC only run when idle?
>
> When we count them with "radosgw-admin gc list | grep oid | wc -l" we get
> 11:00 18.086.665 objects
> 13:00 18.086.665 objects
> 15:00 18.086.665 objects
> so no change in objects after hours
>
> When we list "radosgw-admin gc list" we get files like
>   radosgw-admin gc list | more
> [
>  {
>  "tag": "b5687590-473f-4386-903f-d91a77b8d5cd.7354141.21122\u",
>  "time": "2017-12-06 11:04:56.0.459704s",
>  "objs": [
>  {
>  "pool": "default.rgw.buckets.data",
>  "oid":
>
> "b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_1",
>  "key": "",
>  "instance": ""
>  },
>  {
>  "pool": "default.rgw.buckets.data",
>  "oid":
>
> "b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_2",
>  "key": "",
>  "instance": ""
>  },
>  {
>  "pool": "default.rgw.buckets.data",
>  "oid":
>
> "b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_3",
>  "key": "",
>  "instance": ""
>  },
>
>   A few questions ->
>
> Who purges the gc list. Is it on the the radosgw machines. Or is it done
> distributed on the OSD's?
> Where do i have to change default "rgw_gc_max_objs =1000". We tried
> everywhere. We have used "tell" to change them in OSD and MON systems
> and also on the RGW endpoint's which we restarted.
>
> We have two radosgw endpoints. Is there a lock that only one will act,
> or will they both try to delete. Can we free / display such a lock
>
> How can I debug the application radosgw-admin. In which log files to
> look, what would be example of message.
>
> If I know an oid like above. Can I manually delete such an oid.
>
> Suppose we would delete the complete bucket with "radosgw-admin bucket
> rm --bucket=mybucket --purge-objects --inconsistent-index" would that
> also get rid of the GC files that allready there?
>
> Thanks  ahead for your time,
>
> JW Michels
>
>
>
>
>
>
>
> q
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] S3 objects deleted but storage doesn't free space

2017-12-14 Thread Jan-Willem Michels


Hi there all,
Perhaps someone can help.

We tried to free some storage so we deleted a lot S3 objects. The bucket 
has also valuable data so we can't delete the whole bucket.
Everything went fine, but used storage space doesn't get  less.  We are 
expecting several TB of data to be freed.


We then learned of garbage collection. So we thought let's wait. But 
even day's later no real change.
We started " radosgw-admin gc process ", that never finished , or 
displayed any  error or anything.
Could find anything like -verbose or debug for this command or find a 
place with log to debug what is going on when radosgw-admin is working


We tried to change the default settings, we got from old posting.
We have put them in global and tried  also in [client.rgw..]
rgw_gc_max_objs =7877 ( but also rgw_gc_max_objs =200 or 
rgw_gc_max_objs =1000)

rgw_lc_max_objs = 7877
rgw_gc_obj_min_wait = 300
rgw_gc_processor_period = 600
rgw_gc_processor_max_time = 600

We restarted the  ceph-radosgw several times, the computers, all over 
period of days etc . Tried radosgw-admin gc process a few times etc.
Did not find any references in radosgw logs like gc:: delete etc. But we 
don't know what to look for
System is well, nor errors or warnings. But system is in use ( we are 
loading up data) -> Will GC only run when idle?


When we count them with "radosgw-admin gc list | grep oid | wc -l" we get
11:00 18.086.665 objects
13:00 18.086.665 objects
15:00 18.086.665 objects
so no change in objects after hours

When we list "radosgw-admin gc list" we get files like
 radosgw-admin gc list | more
[
{
"tag": "b5687590-473f-4386-903f-d91a77b8d5cd.7354141.21122\u",
"time": "2017-12-06 11:04:56.0.459704s",
"objs": [
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_1",

"key": "",
"instance": ""
},
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_2",

"key": "",
"instance": ""
},
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_3",

"key": "",
"instance": ""
},

 A few questions ->

Who purges the gc list. Is it on the the radosgw machines. Or is it done 
distributed on the OSD's?
Where do i have to change default "rgw_gc_max_objs =1000". We tried 
everywhere. We have used "tell" to change them in OSD and MON systems 
and also on the RGW endpoint's which we restarted.


We have two radosgw endpoints. Is there a lock that only one will act, 
or will they both try to delete. Can we free / display such a lock


How can I debug the application radosgw-admin. In which log files to 
look, what would be example of message.


If I know an oid like above. Can I manually delete such an oid.

Suppose we would delete the complete bucket with "radosgw-admin bucket 
rm --bucket=mybucket --purge-objects --inconsistent-index" would that 
also get rid of the GC files that allready there?


Thanks  ahead for your time,

JW Michels







q
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com