Thanks Yehuda for your response, much appreciated.

Using the "radosgw-admin object stat" option I was able to reconcile the 
objects on master and slave.  There are 10 objects on the master that have 
replicated to the slave, for these 10 objects I was able to confirm by pulling 
the tag prefix from "object stat", verifying size, name, etc.  There are still 
a large number of "shadow" files in .region-1.zone-2.rgw.buckets pool which 
have no corresponding object to cross reference using "object stat" command.  
These files are taking up several hundred GB from OSD's on the region-2 
cluster.  What would be the correct way to remove these "shadow" files that no 
longer have objects associated?  Is there a process that will clean these 
orphaned objects?  Any steps anyone can provide to remove these files would 
greatly appreciated.

BTW - Since my original post several objects have been copied via s3 client to 
the master and everything appears to be replicating without issue.  Objects 
have been deleted as well, the sync looks fine, objects are being removed from 
master and slave.  I'm pretty sure the large number of orphaned "shadow" files 
that are currently in the .region-1.zone-2.rgw.buckets pool are from the 
original sync performed back on Sept. 15.

Thanks in advance,
MLM

-----Original Message-----
From: yehud...@gmail.com [mailto:yehud...@gmail.com] On Behalf Of Yehuda Sadeh
Sent: Tuesday, September 23, 2014 5:30 PM
To: lyn_mitch...@bellsouth.net
Cc: ceph-users; ceph-commun...@lists.ceph.com
Subject: Re: [ceph-users] Any way to remove possible orphaned files in a 
federated gateway configuration

On Tue, Sep 23, 2014 at 3:05 PM, Lyn Mitchell <mitc...@bellsouth.net> wrote:
> Is anyone aware of a way to either reconcile or remove possible 
> orphaned “shadow” files in a federated gateway configuration?  The 
> issue we’re seeing is the number of chunks/shadow files on the slave has many 
> more “shadow”
> files than the master, the breakdown is as follows:
>
> master zone:
>
> .region-1.zone-1.rgw.buckets = 1737 “shadow” files of which there are 
> 10 distinct sets of tags, an example of 1 distinct set is:
>
> alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_1 through
> alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_516
>
>
>
> slave zone:
>
> .region-1.zone-2.rgw.buckets = 331961 “shadow” files, of which there 
> are 652 distinct sets of  tags, examples:
>
> 1 set having 516 “shadow” files:
>
> alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_1 through
> alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_516
>
>
>
> 236 sets having 515 “shadow” files apiece:
>
> alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_1 through
> alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_515
>
> alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_1 through
> alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_515

These are all part of the same bucket (prefixed by alph-1.80907.1).

>
> ….
>
>
>
> The number of shadow files in zone-2 is taking quite a bit of space from the
> OSD’s in the cluster.   Without being able to trace back to the original
> file name from an s3 or rados tag, I have no way of knowing which 
> files these are.  Is it possible that the same file may have been 
> replicated multiple times, due to network or connectivity issues?
>
>
>
> I can provide any logs or other information that may provide some 
> help, however at this point we’re not seeing any real errors.
>
>
>
> Thanks in advance for any help that can be provided,

You can also run the following command on the existing objects within that 
specific bucket:

$ radosgw-admin object stat --bucket=<bucket> --object=<object>

This will show the mapping from the rgw object to the rados objects that 
construct it.


Yehuda

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to