On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander <w...@42on.com> wrote:
>
>> Op 7 december 2016 om 16:38 schreef John Spray <jsp...@redhat.com>:
>>
>>
>> On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander <w...@42on.com> wrote:
>> > (I think John knows the answer, but sending to ceph-users for archival 
>> > purposes)
>> >
>> > Hi John,
>> >
>> > A Ceph cluster lost a PG with CephFS metadata in there and it is currently 
>> > doing a CephFS disaster recovery as described here: 
>> > http://docs.ceph.com/docs/master/cephfs/disaster-recovery/
>>
>> I wonder if this has any relation to your thread about size=2 pools ;-)
>
> Yes, it does!
>
>>
>> > This data pool has 1.4B objects and currently has 16 concurrent 
>> > scan_extents scans running:
>> >
>> > # cephfs-data-scan --debug-rados=10 scan_extents --worker_n 0 --worker_m 
>> > 16 cephfs_metadata
>> > # cephfs-data-scan --debug-rados=10 scan_extents --worker_n 1 --worker_m 
>> > 16 cephfs_metadata
>> > ..
>> > ..
>> > # cephfs-data-scan --debug-rados=10 scan_extents --worker_n 15 --worker_m 
>> > 16 cephfs_metadata
>> >
>> > According to the source in DataScan.cc:
>> > * worker_n: Worker number
>> > * worker_m: Worker count
>> >
>> > So with the commands above I have 16 workers running, correct? For the 
>> > scan_inodes I want to scale out to 32 workers to speed up the process even 
>> > more.
>> >
>> > Just to double-check before I send a new PR to update the docs, this is 
>> > the right way to run the tool, correct?
>>
>> It looks like you're targeting cephfs_metadata instead of your data pool.
>>
>> scan_extents and scan_inodes operate on data pools, even if your goal
>> is to rebuild your metadata pool (the argument is what you are
>> scanning, not what you are writing to).
>
> That was a typo of me when typing this e-mail. It is scanning the *data* pool 
> at the moment.
>
> Can you confirm that the worker_n and worker_m arguments are the correct ones?

Yep, they look right to me.

>>
>> There is also a "scan_frags" command that operates on a metadata pool.
>
> Didn't know that. In this case the metadata pool is missing objects due to 
> that lost PG.
>
> I think the scan_extents and scan_inodes on the *data* pool is the correct 
> way to rebuild the metadata pool if it is missing objects, right?

In general you'd use both scan_frags (to re-link any orphaned
directories that might have been orphaned if they had an ancestor
dirfrag in the lost PG) and then scan_extents+scan_inodes (to re-link
any orphaned files that might have been orphaned because their
immediate parent dirfrag was in the lost PG).

However scan_extents+scan_inodes is generally doing the lion's share
of the work because anything that scan_frags would have caught would
probably also have appeared somewhere in a backtrace path and got
linked in by scan_inodes as a result, so you should probably just skip
scan_frags in this instance.

BTW, you've probably already realised this, but be *very* cautious
about using the recovered filesystem: our testing of these tools is
mostly verifying that after recovery we can see and read the files
(i.e. well enough to extract them somewhere else), not that the
filesystem is necessarily working well for writes etc after being
recovered.  If it's possible, then it's always better to recover your
files to a separate location, and then rebuild your filesystem with
fresh pools -- that way you're not risking that there as anything
strange left behind by the recovery process.

John

> Wido
>
>>
>> John
>>
>> > If not, before sending the PR and starting scan_inodes on this cluster, 
>> > what is the correct way to invoke the tool?
>> >
>> > Thanks!
>> >
>> > Wido
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to