Gotcha. Yah I think we are going continue the scanning to build a new
metadata pool. I am making some progress on a script to extract files from
the data store. Just need to find the exact format of the xattr's and the
object hierarchy for large files. If I end up taking the script to the
finish line this will be something I post for the community. So I am
reading c source code at the moment to see what cephfs is doing.


On Mon, Nov 5, 2018 at 8:10 PM Sergey Malinin <h...@newmail.com> wrote:

> With cppool you got bunch of useless zero-sized objects because unlike
> "export", cppool does not copy omap data which actually holds all the
> inodes info.
> I suggest truncating journals only for an effort of reducing downtime
> followed by immediate backup of available files to a fresh fs. After
> resetting journals the part of your fs covered by not flushed "UPDATE"
> entries *will* become inconsistent. MDS may start to occasionally segfault
> but it can be avoided by setting forced readonly mode (in this mode MDS
> journal will not flush so you will need extra disk space).
> If you want to get the original fs recovered and fully functional - you
> need to somehow replay the journal (I'm unsure whether cephfs-data-scan
> tool operates on journal entries).
>
>
>
> On 6.11.2018, at 03:43, Rhian Resnick <xan...@sepiidae.com> wrote:
>
> Workload is mixed.
>
> We ran a rados cpool to backup the metadata pool.
>
> So your thinking that truncating journal and purge queue (we are luminous)
> with a reset could bring us online missing just data from that day. (most
> when the issue started)
>
> If so we could continue our scan into our recovery partition and give it a
> try tomorrow after discussions with our recovery team.
>
>
>
>
> On Mon, Nov 5, 2018 at 7:40 PM Sergey Malinin <h...@newmail.com> wrote:
>
>> What was your recent workload? There are chances not to lose much if it
>> was mostly read ops. If such, you *must backup your metadata pool via
>> "rados export" in order to preserve omap data*, then try truncating
>> journals (along with purge queue if supported by your ceph version), wiping
>> session table, and resetting the fs.
>>
>>
>> On 6.11.2018, at 03:26, Rhian Resnick <xan...@sepiidae.com> wrote:
>>
>> That was our original plan. So we migrated to bigger disks and have space
>> but recover dentry uses up all our memory (128 GB) and crashes out.
>>
>> On Mon, Nov 5, 2018 at 7:23 PM Sergey Malinin <h...@newmail.com> wrote:
>>
>>> I had the same problem with multi-mds. I solved it by freeing up a
>>> little space on OSDs, doing "recover dentries", truncating the journal, and
>>> then "fs reset". After that I was able to revert to single-active MDS and
>>> kept on running for a year until it failed on 13.2.2 upgrade :))
>>>
>>>
>>> On 6.11.2018, at 03:18, Rhian Resnick <xan...@sepiidae.com> wrote:
>>>
>>> Our metadata pool went from 700 MB to 1 TB in size in a few hours. Used
>>> all space on OSD and now 2 ranks report damage. The recovery tools on the
>>> journal fail as they run out of memory leaving us with the option of
>>> truncating the journal and loosing data or recovering using the scan tools.
>>>
>>> Any ideas on solutions are welcome. I posted all the logs and and
>>> cluster design previously but am happy to do so again. We are not desperate
>>> but we are hurting with this long downtime.
>>>
>>> On Mon, Nov 5, 2018 at 7:05 PM Sergey Malinin <h...@newmail.com> wrote:
>>>
>>>> What kind of damage have you had? Maybe it is worth trying to get MDS
>>>> to start and backup valuable data instead of doing long running recovery?
>>>>
>>>>
>>>> On 6.11.2018, at 02:59, Rhian Resnick <xan...@sepiidae.com> wrote:
>>>>
>>>> Sounds like I get to have some fun tonight.
>>>>
>>>> On Mon, Nov 5, 2018, 6:39 PM Sergey Malinin <h...@newmail.com wrote:
>>>>
>>>>> inode linkage (i.e. folder hierarchy) and file names are stored in
>>>>> omap data of objects in metadata pool. You can write a script that would
>>>>> traverse through all the metadata pool to find out file names correspond 
>>>>> to
>>>>> objects in data pool and fetch required files via 'rados get' command.
>>>>>
>>>>> > On 6.11.2018, at 02:26, Sergey Malinin <h...@newmail.com> wrote:
>>>>> >
>>>>> > Yes, 'rados -h'.
>>>>> >
>>>>> >
>>>>> >> On 6.11.2018, at 02:25, Rhian Resnick <xan...@sepiidae.com> wrote:
>>>>> >>
>>>>> >> Does a tool exist to recover files from a cephfs data partition? We
>>>>> are rebuilding metadata but have a user who needs data asap.
>>>>> >> _______________________________________________
>>>>> >> ceph-users mailing list
>>>>> >> ceph-users@lists.ceph.com
>>>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to