Gotcha. Yah I think we are going continue the scanning to build a new metadata pool. I am making some progress on a script to extract files from the data store. Just need to find the exact format of the xattr's and the object hierarchy for large files. If I end up taking the script to the finish line this will be something I post for the community. So I am reading c source code at the moment to see what cephfs is doing.
On Mon, Nov 5, 2018 at 8:10 PM Sergey Malinin <h...@newmail.com> wrote: > With cppool you got bunch of useless zero-sized objects because unlike > "export", cppool does not copy omap data which actually holds all the > inodes info. > I suggest truncating journals only for an effort of reducing downtime > followed by immediate backup of available files to a fresh fs. After > resetting journals the part of your fs covered by not flushed "UPDATE" > entries *will* become inconsistent. MDS may start to occasionally segfault > but it can be avoided by setting forced readonly mode (in this mode MDS > journal will not flush so you will need extra disk space). > If you want to get the original fs recovered and fully functional - you > need to somehow replay the journal (I'm unsure whether cephfs-data-scan > tool operates on journal entries). > > > > On 6.11.2018, at 03:43, Rhian Resnick <xan...@sepiidae.com> wrote: > > Workload is mixed. > > We ran a rados cpool to backup the metadata pool. > > So your thinking that truncating journal and purge queue (we are luminous) > with a reset could bring us online missing just data from that day. (most > when the issue started) > > If so we could continue our scan into our recovery partition and give it a > try tomorrow after discussions with our recovery team. > > > > > On Mon, Nov 5, 2018 at 7:40 PM Sergey Malinin <h...@newmail.com> wrote: > >> What was your recent workload? There are chances not to lose much if it >> was mostly read ops. If such, you *must backup your metadata pool via >> "rados export" in order to preserve omap data*, then try truncating >> journals (along with purge queue if supported by your ceph version), wiping >> session table, and resetting the fs. >> >> >> On 6.11.2018, at 03:26, Rhian Resnick <xan...@sepiidae.com> wrote: >> >> That was our original plan. So we migrated to bigger disks and have space >> but recover dentry uses up all our memory (128 GB) and crashes out. >> >> On Mon, Nov 5, 2018 at 7:23 PM Sergey Malinin <h...@newmail.com> wrote: >> >>> I had the same problem with multi-mds. I solved it by freeing up a >>> little space on OSDs, doing "recover dentries", truncating the journal, and >>> then "fs reset". After that I was able to revert to single-active MDS and >>> kept on running for a year until it failed on 13.2.2 upgrade :)) >>> >>> >>> On 6.11.2018, at 03:18, Rhian Resnick <xan...@sepiidae.com> wrote: >>> >>> Our metadata pool went from 700 MB to 1 TB in size in a few hours. Used >>> all space on OSD and now 2 ranks report damage. The recovery tools on the >>> journal fail as they run out of memory leaving us with the option of >>> truncating the journal and loosing data or recovering using the scan tools. >>> >>> Any ideas on solutions are welcome. I posted all the logs and and >>> cluster design previously but am happy to do so again. We are not desperate >>> but we are hurting with this long downtime. >>> >>> On Mon, Nov 5, 2018 at 7:05 PM Sergey Malinin <h...@newmail.com> wrote: >>> >>>> What kind of damage have you had? Maybe it is worth trying to get MDS >>>> to start and backup valuable data instead of doing long running recovery? >>>> >>>> >>>> On 6.11.2018, at 02:59, Rhian Resnick <xan...@sepiidae.com> wrote: >>>> >>>> Sounds like I get to have some fun tonight. >>>> >>>> On Mon, Nov 5, 2018, 6:39 PM Sergey Malinin <h...@newmail.com wrote: >>>> >>>>> inode linkage (i.e. folder hierarchy) and file names are stored in >>>>> omap data of objects in metadata pool. You can write a script that would >>>>> traverse through all the metadata pool to find out file names correspond >>>>> to >>>>> objects in data pool and fetch required files via 'rados get' command. >>>>> >>>>> > On 6.11.2018, at 02:26, Sergey Malinin <h...@newmail.com> wrote: >>>>> > >>>>> > Yes, 'rados -h'. >>>>> > >>>>> > >>>>> >> On 6.11.2018, at 02:25, Rhian Resnick <xan...@sepiidae.com> wrote: >>>>> >> >>>>> >> Does a tool exist to recover files from a cephfs data partition? We >>>>> are rebuilding metadata but have a user who needs data asap. >>>>> >> _______________________________________________ >>>>> >> ceph-users mailing list >>>>> >> ceph-users@lists.ceph.com >>>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> > >>>>> >>>>> >>>> >>> >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com