Thanks Andreas. The orphan data is scattered throughout the array, although it's primarily on one OST (30) which seems to have been hit particularly hard by this outage:
[r...@iliadaccess04 lfsck2]# grep ERROR lfsck2.out lfsck: ost_idx 5: pass2 ERROR: 3817 dangling inodes found (654297 files total) lfsck: ost_idx 6: pass2 ERROR: 2 dangling inodes found (670416 files total) lfsck: ost_idx 11: pass2 ERROR: 942 dangling inodes found (673425 files total) lfsck: ost_idx 12: pass2 ERROR: 64 dangling inodes found (678878 files total) [lfsck: ost_idx 13: pass3 ERROR: 24.2109MB of orphan data (6 of 776532 files total) lfsck: ost_idx 14: pass3 ERROR: 5.34375MB of orphan data (2 of 725085 files total) lfsck: ost_idx 15: pass2 ERROR: 1 dangling inodes found (672942 files total) lfsck: ost_idx 15: pass3 ERROR: 58.4688MB of orphan data (6 of 739995 files total) lfsck: ost_idx 16: pass2 ERROR: 1 dangling inodes found (671379 files total) lfsck: ost_idx 18: pass2 ERROR: 3371 dangling inodes found (692018 files total) lfsck: ost_idx 18: pass3 ERROR: 5499.86MB of orphan data (620 of 688965 files total) lfsck: ost_idx 19: pass3 ERROR: 21.375MB of orphan data (8 of 775964 files total) [20] lfsck: ost_idx 20: pass3 ERROR: 3433.61MB of orphan data (16 of 843328 files total) [22] zero-length orphan objilfsck: ost_idx 22: pass3 ERROR: 1.21094MB of orphan data (16 of 859527 files total) lfsck: ost_idx 23: pass2 ERROR: 1 dangling inodes found (663492 files total) [23] zero-length orphan oblfsck: ost_idx 23: pass3 ERROR: 8571.68MB of orphan data (20 of 838490 files total) [24] zero-length orphan objid 83735lfsck: ost_idx 24: pass3 ERROR: 4367.45MB of orphan data (16 of 837371 files total) [25] zero-length orphan objid lfsck: ost_idx 25: pass3 ERROR: 121.996MB of orphan data (16 of 858679 files total) lfsck: ost_idx 30: pass2 ERROR: 46700 dangling inodes found (682467 files total) lfsck: ost_idx 30: pass3 ERROR: 45313.4MB of orphan data (7648 of 668343 files total) [r...@iliadaccess04 lfsck2]# Thanks again, Chris On 11/12/10 3:05 AM, Andreas Dilger wrote: > On 2010-11-11, at 19:53, Christopher Walker wrote: >> Thanks very much for your reply. I've tried remaking the mdsdb and all >> of the ostdb's, but I still get the same error -- it checks the first 34 >> osts without a problem, but can't find the ostdb file for the 35th >> (which has ost_idx 42): >> >> with the filesystem up I can see files on this OST: >> >> [cwal...@iliadaccess04 P-Gadget3.3.1]$ lfs getstripe predict.c >> OBDS: >> 0: aegalfs-OST0000_UUID ACTIVE >> ... >> 33: aegalfs-OST0021_UUID ACTIVE >> 42: aegalfs-OST002a_UUID ACTIVE >> predict.c >> obdidx objid objid group >> 42 10 0xa 0 >> >> >> lfsck identifies several hundred GB of orphan data that we'd like to >> recover, so we'd really like to run lfsck on this array. We're willing >> to forgo the recovery on the 35th ost, but I want to make sure that >> running lfsck -l with the current configuration won't make things worse. > I'm not sure that what lfsck is reporting in this case is correct. Is the > orphan data all on the same OST, or spread around separate OSTs? My concern > is that if lfsck thinks the in-use objects on your estranged OST is actually > orphan data it will destroy that data. > > If there are a small number of very large objects on other OSTs that are > making up the bulk of the orphan space usage, you could mount those OSTs as > type ldiskfs and delete the objects by hand to free up the space. > > Cheers, Andreas > -- > Andreas Dilger > Lustre Technical Lead > Oracle Corporation Canada Inc. > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss