On Friday, September 03, 2010, Bernd Schubert wrote: > On Friday, September 03, 2010, Bob Ball wrote: > > We added a new OSS to our 1.8.4 Lustre installation. It has 6 OST of > > 8.9TB each. Within a day of having these on-line, one OST stopped > > accepting new files. I cannot get it to activate. The other 5 seem > > fine. > > > > On the MDS "lctl dl" shows it IN, but not UP, and files can be read from > > it: 33 IN osc umt3-OST001d-osc umt3-mdtlov_UUID 5 > > > > However, I cannot get it to re-activate: > > lctl --device umt3-OST001d-osc activate > > [...] > > > LustreError: 4697:0:(filter.c:3172:filter_handle_precreate()) > > umt3-OST001d: ignoring bogus orphan destroy request: obdid > > 11309489156331498430 last_id 0 > > > > Can anyone tell me what must be done to recover this disk volume? > > Check out section 23.3.9 in the Lustre manual ("How to Fix a Bad LAST_ID on > an OST). > > It is on my TODO list to write tool to automatically correct the > "lov_objid", but as of now I don't have it yet. Somehow your lov_objid > file has a completely wrong value for this OST. > Now, when you say "files can be read from it", are you sure there are > already files on that OST? Because the error message says that the last_id > is zero and so you should not have a single file on it. If that is also > wrong, you will need to correct it as well. You can do that manually, or > you can use a patched e2fsprogs version, that will do that for you > > Patches are here: > https://bugzilla.lustre.org/show_bug.cgi?id=22734 > > Packages can be found on my home page: > http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/e2fsprogs/ > > > If you want to do it automatically, you will need to create a lfsck mdsdb > file (the hdr file is sufficient, see the lfsck section in the manual) and > then you will need to run e2fsck for that OST as if you want to create an > OSTDB file. That will start pass6, and if you then run e2fsck *without* > "-n", so in correcting mode, it will correct the LAST_ID file to what it > finds on disk. With "-v" it will also tell you the old and the new value > and then you will need to put that value properly coded into the MDS > lov_objid file.
Update for the lov_objd file, actually, if you rename or delete it (rename it please, so that you have a backup), the MDS should be able to re-create it from OST LAST_ID data. So if the troublesome OST has no data yet, it will be very easy, if it already has data, you will need to correct the LAST_ID on that OST first. Cheers, Bernd -- Bernd Schubert DataDirect Networks _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss