Lustre users,

We've had some hardware and lustre crashes in the recent past, and we are 
trying some new hardware configurations for performance and hopefully stability 
reasons.  To make the changes, we first migrate the data off of the respective 
OSTs, and then do the reconfigure, and put the OSTs back online.

I'm guessing that the MDS simply doesn't know about the dangling object on the 
OST.  On one OST that I've examined I noticed that most of the orphaned objects 
are from Oct 27, and somewhere between Oct 26th and 27th the metadata server 
had a software crash (1.8.1.1 since upgraded to 1.8.4).  I'm guessing that the 
clients pushed data directly to the OST and could not update the MDT, thus 
leaving the stray files on the OST.

Is there any way to get more information about the files or possibly clean 
these files while lustre is active?  All I have are the object ID and regular 
UNIX metadata information that is stored on the object files.  When these 
errors occurred, did the write fail on the client side, and are the users' not 
expecting the data to be there?

Unfortunately, a full lfsck does not seem to be an option due to the amount of 
time that a full lfsck takes.  I guess we are more than OK with losing the 
files, more so if the users' got a failed write.

I've seen too many of these stray files for me to ignore them anymore.  If 
anyone has any tips on how to deal with these orphaned objects, please let me 
know.

Thanks,

-mb

--
+-----------------------------------------------
| Michael Barnes
|
| Thomas Jefferson National Accelerator Facility
| Scientific Computing Group
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------




_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to