You may want to try "The Dilger Procedure". See http://wiki.hpc.ufl.edu/index.php/Lustre
This has saved us a number of times. Charlie Taylor UF HPC center On Fri, 2009-09-04 at 08:45 -0400, Dave Johnson wrote: > Our lustre filesystem is unable to run because the MDS host > crashes immediately while mounting the metadata file system. > It is accessing an invalid address (deadbeef) in the routine > mds_free_client. The Lustre version is 1.6.0.1. Copying the > crash log from the console by hand (lost the password to the > management processors so we can't do serial console anymore): > > mount.lustre Cannot handle kernel paging request mds_client_free+612 > Trace: > mds_destroy_export > obdclass:class_export_destroy > obdclass:obd_zombie_impexp_call > obdclass:class_detach > obdclass:class_process_config > obdclass:class_manual_cleanup > obdclass:lustre_fill_super > > I found messages in the mailing list about removing CATALOGS and OBJECTS/* > and mounting using -o abort_recov. I tried these things, in addition to > removing PENDING/* (all empty files). This last crash trace was done > (accidentally) without the -o abort_recov mount option, but the outcome > did not improve on the earlier attempts. > > Any help in this would be greatly appreciated. > > Thanks, > > -- ddj > > Dave Johnson > Brown University CCV > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
