"ASSERTION(old_inode->i_state & I_FREEING)" is the infamous bug17485. You will need to run lfsck to fix it.
On Saturday 10 October 2009, Wojciech Turek wrote: > Hi, > > Did you get to the bottom of this? > > We are having exactly the same problem with our lustre-1.6.6 (rhel4) file > systems. Recently it got worst and MDS crashes quite frequently, when we > run e2fsck there are errors that are being fixed. However after some time > we still are seeing the same errors in the logs about missing objects and > files get corrupted (?-----------) Also clients LBUGs quite frequently > with this message (osc_request.c:2904:osc_set_data_with_check()) LBUG > This looks like serious lustre problem but so far I didn't find any clues > on that even after long search through lustre bugzilla. > > Our MDSs and OSSs are UPSed, RAID is behaving OK, we don't see any errors > in the syslog. > > I will be grateful for some hints on this one > > Wojciech > > 2009/8/24 rishi pathak <mailmaverick...@gmail.com> > > > Hi, > > > > Our lustre fs comprises of 15 OST/OSS and 1 MDS with no failover. Client > > as well as servers run lustre-1.6 and kernel 2.6.9-18. > > > > Doing a ls -ltr for a directory in lustre fs throws following > > errors (as got from lustre logs) on client > > > > 00000008:00020000:0:1251099455.304622:0:724:0:(osc_request.c:2898:osc_set > >_data_with_check()) ### inconsistent l_ast_data found ns: > > scratch-OST0005-osc-ffff81201e8dd800 lock: ffff811f9af04 > > 000/0xec0d1c36da6992fd lrc: 3/1,0 mode: PR/PR res: 570622/0 rrc: 2 type: > > EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 100000 > > remote: 0xb79b445e381bc9e6 expref: -99 p > > id: 22878 > > 00000008:00040000:0:1251099455.337868:0:724:0:(osc_request.c:2904:osc_set > >_data_with_check()) ASSERTION(old_inode->i_state & I_FREEING) failed:Found > > existing inode ffff811f2cf693b8/1972725 > > 44/1895600178 state 0 in lock: setting data to > > ffff8118ef8ed5f8/207519777/1771835328 > > 00000000:00040000:0:1251099455.360090:0:724:0:(osc_request.c:2904:osc_set > >_data_with_check()) LBUG > > > > > > On scratch-OST0005 OST it shows > > > > Aug 24 10:22:53 yn266 kernel: LustreError: > > 3023:0:(ldlm_resource.c:851:ldlm_resource_add()) lvbo_init failed for > > resour ce 569204: rc -2 > > Aug 24 10:22:53 yn266 kernel: LustreError: > > 3023:0:(ldlm_resource.c:851:ldlm_resource_add()) Skipped 19 previous > > similar messages > > Aug 24 12:40:43 yn266 kernel: LustreError: > > 2737:0:(ldlm_resource.c:851:ldlm_resource_add()) lvbo_init failed for > > resour ce 569195: rc -2 > > Aug 24 12:44:59 yn266 kernel: LustreError: > > 2835:0:(ldlm_resource.c:851:ldlm_resource_add()) lvbo_init failed for > > resour ce 569198: rc -2 > > > > These kind of errors we are getting for many clients. > > > > ##History ## > > Prior to thsese occurences, our MDS showed signs of failure in way that > > cpu load was shooting above 100 (on a quad core quad socket system) and > > users were complaining about slow storage performance. We took it offline > > and did fsck on unmounted MDS and OSTs. fsck on OSTs went fine but it > > showed some errors which were fixed. For data integrity check, mdsdb and > > ostdb were built and lfsck was run on a client(client was mounted with > > abort_recov). > > > > lfsck was run in following order: > > lfsck with no fix - reported dangling inodes and orphaned objects > > lfsck with -l (backup orphaned objects) > > lfsck with -d and -c (delete orphaned objects and create missing OST > > objects referenced by MDS) > > > > After above operations, on clients we were seeing file in red and > > blinking. Doing a stat came out with an error stating 'no such file or > > directory'. > > > > My question is whether the order in which lfsck was run (should lfsck be > > run multiple times) and the errors we are getting are related or not. > > > > > > > > > > -- > > Regards-- > > Rishi Pathak > > National PARAM Supercomputing Facility > > Center for Development of Advanced Computing(C-DAC) > > Pune University Campus,Ganesh Khind Road > > Pune-Maharastra > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss@lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > -- Bernd Schubert DataDirect Networks _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss