On 2010-01-18, at 23:09, Wojciech Turek wrote: > Thanks Andreas for quick answer. So upgrading to a newer version of > colletcl should fix it?
No, it is a Lustre bug, not collectl. I think a newer version of Lustre has fixes in lprocfs to avoid such races. > 2010/1/18 Andreas Dilger <adil...@sun.com>: >> On 2010-01-18, at 19:59, Wojciech Turek wrote: >>> >>> RHEL4 Lustre-1.6.6 >>> >>> Does the kernel panic below rings a bell to anyone? >>> >>> RIP: 0010:[<ffffffff801af8f0>] <ffffffff801af8f0>{proc_pid_status >>> +534} >>> Process collectl (pid: 13546, threadinfo 0000010416fc8000, task >>> Call Trace:<ffffffff801acfd3>{proc_info_read+85} >>> <ffffffff80178dac>{vfs_read+207} >>> <ffffffff80179008>{sys_read+69} >>> <ffffffff80110236>{system_call+126} >> >> >> This looks like collectl reading from a /proc entry after it was >> cleaned >> up. I think several such bugs were already fixed. >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Sr. Staff Engineer, Lustre Group >> Sun Microsystems of Canada, Inc. >> >> > > > > -- > -- > Wojciech Turek > > Assistant System Manager > > High Performance Computing Service > University of Cambridge > Email: wj...@cam.ac.uk > Tel: (+)44 1223 763517 Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss