On 2010-01-18, at 23:09, Wojciech Turek wrote:
> Thanks Andreas for quick answer. So upgrading to a newer version of
> colletcl should fix it?

No, it is a Lustre bug, not collectl.  I think a newer version of  
Lustre has fixes in lprocfs to avoid such races.

> 2010/1/18 Andreas Dilger <adil...@sun.com>:
>> On 2010-01-18, at 19:59, Wojciech Turek wrote:
>>>
>>> RHEL4 Lustre-1.6.6
>>>
>>> Does the kernel panic below rings a bell to anyone?
>>>
>>> RIP: 0010:[<ffffffff801af8f0>] <ffffffff801af8f0>{proc_pid_status 
>>> +534}
>>> Process collectl (pid: 13546, threadinfo 0000010416fc8000, task
>>> Call Trace:<ffffffff801acfd3>{proc_info_read+85}
>>>           <ffffffff80178dac>{vfs_read+207}
>>>           <ffffffff80179008>{sys_read+69}
>>>           <ffffffff80110236>{system_call+126}
>>
>>
>> This looks like collectl reading from a /proc entry after it was  
>> cleaned
>> up.  I think several such bugs were already fixed.
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>
>
>
> -- 
> --
> Wojciech Turek
>
> Assistant System Manager
>
> High Performance Computing Service
> University of Cambridge
> Email: wj...@cam.ac.uk
> Tel: (+)44 1223 763517


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to