I don't recommand using "scan_op_timeout" unless if it's with "exit_on_timeout 
= yes;".
This can leave hung threads in a buggy state.

If robinhood is stuck on a filesystem call, I think it's better to leave it 
like this
and determine what is the problematic syscall by using "crash" utility
or dumping your FS client node.
Or if the hang is transient, RH will continue scanning when the filesystem is 
back.

You can check in dmesg if there was a Lustre client eviction or smthg.

Also, disabling Lustre statahead avoids some Lustre bugs:
echo 0 > /proc/fs/lustre/llite/*/statahead_max

I hope this helps
-Thomas

On 10/21/13 16:43, Crowe, Tom wrote:
> My apologies. Our robin hood version is 2.4.3. We have the tmpfs, adm and
> webgui rpm's installed.
>
> Our version of lustre is 2.1.6.
>
> Thank You.
>
> -Tom
>
> On 10/21/13 10:36 AM, "Aurélien Degrémont" <[email protected]>
> wrote:
>
>> Hello
>>
>> Could you give us more details on the Lustre version you are running?
>> By the way, if you just started using RBH, latest version is 2.4.3 :)
>>
>>
>> Aurélien
>>
>> Le 21/10/2013 15:51, Crowe, Tom a écrit :
>>> We have installed robin hood 2.4.1 via RPM's, and are trying to
>>> establish a baseline scan of out lustre MDS.
>>>
>>> We continue to experience threads hanging when scaning/stating files,
>>> and have rolled the timeout down to 1m (scan_op_timeout) in an attempt
>>> to let the initial scan complete.
>>>
>>> When we see the hangs in the log, we can run a ls -l and stat on the
>>> hung threads/files, and the output returns immediately.
>>>
>>> Looking for any advise on tuning/parameter settings to limit the hangs.
>>>
>>> Thank You
>>>
>>> -Tom
>>>
>>>
>>>
>>> -------------------------------------------------------------------------
>>> -----
>>> October Webinars: Code for Performance
>>> Free Intel webinars can help you accelerate application performance.
>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the
>>> most from
>>> the latest Intel processors and coprocessors. See abstracts and
>>> register >
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clkt
>>> rk
>>> _______________________________________________
>>> robinhood-support mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/robinhood-support
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
> _______________________________________________
> robinhood-support mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/robinhood-support


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to