Re: [lustre-discuss] High MDS load

2020-06-08 Thread BOUYER, QUENTIN

Hi,

Maybe, you can also try this :

https://github.com/quentinbouyer/topmdt

Le 28/05/2020 à 18:32, Chad DeWitt a écrit :

Hi Heath,

Hope you're doing well!

Your mileage may vary (and quite frankly, there may be better 
approaches), but this is a quick and dirty set of steps to find which 
client is issuing a large number of metadata operations.:


  * Log into the affected MDS.

  * Change into the exports directory.

cd /proc/fs/lustre/mdtexports/

  * OPTIONAL: Set all your stats to zero and clear out stale
clients. (If you don't want to do this step, you don't really
have to, but it does make it easier to see the stats if you
are starting with a clean slate. In fact, you may want to skip
this the first time through and just look for high numbers. If
a particular client is the source of the issue, the stats
should clearly be higher for that client when compared to the
others.)

echo "C" > clear

  * Wait for a few seconds and dump the stats.

for client in $( ls -d */ ) ; do echo && echo && echo
${client} && cat ${client}/stats && echo ; done


You'll get a listing of stats for each mounted client like so:

open  278676 samples [reqs]
close 278629 samples [reqs]
mknod 2320 samples [reqs]
unlink  495 samples [reqs]
mkdir 575 samples [reqs]
rename  1534 samples [reqs]
getattr 277552 samples [reqs]
setattr 550 samples [reqs]
getxattr  2742 samples [reqs]
statfs  350058 samples [reqs]
samedir_rename  1534 samples [reqs]


(Don't worry if some of the clients give back what appears to be empty 
stats. That just means they are mounted, but have not yet performed 
any metadata operations.) From this data, you are looking for any 
"high" samples.  The client with the high samples is usually the 
culprit.  For the example client stats above, I would look to see what 
process(es) on this client is listing, opening, and then closing files 
in Lustre... The advantage with this method is you are seeing exactly 
which metadata operations are occurring. (I know there are also 
various utilities included with Lustre that may give this information 
as well, but I just go to the source.)


Once you find the client, you can use various commands, such as mount 
and lsof to get a better understanding of what may be hitting Lustre.


Some of the more common issues I've found that can cause a high MDS load:

  * List a directory containing a large number of files. (Instead,
unalias ls or better yet, use lfs find.)
  * Remove on many files.
  * Open and close many files. (May be better to move the data over to
another file system, such as XFS, etc.  We keep some of our deep
learning off Lustre, because of the sheer number of small files.)

Of course the actual mitigation of the load depends on what the user 
is attempting to do...


I hope this helps...

Cheers,
Chad



Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

ccdew...@uncc.edu  *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person 
responsible for delivering it to the intended recipient, any 
disclosure, copying, distribution, or other use of any of the 
information in this transmission is strictly prohibited. If you have 
received this transmission in error, please notify me immediately by 
reply email or by telephone at 704-687-7802. Thank you.




On Thu, May 28, 2020 at 11:37 AM Peeples, Heath 
mailto:hea...@hpc.msstate.edu>> wrote:


I have 2 MDSs and periodically on one of them (either at one time
or another) peak above 300, causing the file system to basically
stop.  This lasts for a few minutes and then goes away.  We can’t
identify any one user running jobs at the times we see this, so
it’s hard to pinpoint this on a user doing something to cause it. 
 Could anyone point me in the direction of how to begin debugging
this?  Any help is greatly appreciated.

Heath

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load

2020-05-28 Thread Carlson, Timothy S
Since some mailers don't like attachments, I'll just paste in the script we use 
here.  

I call the script with

./parse.sh | sort -k3 -n

You just need to change out the name of your MDT in two places.

#!/bin/bash
set -e
SLEEP=10
stats_clear()
{
cd $1
echo clear >clear
}

stats_print()
{
cd $1
echo "= $1 "
for i in *; do 
[ -d $i ] || continue
out=`cat ${i}/stats | grep -v "snapshot_time" | grep -v "ping" 
|| true`
[ -n "$out" ] || continue
echo $i $out
done
echo 
"="
echo
}

for i in /proc/fs/lustre/mdt/lzfs-MDT /proc/fs/lustre/obdfilter/*OST*; do
dir="${i}/exports"
[ -d "$dir" ] || continue
stats_clear "$dir"
done
echo "Waiting ${SLEEP}s after clearing stats"
sleep $SLEEP

for i in /proc/fs/lustre/mdt/lzfs-MDT/ /proc/fs/lustre/obdfilter/*OST*; do
dir="${i}/exports"
[ -d "$dir" ] || continue
stats_print "$dir"
done




On 5/28/20, 9:28 AM, "lustre-discuss on behalf of Bernd Melchers" 
 wrote:

>I have 2 MDSs and periodically on one of them (either at one time or
>another) peak above 300, causing the file system to basically stop.
>This lasts for a few minutes and then goes away.  We can't identify any
>one user running jobs at the times we see this, so it's hard to
>pinpoint this on a user doing something to cause it.   Could anyone
>point me in the direction of how to begin debugging this?  Any help is
>greatly appreciated.

I am not able to solve this problem, but...
We saw this behaviour (lustre 2.12.3 and 2.12.4) parallel with lustre 
kernel thread
(if i remember: ll_ost_io threads at the ods, but with other messages at
the mds) BUG messages in the
kernel log (dmesg output). At this time the omnipath interface were not
longer pingable. We were not able to say what crashes first, the
omnipath or the lustre parts in the kernel. Perhaps you can have a look
if your mds are pingable from your clients (using the network interface
of your lustre installation). Otherwise it is expected that you get a
high load because your lustre io threads cannot satisfy requests.

Mit freundlichen Grüßen
Bernd Melchers

-- 
Archiv- und Backup-Service | fab-serv...@zedat.fu-berlin.de
Freie Universität Berlin   | Tel. +49-30-838-55905
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

https://protect2.fireeye.com/v1/url?k=2b5b7e8e-77ee4041-2b5b549b-0cc47adc5e60-f39b4d99025e7043=1=02c1fc69-2754-4f01-8478-8cef00277511=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load

2020-05-28 Thread Cameron Harr

  
Are you using any Lustre monitoring tools? We use ltop from the
  LMT package (https://github.com/LLNL/lmt) and during that time of
  high load you could see if there are bursts of IOPs coming in.
  Running iotop or iostat might also provide some insight into the
  load if based on I/O.
Cameron

On 5/28/20 8:37 AM, Peeples, Heath
  wrote:


  
  
  
  
I have 2 MDSs and periodically on one of
  them (either at one time or another) peak above 300, causing
  the file system to basically stop.  This lasts for a few
  minutes and then goes away.  We can’t identify any one user
  running jobs at the times we see this, so it’s hard to
  pinpoint this on a user doing something to cause it.   Could
  anyone point me in the direction of how to begin debugging
  this?  Any help is greatly appreciated.
 
Heath
  
  
  
  ___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


  

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load

2020-05-28 Thread Chad DeWitt
Hi Heath,

Hope you're doing well!

Your mileage may vary (and quite frankly, there may be better approaches),
but this is a quick and dirty set of steps to find which client is issuing
a large number of metadata operations.:


   - Log into the affected MDS.


   - Change into the exports directory.

cd /proc/fs/lustre/mdt/**/exports/


   - OPTIONAL: Set all your stats to zero and clear out stale clients. (If
   you don't want to do this step, you don't really have to, but it does make
   it easier to see the stats if you are starting with a clean slate. In fact,
   you may want to skip this the first time through and just look for high
   numbers. If a particular client is the source of the issue, the stats
   should clearly be higher for that client when compared to the others.)

echo "C" > clear


   - Wait for a few seconds and dump the stats.

for client in $( ls -d */ ) ; do echo && echo && echo ${client} && cat
${client}/stats && echo ; done


You'll get a listing of stats for each mounted client like so:

open  278676 samples [reqs]
close 278629 samples [reqs]
mknod 2320 samples [reqs]
unlink495 samples [reqs]
mkdir 575 samples [reqs]
rename1534 samples [reqs]
getattr   277552 samples [reqs]
setattr   550 samples [reqs]
getxattr  2742 samples [reqs]
statfs350058 samples [reqs]
samedir_rename1534 samples [reqs]


(Don't worry if some of the clients give back what appears to be empty
stats. That just means they are mounted, but have not yet performed any
metadata operations.) From this data, you are looking for any "high"
samples.  The client with the high samples is usually the culprit.  For the
example client stats above, I would look to see what process(es) on this
client is listing, opening, and then closing files in Lustre... The
advantage with this method is you are seeing exactly which metadata
operations are occurring. (I know there are also various utilities included
with Lustre that may give this information as well, but I just go to the
source.)

Once you find the client, you can use various commands, such as mount and
lsof to get a better understanding of what may be hitting Lustre.

Some of the more common issues I've found that can cause a high MDS load:

   - List a directory containing a large number of files. (Instead, unalias
   ls or better yet, use lfs find.)
   - Remove on many files.
   - Open and close many files. (May be better to move the data over to
   another file system, such as XFS, etc.  We keep some of our deep learning
   off Lustre, because of the sheer number of small files.)

Of course the actual mitigation of the load depends on what the user is
attempting to do...

I hope this helps...

Cheers,
Chad



Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

ccdew...@uncc.edu *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person
responsible for delivering it to the intended recipient, any disclosure,
copying, distribution, or other use of any of the information in this
transmission is strictly prohibited. If you have received this transmission
in error, please notify me immediately by reply email or by telephone at
704-687-7802. Thank you.


On Thu, May 28, 2020 at 11:37 AM Peeples, Heath 
wrote:

> I have 2 MDSs and periodically on one of them (either at one time or
> another) peak above 300, causing the file system to basically stop.  This
> lasts for a few minutes and then goes away.  We can’t identify any one user
> running jobs at the times we see this, so it’s hard to pinpoint this on a
> user doing something to cause it.   Could anyone point me in the direction
> of how to begin debugging this?  Any help is greatly appreciated.
>
>
>
> Heath
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load

2020-05-28 Thread Bernd Melchers
>I have 2 MDSs and periodically on one of them (either at one time or
>another) peak above 300, causing the file system to basically stop.
>This lasts for a few minutes and then goes away.  We can't identify any
>one user running jobs at the times we see this, so it's hard to
>pinpoint this on a user doing something to cause it.   Could anyone
>point me in the direction of how to begin debugging this?  Any help is
>greatly appreciated.

I am not able to solve this problem, but...
We saw this behaviour (lustre 2.12.3 and 2.12.4) parallel with lustre kernel 
thread
(if i remember: ll_ost_io threads at the ods, but with other messages at
the mds) BUG messages in the
kernel log (dmesg output). At this time the omnipath interface were not
longer pingable. We were not able to say what crashes first, the
omnipath or the lustre parts in the kernel. Perhaps you can have a look
if your mds are pingable from your clients (using the network interface
of your lustre installation). Otherwise it is expected that you get a
high load because your lustre io threads cannot satisfy requests.

Mit freundlichen Grüßen
Bernd Melchers

-- 
Archiv- und Backup-Service | fab-serv...@zedat.fu-berlin.de
Freie Universität Berlin   | Tel. +49-30-838-55905
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] High MDS load

2020-05-28 Thread Peeples, Heath
I have 2 MDSs and periodically on one of them (either at one time or another) 
peak above 300, causing the file system to basically stop.  This lasts for a 
few minutes and then goes away.  We can't identify any one user running jobs at 
the times we see this, so it's hard to pinpoint this on a user doing something 
to cause it.   Could anyone point me in the direction of how to begin debugging 
this?  Any help is greatly appreciated.

Heath
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load, but no activity

2017-07-27 Thread Robin Humble
Hi Kevin,

On Thu, Jul 27, 2017 at 08:18:04AM -0400, Kevin M. Hildebrand wrote:
>We recently updated to Lustre 2.8 on our cluster, and have started seeing
>some unusal load issues.
>Last night our MDS load climbed to well over 100, and client performance
>dropped to almost zero.
>Initially this appeared to be related to a number of jobs that were doing
>large numbers of opens/closes, but even after killing those jobs, the MDS
>load did not recover.
>
>Looking at stats in /proc/fs/lustre/mdt/scratch-MDT/exports showed
>little to no activity on the MDS.  Looking at iostat showed almost no disk
>activity to the MDT (or to any device, for that matter), and minimal IO wait.
>Memory usage (the machine has 128GB) showed over half of that memory free.

sounds like VM spinning to me. check /proc/zoneinfo, /proc/vmstat etc.

do you have zone_reclaim_mode=0? that's an olde, but important to have
set to zero.
 sysctl vm.zone_reclaim_mode

failing that (and assuming you have a 2 or more numa zone server) I
would guess it's all the zone affinity stuff in lustre these days.
you can turn most of it off with a modprobe option
  options libcfs cpu_npartitions=1

what happens by default is that a bunch of lustre threads are bound to
numa zones and preferentially and agressively allocate kernel ram in
those zones. in practice this usually means that the zone where IB card
is physically attached fills up, and then the machine is (essentially)
out of ram and spinning hard trying to reclaim, even though all the ram
in the other zone(s) is almost all unused.

I tried to talk folks out of having affinity on by default in
  https://jira.hpdd.intel.com/browse/LU-5050
but didn't succeed.

even if it wasn't unstable to have affinity on, IMHO having 2x the ram
available for caching on the MDS and OSS's is #1, and tiny performance
increases from having that ram next to the IB card is a distant #2.

cheers,
robin

>I eventually ended up unmounting the MDT and failing it over to a backup
>MDS, which promptly recovered and now has a load of near zero.
>
>Has anyone seen this before?  Any suggestions for what I should look at if
>this happens again?
>
>Thanks!
>Kevin
>
>--
>Kevin Hildebrand
>University of Maryland, College Park
>Division of IT

>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] High MDS load, but no activity

2017-07-27 Thread Kevin M. Hildebrand
We recently updated to Lustre 2.8 on our cluster, and have started seeing
some unusal load issues.
Last night our MDS load climbed to well over 100, and client performance
dropped to almost zero.
Initially this appeared to be related to a number of jobs that were doing
large numbers of opens/closes, but even after killing those jobs, the MDS
load did not recover.

Looking at stats in /proc/fs/lustre/mdt/scratch-MDT/exports showed
little to no activity on the MDS.  Looking at iostat showed almost no disk
activity to the MDT (or to any device, for that matter), and minimal IO
wait.
Memory usage (the machine has 128GB) showed over half of that memory free.

I eventually ended up unmounting the MDT and failing it over to a backup
MDS, which promptly recovered and now has a load of near zero.

Has anyone seen this before?  Any suggestions for what I should look at if
this happens again?

Thanks!
Kevin

--
Kevin Hildebrand
University of Maryland, College Park
Division of IT
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org