Since some mailers don't like attachments, I'll just paste in the script we use
here.
I call the script with
./parse.sh | sort -k3 -n
You just need to change out the name of your MDT in two places.
#!/bin/bash
set -e
SLEEP=10
stats_clear()
{
cd $1
echo clear >clear
}
stats_print "$dir"
done
From: Moreno Diego (ID SIS)
Sent: Tuesday, October 29, 2019 10:08 AM
To: Louis Allen ; Oral, H. ; Carlson,
Timothy S ; lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] [EXTERNAL] Re: Lustre Timeouts/Filesystem Hanging
Hi Louis,
If you don’t hav
In my experience, this is almost always related to some code doing really bad
I/O. Let's say you have a 1000 rank MPI code doing open/read 4k/close on a few
specific files on that OST. That will make for a bad day.
The other place you can see this, and this isn't your case, is when ZFS
I've been running 100->200TB OSTs making up small petabyte file systems for the
last 4 or 5 years with no pain. Lustre 2.5.x through current generation.
Plenty of ZFS rebuilds when I ran across a set of bad disks that went fine.
From: lustre-discuss On Behalf Of
w...@umich.edu
Sent: Tuesday,
+1 on
options zfs zfs_prefetch_disable=1
Might not be as critical now, but that was a must-have on Lustre 2.5.x
Tim
From: lustre-discuss On Behalf Of
Riccardo Veraldi
Sent: Wednesday, March 13, 2019 3:00 PM
To: Kurt Strosahl ; lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] ZFS
I will say YMMV. I've rebooted storage nodes and have had mixed results where
we land into one of three bucktes
1) Codes breeze through and have just been stuck in D state while OSS's reboot
2) RPCs get stuck somewhere and when the OSS comes back I eventually have to
force an abort_recovery
3)
I would work on fixing your NFS server before moving to Lustre. That being
said, I have no idea of how big an installation you have. How many nodes you
have for NFS clients, how much data you are talking about moving around, etc.
As others will point out, even with improvements in Lustre
I'll just add +1 to this thread. /home on NFS for software builds, small
files, lots of metadata operations. Lustre for the rest. Users will do the
wrong thing even after education.
Tim
-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On
Isilon is truly an enterprise solution. We have one (about a dozen bricks
worth) and use it for home directories on our super computers and it allows
easy access via CIFS to users on Windows/Mac.
It is highly configurable with “smart pools” and policies to move data around
based on
FWIW, we have successfully been running 2.9 clients (RHEL 7.3) with 2.5.3
servers (RHEL 6.6) at a small scale. About 40 OSSes and dozens of 2.9 clients
with hundreds of 2.5.3 clients mixed in.
Tim
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf
Of E.S.
Does your new MDS server have all the UIDs of these people in /etc/passwd?
Tim
-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf
Of Phill Harvey-Smith
Sent: Monday, December 12, 2016 9:16 AM
To: lustre-discuss@lists.lustre.org
Subject:
oks like I will be upgrading to 2.5.4 soon as I really
need to be able to deactivate OSTs and have the algorithm on the MDS still be
able to choose new OSTs to write to.
Tim
-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf
Of Carlson,
Running Lustre 2.5.3(ish) backed with ZFS.
We’ve added a few OSTs and they show as being “UP” but aren’t taking any data
[root@lzfs01a ~]# lctl dl
0 UP osd-zfs MGS-osd MGS-osd_UUID 5
1 UP mgs MGS MGS 1085
2 UP mgc MGC172.17.210.11@o2ib9 77cf08da-86a4-7824-1878-84b540993c6d 5
3 UP
Is this a ZFS backed Lustre with compression? If so, then that is not at all
surprising if that is a compressible file. I have a 1G file of zeros that shows
up as 512 bytes
[root@pic-admin03 tim]# ls -sh 1G
512 1G
[root@pic-admin03 tim]# ls -l 1G
-rw-r--r-- 1 tim users 1073741824 Dec 2 2015
Folks,
I've done my fair share of googling and run across some good information on ZFS
backed Lustre tuning including this:
http://lustre.ornl.gov/ecosystem-2016/documents/tutorials/Stearman-LLNL-ZFS.pdf
and various discussions around how to limit (or not) the ARC and clear it if
needed.
Lustre is not (yet) part of the mainstream kernel so you are not going to find
Lustre digging through the linux kernel build process. Thus you see the link
below from Thomas on some lustre packages.
Tim
-Original Message-
From: lustre-discuss-boun...@lists.lustre.org
-Original Message-
From: Dilger, Andreas [mailto:andreas.dil...@intel.com]
Sent: Wednesday, September 25, 2013 10:03 AM
To: Carlson, Timothy S
Cc: lustre-discuss@lists.lustre.org; hpdd-disc...@lists.01.org
Subject: Re: [Lustre-discuss] Can't increase effective client read cache
I've got an odd situation that I can't seem to fix.
My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL
5.
My compute nodes have 64 GB of memory and I have a use case where an
application has very low memory usage and needs to access a few thousand files
in Lustre
FWIW, we have seen the same issues with Lustre 1.8.x and slightly older RHEL6
kernel. We do the echo as part of our slurm prolog/epilog scripts. Not a fix
but a workaround before/after jobs run. No swap activity, but very large
buffer cache in use.
Tim
-Original Message-
From:
way too much in the past day to
boot back into working kernels. :)
Tim
-Original Message-
From: Kevin Van Maren [mailto:kvanma...@fusionio.com]
Sent: Saturday, October 22, 2011 8:24 AM
To: Carlson, Timothy S
Cc: Lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss
to be working so far.
Thanks
Tim
-Original Message-
From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-
boun...@lists.lustre.org] On Behalf Of Carlson, Timothy S
Sent: Sunday, October 23, 2011 3:19 PM
To: 'Kevin Van Maren'
Cc: Lustre-discuss@lists.lustre.org
Subject: Re
Folks,
I've got a need to run a 2.6.37 or later kernel on client machines in order to
properly support AMD Interlagos CPUs. My other option is to switch from RHEL
5.x to RHEL 6.x and use the whamcloud 1.8.6-wc1 patchless client (the latest
RHEL 6 kernel also supports Interlagos). But I would
On May 19, 2011, at 10:28, Kevin Van Maren wrote:
Dardo D Kleiner - CONTRACTOR wrote:
As for putting the entire filesystem on flash, sure that would be
pretty
nifty, but expensive. Not being able to do failover, with storage on
internal PCIe cards, is a downside.
[Andreas added this
Folks,
I know that flash based technology gets talked about from time to time on the
list, but I was wondering if anybody has actually implemented FusionIO devices
for metadata. The last thread I can find on the mailing list that relates to
this topic dates from 3 years ago. The software
24 matches
Mail list logo