clients
to treat all the targets as read-only. But if there is such a parameter, I am
not familiar with it.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discus
I don’t think we need to have PFL working immediately, and since we have plans
to upgrade the client at some point, I will just wait and see what happens
after the upgrade.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
lcme_flags: 0
lcme_extent.e_start: 4194304
lcme_extent.e_end: 67108864
lmm_stripe_count: 4
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen:65535
lmm_stripe_offset: -1
--
Rick Mohr
Senior HPC System Administrator
National Institute
hould be 1. Fix? yes
>
> In fact I think that the e2fsc ran so slow due that all the mdt inodes were
> corrected.
You may already be doing this, but just in case, make sure that you are using
the latest version of Whamcloud’s e2fsprogs
(https://downloads.whamcloud.com/public/e2fsprogs
use the RPMs
found alongside the lustre client RPM?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Apr 16, 2019, at 11:27 AM, Pharthiphan Asokan wrote:
>
>
> Hello,
>
> unable to install lustre
ects for that file. This is necessary to
ensure that quota information reported by Lustre is accurate, but I don’t
believe it is meant to fix any corruption in the quota files themselves.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.
sure
if/how you can regenerate quota info for ZFS.)
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://list
10157 26% /share/lfs02
Again, there are already 19,222,318 files on the file system, so
IUsed=19222318. All the OSTs together only have 18,175,092 + 18,300,779 +
18,134,286 = 54,610,157 inodes available, so IFree=54610157. And Inodes =
IUsed + IFree = 73832475.
--
Rick Mohr
Senior HPC
some kind of bad data or corruption in
the config logs on the MGS (so you use the writeconf process to blow away the
bad config logs and regenerate them).
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Apr 4, 2019, a
migrate a MDT to new
hardware.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Mar 29, 2019, at 6:03 PM, Hans Henrik Happe wrote:
>
> Hi Kurt,
>
> Haven't got much experience with the comple
appreciated.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss
> On Mar 20, 2019, at 1:24 PM, Peter Jones wrote:
>
> If it's not in the manual then it should be. Could you please open an LUDOC
> ticket to track getting this corrected if need be?
Done.
https://jira.whamcloud.com/browse/LUDOC-435
--
Rick Mohr
Senior HPC System Admin
> On Mar 18, 2019, at 5:31 PM, Peter Jones wrote:
>
> You need the patched kernel for that feature
I suppose that should be documented in the manual somewhere. I thought project
quota support was determined based on ldiskfs vs zfs, and not patched vs
unpatched.
--
Rick Mohr
S
documentation?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss
ml#dbdoclet.lfsckadmin
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> On Jan 16, 2019, at 4:18 AM, Jae-Hyuck Kwak wrote:
>
> How can I force --writeconf option? It seems that mkfs.lustre doesn't support
> --writeconf option.
You will need to use the tunefs.lustre command to do a writeconf.
--
Rick Mohr
Senior HPC System Administrator
Natio
Is it possible you have some incompatible ko2iblnd module parameters between
the 2.8 servers and the 2.10 clients? If there was something causing LNet
issues, that could possibly explain some of the symptoms you are seeing.
--
Rick Mohr
Senior HPC System Administrator
National Institute for
like the “lfs” command has a built-in “lfs migrate”
subcommand which supports a “—block” option to prevent file access while the
migration is happening. So it might be safe to use.
Perhaps someone else on the list with more experience using this command could
chime in.
--
Rick Mohr
Senior
out what the "lctl
> set_param osp..max_create_count=0” command would do?
The Lustre manual has a section on removing MDTs/OSTs:
http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.deactivating_mdt_ost
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
htt
"lctl conf_param .osc.active=0”. This will notify all
Lustre clients to deactivate the OST, which I believe causes the hangs you were
seeing when any client tries to remove or stat a file on that OST.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sci
allocating any new files to the OST, but still allow clients to read and delete
files on that OST.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss
> On Nov 9, 2018, at 11:28 AM, Mohr Jr, Richard Frank (Rick Mohr)
> wrote:
>
>
>> On Nov 8, 2018, at 11:44 AM, Ms. Megan Larko wrote:
>>
>> I have been attempting this command on a directory on a Lustre-2.10.4
>> storage from a Lustre 2.10.1 client a
ioctl 0x4008669a for 'custTest' (3): Invalid argument
> error: setstripe: create striped file 'custTest' filed: Invalid argument
Do you get the same error if you try to run this on a file instead of a
directory? Also, don’t you typically need to add the “-d” option when setting
nd zfs receive it on mds2
> • zfs send the MGT partition from mds1 and zfs receive it on mds2
> • mount lustre on mds2
> should it work ?
I think that should work, except that in the first step you don’t need to
create a lustre FS on the new pools.
--
Rick Mohr
Senior
le ZIL, change the
> redundant_metadata to "most" atime off.
>
> I could send you a list of parameters that in my case work well.
Riccardo,
Would you mind sharing your ZFS parameters with the mailing list? I would be
interested to see which options you have changed.
--
Rick
had good luck
with most of the applications continuing without issues. Sometimes there are a
few jobs that abort, but overall this is better than having to stop all jobs
and remount lustre on all the compute nodes.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computa
the mgs & mdt
> servernodes to support both LNET nids after mounting the OSTs, the command
> succeeds, but the file system is not mountable from the client.
You can’t use mkfs.lustre to update service node NIDs once the file system is
formatted. You would need to perform a writeconf o
a good idea to do
this.
So that is why I was thinking that resizing the MDT might be the simplest
approach. Of course, I might be mistunderstanding something about DNE2, and
if that is the case, someone can correct me. Of if there are options I am not
considering, I would welcome those too.
remember is not available for
ZFS at the moment .)
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http
) require info from the OSS
servers, so those operations would hang.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Sep 8, 2018, at 8:33 AM, fırat yılmaz wrote:
>
> Hi There,
>
> OS=Centos 7.4
> Lust
2iblnd-opa are intended for OmniPath hardware. Since you are
using IB, you will want to just set your options like this:
options ko2iblnd peer_credits=…, etc.
Have you verified that the firewall is not running? It’s possible a firewall
might be allowing ping traffic but blocking the port neede
> On Aug 22, 2018, at 8:10 PM, Riccardo Veraldi
> wrote:
>
> On 8/22/18 3:13 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>>> On Aug 22, 2018, at 3:31 PM, Riccardo Veraldi
>>> wrote:
>>> I would like to migrate this virtual machine to another infras
corruption of data ?
> May I simply use zfs send and zfs receive thru SSH ?
> what is the best way to move a MDS based virtual machine ?
I don’t have much experience with VMs, but I have used zfs send/receive to
migrate a MDT from one server to another. It worked quite well.
--
Ric
e size discrepancy.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
your failover
config or not. (Maybe it doesn’t matter.)
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
onnection.
> Maybe I'm missing something obvious. Do you see any typo in the command?
What mount command are you using on the client?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
> On Jul 27, 2018, at 1:56 PM, Andreas Dilger wrote:
>
>> On Jul 27, 2018, at 10:24, Mohr Jr, Richard Frank (Rick Mohr)
>> wrote:
>>
>> I am working on upgrading some Lustre servers. The servers currently run
>> lustre 2.8.0 with zfs 0.6.5, and I am
need to run “zfs
upgrade” on the underlying pools before upgrading the lustre version?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss
> On Jun 27, 2018, at 4:44 PM, Mohr Jr, Richard Frank (Rick Mohr)
> wrote:
>
>
>> On Jun 27, 2018, at 3:12 AM, yu sun wrote:
>>
>> client:
>> root@ml-gpu-ser200.nmg01:~$ mount -t lustre
>> node28@o2ib1:node29@o2ib1:/project /mnt/lustre_data
>&g
something change in the
meantime?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
d line : lnetctl lnet configure --all to make my static lnet
> configuration take effect. but i still can't ping node28 from my client
> ml-gpu-ser200.nmg01. I can mount as well as access lustre on client
> ml-gpu-ser200.nmg01.
What options did you use when mounting the
> On May 2, 2018, at 10:37 AM, Mohr Jr, Richard Frank (Rick Mohr)
> wrote:
>
>
>> On May 2, 2018, at 9:59 AM, Mark Miller wrote:
>>
>> Since I have the Lustre source code, I can start looking through it to see
>> if I can find where the Lustre mount sy
options stored in the file system that
gets retrieved with one of the e2fsprogs tools (maybe debugfs) which is then
used when performing the actual mount. So I could easily see something trying
to query a ZFS attribute to retrieve similar information before doing the mount.
--
Rick Mohr
Senior HPC
aim_mode parameter (or have
an admin do it for you), then it might be worth looking at.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-d
your 4 OSTs, but it
might explain why the cache for some OSTs decrease when others increase.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Apr 2, 2018, at 8:06 PM, John Bauer wrote:
>
> I am running dd
r.node= /dev/
The first one is the preferred method. Keep in mind that the
“—servicenode=nid,nid” syntax is intended for specifying multiple nids that
belong to the same host. To specify multiple hosts for failover, you will want
to add a —servicenode option for each host.
--
Rick Mohr
Senio
stered. I also waited a while in
case it just took some time to clear the entries, but after several hours, they
were still there.
Am I misunderstanding what is supposed to happen when a userid is deregistered?
Or did I mess up a command somewhere? Or is this a bug?
--
Rick Mohr
Senior HPC Syste
end/receive to move data using incremental snapshots. This was
much easier than trying to tar up the contents of a ldiskfs-backed MDT and
untar it to the new storage.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
__
> On Nov 29, 2017, at 8:35 PM, Dilger, Andreas wrote:
>
> Would you be able to open a ticket for this, and possibly submit a patch to
> fix the build?
I can certainly open a ticket, but I’m afraid I don’t know what needs to be
fixed so I can’t provide a patch.
--
Rick Mohr
Senio
** No rule to make target `fld.ko', needed by `all-am'. Stop.
When I removed the “—disable-client” option, the error went away.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
blocking any traffic, that could cause a problem.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustr
ge will be. If the applications
create lots of small files (like some biomed programs), then a larger MDT would
result in more inodes allowing more Lustre files to be created.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nic
o2iblnd/parameters.
It might be worthwhile to compare those values on the lnet routers to the
values on the servers to see if maybe there is a difference that could affect
the behavior.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
and create files or folders.
Are you using automount for /home?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
ht
stre servers. You could also use LDAP or even just
/etc/passwd. You’ll probably just want to choose whatever mechanism is used
on your other systems.
For the purposes of testing, you could always just create the luser1 locally on
each lustre server to see if things start to work.
--
Rick Mohr
might have some experience with
this so they can share their wisdom with me.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-di
what I did and comment out all the lines in ko2iblnd.conf and add
your own lines.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre
formance impact, if any, from using quotas?
The last time we did performance testing, I think we only saw a performance hit
of around 10%. But this was several years ago (i.e. - Lustre 1.8 days), so I
don’t know how much things have changed since then.
--
Rick Mohr
Senior HPC System Adminis
e can dramatically reduce load.
So in summary:
Q: Is it a problem to have a high load on my OSS servers?
A: It depends….
(Wish it could be a little more clear cut than that)
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.
You might want to start by looking at these online tutorials:
http://lustre.ornl.gov/lustre101-courses/
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On May 21, 2017, at 6:19 AM, Ravi Konila wrote:
>
>
> On May 4, 2017, at 11:03 AM, Steve Barnet wrote:
>
> On 5/4/17 10:01 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>> Did you try doing a writeconf to regenerate the config logs for the file
>> system?
>
>
> Not yet, but quick enough to try. Do this for the
Did you try doing a writeconf to regenerate the config logs for the file system?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On May 4, 2017, at 10:03 AM, Steve Barnet wrote:
>
> Hi all,
>
> This
m (%install)
>
> this is the command line I used to build:
>
> rpmbuild --without servers --without lustre-tests --with zfs --with
> lustre_modules -bb lustre-2.9.0.spec
This probably has nothing to do with the errors you are seeing, but just for
reference, you shouldn’t need
than repeat it here.
>
> https://jira.hpdd.intel.com/browse/LU-8658
Ah, that is good to know. Thanks for the explanation.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
_
complaining about errors
to the same MDS server, then my first guess would be that there is some wrong
on the server side of things.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On May 2, 2017, at 4:52 AM, Lydia Heck wr
This might be a long shot, but have you checked for possible firewall rules
that might be causing the issue? I’m wondering if there is a chance that some
rules were added after the nodes were up to allow Lustre access, and when a
node got rebooted, it lost the rules.
--
Rick Mohr
Senior HPC
out for?
I have enabled flock on all my Lustre file systems (2.4.3 and 2.8), and I have
not yet encountered any issues.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-
Alex,
Were you ever able to get more details about this problem? Thanks.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Feb 9, 2017, at 10:27 PM, Alexander I Kulyavtsev wrote:
>
> Yes, in lustre 2.5.3 af
y verified.
Have you tried mounting the file system on different nodes? This could help
determine if the problem is always the same or if it might be affected by the
type of node (MDS vs OSS) that is being used for the client.
--
Rick Mohr
Senior HPC System Administrator
National Institute f
-
Has anyone else encountered this “off by 21” problem before? I didn’t see
anything online, but perhaps I missed something.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
Glad you were able to get it up and running.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Jan 12, 2017, at 9:52 PM, Jeff Slapp wrote:
>
> That was the solution! Thank you for your suppor
I noticed that you appear to have formatted the MDT with the file system name
“mgsZFS” while the OST was formatted with the file system name “ossZFS”. The
same name needs to be used on all MDTs/OSTs in the same file system. Until
that is fixed, your file system won’t work properly.
--
Rick
but is there any reason why you are choosing IB over Ethernet? I think
> I'd prefer to try over the Ethernet is we are going to pick one.
I just figured that if you had Infiniband, then you would prefer to run with
the higher performance interconnect. But you can try ethernet ju
up the file system to only
use Infiniband, then that would eliminate any complications from having two
fabrics active at the same time. Then you could see if the problem still
persists.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.t
ore specialized storage hardware (like DDN or
NetApp), but that is not required for Lustre.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
--servicenode options if you wanted.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Jan 8, 2017, at 11:58 PM, Vicker, Darby (JSC-EG311)
> wrote:
>
> We have a new set of hardware we are configuring as a
e OST
usage drops, then you can use “lctl enable” to reenable it.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http
r, which I installed from RPMs I suspect there is one that I
> have not installed :)
It looks like that command may have been removed in more recent versions of
Lustre.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sci
t the OSTs (these are referred to as the OSS nodes). However,
it is possible to have a single server that mounts the MDT/MGT as well as the
OSTs.
If you are interested in some entry-level Lustre tutorials, check out
http://lustre.ornl.gov/lustre101-courses.
--
Rick Mohr
Senior HPC System Admini
tures
(like snapshots) that are useful for the MDT so some folks are willing to
accept a performance hit in order to take advantage of those features.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.t
Did you check to make sure there are no firewalls running that could be
blocking traffic?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Sep 27, 2016, at 10:12 AM, Phill Harvey-Smith
> wrote:
>
> Hi
e (err -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol
> zap_curs:
Often time those types of error messages indicate some sort of version mismatch
between kernel modules. Did you just download the lustre RPMs from the web
site?
--
Rick Mohr
S
> As I said in previous messages, the client connected when the primary was ok
> it can use the service MDS without problems.
>
> Any suggestion?
Unfortunately, no. Did you ever mention which Lustre version you are running?
I don’t recall seeing that.
--
Rick Mohr
Senior HPC System A
Alfonso,
Are you still having problems with this, or were you able to get it resolved?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Sep 1, 2016, at 12:43 PM, Pardo Diaz, Alfonso
> wrote:
>
> Hi!
>
corresponds to mds1, so when it is down, there is no
second host for the client to try. Try specifying IP addresses instead of
hostnames and see if that make a difference.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational
Some of the information should still be relevant. In my opinion, it is still
worthwhile reading to get a better idea on what is happening inside of Lustre
(even if some of the details are out of date).
--
Rick Mohr
Senior HPC System Administrator
National Institute for Com
here should be any problem with the OSTs that would
require a fsck.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.or
the server that has the OST mounted writes a bit of info
to the disk periodically to indicate that it is in use. If another host tries
to mount the OST, it looks at the MMP info. If it is recent, it assumes the
OST is in use and won’t mount it. I suspect that tune2fs is doing something
simila
or newer versions of Lustre. So
it’s possible I am mistaken about the defaults.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@list
bottleneck as Oliver
suggested.) Do your OSS nodes have a lot of memory? Do you know what your
typical memory usage is on the OSS nodes?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Jul 28, 2016, at 10:19 PM, Ricca
gt;
> 0 edi-vf-1-5:~#
Is the device currently mounted? If so, that would explain the error.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lus
don’t
know if that could be used to prefer one interface over another.)
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.
Is the client supposed to have an IB interface configured, or is it just
supposed to mount over ethernet?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On Jul 20, 2016, at 2:09 PM, sohamm wrote:
>
> Hi
ou are at it. The clients can then be upgraded later
like you listed in your plan.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-d
r restriping will mostly be done by a script, then you
don’t necessarily need a simple formula.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
t to see several OST at 81%, a few at 82%, and maybe one or
two at 83%. Instead, I see two OSTs at 85% and 86% which fall outside the
norm. Since the default stripe count for my file system is 2, this is an
excellent indication that someone has a misstriped file.
Which means that I need to stop typ
Have you tried doing a writeconf to regenerate the config logs?
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
> On May 17, 2016, at 12:08 PM, Randall Radmer wrote:
>
> We've been working with lustre sys
. IIRC, SDSC also
had an issue with LU-5726 but the symptoms they saw were not identical to mine.
So maybe you are seeing the same problem manifest itself in a different way.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational S
g else, reading through the bug report might be useful. It
details some of the MDS OOM problems I had and mentions setting
vm.zone_reclaim_mode=0. It also has Robin Humble’s suggestion of setting
"options libcfs cpu_npartitions=1” (which is something that I started doing as
well).
--
top" showing nearly all time spent in spinlock_irq. iirc.)
>
> might your system have had a *lot* of memory? ours tend to be fairly modest
> (32-64G, dual-socket intel.)
I have 64 GB on my servers.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational S
1 - 100 of 199 matches
Mail list logo