Re: [lustre-discuss] OSS node crash/high CPU latency when deleting 100's of emty test files

2021-03-01 Thread Colin Faber via lustre-discuss
Hi Sid, What version of lustre? -cf On Mon, Mar 1, 2021, 6:37 PM Sid Young via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > G'Day all, > > I've been doing some file create/delete testing on our new Lustre storage > which results in the OSS nodes crashing and rebooting due to high

Re: [lustre-discuss] Performance over 100G ethernet

2021-03-08 Thread Colin Faber via lustre-discuss
Hi Sid, Start here https://wiki.lustre.org/LNET_Selftest That should get you near wire speed numbers and you can use it to play around with your network tunables. After that, obdfilter-survey, mdtest, IOR and fio amongst others will help you determine hero and average numbers for your whole I/O

Re: [lustre-discuss] MDT mount failing due to unknown param quota

2021-04-02 Thread Colin Faber via lustre-discuss
Hi Brad, Looks like you're suffering from an old bug, you'll need to tuners.lustre the MDT target, capture the current params and then --erase-params, --param mdd.quota_type=ug and anything else (other than mdt.quota_type=ug) and --writeconf And because of other behavioral issues around --writeco

Re: [lustre-discuss] problems to mount MDS and MDT

2021-05-17 Thread Colin Faber via lustre-discuss
Firewall rules dealing with localhost? On Mon, May 17, 2021 at 11:33 AM Abdeslam Tahari via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > Hello > > i have a problem to mount the mds/mdt luster, it wont mount at all and > there is no message errors at the console > > -it does not show

Re: [lustre-discuss] problems to mount MDS and MDT

2021-05-17 Thread Colin Faber via lustre-discuss
It appears part of the debug data is missing (the part before you posted it), Can you try again, lctl dk > /dev/null to clear it then try your mount and grab the debug again? On Mon, May 17, 2021 at 1:35 PM Abdeslam Tahari wrote: > Thank you Colin > > No i don't have iptables or rules > > firewa

Re: [lustre-discuss] MDT filling up

2021-06-26 Thread Colin Faber via lustre-discuss
Is changelogs enabled? On Sat, Jun 26, 2021 at 12:58 PM Thomas Roth wrote: > > Dear all, > > we have one of three MDTs filling up its disk space. It is running 2.12.5 > on ldiskfs, but no data-on-metadata. > The inode usage is just 54%, corresponding to 477 M inodes. > > There is a large directo

Re: [lustre-discuss] Why reads are slower than writes on lustre file system?

2021-09-27 Thread Colin Faber via lustre-discuss
Depending on your IO workloads reads can be slower than writes in cases where the writes may be sequential and optimized for the backing storage hardware, and the reads random, or semi-random. This also can be greatly affected by block allocator efficiency. Typically on well tuned modern lustre fil

Re: [lustre-discuss] Why reads are slower than writes in lustre

2021-09-28 Thread Colin Faber via lustre-discuss
The amount of data you're testing is far too small. Try upping it 10x so you have a longer run time and you achieve a more reasonable average. On Tue, Sep 28, 2021 at 6:02 PM Nagmat Nazarov wrote: > Dear Engineers, > > I have started working on a lustre file system. I have done a couple of > exp

Re: [lustre-discuss] No soace left on device Error

2021-10-08 Thread Colin Faber via lustre-discuss
There's many reasons why you might hit this error, can you provide a bit more information? What's the state of your OSS systems? Are all your OSTs online and available? Any error logging from these nodes you can share? -cf On Thu, Oct 7, 2021 at 10:04 AM Dilip Sathaye via lustre-discuss < lustre

Re: [lustre-discuss] OST's wating fro client on a pcs cluster

2021-11-18 Thread Colin Faber via lustre-discuss
Hi Koos, First thing -- it's generally a bad idea to run newer server versions with older clients (the opposite isn't true). Second -- do you have any logging that you can share from the client itself? (dmesg, syslog, etc) A quick test may be to run 2.12.7 clients against your cluster to verify

Re: [lustre-discuss] Lustre and server upgrade

2021-11-18 Thread Colin Faber via lustre-discuss
Hi, I believe in 2.10 sometime (someone correct me if I'm wrong) that the index parameter was required and needs to be specified. On an existing system this should already be set, but can you check the parameters line with tunefs.lustre for correct index=N values across your storage nodes? Also,

Re: [lustre-discuss] Lustre and server upgrade

2021-11-18 Thread Colin Faber via lustre-discuss
Hm.. If you install the test suite does llmount.sh succeed? This should setup a single node cluster on whatever node you're running lustre on, I believe it will load modules as needed (IIRC), if this test succeeds, then you know that lustre is installed correctly (or correctly enough), if not, I'd

Re: [lustre-discuss] Lustre and server upgrade

2021-11-18 Thread Colin Faber via lustre-discuss
This would be part of the lustre-tests RPM package and will install llmount.sh to /usr/lib/lustre/tests/llmount.sh I believe. On Thu, Nov 18, 2021 at 1:45 PM STEPHENS, DEAN - US wrote: > Not sure what you mean by “If you install the test suite”. I am not seeing > a llmount.sh file on the server

Re: [lustre-discuss] Lustre and server upgrade

2021-11-18 Thread Colin Faber via lustre-discuss
So that indicates that your installation is incomplete or something else is preventing lustre, ldiskfs, and possibly other modules from loading. Have you been able to reproduce this behavior on a fresh rhel install with lustre 2.12.7? (i.e. llmount.sh failing)? -cf On Thu, Nov 18, 2021 at 2:20

Re: [lustre-discuss] Lustre and server upgrade

2021-11-18 Thread Colin Faber via lustre-discuss
The VM will need a full install of all server packages, as well as the tests package to allow for this test. On Thu, Nov 18, 2021 at 2:26 PM STEPHENS, DEAN - US wrote: > I have not tried that but I can do that on a new VM that I can create. I > assume that is all that I need is the lustre-tests

Re: [lustre-discuss] Lustre and server upgrade

2021-11-19 Thread Colin Faber via lustre-discuss
Hi Dean, Glad to hear you were able to clean up, sounds like you've also been successful in your vm trial, I would suggest at this point that you take a close look at your installation and verify that all of the needed packages are installed correctly. The fact that it's complaining about missing

Re: [lustre-discuss] OST's wating fro client on a pcs cluster

2021-11-19 Thread Colin Faber via lustre-discuss
Hi Koos, One thing you mentioned that I should have picked up on sooner, was "The servers are connected in a multirail network, because some clients are in IB and the other clients are on ethernet" Can you describe your topology? How are the various elements connected to each other? -cf On Fri

Re: [lustre-discuss] Lustre and server upgrade

2021-11-24 Thread Colin Faber via lustre-discuss
what does tune2fs report for /dev/sdb on the MDS? (Also sorry, this somehow got lost in my inbox) On Mon, Nov 22, 2021 at 8:57 AM STEPHENS, DEAN - US wrote: > Colin and Andreas, so to clarify some points for you, This is what I am > seeing: > > > > Rpm -qa | grep lustre > > Kmod_lustre-2.12.6-1

Re: [lustre-discuss] ost_connect to node failed

2021-11-25 Thread Colin Faber via lustre-discuss
-114 == operation in progress, what's the logging look like on both sides of the connection? -cf On Thu, Nov 25, 2021 at 5:18 AM Hallstein Løhre < hallstein.lo...@alphasystem.no> wrote: > > > Hi, > > > > After some trouble with runaway processes yesterday, I had to reboot > several Lustre clien

Re: [lustre-discuss] Lustre and server upgrade

2021-11-29 Thread Colin Faber via lustre-discuss
Hi, tune2fs and tunefs.lustre are different tools which yield different information about the block device. I'd like to be sure that we're working with the right type of device here and basic ext4/ldiskfs data is present (nevermind if the lustre configuration data is present) -cf On Mon, Nov 29,

Re: [lustre-discuss] Lustre and server upgrade

2021-11-29 Thread Colin Faber via lustre-discuss
Well, all signs indicate that this target has not been prepared for lustre. Can you post the output of your original formatting command? On Mon, Nov 29, 2021 at 8:26 AM STEPHENS, DEAN - US wrote: > That was my fault. I did not use the correct command. > > > > The output of the lsblk command show

Re: [lustre-discuss] darshan-discuss

2022-04-28 Thread Colin Faber via lustre-discuss
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users ? On Thu, Apr 28, 2022 at 9:42 AM John Bauer wrote: > Since there seems to be considerable overlap between lustre and darshan > users I thought I would ask here: Is there an email list for darshan > discussion analogous to lustre-discuss?

Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Colin Faber via lustre-discuss
Can you mount the target directly with -t ldiskfs ? Also what does e2fsck report? On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > We have a filesystem that we have running Lustre 2.10.4 in HA mode using > IML. One of our OST's had some di

Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Colin Faber via lustre-discuss
raid check? On Wed, Jul 20, 2022, 12:41 PM Paul Edmon wrote: > [root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd > /mnt/holylfs2-OST001f > mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd, >missing codepage or helper program, or other error > >In

Re: [lustre-discuss] Old IP of OSS still appearing in client

2022-08-17 Thread Colin Faber via lustre-discuss
See: https://manpages.org/tunefslustre/8 On Wed, Aug 17, 2022 at 4:27 AM Zeeshan Ali Shah via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > Dear All , i am getting strange issue the old OSS is decommissioned and > all of its OST moved to new OSS to new IP . suddenly in client side we

Re: [lustre-discuss] Old IP of OSS still appearing in client

2022-08-18 Thread Colin Faber via lustre-discuss
Running the command with no arguments against the targets will dump all configuration info you need On Thu, Aug 18, 2022, 1:55 AM Zeeshan Ali Shah wrote: > Thanks Colin , but how to get the current param of OSS or OST ? i wan to > know which failovernode ip or servicenode IP used . > > On Wed, A

Re: [lustre-discuss] Lustre 2.15.1 change log

2022-09-30 Thread Colin Faber via lustre-discuss
It was also fixed in 2.12.8 On Fri, Sep 30, 2022, 1:28 PM Simon Guilbault < simon.guilba...@calculquebec.ca> wrote: > Hi, the grant_shrink bug was fixed in 2.15.0 according to this JIRA: > https://jira.whamcloud.com/browse/LU-14124 > > On Fri, Sep 30, 2022 at 3:59 AM Tung-Han Hsieh < > thhs...@tw

Re: [lustre-discuss] Accessing files with bad PFL causing MDS kernel panics

2022-10-25 Thread Colin Faber via lustre-discuss
Hi Nathan, looks like you're hitting https://jira.whamcloud.com/browse/LU-16152 -cf On Tue, Oct 25, 2022 at 2:43 PM Nathan Crawford wrote: > Hi All, > > I'm looking for possible work-arounds to recover data from some > mis-migrated files (as seen in LU-16152). Basically, there's a bug in "l

Re: [lustre-discuss] Question about lustre2.15.2 between server and client instal

2023-02-12 Thread Colin Faber via lustre-discuss
The server packages contain everything needed to mount as a client. The client packages are a limited subset which only provide the client itself. On Sun, Feb 12, 2023 at 7:51 PM 王烁斌 via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > Hi~ > When the version 2.15.2 Lustre server is comp

Re: [lustre-discuss] Lustre crash and now lockup on ls -la /lustre

2023-02-23 Thread Colin Faber via lustre-discuss
What errors are indicated in the kernel ring buffer on the client (dmesg) ? On Wed, Feb 22, 2023 at 10:56 PM Sid Young via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > Hi all, > > I've been running lustre 2.12.6 and (clients are 2.12.7) on HP gear for > nearly 2 years and had an odd

Re: [lustre-discuss] How lustre uses the replication

2023-02-27 Thread Colin Faber via lustre-discuss
I think you're after "lfs mirror" and related commands. On Mon, Feb 27, 2023, 12:53 AM yuehui gan via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > But I can't find the relevant implementation, Is there a third-party > lustre replication implementation? > any help would be appreciate

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-15 Thread Colin Faber via lustre-discuss
Have you tried resilvering the pool? On Wed, Mar 15, 2023, 11:57 AM Mountford, Christopher J. (Dr.) via lustre-discuss wrote: > I'm hoping someone offer some suggestions. > > We have a problem on our production Lustre/ZFS filesystem (CentOS 7, ZFS > 0.7.13, Lustre 2.12.9), so far I've drawn a bl

Re: [lustre-discuss] question mark when listing file after the upgrade

2023-05-03 Thread Colin Faber via lustre-discuss
Hi, What does your client log indicate? (dmesg / syslog) On Wed, May 3, 2023, 7:32 AM Jane Liu via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > Hello, > > I'm writing to ask for your help on one issue we observed after a major > upgrade of a large Lustre system from RHEL7 + 2.12.9