[lustre-discuss] Eternally Invalid Lock?

2024-04-24 Thread Ellis Wilson via lustre-discuss
Hi all, (This is on 2.15.4 with very limited modifications, none to speak of in ldlm or similar) Very rarely, when attempting to perform an lctl barrier_freeze, we run into a situation where it fails with EINVAL. At that point all future lctl barrier operations (including rescan) return

[lustre-discuss] Full List of Required Open Lustre Ports?

2023-02-01 Thread Ellis Wilson via lustre-discuss
Hi folks, We've seen some weird stuff recently with UFW/iptables dropping packets on our OSS and MDS nodes. We are running 2.15.1. Example: [ 69.472030] [UFW BLOCK] IN=eth0 OUT= MAC= SRC= DST= LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=58224 DF PROTO=TCP SPT=1022 DPT=988 WINDOW=510 RES=0x00 ACK

[lustre-discuss] Intent of resize in mkfs.lustre

2022-11-09 Thread Ellis Wilson via lustre-discuss
Hi all, I ran into an issue against drives just shy of 16TiB with mkfs.lustre, which appears to relate to how resize is employed by mkfs.lustre when it calls mke2fs. I've opened this: https://jira.whamcloud.com/browse/LU-16305 Side note: how do I assign something to myself? I have a fix but

Re: [lustre-discuss] lproc stats changed snapshot_time from unix-epoch to uptime/monotonic in 2.15

2022-08-25 Thread Ellis Wilson via lustre-discuss
e original patch, which was commit ea2cd3af7b). Cheers, Andreas > On Aug 24, 2022, at 14:50, Ellis Wilson via lustre-discuss > wrote: > > Hi all, > > One of my colleagues noticed that in testing 2.15.1 out the stats returned > include snapshot_time showing up in

[lustre-discuss] lproc stats changed snapshot_time from unix-epoch to uptime/monotonic in 2.15

2022-08-24 Thread Ellis Wilson via lustre-discuss
Hi all, One of my colleagues noticed that in testing 2.15.1 out the stats returned include snapshot_time showing up in a different fashion than before. Previously, ktime_get_real_ts64 was used to get the current timestamp and that was presented when stats were printed, whereas now uptime is

Re: [lustre-discuss] [EXTERNAL] Limiting Lustre memory use?

2022-02-22 Thread Ellis Wilson via lustre-discuss
Hi Bill, I just ran into a similar issue. See: https://jira.whamcloud.com/browse/LU-15468 Lustre definitely caches data in the pagecache, and as far as I have seen metadata in slab. I'd start by running slabtop on a client machine if you can stably reproduce the OOM situation, or creating a

[lustre-discuss] Appropriate Umount Ordering

2022-02-17 Thread Ellis Wilson via lustre-discuss
Hi all, (Hopefully) simple two questions this time around. This is for 2.14.0, and my cluster is setup with no failovers for MDTs or OSTs. OBD timeouts have not been altered from the defaults. Question 1: I read on the Lustre Wiki that the appropriate ordering to umount the various

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-02-14 Thread Ellis Wilson via lustre-discuss
..@whamcloud.com<mailto:pjo...@whamcloud.com>. Learn why this is important<http://aka.ms/LearnAboutSenderIdentification> Ellis JIRA accounts can be requested from i...@whamcloud.com<mailto:i...@whamcloud.com> Peter From: lustre-discuss mailto:lustre-discuss-boun...@lists.lustre.o

Re: [lustre-discuss] [EXTERNAL] Re: Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-20 Thread Ellis Wilson via lustre-discuss
.@lists.lustre.org>> on behalf of Ellis Wilson via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> Reply-To: Ellis Wilson mailto:elliswil...@microsoft.com>> Date: Thursday, January 20, 2022 at 6:20 AM To: Raj mailto:rajgau...@gmail.com>>, Patrick Farrell mai

Re: [lustre-discuss] [EXTERNAL] Re: Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-20 Thread Ellis Wilson via lustre-discuss
attach if compressed. Regards, Patrick From: lustre-discuss mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of Ellis Wilson via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> Sent: Wednesday, January 19, 2022 8:32 AM To: Andreas Di

[lustre-discuss] Memory Management in Lustre

2022-01-19 Thread Ellis Wilson via lustre-discuss
Hi folks, Broader (but related) question than my current malaise with OOM issues on 2.14/2.15: Is there any documentation or can somebody point me at some code that explains memory management within Lustre? I've hunted through Lustre manuals, the Lustre internals doc, and a bunch of code,

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-19 Thread Ellis Wilson via lustre-discuss
rIdentification> On Jan 18, 2022, at 13:40, Ellis Wilson via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Recently we've switched from using ZFS to ldiskfs as the backing filesystem to work around some performance issues and I'm finding that when I put the cluster un

[lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-18 Thread Ellis Wilson via lustre-discuss
Hi all, Recently we've switched from using ZFS to ldiskfs as the backing filesystem to work around some performance issues and I'm finding that when I put the cluster under load (with as little as a single client) I can almost completely lockup the client. SSH (even existing sessions) stall,

[lustre-discuss] Configuring File Layout Questions

2021-07-13 Thread Ellis Wilson via lustre-discuss
Hi Lustre folks, A few questions about around configuring file layouts, specifically progressive file layouts: 1. In a freshly stood-up Lustre cluster, if there are no clients yet mounted, are there any Lustre utilities (I've not found one) that allows one to perform the equivalent of "lfs