Re: [lustre-discuss] Joining files

2023-03-30 Thread Patrick Farrell via lustre-discuss
them together correctly to get the desired result. I will defer to Andreas about what those primitives would be. -Patrick From: Sven Willner Sent: Thursday, March 30, 2023 1:47 AM To: Andreas Dilger Cc: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re

Re: [lustre-discuss] Joining files

2023-03-29 Thread Patrick Farrell via lustre-discuss
Sven, The "combining layouts without any data movement" part isn't currently possible. It's probably possible in theory, but it's never been implemented. (I'm curious what your use case is?) Even allowing for data movement, there's no tool to do this for you. Depending what you mean by

Re: [lustre-discuss] Mounting lustre on block device

2023-03-16 Thread Patrick Farrell via lustre-discuss
Lustre doesn't show up in lsblk on the client because it isn't a block device on the client. NFS and other network file systems also don't show up lsblk, for the same reason. -Patrick From: lustre-discuss on behalf of Shambhu Raje via lustre-discuss Sent:

Re: [lustre-discuss] Question regarding user access during recovery and journal replay

2023-03-14 Thread Patrick Farrell via lustre-discuss
Marc, [Re-posting to the list...] No, it’s fine to have interaction during those times. The system is designed to do that work online. Depending what you’re trying to do and what you’re accessing, some client operations will experience delays, but that’s it. For example, during

Re: [lustre-discuss] Avoiding system cache when using ssd pfl extent

2022-05-19 Thread Patrick Farrell via lustre-discuss
is not of value as that would apply to all extents, whether on SSD on HDD. O_DIRECT on Lustre has been problematic for me in the past, performance wise. John On 5/19/22 13:05, Patrick Farrell wrote: No, and I'm not sure I agree with you at first glance. Is this just generally an idea

Re: [lustre-discuss] Avoiding system cache when using ssd pfl extent

2022-05-19 Thread Patrick Farrell via lustre-discuss
No, and I'm not sure I agree with you at first glance. Is this just generally an idea that data stored on SSD should not be in RAM? If so, there's no mechanism for that other than using direct I/O. -Patrick From: lustre-discuss on behalf of John Bauer Sent:

Re: [lustre-discuss] Write Performance is Abnormal for max_dirty_mb Value of 2047

2022-03-27 Thread Patrick Farrell via lustre-discuss
Hasan, Historically, there have been several bugs related to write grant when max_dirty_mb is set to large values (depending on a few other details of system setup). Write grant allows the client to write data in to memory and write it out asynchronously. When write grant is not available to

Re: [lustre-discuss] 2.14 against mofed 5.5.1.0.3.2-rhel7.9

2022-03-07 Thread Patrick Farrell via lustre-discuss
Michael, Perhaps more importantly, Lustre 2.15 hasn't been released yet. (In general, the recommended matrix is maintenance release to maintenance release - So 2.15 clients and 2.12 servers will be a recommended configuration, once 2.15 is released.) -Patrick

Re: [lustre-discuss] RE-Fortran IO problem

2022-02-03 Thread Patrick Farrell via lustre-discuss
Denis, FYI, the git link you provided seems to be non-public - it asks for a GSI login. Fortran is widely used for applications on Lustre, so it's unlikely to be a fortran specific issue. If you're seeing I/O rates drop suddenly during​ activity, rather than being reliably low for some

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-19 Thread Patrick Farrell via lustre-discuss
Ellis, As you may have guessed, that function just set looks like a node which is doing buffered I/O and thrashing for memory. No particular insight available from the count of functions there. Would you consider opening a bug report in the Whamcloud JIRA? You should have enough for a good

Re: [lustre-discuss] CPU soft lockup on mkfs.lustre

2019-09-11 Thread Patrick Farrell
Tamas, Aurélien, Would one of you mind opening an LU on this? Thanks, - Patrick From: lustre-discuss on behalf of Tamas Kazinczy Sent: Wednesday, September 11, 2019 1:32:09 AM To: Degremont, Aurelien ; lustre-discuss@lists.lustre.org Subject: Re:

Re: [lustre-discuss] Group and Project quota enforcement semantics

2019-08-05 Thread Patrick Farrell
Steve, The Lustre quota behavior is the standard Linux file system quota behavior - All data written by a user/group/in a project directory counts against all applicable quotas. You'll see the same if using quotas on EXT4, XFS, etc. Additionally (you didn't ask, but this is a common related

Re: [lustre-discuss] lnet.service reporting failure on start

2019-06-24 Thread Patrick Farrell
This is correct, and normal. It’s not really a failure (in the sense of a being a problem), it’s just that you’re using modules that aren’t signed with the key used by your kernel vendor. In general, if you’re getting third party modules (ie not from your kernel vendor), this happens, because

Re: [lustre-discuss] Stop writes for users

2019-05-14 Thread Patrick Farrell
Connections on demand has been done and is not relevant here – It just idles unused connections to save resources, no impact on ability to write, etc. * Patrick From: lustre-discuss on behalf of Alexander I Kulyavtsev Date: Tuesday, May 14, 2019 at 11:42 AM To:

Re: [lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-05-03 Thread Patrick Farrell
Thomas, As a general rule, Lustre only supports mixing versions on servers for rolling upgrades. - Patrick From: lustre-discuss on behalf of Thomas Roth Sent: Wednesday, April 24, 2019 3:54:09 AM To: lustre-discuss@lists.lustre.org Subject:

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-29 Thread Patrick Farrell
Neil, My understanding is marking the inode cache reclaimable would make Lustre unusual/unique among Linux file systems. Is that incorrect? - Patrick From: lustre-discuss on behalf of NeilBrown Sent: Monday, April 29, 2019 8:53:43 PM To: Jacek Tomaka Cc:

Re: [lustre-discuss] lfs find

2019-04-26 Thread Patrick Farrell
Would you mind listing current lfs find options to help kickstart discussion? It seems like I might want it for lots of them, maybe close to all - For example, stripe size seems at first (to me) it wouldn't be useful, but what if I want to check to see if anyone is using a weird stripe size? I

Re: [lustre-discuss] State of arm client?

2019-04-25 Thread Patrick Farrell
Also, you’ll need (I think?) fairly new Pis - Lustre only supports ARM64 and older ones were 32 bit. - Patrick From: lustre-discuss on behalf of Peter Jones Sent: Wednesday, April 24, 2019 11:08:38 PM To: Andrew Elwell; lustre-discuss@lists.lustre.org Subject:

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-14 Thread Patrick Farrell
:10:32 PM To: Patrick Farrell Cc: NeilBrown; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre client memory and MemoryAvailable Thanks Patrick for getting the ball rolling! >1/ w.r.t drop_caches, "2" is *not* "inode and dentry". The '2' bit > cause

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-14 Thread Patrick Farrell
should instead have encouraged Jacek to use lustre-devel :) - Patrick From: NeilBrown Sent: Sunday, April 14, 2019 6:38:47 PM To: Patrick Farrell; Jacek Tomaka; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre client memory and MemoryAvailable

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-14 Thread Patrick Farrell
echo 1 > drop_caches does not generate memory pressure - it requests that the page cache be cleared. It would not be expected to affect slab caches much. You could try 3 (1+2 in this case, where 2 is inode and dentry). That might do a bit more because some (maybe many?) of those objects

Re: [lustre-discuss] EINVAL error when writing to a PFL file (lustre 2.12.0)

2019-03-29 Thread Patrick Farrell
mind, I'm just playing around with PFL. > > If this is currently not properly supported, a quick fix could be to prevent the user from creating such incomplete layouts? > > Regards, > Thomas > > On 2/22/19 5:33 PM, Patrick Farrell wrote: >&

Re: [lustre-discuss] Compiling lustre-2.10.6

2019-03-12 Thread Patrick Farrell
Hsieh, We have instructions for compiling from source here on our Wiki: https://wiki.whamcloud.com/display/PUB/Building+Lustre+from+Source Are you following those? If not, I'd suggest it - Your problem looks likely to be an error in the build process. We also have prebuilt 2.10.6 packages

Re: [lustre-discuss] Lustre 2.12.0 and locking problems

2019-03-05 Thread Patrick Farrell
Riccardo, Since 2.12 is still a relatively new maintenance release, it would be helpful if you could open an LU and provide more detail there - Such as what clients were doing, if you were using any new features (like DoM or FLR), and full dmesg from the clients and servers involved in these

Re: [lustre-discuss] Data migration from one OST to anther

2019-03-03 Thread Patrick Farrell
Hsieh, This sounds similar to a bug with pre-2.5 servers and 2.7 (or newer) clients. The client and server have a disagreement about which does the delete, and the delete doesn’t happen. Since you’re running 2.5, I don’t think you should see this, but the symptoms are the same. You can

Re: [lustre-discuss] Lustre client 2.11.0 with Lustre server lustre-2.12.0-ib

2019-03-02 Thread Patrick Farrell
Parag, I would be interested to know more about the application compatibility issues, but you should be OK with 2.11 clients and a 2.12 server. In general, newer clients are tested with older servers, much the other way around, but especially with adjacent major releases you should be fine.

Re: [lustre-discuss] Draining and replacing OSTs with larger volumes

2019-02-28 Thread Patrick Farrell
: Scott Wood Sent: Thursday, February 28, 2019 6:15:54 PM To: Patrick Farrell; Jongwoo Han Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Draining and replacing OSTs with larger volumes My Thanks to both Jongwoo and Patrick for your responses. Great advice to do a practice run

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-28 Thread Patrick Farrell
This is very good advice, and you can also vary it to aid in removing old OSTs (thinking of the previous message) - simply take the old ones you wish to remove out of the pool, then new files will not be created there. Makes migration easier. One thing though: Setting a default layout

Re: [lustre-discuss] Draining and replacing OSTs with larger volumes

2019-02-28 Thread Patrick Farrell
Scott, I’d like to strongly second all of Jongwoo’s advice, particularly that about adding new OSTs rather than replacing existing ones, if possible. That procedure is so much simpler and involves a lot less messing around “under the hood”. It takes you from a complex procedure with many

Re: [lustre-discuss] Which release to use?

2019-02-22 Thread Patrick Farrell
Please file bugs if anything does! - Patrick From: lustre-discuss on behalf of Nathan R Crawford Sent: Friday, February 22, 2019 7:04:24 PM To: Peter Jones Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Which release to use? Thanks for the

Re: [lustre-discuss] EINVAL error when writing to a PFL file (lustre 2.12.0)

2019-02-22 Thread Patrick Farrell
to enforce a file size limit, but that's about it. - Patrick From: LEIBOVICI Thomas Sent: Friday, February 22, 2019 11:09:03 AM To: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] EINVAL error when writing to a PFL file (lustre 2.12.0

Re: [lustre-discuss] Upgrade without losing data from 1.8.6 to 2.5.2 and back if necessary

2019-02-04 Thread Patrick Farrell
Wow, 1.8 is pretty old these days, as is 2.5 (first release 6 years ago!). I hope you're planning on upgrading past 2.5 once you've upgraded to it. (Honestly, this is all so old at this point you might consider letting your existing system reach EOL on 1.8.x and building a new file system

Re: [lustre-discuss] LFS tuning hierarchy question

2019-01-24 Thread Patrick Farrell
with the MDT, then that is, I believe, a maximum and not a default. From: Ms. Megan Larko Sent: Thursday, January 24, 2019 8:24:31 PM To: Lustre User Discussion Mailing List; Patrick Farrell Subject: [lustre-discuss] LFS tuning hierarchy question Thank you

Re: [lustre-discuss] LFS tuning hierarchy question

2019-01-24 Thread Patrick Farrell
It varies by value. If the server has a value set (with lctl set_param -P on the MGS), it will override the client value. Otherwise you'll get the default value. (Max pages per RPC is a bit of an exception in that the client and server will negotiate to "highest mutually supported" value for

Re: [lustre-discuss] ldiskfs performance degradation due to kernel swap hugging cpu

2018-12-28 Thread Patrick Farrell
Abe, You gave some general info, but unless I missed something, nothing specific to show any involvement by swap. How did you determine that? Can you share that data? And what performance are you expecting here? - Patrick From: lustre-devel on behalf of

[lustre-discuss] LU-8964/pio feature usage

2018-12-23 Thread Patrick Farrell
Good afternoon, There was a recent discussion on the lustre-devel mailing list in which I floated removing the 'pio' feature from Lustre. This is a client side i/o parallelization feature (splitting i/o in kernel space & using multiple worker threads) which is off by default and must be

Re: [lustre-discuss] no more free slots in catalog

2018-12-17 Thread Patrick Farrell
Julien, Could you share the details (LBUG plus full back trace, primarily) with the list? It would be good to know if it’s a known problem or not. Thanks! From: lustre-discuss on behalf of Julien Rey Sent: Monday, December 17, 2018 3:40:56 AM To:

Re: [lustre-discuss] How to solve issue when OSS is turned off?

2018-11-11 Thread Patrick Farrell
Default Lustre striping is just straight RAID0, so the data on (say) OST0 is not anywhere else. You can still access data and files on other OSTs, and you can create files that live on other OSTs, so I don’t think the MDS is useless. But this is the reason for failover - to ensure you can

Re: [lustre-discuss] Usage for lfs setstripe -o ost_indices

2018-11-09 Thread Patrick Farrell
“I am not able to specify -o to an existing file.” Yes, that’s expected - As with any other setstripe command, you cannot apply it to existing files which already have stripe information. (The exception is files created with LOV_DELAY_CREATE or mknod(), which do not have striping information

Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O)

2018-10-30 Thread Patrick Farrell
Andreas, An interesting thought on this, as the same limitation came up recently in discussions with a Cray customer. Strictly honoring the direct I/O expectations around data copying is apparently optional. GPFS is a notable example – It allows non page-aligned/page-size direct I/O, but it

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-23 Thread Patrick Farrell
this in the context of a bug. - Patrick From: Andreas Dilger Sent: Monday, October 22, 2018 8:55:57 PM To: Marion Hakanson Cc: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5 On Oct 23, 2018, at 09

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-19 Thread Patrick Farrell
There is a somewhat hidden danger with eviction: You can get silent data loss. The simplest example is buffered (ie, any that aren't direct I/O) writes - Lustre reports completion (ie your write() syscall completes) once the data is in the page cache on the client (like any modern file system,

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-19 Thread Patrick Farrell
Marion, You note the deadlock reoccurs on server reboot, so you’re really stuck. This is most likely due to recovery where operations from the clients are replayed. If you’re fine with letting any pending I/O fail in order to get the system back up, I would suggest a client side action:

Re: [lustre-discuss] limit on number of oss/ost's?

2018-10-11 Thread Patrick Farrell
The 160 limit has been raised. I don't know what the new one is, but it is *quite* large. I'm pretty sure it's beyond practical interest today. There are a few issues with having extremely large numbers of OSTs, especially if you are explicitly trading off 1 vs many OSTs. There are no

Re: [lustre-discuss] lfs mirror create directory

2018-10-01 Thread Patrick Farrell
George, Your mirror is stale - look at the output. Mirroring in Lustre is currently a manual process, you have to manually resync a file after writing to it. lfs mirror resync is the lfs command. If your mirror is in sync, you should get the behavior you’re looking for. - Patrick

Re: [lustre-discuss] Experience with resizing MDT

2018-09-27 Thread Patrick Farrell
Andreas, Take a closer look. It doesn't look to be connected to anything (this is current master). This is all the instances of it I see: C symbol: mdt_enable_remote_dir File Function Line 0 mdt_internal.h251 mdt_enable_remote_dir:1, 1

Re: [lustre-discuss] Second read or write performance

2018-09-21 Thread Patrick Farrell
yılmaz Sent: Friday, September 21, 2018 7:50:51 PM To: Patrick Farrell Cc: adil...@whamcloud.com; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Second read or write performance The problem solved by adding lustre fine tuning parameter to oss servers lctl set_param obdfilter.lı

Re: [lustre-discuss] separate SSD only filesystem including HDD

2018-08-28 Thread Patrick Farrell
: Tuesday, August 28, 2018 at 9:52 AM To: Patrick Farrell Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] separate SSD only filesystem including HDD 1) fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=20G --numjobs=4 --r

Re: [lustre-discuss] separate SSD only filesystem including HDD

2018-08-28 Thread Patrick Farrell
How are you measuring write speed? From: lustre-discuss on behalf of Zeeshan Ali Shah Sent: Tuesday, August 28, 2018 1:30:03 AM To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] separate SSD only filesystem including HDD Dear All, I recently

Re: [lustre-discuss] oldest lustre deployment?

2018-08-15 Thread Patrick Farrell
10:25:54 AM To: Patrick Farrell; Latham, Robert J.; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] oldest lustre deployment? I agree that there are still sites running Lustre 1.8.x in production, but I don’t think that it is a reasonable assumption that 2.7 or newer isn’t safe yet

Re: [lustre-discuss] oldest lustre deployment?

2018-08-15 Thread Patrick Farrell
Oh, yes. Absolutely. Many sites are running 2.5, a few are evening running 1.8. It's not "officially supported", but that's all those matrices indicate. Sorry, assuming 2.7 or newer isn't safe yet. 2.5 may still be the largest single release by usage. Check these slides for an update

Re: [lustre-discuss] [lustre-devel] MDT test in rel2.11

2018-07-18 Thread Patrick Farrell
- couldn’t today) use for mdtest would very much be “writing to the benchmark” and defeating the intent. From: John Bent Sent: Tuesday, July 17, 2018 11:54:32 PM To: Patrick Farrell Cc: Abe Asraoui; lustre-de...@lists.lustre.org; lustre-discuss@lists.lustre.org

Re: [lustre-discuss] [lustre-devel] MDT test in rel2.11

2018-07-17 Thread Patrick Farrell
To be clear in case I sound too down on it - Lazy SoM is a very nice feature that will speed up important use cases. It’s just not going to jazz up mdtest #s. From: Patrick Farrell Sent: Tuesday, July 17, 2018 11:49:48 PM To: John Bent Cc: Abe Asraoui; lustre

Re: [lustre-discuss] [lustre-devel] MDT test in rel2.11

2018-07-17 Thread Patrick Farrell
it (accessed via an ioctl) and can accept information that may be stale. The intended use case is scanning the FS for policy application. From: John Bent Sent: Tuesday, July 17, 2018 10:55:24 PM To: Patrick Farrell Cc: Abe Asraoui; lustre-de...@lists.lustre.org

Re: [lustre-discuss] MDT test in rel2.11

2018-07-17 Thread Patrick Farrell
Abe, Any benchmarking would be highly dependent on hardware, both client and server. Is there a particular comparison (say, between versions) you’re interested in or something you’re concerned about? - Patrick From: lustre-devel on behalf of Abe Asraoui

Re: [lustre-discuss] Not able to load lustre modules on Luster client

2018-06-29 Thread Patrick Farrell
I am not certain, but I believe insmod does not attempt to fulfill dependencies. What happens when you try modprobe and what are the errors in dmesg then? From: lustre-discuss on behalf of vaibhav pol Sent: Thursday, June 28, 2018 11:44:10 PM To: Andreas

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-28 Thread Patrick Farrell
, June 27, 2018 11:26:52 PM To: Patrick Farrell Cc: adil...@whamcloud.com; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error yes, drbd will mirror the content of block devices between hosts synchronously or asynchronously. this will provide

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Patrick Farrell
I’m a little puzzled - it can switch, but isn’t the data on the failed disk lost...? That’s why Andreas is suggesting RAID. Or is drbd doing syncing of the disk? That seems like a really expensive way to get redundancy, since it would have to be full online mirroring with all the costs in

Re: [lustre-discuss] Lustre 2.11 File Level Replication

2018-06-22 Thread Patrick Farrell
Mark, Hmm. I’m adding the list back on here, because that *seems* like it’s wrong. Don’t have time to check right now, but I’m curious if others can weigh in. * Patrick From: Mark Roper Date: Friday, June 22, 2018 at 2:29 PM To: Patrick Farrell Subject: Re: [lustre-discuss] Lustre 2.11

Re: [lustre-discuss] Lustre 2.11 File Level Replication

2018-06-21 Thread Patrick Farrell
Mark, I haven’t played specifically with FLR and inheritance/templates, but if you want to set a default layout on a directory, you’ll want to look at lfs setstripe. Mirror extend is specifically for modifying individual, existing files. - Patrick From:

Re: [lustre-discuss] Do I need Lustre?

2018-04-27 Thread Patrick Farrell
One factor is probably budget - Lustre is probably a higher budget option, in terms of hardware and time investment. I would guess at the 6-8 node range you probably don't need its speed, though you might need at least one other trick it has: One thing Lustre gives that NFS does not is the

Re: [lustre-discuss] Upgrade to 2.11: unrecognized mount option

2018-04-11 Thread Patrick Farrell
I think you missed it – It came out a few days ago, and Peter Jones announced it in what I assume was the usual manner. Maybe there’s a “which lists were sent to” issue? * Patrick From: lustre-discuss on behalf of "E.S. Rosenberg"

Re: [lustre-discuss] latest kernel version supported by Lustre ?

2018-04-09 Thread Patrick Farrell
Peter, Unfortunately, Riccardo was asking about server support. - Patrick From: lustre-discuss on behalf of Jones, Peter A Sent: Monday, April 9, 2018 7:24:52 AM To: Dilger, Andreas; Riccardo

Re: [lustre-discuss] Lustre 2.10.3 client fails to compile on centos 6.5

2018-04-08 Thread Patrick Farrell
Are you not able to move to a newer version even of CentOS6? 6.5 is no longer supported and it looks like you would have to revert some Lustre patches to get the newest client to build. From: lustre-discuss on behalf

Re: [lustre-discuss] varying sequential read performance.

2018-04-03 Thread Patrick Farrell
John, There’s a simple explanation for that lack of top line performance benefit - you’re not reading 16 GB then 16 GB then 16 GB etc. It’s interleaved. Read ahead will do large reads, much larger than your 1 MiB i/o size, so it’s all interleaved from four sources on every actual read

Re: [lustre-discuss] Static lfs?

2018-03-23 Thread Patrick Farrell
_ From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Patrick Farrell <p...@cray.com> Sent: Friday, March 23, 2018 3:17:14 PM To: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Static lfs? Ah, interesting – I got a question off list about th

Re: [lustre-discuss] Static lfs?

2018-03-23 Thread Patrick Farrell
: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Patrick Farrell <p...@cray.com> Date: Friday, March 23, 2018 at 3:03 PM To: "lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org> Subject: [lustre-discuss] Static lfs? Good afternoon, I

[lustre-discuss] Static lfs?

2018-03-23 Thread Patrick Farrell
, regardless of whether or not they’re already installed in the normal system library locations.) Regards, Patrick Farrell ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] File locking errors.

2018-02-20 Thread Patrick Farrell
that difference to be measurable in the context of real use of the flock code. On 2/20/18, 9:00 AM, "Prentice Bisbal" <pbis...@pppl.gov> wrote: On 02/20/2018 08:58 AM, Patrick Farrell wrote: > There is almost NO overhead to this locking unless you’re using it to

Re: [lustre-discuss] File locking errors.

2018-02-20 Thread Patrick Farrell
8 6:47:16 AM To: Prentice Bisbal Cc: lustre-discuss Subject: Re: [lustre-discuss] File locking errors. On Fri, Feb 16, 2018 at 11:52 AM, Prentice Bisbal <pbis...@pppl.gov> wrote: > On 02/15/2018 06:30 PM, Patrick Farrell wrote: >> Localflock will only provide flock between threads on

Re: [lustre-discuss] File locking errors.

2018-02-15 Thread Patrick Farrell
sense, and it would mean localflock would be safe, unless you had some other application which looked for flocks before accessing a file. From: Arman Khalatyan <arm2...@gmail.com> Sent: Thursday, February 15, 2018 5:38:39 PM To: Patrick Farrell Cc: E.S. Ros

Re: [lustre-discuss] File locking errors.

2018-02-15 Thread Patrick Farrell
Localflock will only provide flock between threads on the same node. I would describe it as “likely to result in data corruption unless used with extreme care”. Are you sure HDF only ever uses flocks between threads on the same node? That seems extremely unlikely or maybe impossible for

Re: [lustre-discuss] Are there any performance hits with the https://access.redhat.com/security/vulnerabilities/speculativeexecution?

2018-01-08 Thread Patrick Farrell
Note though that since the servers live in kernel space they are also going to be affected only minimally. The Lustre server code itself will see zero effect, since it’s entirely kernel code. Other things running on those servers may see impact, and if there’s enough user space stuff,

Re: [lustre-discuss] Lustre Client in a container

2018-01-03 Thread Patrick Farrell
. From: Dilger, Andreas <andreas.dil...@intel.com> Sent: Wednesday, January 3, 2018 4:20:56 AM To: David Cohen Cc: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre Client in a container On Dec 31, 2017, at 01:50, David Cohen <cda...@physics.technion.ac.

Re: [lustre-discuss] Lustre Client in a container

2017-12-31 Thread Patrick Farrell
ieb David Cohen <cda...@physics.technion.ac.il>: > > Patrick, > Thanks for you response. > I looking for a way to migrate from 1.8.9 system to 2.10.2, stable enough to > run the several weeks or more that it might take. > > > David > > On Sun, Dec 31, 2017 at 12:12

Re: [lustre-discuss] Lustre Client in a container

2017-12-31 Thread Patrick Farrell
. From: David Cohen <cda...@physics.technion.ac.il> Sent: Sunday, December 31, 2017 2:50:05 AM To: Patrick Farrell Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre Client in a container Patrick, Thanks for you response. I looking for a way to migrate from 1.8.9

Re: [lustre-discuss] Designing a new Lustre system

2017-12-20 Thread Patrick Farrell
I won’t try to answer all your questions (I’m not really qualified to opine), but a quick one on ZFS: ZFS today is still much slower for the MDT. It’s competitive on OSTs, arguably better, depending on your needs and hardware. So a strong choice for a config today would be ldiskfs MDTs and

Re: [lustre-discuss] BAD CHECKSUM

2017-12-07 Thread Patrick Farrell
I would think it's possible if the application is doing direct I/O. This should be impossible for buffered I/O, since the checksums are all calculated after the copies in to kernel memory (the page cache) are complete, so it doesn¹t matter what userspace does to its memory (at least, it doesn¹t

Re: [lustre-discuss] Lustre and Elasticsearch

2017-11-26 Thread Patrick Farrell
They more or less don't. They only come in to play for applications that explicitly ask for them and the implementation is fast and efficient (it's tied in to the standard Lustre locking mechanisms) - Patrick From: lustre-discuss

Re: [lustre-discuss] ZFS-OST layout, number of OSTs

2017-10-24 Thread Patrick Farrell
It can be pretty easily inferred from the nature of the feature. If a decent policy is written and applied to all files (starting with few stripes and going to many as size increases), then it will resolve the problem of large files on single OSTs. If the policy is not universally applied or

Re: [lustre-discuss] FW: Lustre 2.10.1 released

2017-10-24 Thread Patrick Farrell
Peter, Not mine - Elis. (I knew that one. :) ) - Patrick From: Jones, Peter A <peter.a.jo...@intel.com> Sent: Tuesday, October 24, 2017 11:43:30 AM To: E.S. Rosenberg; Patrick Farrell Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discu

Re: [lustre-discuss] ZFS-OST layout, number of OSTs

2017-10-22 Thread Patrick Farrell
Thomas, This is likely a reflection of an older issue, since resolved. For a long time, Lustre reserved max_rpcs_in_flight*max_pages_per_rpc for each OST (on the client). This was a huge memory commitment in larger setups, but was resolved a few versions back, and now per OST memory usage on

Re: [lustre-discuss] Acceptable thresholds

2017-10-19 Thread Patrick Farrell
Several processes per CPU core, probably? It’s a lot. But there’s a lot of environmental and configuration dependence here too. Why not look at how many you have running currently when Lustre is set up and set the limit to double that? Watching process count isn’t a good way to measure load

Re: [lustre-discuss] Lustre clients and Lustre servers (MDS/OSS) operating system requirements?

2017-10-10 Thread Patrick Farrell
Amjad, To answer your question more directly… Operating system differences between client and server don’t matter – it’s very common to deploy clients and servers using different kernels and/or different distributions. However, Lustre versions do matter. There is probably no client version

Re: [lustre-discuss] FW: Lustre 2.10.1 released

2017-10-07 Thread Patrick Farrell
Michael, In general, yes, and if the kernel versions match, then almost certainly specifically too. I’ve certainly done it on a Fedora box in the past. - Patrick From: lustre-discuss on behalf of Michael Watters

Re: [lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load

2017-09-27 Thread Patrick Farrell
A guess for you to consider: A very common cause of ksoftirqd load is a hypervisor putting memory pressure on a VM. At least VMWare, and I think KVM and others, use IRQs to implement some of their memory management and it can show up like this. That would of course mean it's not really the

Re: [lustre-discuss] Running IBM Power boxes as OSSs?

2017-09-01 Thread Patrick Farrell
While I can't speak to Intels intentions, I can say this: Lustre clients work well on Power architectures (see, for example, LLNL), so a lot of the work is done and there shouldn't be any endianness issues in the network part of the code. I suspect the server code would also build for such an

Re: [lustre-discuss] Bandwidth bottleneck at socket?

2017-08-30 Thread Patrick Farrell
r, with each worker reading in its portion of the file. Hmm. I shall try doing multiple copies at the same time to see what happens. That, I hadn't tested. We are using Lustre 2.10.51-1 under CentOS 7 kernel 3.10.0-514.26.2 Brian On 8/30/2017 9:32 AM, Patrick Farrell wrote: Brian, I'm n

Re: [lustre-discuss] Bandwidth bottleneck at socket?

2017-08-30 Thread Patrick Farrell
Brian, I'm not sure what you mean by "socket level". A starter question: How fast are your OSTs? Are you sure the limit isn't the OST? (Easy way to test - Multiple files on that OST from multiple clients, see how that performs) (lfs setstripe -i [index] to set the OST for a singly striped

Re: [lustre-discuss] Best way to run serverside 2.8 w. MOFED 4.1 on Centos 7.2

2017-08-18 Thread Patrick Farrell
I would strongly suggest make -j something for parallelism, unless you want to have time to go out for your coffee. From: lustre-discuss on behalf of Christopher Johnston Sent: Friday, August 18,

Re: [lustre-discuss] nodes crash during ior test

2017-08-04 Thread Patrick Farrell
Brian, What is the actual crash? Null pointer, failed assertion/LBUG...? Probably just a few more lines back in the log would show that. Also, Lustre 2.10 has been released, you might benefit from switching to that. There are almost certainly more bugs in this pre-2.10 development version

Re: [lustre-discuss] Problem with raising osc.*.max_rpcs_in_flight

2017-07-03 Thread Patrick Farrell
It definitely is limited to 32 buckets. We've toyed with raising that limit (and Cray did so internally), but it does use some memory, etc. So that's almost certainly the issue you're seeing, Reinoud. RPCs larger than the largest size appear as the largest size. - Patrick

Re: [lustre-discuss] Per-client I/O Operation Counters

2017-06-01 Thread Patrick Farrell
rpc_stats on the clients may be helpful here, as a first step. They are found in /proc/fs/lustre/osc/[ost ]/rpc_stats on the client. Contents should be mostly self-explanatory. Look for lots of small RPCs. - Patrick From: lustre-discuss

Re: [lustre-discuss] client complains about server version

2017-05-06 Thread Patrick Farrell
2.8 as well?) - Patrick From: Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> Sent: Saturday, May 6, 2017 10:03:17 PM To: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] client complains about server version thanks for the

Re: [lustre-discuss] client complains about server version

2017-05-06 Thread Patrick Farrell
Riccardo, You may be unable to free space on the OSTs when deleting files. I can't remember if 2.4 has the required support for delete-from-MDS (not the real feature name, sorry). I think it does, but I'm not sure. It's easy to check - just delete a large file and see if the space shows up

Re: [lustre-discuss] Lustre 2.8.0 - MDT/MGT failing to mount

2017-05-04 Thread Patrick Farrell
Hm, I'm not sure everyone here is talking about the same ordering... As I understand it: The writeconf process is to unmount everything, then writeconf all your targets (order doesn't matter, pretty sure - Someone will correct me if not...), then mount in the order Colin gave - MGS/MDT, then

Re: [lustre-discuss] operation ldlm_queue failed with -11

2017-05-03 Thread Patrick Farrell
Rick, Lydia, That reasoning is sound, but this is a special case. -11 (-EAGAIN) on ldlm_enqueue is generally OK... LU-8658 explains the situation (it's POSIX flocks), so I'm going to reference that rather than repeat it here. https://jira.hpdd.intel.com/browse/LU-8658 - Patrick

Re: [lustre-discuss] Compile a C++ app. using the Lustre API

2017-03-15 Thread Patrick Farrell
It looks like your compiler is being fussier than the C compiler. Specifically, the problem appears to be with the enum type. The C compiler is happy to let pass using a short (cr_flags) where an enum is called for (argument to changelog_rec_offset). In C, I think an enum is an int (so

Re: [lustre-discuss] LNET Self-test

2017-02-05 Thread Patrick Farrell
Doug, It seems to me that's not true any more, with larger RPC sizes available. Is there some reason that's not true? - Patrick From: lustre-discuss on behalf of Oucharek, Doug S Sent:

Re: [lustre-discuss] Status of LU-8703 for Knights Landing

2017-02-01 Thread Patrick Farrell
Andrew, Are they really just not working? I didn't see that with KNL (the default CPT generated without the fixes from LU-8703 is very weird, but didn't affect performance much - the real NUMA-ness of KNL processors seems to be minimal, despite the various NUMA related configuration

  1   2   >