Re: [lustre-discuss] Lustre 2.12.6 on RHEL 7.9 not able to mount disks after reboot

2022-08-09 Thread Crowder, Jonathan via lustre-discuss
This is fantastic output, you guessed correctly. We use an md raid on ldiskfs. We're getting ext4 io errors on boot and were able to reboot enough to get into a state where we can read the mount, so we are moving data off that mount onto another and rebuilding(which is the intended design of thi

Re: [lustre-discuss] Lustre 2.12.6 on RHEL 7.9 not able to mount disks after reboot

2022-08-09 Thread Cameron Harr via lustre-discuss
JC, The message where it asks if the MGS is running is a pretty common error that you'll see when something isn't right. There's not a lot of detail in your message but first step is to make sure your OST device is present on the OSS server. You mentioned remounting the RAID directories; is t

Re: [lustre-discuss] Changing default recovery window time settings

2022-08-09 Thread Spitz, Cory James via lustre-discuss
The classical way to put a limit on recovery is to use the recovery_time_soft and recovery_time_hard mount options. See the mount.lustre options: https://doc.lustre.org/lustre_manual.xhtml#idm139974521647280 recovery_time_soft=timeout Allows timeout seconds for clients to reconnect for recovery

[lustre-discuss] Lustre 2.12.6 on RHEL 7.9 not able to mount disks after reboot

2022-08-09 Thread Crowder, Jonathan via lustre-discuss
Hello, this is my first post here so I may need some guidance on the function of this system. I am in a small team supporting some 36TB lustre servers for a business unit. Our configuration per mount point is one lustre master node and 3 lustre object stores. We had one of the object stores los