Re: [lustre-discuss] Changing default recovery window time settings

2022-08-09 Thread Spitz, Cory James via lustre-discuss
The classical way to put a limit on recovery is to use the recovery_time_soft and recovery_time_hard mount options. See the mount.lustre options: https://doc.lustre.org/lustre_manual.xhtml#idm139974521647280 recovery_time_soft=timeout Allows timeout seconds for clients to reconnect for

Re: [lustre-discuss] Changing default recovery window time settings

2022-08-06 Thread Andreas Dilger via lustre-discuss
The maximum amount of time that recovery will run is controlled by "at_max". The default is 600s (10 mins), but on my 2-client home cluster (with a relatively light load) the recovery is usually finished in 10s or less. You can reduce the timeout based on what is your typical time. Note that

[lustre-discuss] Changing default recovery window time settings

2022-08-04 Thread Christian Kuntz
Hello all, I'm wondering if there is any way to tune the maximum amount of time that lustre will use for a recovery window in the event that imperative recovery fails due to the failover of an MGS. On MGS failover, we appear to hit a default timeout of around 6 minutes that seems to be