On May 11, 2022, at 08:25, Nathan Dauchy wrote:
> 
> Greetings!

Hello Nathan,

> During the helpful LUG tutorial from Rick Mohr on advanced lustre file 
> layouts, it was mentioned that “lfs mirror” could be used to improve read 
> performance.  And the manual supports this, stating “files that are 
> concurrently read by many clients (e.g. input decks, shared libraries, or 
> executables) the aggregate parallel read performance of a single file can be 
> improved by creating multiple mirrors of the file data”.
>  
> What method does Lustre use to ensure that multiple clients balance their 
> read workloads from the multiple mirrors?

Currently (2.15.0), if there are no mirror copies marked "prefer", it tries the 
mirror with the most stripes on flash devices (vs. mirrors on HDDs), and if 
there are still multiple mirrors it uses the hash of a client memory pointer 
address modulo mirror count.  This should be relatively random for each client 
to distribute the read workload across mirrors. 

I'm not totally sure why the "hash of the pointer address" mechanism was 
implemented, as clients typically use the client NID as the basis for 
"autonomous" load distribution (modulo mirror count in this case) so that the 
workload is "ideally" distributed across copies without any added 
communication.  The latter is what is described in LU-10158 "FLR: Define a 
replica choosing policy function", but this is not fully implemented.

> Are there any tuning parameters that should be considered, other than making 
> sure the “preferred” flag is NOT set on a single mirror, to help even out the 
> read workload among the OSTs?
>  
> Has anyone tested this and quantified the performance improvement?

I don't recall seeing any benchmarks to verify this behavior for reads, but I'd 
be interested to learn of any results you find.

In typical FLR uses that I'm aware of this is mainly between HDD and NVMe 
mirror copies, not multiple copies on the same class of storage, so they use 
either the "prefer" flag set on the flash mirror, or with LU-14996 it also 
checks the OS_STATFS_NONROT flag from the OSTs (if this is reported, check "lfs 
df -v" for the 'f' (flash) flag).

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to