Ah ok, didn't read through the discussion well enough it appears...

________________________________________
From: Andreas Dilger <[email protected]>
Sent: Tuesday, December 23, 2025 8:09
To: Åke Sandgren
Cc: [email protected]
Subject: Re: [lustre-discuss] Overstriping setting

Hi Åke,
I'm not arguing against overstriping itself. Definitely for shared file 
workloads, having more objects/locks can improve performance.

The question is whether eg. 2 stripes on each of 100 OSTs is faster than 1 
stripe on each of 200 OSTs, not whether it is faster than 1 stripe on each of 
100 OSTs...

Cheers, Andreas

> On Dec 22, 2025, at 23:37, Åke Sandgren via lustre-discuss 
> <[email protected]> wrote:
>
> Hi!
>
> That logic only applies when the OST's are made up of single disks. If they 
> are LUN's behind a raid controller or otherwise consists of multiple physical 
> disks then overstriping can indeed result in higher performance. We've seen 
> this when overstriping on our DDN based lustre, up to 4x overstriping was 
> giving a more or less linear increase. Those OSTs are 8+2 raid6-ish. I never 
> tried with 8x overstriping because 4x was enough for our purpose.
> Also we did 4x/OST over all 8 OSTs so 32 stripes on 8 OSTs when testing.
>
> ________________________________________
> From: lustre-discuss <[email protected]> on behalf of 
> Andreas Dilger via lustre-discuss <[email protected]>
> Sent: Tuesday, December 23, 2025 1:47
> To: Wei-Keng Liao
> Cc: [email protected]
> Subject: Re: [lustre-discuss] Overstriping setting
>
> I don't think that using 3 stripes per OST is ever going to be
> faster than using 3 separate OSTs, especially if the OSTs are
> HDD based instead of flash.  Even with NVMe OSTs, there is still
> contention on the block device queue (elevator, queue depth, etc.)
>
> With separate OSTs, then there are more resources available that
> can be leveraged with less contention.  Consider DLM lock server
> resources such as the DLM lock hash, or OST filesystem resources
> like block allocators.  With separate OSTs, those can be used
> with less contention compared to having 3 objects sharing the
> same resources.
>
> Also, using more OSTs (when warranted) will distribute space
> usage more evenly across devices.
>
> That said, there is some benefit to potentially leaving a few
> OSTs out of the allocation, if that aligns with the application.
> That allows the MDS to skip OSTs that are full or busy, instead
> of trying to always allocate objects from all of the OSTs.
>
> That said, there isn't an easy way to overstripe, say, 900 stripes
> evenly across 300 of the 370 OSTs, instead of 3 stripes on 160 of
> the 370 OSTs and 2 stripes on 210 of the OSTs.  It _might_ be good
> to do this if it shows better performance, but I think even then
> the uneven loading would still be better than only using 300 OSTs.
>
> Cheers, Andreas
> _______________________________________________
> lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to