On 08/12/2015 21:17, Jim Harris wrote:
On Tue, Dec 8, 2015 at 1:48 PM, Steven Hartland
<ste...@multiplay.co.uk <mailto:ste...@multiplay.co.uk>> wrote:
Hi Jim could you let me know the use case for exposing the
controller stripe size as the disk stripe size done by this commit?
I ask as it actually causes problems for ZFS which has checks to
ensure zpools perform optimally by correctly configuring ashift to
match the stripesize if reported.
This is usually fine as stripe size typically reports the physical
block size of device, where sectorsize is the logical block size,
unfortunately this is currently limited to ashift of 13 (8KB) so
when nvme reports 128KB it limits it 8KB and hence every
subsequent zpool status reports a warning about optimal performance.
Before I look to fix one or the other, I wanted to fully
understand the reasoning behind how nvme behaves here.
Some Intel NVMe controllers have a slow path for I/Os that span a
128KB stripe boundary. The FreeBSD NVMe driver checks for this
condition, and will split the I/O inside of the NVMe driver in these
cases, to ensure we do not hit this slow path.
The idea behind reporting the stripe size up through GEOM was to
provide a hint to upper layers, especially for file system layout - in
hopes of reducing the number of I/Os that need to be split.
Based on your findings, limiting the stripe size reported up through
GEOM to 4KB would be OK. This may result in some small number of
additional I/Os to require splitting, but the NVMe I/O path is very
efficient so these additional I/Os would cause very minimal (if any)
difference in performance or CPU utilization.
Thanks for the fast reply Jim most appreciated. I've created a review
for the change here: https://reviews.freebsd.org/D4446
If you're happy I'll get that committed.
Regards
Steve
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"