On Wed, Oct 10, 2012 at 08:22:05AM -0400, Thor Lancelot Simon wrote: > On Wed, Oct 10, 2012 at 12:02:21PM +0200, Manuel Bouyer wrote: > > > > There's another case, which I think is worse: a raidframe volume, with > > underlying disks with different maxphys. If it's a raid-1 you can't predict > > from which disk a read will come from, so you don't know the maxphys. > > Sure you do. The maxphys computation for RAIDframe is simple and > consistent: it is the smallest maxphys of any component disk, multiplied > by the number of data (not parity) elements in the set. > > It works the same for RAID1 as for any other RAID level: a RAID1 set has > 1 data element (so does a RAID0 set) so it is simply the smallest maxphys > of any disk in the set.
But this assumes you knows the disks at config time. The problem is that you can add or remove drives from a raid volume, which may change the maxphys. > > > I wouldn't expect this case (a volume composed of multiple disks with > > different maxphys) to be that common, so I'm not sure we should optimise > > for this. The volume's maxphys would be the lower of all the devices's > > maxphys. > > > > A tricky case is when the new maxphys would be smaller. You would need > > to suspend filesystemm operations before changing it, but I'm not sure > > all filesystems support this. Maybe support for splitting the request in > > VOP_STRATEGY() or another appropriate place would be better ? > > That is my thinking: a device driver whose maxphys can change should split > (and potentially even combine, though this is a matter of performance not > correctness -- xbdback can already do this, though) requests as > necessary, since there is no real atomicity guarantee for a request > larger than a single sector anyhow. This is one solution, but I think it should be centralised. No need to replicate this in every drivers which needs this. > > I really would like to avoid reaching out through multiple layers of > indirect reference (or via several stacked up function pointers, > defeating branch prediction etc.) every time we schedule an I/O or > even consider which pages to push or pull on a single I/O. I agree. -- Manuel Bouyer <bou...@antioche.eu.org> NetBSD: 26 ans d'experience feront toujours la difference --