On Mon, 28 Nov 2022 16:56:39 -0400 Jason Gunthorpe <j...@nvidia.com> wrote:
> On Mon, Nov 28, 2022 at 01:36:30PM -0700, Alex Williamson wrote: > > On Mon, 28 Nov 2022 15:40:23 -0400 > > Jason Gunthorpe <j...@nvidia.com> wrote: > > > > > On Mon, Nov 28, 2022 at 11:50:03AM -0700, Alex Williamson wrote: > > > > > > > There's a claim here about added complexity that I'm not really seeing. > > > > It looks like we simply make an ioctl call here and scale our buffer > > > > based on the minimum of the returned device estimate or our upper > > > > bound. > > > > > > I'm not keen on this, for something like mlx5 that has a small precopy > > > size and large post-copy size it risks running with an under allocated > > > buffer, which is harmful to performance. > > > > I'm trying to weed out whether there are device assumptions in the > > implementation, seems like maybe we found one. > > I don't think there are assumptions. Any correct kernel driver should > be able to do this transfer out of the FD byte-at-a-time. > > This buffer size is just a random selection for now until we get > multi-fd and can sit down, benchmark and optimize this properly. We can certainly still do that, but I'm still failing to see how buffer_size = min(MIG_DATA_SIZE, 1MB) is such an imposition on the complexity or over-eager optimization. Thanks, Alex