On Mon, 28 Nov 2022 16:56:39 -0400
Jason Gunthorpe <j...@nvidia.com> wrote:

> On Mon, Nov 28, 2022 at 01:36:30PM -0700, Alex Williamson wrote:
> > On Mon, 28 Nov 2022 15:40:23 -0400
> > Jason Gunthorpe <j...@nvidia.com> wrote:
> >   
> > > On Mon, Nov 28, 2022 at 11:50:03AM -0700, Alex Williamson wrote:
> > >   
> > > > There's a claim here about added complexity that I'm not really seeing.
> > > > It looks like we simply make an ioctl call here and scale our buffer
> > > > based on the minimum of the returned device estimate or our upper
> > > > bound.    
> > > 
> > > I'm not keen on this, for something like mlx5 that has a small precopy
> > > size and large post-copy size it risks running with an under allocated
> > > buffer, which is harmful to performance.  
> > 
> > I'm trying to weed out whether there are device assumptions in the
> > implementation, seems like maybe we found one.    
> 
> I don't think there are assumptions. Any correct kernel driver should
> be able to do this transfer out of the FD byte-at-a-time.
> 
> This buffer size is just a random selection for now until we get
> multi-fd and can sit down, benchmark and optimize this properly.

We can certainly still do that, but I'm still failing to see how
buffer_size = min(MIG_DATA_SIZE, 1MB) is such an imposition on the
complexity or over-eager optimization.  Thanks,

Alex


Reply via email to