On Tue, 2015-06-02 at 13:04 -0600, Jason Gunthorpe wrote:
> On Tue, Jun 02, 2015 at 02:51:23PM -0400, Doug Ledford wrote:
> > On Tue, 2015-06-02 at 12:08 -0600, Jason Gunthorpe wrote:
> > > On Tue, Jun 02, 2015 at 10:35:24AM -0400, Doug Ledford wrote:
> > > 
> > > > So, just so everyone is clear on this point: the current user space
> > > > implementation of this feature creates an unversioned, newly named
> > > > ibv_wc_ex struct that is ibv_wc with a 64bit timestamp tacked on at the
> > > > end (not 64bit aligned either).  If we ever wanted to have a different
> > > > extension to our ibv_wc struct, there is no good way to do that.
> > > 
> > > No, if they followed (I didn't check yes) the extension scheme then the
> > > poll call is
> > > 
> > >  struct ibv_wc_ex wcs[num_wcs]
> > >  ibv_poll_wc_ex(&wcs,num_wcs,sizeof(wcs[0]));
> > > 
> > > And the drivers decide what to do based on the 3rd argument, which is
> > > essentially the ABI version.
> > 
> > Ick.  OK.  I would *much* prefer something done akin to the routines in
> > packer.c of the kernel, but that's not my call to make, the decision on
> > the ABI/API extension mechanism was made long ago.  It does, however,
> > mean that extensions are serial and not modular, and that's a shame.
> 
> All verbs extensions are essentially serial, each extension requires a
> fixed allocation of structure bytes, made by upstream.
> 
> This is also why no vendor may ship an extension that is not upstream
> and continue to use the same soname as upstream. Similarly for the
> kernel.
> 
> This is fairly performance neutral, while a packer.c scheme would be
> unacceptably expensive, IMHO. poll_wc is one of the most performance
> sensitive routines in the library.

I disagree.  Obviously I haven't run them in a tight loop to confirm,
but I looked at mthca, mlx4, and cxgb4 user libraries, and all of them
have complex *_poll_one routines that convert their internal cqe's to
wc's.  The packer routines aren't any more complex or any slower (at
least not necessarily, it all depends on the particular transformation
needed).  The packer routines are just hard to read.

And, as Christoph pointed out, we can keep our wc in a single cache line
right now.  However, we only need a few extensions to blow that out of
the water.  If some extension comes along that gets allocated past the
64byte cacheline size, and that extension is used far more frequently
than say this timestamp, then we will have forced a cache line break on
a frequently used item for a less frequently used item.  So, there would
be benefits to a modular approach in terms of allowing the user to
select what items they want and to keep their important items in that
single cache line.

-- 
Doug Ledford <dledf...@redhat.com>
              GPG KeyID: 0E572FDD

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to