On Fri, 29 May 2015, Doug Ledford wrote: > > Well this is a kernel bypass API and a lot of raw hardware issues will > > have to be handled since you do go directly to the device. > > No, that's not entirely true, and it *certainly* is not the correct way > to think about verbs extensions. Is it kernel bypass? Yes. Does it go > direct to the hardware? Not as far as the user application is > concerned. The direct hardware access is abstracted away in the verbs
There is a compromise here by using the kernel as a administrative function (setup and configuration of QPs) but using bare metal for the data path. The structures modified for send/receive are structures that are directly understood and handled by the hardware. That is the core benefit of the RDMA API which results in the wanted performance and latency. The administrative function / bare metal separation is also reflected in this patchset. The admin function allows the determination of the cycle counter freq and size. The bare metal cycle counter exists in the fastpath. > library. Because the verbs library is a hardware abstraction layer, any > extensions to it need to be well thought out. And by that I mean if it > is of general use, then it should be added in a general, abstract way > that any hardware can implement. If it is specific to just one vendor's > hardware, then it can be added in a means that is specific to that > vendor's hardware. What is particular here to the vendor? > Now, as a general rule, I would call timestamps general. They should be > added in a fashion that anyone can implement. They should also be well > defined. Sean's questions raise a very valid point. Exactly what is > being timestamped, and do we care about different timestamp options? Is > it completion of message, start of message, transfer from HCA to main > system memory completion, etc. The 00/10 header to this patch series > was probably answering Sean's question, but just based on the name of > the TIMESTAMP flag to the CQ creation attr struct it isn't clear that > this is the case. Ok then lets answer that. > > Right but then we are not at the comfortable sockets API here but at the > > bare metal level. > > That's not entirely true. We still hold to our abstrations, they are > just intentionally kept very thin and high performing. Well there is a distinction here. We provide the comfort of setup and administrative functions through the kernel API. We still try to isolate the application as much as possible when we go to the data paths but we need to hit bare metal in the fastpath in order to accomplish our mission of maximum performance and minimum latency. This cannot be accomplished with kernel calls and therefore it is an examplle of kernel bypass. We want this as comfy as possible of course and well defined. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html