On Tue, 29 Mar 2016 10:31:19 +0100 Bruce Richardson <bruce.richardson at intel.com> wrote:
> On Mon, Mar 28, 2016 at 06:45:26PM -0700, Mohammad El-Shabani wrote: > > Hi, > > Looking into why it hurts performance, I see that ixgbe_dev_rx_queue_count > > is implemented a scan of elements of rx descriptors, which is very > > expensive. I am wondering why its implemented the way it is. Could it not > > just read the head location from the driver? > > > > Thanks! > > Mohammad El-Shabani > > It's likely that reading the head location from the driver will be even slower > than scanning the descriptor rings in memory. Access to PCI is very much > slower > than accessing memory - especially since on platforms with DDIO, many memory > accesses will actually be cache reads. > > That being said, I haven't actually written a test to prove this out, so feel > free to try out the head pointer read method instead and see if it improves > things. The results may vary depending on how far ahead needs to be scanned, > but certainly for the empty ring case, the descriptor scan method will be far > faster than a head read. > > Regards, > /Bruce Also the most common use case is "is there any more packets ready before I go to sleep on epoll", and the descriptor done API tells more than is needed.

