Hi Honnappa, Inline comments...
> -----Original Message----- > From: Honnappa Nagarahalli <[email protected]> > Sent: Saturday, September 19, 2020 12:49 AM > To: Phil Yang <[email protected]>; Jakub Grajciar -X (jgrajcia - PANTHEON > TECH SRO at Cisco) <[email protected]>; [email protected] > Cc: Ruifeng Wang <[email protected]>; nd <[email protected]>; Honnappa > Nagarahalli <[email protected]>; nd <[email protected]> > Subject: RE: [PATCH] net/memif: relax barrier for zero copy path > > Hi Jakub, > I am trying to review this patch. I am having difficulty in > understanding > the implementation for the queue/ring, appreciate if you could help me > understand the logic. 'ring' refers to a ring buffer holding packet descriptors. These descriptors hold metadata about the packet (packet buffer address, length, etc..). 'queues' are a representation of rings and buffers (+ some metadata). In more detail, one ring (S2M) and packet buffers allocated for this ring would be represented as 'tx queue' for the slave and 'rx queue' for the master. > > 1) The S2M queues - are used to send packets from slave to master. My > understanding is that, the slave thread would call 'eth_memif_tx_zc' and the > master thread would call 'eth_memif_rx_zc'. Is this correct? > 2) The M2S queues - are used to send packets from master to slave. Here the > slave thread would call 'eth_memif_rx_zc' and the master thread would call > 'eth_memif_tx_zc'. Is this correct? This is inded correct. > > Thank you, > Honnappa > > > -----Original Message----- > > From: Phil Yang <[email protected]> > > Sent: Friday, September 11, 2020 12:38 AM > > To: [email protected]; [email protected] > > Cc: Honnappa Nagarahalli <[email protected]>; Ruifeng Wang > > <[email protected]>; nd <[email protected]> > > Subject: [PATCH] net/memif: relax barrier for zero copy path > > > > Using 'rte_mb' to synchronize the shared ring head/tail between > > producer and consumer will stall the pipeline and damage performance > > on the weak memory model platforms, such like aarch64. > > > > Relax the expensive barrier with c11 atomic with explicit memory > > ordering can improve 3.6% performance on throughput. My question here is: `rte_mb` is supposed to make sure that head/tail pointer are not updated before the packets are written into shared memory. Does the atomic ensures that the packets are written into shared memory before head/tail pointers are updated?

