On Thu, Nov 05, 2020 at 02:04:39AM +0100, Andrew Lunn wrote: > > +static int dpaa2_switch_build_single_fd(struct ethsw_core *ethsw, > > + struct sk_buff *skb, > > + struct dpaa2_fd *fd) > > +{ > > + struct device *dev = ethsw->dev; > > + struct sk_buff **skbh; > > + dma_addr_t addr; > > + u8 *buff_start; > > + void *hwa; > > + > > + buff_start = PTR_ALIGN(skb->data - DPAA2_SWITCH_TX_DATA_OFFSET - > > + DPAA2_SWITCH_TX_BUF_ALIGN, > > + DPAA2_SWITCH_TX_BUF_ALIGN); > > + > > + /* Clear FAS to have consistent values for TX confirmation. It is > > + * located in the first 8 bytes of the buffer's hardware annotation > > + * area > > + */ > > + hwa = buff_start + DPAA2_SWITCH_SWA_SIZE; > > + memset(hwa, 0, 8); > > + > > + /* Store a backpointer to the skb at the beginning of the buffer > > + * (in the private data area) such that we can release it > > + * on Tx confirm > > + */ > > + skbh = (struct sk_buff **)buff_start; > > + *skbh = skb; > > Where is the TX confirm which uses this stored pointer. I don't see it > in this file. >
The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9. > It can be expensive to store pointer like this in buffers used for > DMA. Yes, it is. But the hardware does not give us any other indication that a packet was actually sent so that we can move ahead with consuming the initial skb. > It has to be flushed out of the cache here as part of the > send. Then the TX complete needs to invalidate and then read it back > into the cache. Or you use coherent memory which is just slow. > > It can be cheaper to keep a parallel ring in cacheable memory which > never gets flushed. I'm afraid I don't really understand your suggestion. In this parallel ring I would keep the skb pointers of all frames which are in-flight? Then, when a packet is received on the Tx confirmation queue I would have to loop over the parallel ring and determine somehow which skb was this packet initially associated to. Isn't this even more expensive? Ioana