On Thu, May 12, 2016 at 10:07:09AM +0000, Ananyev, Konstantin wrote: > Hi Jerrin, > > > > > Hi All, > > > > I would like align mbuff rearm_data field to 8 byte aligned so that > > write to mbuf->rearm_data with uint64_t* will be naturally aligned. > > I am not sure about IA but some other architecture/implementation has > > overhead > > in non-naturally aligned stores. > > > > Proposed patch is something like this below, But open for any change to > > make fit for all other architectures/platform. > > > > Any thoughts ? > > > > ? [master] [dpdk-master] $ git diff > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > index 529debb..5a917d0 100644 > > --- a/lib/librte_mbuf/rte_mbuf.h > > +++ b/lib/librte_mbuf/rte_mbuf.h > > @@ -733,10 +733,8 @@ struct rte_mbuf { > > void *buf_addr; /**< Virtual address of segment > > buffer. */ > > phys_addr_t buf_physaddr; /**< Physical address of segment > > buffer. */ > > > > - uint16_t buf_len; /**< Length of segment buffer. */ > > - > > > There is no need to move buf_len itself, I think. > Just move rearm_data marker prior to buf_len is enough. > Though how do you suggest to deal with the fact, that right now we blindly > update the whole 64bits pointed by rearm_data: > > drivers/net/ixgbe/ixgbe_rxtx_vec.c: > /* > * Flush mbuf with pkt template. > * Data to be rearmed is 6 bytes long. > * Though, RX will overwrite ol_flags that are coming next > * anyway. So overwrite whole 8 bytes with one load: > * 6 bytes of rearm_data plus first 2 bytes of ol_flags. > */ > p0 = (uintptr_t)&mb0->rearm_data; > *(uint64_t *)p0 = rxq->mbuf_initializer; > > ? > > If buf_len will be inside these 64bits, we can't do it anymore. > > Are you suggesting something like: > > uint64_t *p0, v0; > > p0 = &mb0->rearm_data; > v0 = *p0 & REARM_MASK; > *p0 = v0 | rxq->mbuf_initializer; > ?
Due to unaligned rearm_data issue, In ThunderX platform, we need to write multiple half word of aligned stores(so masking was better us). But I think, if we can put 16bit hole between port and ol_flags then we may not need the masking stuff in ixgbe. Right? OR Even better, if we can fill in a uint16_t variable which will replaced later in the flow like "data_len"? and move buf_len at end the first cache line? or any other thoughts to fix unaligned rearm_data issue? Jerin > > If so I wonder what would be the performance impact of that change. > Konstantin > > > > /* next 6 bytes are initialised on RX descriptor rearm */ > > - MARKER8 rearm_data; > > + MARKER64 rearm_data; > > uint16_t data_off; > > > > /** > > @@ -754,6 +752,7 @@ struct rte_mbuf { > > uint8_t nb_segs; /**< Number of segments. */ > > uint8_t port; /**< Input port. */ > > > > + uint16_t buf_len; /**< Length of segment buffer. */ > > uint64_t ol_flags; /**< Offload features. */ > > > > /* remaining bytes are set on RX when pulling packet from > > * descriptor > > > > /Jerin