Hi Oliver, Apart from performance impact, one more concern: As I know, theoretical limit for PA on Intel is 52 bits. I understand that these days no-one using more than 48 bits and it probably would stay like that for next few years. Though if we'll occupy these (MAXPHYADDR - 48) bits now, it can become a potential problem in future. After all the savings from that changes are not that big - only 2 bytes. As I understand you already save extra 7 bytes with other proposed modifications of mbuf. That's enough to add TSO related information into the mbuf. So my suggestion would be to keep phys_addr 64bit long. Thanks Konstantin
-----Original Message----- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Shaw, Jeffrey B Sent: Friday, May 09, 2014 5:12 PM To: Olivier MATZ; dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH RFC 05/11] mbuf: merge physaddr and buf_len in a bitfield I agree, we should wait for comments then test the performance when the patches have settled. -----Original Message----- From: Olivier MATZ [mailto:olivier.m...@6wind.com] Sent: Friday, May 09, 2014 9:06 AM To: Shaw, Jeffrey B; dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH RFC 05/11] mbuf: merge physaddr and buf_len in a bitfield Hi Jeff, Thank you for your comment. On 05/09/2014 05:39 PM, Shaw, Jeffrey B wrote: > have you tested this patch to see if there is a negative impact to > performance? Yes, but not with testpmd. I passed our internal non-regression performance tests and it shows no difference (or below the error margin), even with low overhead processing like forwarding whatever the number of cores I use. > Wouldn't the processor have to mask the high bytes of the physical > address when it is used, for example, to populate descriptors with > buffer addresses? When compute bound, this could steal CPU cycles > away from packet processing. I think we should understand the > performance trade-off in order to save these 2 bytes. I would naively say that the cost is negligible: accessing to the length is the same as before (it's a 16 bits field) and accessing the physical address is just a mask or a shift, which should not be very long on an Intel processor (1 cycle?). This is to be compared with the number of cycles per packet in io-fwd mode, which is probably around 150 or 200. > It would be interesting to see how throughput is impacted when the > workload is core-bound. This could be accomplished by running testpmd > in io-fwd mode across 4x 10G ports. I agree, this is something we could check. If you agree, let's first wait for some other comments and see if we find a consensus on the patches. Regards, Olivier