Hi Oliver,

Apart from performance impact, one more concern:
As I know, theoretical limit for PA on Intel is 52 bits.
I understand that these days no-one using more than 48 bits and it probably 
would stay like that for next few years.
Though if we'll occupy these (MAXPHYADDR - 48) bits now, it can become a 
potential problem in future.
After all the savings from that changes are not that big - only 2 bytes.  
As I understand you already save extra 7 bytes with other proposed 
modifications of mbuf.
That's enough to add TSO related information into the mbuf.
So my suggestion would be to keep phys_addr 64bit long.
Thanks
Konstantin  

-----Original Message-----
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Shaw, Jeffrey B
Sent: Friday, May 09, 2014 5:12 PM
To: Olivier MATZ; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH RFC 05/11] mbuf: merge physaddr and buf_len in a 
bitfield

I agree, we should wait for comments then test the performance when the patches 
have settled.


-----Original Message-----
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Friday, May 09, 2014 9:06 AM
To: Shaw, Jeffrey B; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH RFC 05/11] mbuf: merge physaddr and buf_len in a 
bitfield

Hi Jeff,

Thank you for your comment.

On 05/09/2014 05:39 PM, Shaw, Jeffrey B wrote:
> have you tested this patch to see if there is a negative impact to 
> performance?

Yes, but not with testpmd. I passed our internal non-regression performance 
tests and it shows no difference (or below the error margin), even with low 
overhead processing like forwarding whatever the number of cores I use.

> Wouldn't the processor have to mask the high bytes of the physical 
> address when it is used, for example, to populate descriptors with 
> buffer addresses?  When compute bound, this could steal CPU cycles 
> away from packet processing.  I think we should understand the 
> performance trade-off in order to save these 2 bytes.

I would naively say that the cost is negligible: accessing to the length is the 
same as before (it's a 16 bits field) and accessing the physical address is 
just a mask or a shift, which should not be very long on an Intel processor (1 
cycle?). This is to be compared with the number of cycles per packet in io-fwd 
mode, which is probably around 150 or 200.

> It would be interesting to see how throughput is impacted when the 
> workload is core-bound.  This could be accomplished by running testpmd 
> in io-fwd mode across 4x 10G ports.

I agree, this is something we could check. If you agree, let's first wait for 
some other comments and see if we find a consensus on the patches.

Regards,
Olivier

Reply via email to