[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-10 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, May 03, 2016 11:20 AM
> To: Ivan Boule
> Cc: Jerin Jacob; dev at dpdk.org; De Lara Guarch, Pablo
> Subject: Re: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in 
> macswap loop
> 
> On Tue, May 03, 2016 at 11:50:31AM +0200, Ivan Boule wrote:
> > On 05/03/2016 11:45 AM, Bruce Richardson wrote:
> > >On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
> > >>prefetch the next packet data address in advance in macswap loop
> > >>for performance improvement.
> > >>
> > >>...
> > >>  for (i = 0; i < nb_rx; i++) {
> > >>+ if (likely(i < nb_rx - 1))
> > >>+ rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> > >>+void *));
> > >
> > >At least on IA platforms, there is no issue with prefetching beyond the 
> > >end of
> > >the array, since it's only a hint to the cpu. If this is true for other 
> > >platforms,
> > >then I suggest we just drop the conditional and just always prefetch.
> >
> > This is an interesting point.
> > Bruce, are you suggesting that prefetching at an invalid [virtual] address
> > won't trigger a CPU exception?
> >
> 
> Yep. For example, adding "rte_prefetch0(NULL)" at the start of main in testpmd
> causes no ill effects when running the app.
> 

One correction - while on IA prefetch(inval_addr) wouldn't cause any functional 
problems,
it still might cause DTLB miss and can be a source of noticeable performance 
degradation.
So it is better to avoid such constructions for performance critical code.
Konstantin





[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Jerin Jacob
On Mon, May 02, 2016 at 05:48:02PM +, De Lara Guarch, Pablo wrote:
> Hi Jerin,
> 
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > Sent: Monday, May 02, 2016 1:00 PM
> > To: dev at dpdk.org
> > Cc: De Lara Guarch, Pablo; Jerin Jacob
> > Subject: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in
> > macswap loop
> > 
> > prefetch the next packet data address in advance in macswap loop
> > for performance improvement.
> > 
> > Signed-off-by: Jerin Jacob 
> > ---
> >  app/test-pmd/macswap.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> > index 154889d..c10f4b5 100644
> > --- a/app/test-pmd/macswap.c
> > +++ b/app/test-pmd/macswap.c
> > @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
> > ol_flags |= PKT_TX_QINQ_PKT;
> > for (i = 0; i < nb_rx; i++) {
> > +   if (likely(i < nb_rx - 1))
> > +   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> > +  void *));
> > mb = pkts_burst[i];
> > eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
> > 
> > --
> > 2.1.0
> 
> This looks good. Could you also add it in the other forwarding modes (the 
> ones that make changes in the packets)?

OK Pablo.

I will add the similar logic in the following forwarding modes in
testpmd.

macswap
macfwd
macfwd-retry
csumonly
icmpecho

/Jerin

> 
> Thanks,
> Pablo


[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Ivan Boule
On 05/03/2016 11:45 AM, Bruce Richardson wrote:
> On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
>> prefetch the next packet data address in advance in macswap loop
>> for performance improvement.
>>
>> ...
>>  for (i = 0; i < nb_rx; i++) {
>> +if (likely(i < nb_rx - 1))
>> +rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
>> +   void *));
>
> At least on IA platforms, there is no issue with prefetching beyond the end of
> the array, since it's only a hint to the cpu. If this is true for other 
> platforms,
> then I suggest we just drop the conditional and just always prefetch.

This is an interesting point.
Bruce, are you suggesting that prefetching at an invalid [virtual] 
address won't trigger a CPU exception?

Ivan

>
> /Bruce
>
>>  mb = pkts_burst[i];
>>  eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);



[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Bruce Richardson
On Tue, May 03, 2016 at 11:50:31AM +0200, Ivan Boule wrote:
> On 05/03/2016 11:45 AM, Bruce Richardson wrote:
> >On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
> >>prefetch the next packet data address in advance in macswap loop
> >>for performance improvement.
> >>
> >>...
> >>for (i = 0; i < nb_rx; i++) {
> >>+   if (likely(i < nb_rx - 1))
> >>+   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> >>+  void *));
> >
> >At least on IA platforms, there is no issue with prefetching beyond the end 
> >of
> >the array, since it's only a hint to the cpu. If this is true for other 
> >platforms,
> >then I suggest we just drop the conditional and just always prefetch.
> 
> This is an interesting point.
> Bruce, are you suggesting that prefetching at an invalid [virtual] address
> won't trigger a CPU exception?
> 

Yep. For example, adding "rte_prefetch0(NULL)" at the start of main in testpmd
causes no ill effects when running the app.

/Bruce




[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Bruce Richardson
On Tue, May 03, 2016 at 10:48:34AM +0100, De Lara Guarch, Pablo wrote:
> 
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Tuesday, May 03, 2016 10:45 AM
> > To: Jerin Jacob
> > Cc: dev at dpdk.org; De Lara Guarch, Pablo
> > Subject: Re: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in
> > macswap loop
> > 
> > On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
> > > prefetch the next packet data address in advance in macswap loop
> > > for performance improvement.
> > >
> > > Signed-off-by: Jerin Jacob 
> > > ---
> > >  app/test-pmd/macswap.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> > > index 154889d..c10f4b5 100644
> > > --- a/app/test-pmd/macswap.c
> > > +++ b/app/test-pmd/macswap.c
> > > @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > >   if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
> > >   ol_flags |= PKT_TX_QINQ_PKT;
> > >   for (i = 0; i < nb_rx; i++) {
> > > + if (likely(i < nb_rx - 1))
> > > + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> > > +void *));
> > 
> > At least on IA platforms, there is no issue with prefetching beyond the end 
> > of
> > the array, since it's only a hint to the cpu. If this is true for other 
> > platforms,
> > then I suggest we just drop the conditional and just always prefetch.
> 
> That's what I thought when I saw this patch, but the problem is that the 
> prefetch is not for pkts_burst,
> but for rte_pktmbuf_mtod(pkts_burst...), so there must be a limit in nb_rx - 
> 2, or there will be a seg fault.
> 
> Pablo
Good point, Pablo, I missed that subtlety here.

/Bruce
> > 
> > /Bruce
> > 
> > >   mb = pkts_burst[i];
> > >   eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
> > >
> > > --
> > > 2.1.0
> > >


[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Bruce Richardson
On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
> prefetch the next packet data address in advance in macswap loop
> for performance improvement.
> 
> Signed-off-by: Jerin Jacob 
> ---
>  app/test-pmd/macswap.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> index 154889d..c10f4b5 100644
> --- a/app/test-pmd/macswap.c
> +++ b/app/test-pmd/macswap.c
> @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
>   if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
>   ol_flags |= PKT_TX_QINQ_PKT;
>   for (i = 0; i < nb_rx; i++) {
> + if (likely(i < nb_rx - 1))
> + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> +void *));

At least on IA platforms, there is no issue with prefetching beyond the end of
the array, since it's only a hint to the cpu. If this is true for other 
platforms,
then I suggest we just drop the conditional and just always prefetch.

/Bruce

>   mb = pkts_burst[i];
>   eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
>  
> -- 
> 2.1.0
> 


[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Richardson, Bruce
> Sent: Tuesday, May 03, 2016 10:45 AM
> To: Jerin Jacob
> Cc: dev at dpdk.org; De Lara Guarch, Pablo
> Subject: Re: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in
> macswap loop
> 
> On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
> > prefetch the next packet data address in advance in macswap loop
> > for performance improvement.
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> >  app/test-pmd/macswap.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> > index 154889d..c10f4b5 100644
> > --- a/app/test-pmd/macswap.c
> > +++ b/app/test-pmd/macswap.c
> > @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
> > ol_flags |= PKT_TX_QINQ_PKT;
> > for (i = 0; i < nb_rx; i++) {
> > +   if (likely(i < nb_rx - 1))
> > +   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> > +  void *));
> 
> At least on IA platforms, there is no issue with prefetching beyond the end of
> the array, since it's only a hint to the cpu. If this is true for other 
> platforms,
> then I suggest we just drop the conditional and just always prefetch.

That's what I thought when I saw this patch, but the problem is that the 
prefetch is not for pkts_burst,
but for rte_pktmbuf_mtod(pkts_burst...), so there must be a limit in nb_rx - 2, 
or there will be a seg fault.

Pablo
> 
> /Bruce
> 
> > mb = pkts_burst[i];
> > eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
> >
> > --
> > 2.1.0
> >


[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-02 Thread De Lara Guarch, Pablo
Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Monday, May 02, 2016 1:00 PM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo; Jerin Jacob
> Subject: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in
> macswap loop
> 
> prefetch the next packet data address in advance in macswap loop
> for performance improvement.
> 
> Signed-off-by: Jerin Jacob 
> ---
>  app/test-pmd/macswap.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> index 154889d..c10f4b5 100644
> --- a/app/test-pmd/macswap.c
> +++ b/app/test-pmd/macswap.c
> @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
>   if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
>   ol_flags |= PKT_TX_QINQ_PKT;
>   for (i = 0; i < nb_rx; i++) {
> + if (likely(i < nb_rx - 1))
> + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
> +void *));
>   mb = pkts_burst[i];
>   eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
> 
> --
> 2.1.0

This looks good. Could you also add it in the other forwarding modes (the ones 
that make changes in the packets)?

Thanks,
Pablo


[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-02 Thread Jerin Jacob
prefetch the next packet data address in advance in macswap loop
for performance improvement.

Signed-off-by: Jerin Jacob 
---
 app/test-pmd/macswap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 154889d..c10f4b5 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
ol_flags |= PKT_TX_QINQ_PKT;
for (i = 0; i < nb_rx; i++) {
+   if (likely(i < nb_rx - 1))
+   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
+  void *));
mb = pkts_burst[i];
eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);

-- 
2.1.0