Re: svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-06 Thread Qing Li
On Sun, Jun 5, 2016 at 11:06 PM, Alexander V. Chernikov <
melif...@freebsd.org> wrote:

> 06.06.2016, 04:40, "George Neville-Neil" :
> > On 4 Jun 2016, at 15:05, Alexander V. Chernikov wrote:
> >
> >>  02.06.2016, 20:51, "George V. Neville-Neil" :
> >>>  Author: gnn
> >>>  Date: Thu Jun 2 17:51:29 2016
> >>>  New Revision: 301217
> >>>  URL: https://svnweb.freebsd.org/changeset/base/301217
> >>>
> >>>  Log:
> >>>This change re-adds L2 caching for TCP and UDP, as originally
> >>>  added in D4306
> >>>but removed due to other changes in the system. Restore the
> >>>  llentry pointer
> >>>to the "struct route", and use it to cache the L2 lookup (ARP or
> >>>  ND6) as
> >>>appropriate.
> >>
> >>  I have several comments regarding this commit.
> >>
> >>  1 Architecturally, there was quite a lot of efforts to eliminate
> >>  layering violation between lltable and other places in network stack.
> >>  It ended by committing D4102, which allowed both to cleanup lower
> >>  level, provide abstract “prepend” framework which could be used to
> >>  provide cached data to _otuput() functions.
> >>  This change brings these violations back in a really invasive way.
> >>
> >>  Additionally, implementing L2 PCB caching at the other subsystem
> >>  expense is really a bad idea.
> >>  If one wants caching in some subsystem, it should be implemented in
> >>  that subsystem not polluting other things.
> >>  Current implementation permits this by filling in “ro_prepend” /
> >>  ro_plen fields.
> >>
> >>  In general, this change looks more like a local hack and not like the
> >>  code that should be included in the tree.
> >>
> >>  2 There was no benchmarks proving the effectiveness of this change.
> >>  (For example, it is not obvious if it could significantly improve TCP
> >>  since we still have per-session TCP wlock + (typically) per-ring
> >>  mutex, so removing lltable rock might not help things here). Given
> >>  that the patch complicates existing code, there should be adequate
> >>  benefits to consider whether this change is worth implementing.
> >>
> >>  3 The “network” group was not included to the review despite the
> >>  fact that most of the changes were related to the L2 layer which is
> >>  not “transport”, so some people might have missed this review.
> >>
> >>  4 This change DOES NOT WORK. really.
> >>  (which raises questions on both review and benchmarking process).
> >>
> >>  The reason is that “plle” argument is filled only in “heavy”
> >>  lltable lookup functions (e.g. when we don’t have neighbour
> >>  adjacency). 99.9% of the time arpresolve/nd6_resolve() returns the
> >>  result w/o calling their heavy versions, and the returned “plle”
> >>  is NULL.
> >>
> >>  This can be easily verified by calling something like
> >>  dtrace -n 'fbt:kernel:ether_output:entry /arg3!=NULL&&((struct route
> >>  *)arg3)->ro_lle != NULL/ { stack(); }'
> >>
> >>  Given that, I kindly ask you to backout this change.
> >
> > Hi,
> >
> > I'm going to limit the scope of this reply to just you, me and Mike
> > Karels, from whom this originated.
> No, please keep the discussion open. The decision on having that
> particular L2 caching implementation (and L2 caching in general) is quite
> important, so it would be great if all technical arguments were saved so
> other people can (now or later) understand the decision details.
>
> Thanks for understanding.
> >
> > Best,
> > George
>
> This commit does seem to undo the non-trivial layer separation efforts
invested previously.
The flow-table construction was meant to help accelerate TCP/UDP route
lookups. The various
aspects of the routing code took flow-table into consideration, and those
code are still present
even after this change.
--Qing
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

RE: svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-06 Thread Qing

-Original Message-
From: owner-src-committ...@freebsd.org 
[mailto:owner-src-committ...@freebsd.org] On Behalf Of Alexander V. Chernikov
Sent: Sunday, June 05, 2016 11:07 PM
To: George Neville-Neil <g...@freebsd.org>
Cc: Mike Karels <m...@karels.net>; src-committers <src-committ...@freebsd.org>; 
svn-src-all <svn-src-...@freebsd.org>; svn-src-head <svn-src-head@freebsd.org>
Subject: Re: svn commit: r301217 - in head/sys: net netinet netinet6

06.06.2016, 04:40, "George Neville-Neil" <g...@freebsd.org>:
> On 4 Jun 2016, at 15:05, Alexander V. Chernikov wrote:
>
>>  02.06.2016, 20:51, "George V. Neville-Neil" <g...@freebsd.org>:
>>>  Author: gnn
>>>  Date: Thu Jun 2 17:51:29 2016
>>>  New Revision: 301217
>>>  URL: https://svnweb.freebsd.org/changeset/base/301217
>>>
>>>  Log:
>>>This change re-adds L2 caching for TCP and UDP, as originally
>>>  added in D4306
>>>but removed due to other changes in the system. Restore the
>>>  llentry pointer
>>>to the "struct route", and use it to cache the L2 lookup (ARP or
>>>  ND6) as
>>>appropriate.
>>
>>  I have several comments regarding this commit.
>>
>>  1 Architecturally, there was quite a lot of efforts to eliminate
>>  layering violation between lltable and other places in network stack.
>>  It ended by committing D4102, which allowed both to cleanup lower
>>  level, provide abstract “prepend” framework which could be used to
>>  provide cached data to _otuput() functions.
>>  This change brings these violations back in a really invasive way.
>>
>>  Additionally, implementing L2 PCB caching at the other subsystem
>>  expense is really a bad idea.
>>  If one wants caching in some subsystem, it should be implemented in
>>  that subsystem not polluting other things.
>>  Current implementation permits this by filling in “ro_prepend” /
>>  ro_plen fields.
>>
>>  In general, this change looks more like a local hack and not like 
>> the
>>  code that should be included in the tree.
>>
>>  2 There was no benchmarks proving the effectiveness of this change.
>>  (For example, it is not obvious if it could significantly improve 
>> TCP
>>  since we still have per-session TCP wlock + (typically) per-ring
>>  mutex, so removing lltable rock might not help things here). Given
>>  that the patch complicates existing code, there should be adequate
>>  benefits to consider whether this change is worth implementing.
>>
>>  3 The “network” group was not included to the review despite the
>>  fact that most of the changes were related to the L2 layer which is
>>  not “transport”, so some people might have missed this review.
>>
>>  4 This change DOES NOT WORK. really.
>>  (which raises questions on both review and benchmarking process).
>>
>>  The reason is that “plle” argument is filled only in “heavy”
>>  lltable lookup functions (e.g. when we don’t have neighbour
>>  adjacency). 99.9% of the time arpresolve/nd6_resolve() returns the
>>  result w/o calling their heavy versions, and the returned “plle”
>>  is NULL.
>>
>>  This can be easily verified by calling something like
>>  dtrace -n 'fbt:kernel:ether_output:entry /arg3!=NULL&&((struct route
>>  *)arg3)->ro_lle != NULL/ { stack(); }'
>>
>>  Given that, I kindly ask you to backout this change.
>
> Hi,
>
> I'm going to limit the scope of this reply to just you, me and Mike 
> Karels, from whom this originated.

>>No, please keep the discussion open. The decision on having that particular 
>>L2 caching 
>>implementation (and L2 caching in general) is quite important, so it would be 
>>great if 
>>all technical arguments were saved so other people can 
>>(now or later) understand the decision details.
>>Thanks for understanding.

This commit does seem to undo the non-trivial layer separation efforts invested 
previously. 
The flow-table construction was meant to help accelerate TCP/UDP route lookups. 
The various
aspects of the routing code took flow-table into consideration, and those code 
are still present
even after this change.

--Qing




___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Re: svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-04 Thread Alexander V . Chernikov
02.06.2016, 20:51, "George V. Neville-Neil" :
> Author: gnn
> Date: Thu Jun 2 17:51:29 2016
> New Revision: 301217
> URL: https://svnweb.freebsd.org/changeset/base/301217
>
> Log:
>   This change re-adds L2 caching for TCP and UDP, as originally added in D4306
>   but removed due to other changes in the system. Restore the llentry pointer
>   to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as
>   appropriate.

I have several comments regarding this commit.

1 Architecturally, there was quite a lot of efforts to eliminate layering 
violation between lltable and other places in network stack.
It ended by committing D4102, which allowed both to cleanup lower level, 
provide abstract “prepend” framework which could be used to provide cached data 
to _otuput() functions.
This change brings these violations back in a really invasive way.

Additionally, implementing L2 PCB caching at the other subsystem expense is 
really a bad idea.
If one wants caching in some subsystem, it should be implemented in that 
subsystem not polluting other things.
Current implementation permits this by filling in “ro_prepend” / ro_plen fields.

In general, this change looks more like a local hack and not like the code that 
should be included in the tree.

2 There was no benchmarks proving the effectiveness of this change. (For 
example, it is not obvious if it could significantly improve TCP since we still 
have per-session TCP wlock + (typically) per-ring mutex, so removing lltable 
rock might not help things here). Given that the patch complicates existing 
code, there should be adequate benefits to consider whether this change is 
worth implementing.

3 The “network” group was not included to the review despite the fact that most 
of the changes were related to the L2 layer which is not “transport”, so some 
people might have missed this review.

4 This change DOES NOT WORK. really.
(which raises questions on both review and benchmarking process).

The reason is that “plle” argument is filled only in “heavy” lltable lookup 
functions (e.g. when we don’t have neighbour adjacency). 99.9% of the time 
arpresolve/nd6_resolve() returns the result w/o calling their heavy versions, 
and the returned “plle” is NULL.

This can be easily verified by calling something like
dtrace -n 'fbt:kernel:ether_output:entry /arg3!=NULL&&((struct route 
*)arg3)->ro_lle != NULL/ { stack(); }'

Given that, I kindly ask you to backout this change.

>
>   Submitted by: Mike Karels
>   Differential Revision: https://reviews.freebsd.org/D6262
>
> Modified:
>   head/sys/net/flowtable.c
>   head/sys/net/if_arcsubr.c
>   head/sys/net/if_ethersubr.c
>   head/sys/net/if_fddisubr.c
>   head/sys/net/if_fwsubr.c
>   head/sys/net/if_iso88025subr.c
>   head/sys/net/if_llatbl.h
>   head/sys/net/route.c
>   head/sys/net/route.h
>   head/sys/netinet/if_ether.c
>   head/sys/netinet/if_ether.h
>   head/sys/netinet/in_pcb.c
>   head/sys/netinet/ip_output.c
>   head/sys/netinet/toecore.c
>   head/sys/netinet6/in6.h
>   head/sys/netinet6/in6_pcb.c
>   head/sys/netinet6/ip6_output.c
>   head/sys/netinet6/nd6.c
>   head/sys/netinet6/nd6.h
>
> Modified: head/sys/net/flowtable.c
> ==
> --- head/sys/net/flowtable.c Thu Jun 2 17:31:37 2016 (r301216)
> +++ head/sys/net/flowtable.c Thu Jun 2 17:51:29 2016 (r301217)
> @@ -696,13 +696,8 @@ flowtable_lookup(sa_family_t sa, struct
>  ro->ro_rt = fle->f_rt;
>  ro->ro_flags |= RT_NORTREF;
>  lle = fle->f_lle;
> - if (lle != NULL && (lle->la_flags & LLE_VALID)) {
> - ro->ro_prepend = lle->r_linkdata;
> - ro->ro_plen = lle->r_hdrlen;
> - ro->ro_flags |= RT_MAY_LOOP;
> - if (lle->la_flags & LLE_IFADDR)
> - ro->ro_flags |= RT_L2_ME;
> - }
> + if (lle != NULL && (lle->la_flags & LLE_VALID))
> + ro->ro_lle = lle; /* share ref with fle->f_lle */
>
>  return (0);
>  }
>
> Modified: head/sys/net/if_arcsubr.c
> ==
> --- head/sys/net/if_arcsubr.c Thu Jun 2 17:31:37 2016 (r301216)
> +++ head/sys/net/if_arcsubr.c Thu Jun 2 17:51:29 2016 (r301217)
> @@ -129,7 +129,8 @@ arc_output(struct ifnet *ifp, struct mbu
>  else if (ifp->if_flags & IFF_NOARP)
>  adst = ntohl(SIN(dst)->sin_addr.s_addr) & 0xFF;
>  else {
> - error = arpresolve(ifp, is_gw, m, dst, , NULL);
> + error = arpresolve(ifp, is_gw, m, dst, , NULL,
> + NULL);
>  if (error)
>  return (error == EWOULDBLOCK ? 0 : error);
>  }
> @@ -170,7 +171,8 @@ arc_output(struct ifnet *ifp, struct mbu
>  if ((m->m_flags & M_MCAST) != 0)
>  adst = arcbroadcastaddr; /* ARCnet broadcast address 
> */
>  else {
> - error = nd6_resolve(ifp, is_gw, m, dst, , NULL);
> + error = nd6_resolve(ifp, is_gw, m, dst, , 

Re: svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-02 Thread George Neville-Neil



On 2 Jun 2016, at 16:21, Ngie Cooper wrote:

On Thu, Jun 2, 2016 at 10:51 AM, George V. Neville-Neil 
 wrote:

Author: gnn
Date: Thu Jun  2 17:51:29 2016
New Revision: 301217
URL: https://svnweb.freebsd.org/changeset/base/301217

Log:
  This change re-adds L2 caching for TCP and UDP, as originally added 
in D4306
  but removed due to other changes in the system. Restore the llentry 
pointer
  to the "struct route", and use it to cache the L2 lookup (ARP or 
ND6) as

  appropriate.

  Submitted by: Mike Karels
  Differential Revision:https://reviews.freebsd.org/D6262


This broke ibcore:
https://lists.freebsd.org/pipermail/freebsd-current/2016-June/061640.html
Thanks,
-Ngie


About to commit a fix.

Best,
George
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-02 Thread Ngie Cooper
On Thu, Jun 2, 2016 at 10:51 AM, George V. Neville-Neil  
wrote:
> Author: gnn
> Date: Thu Jun  2 17:51:29 2016
> New Revision: 301217
> URL: https://svnweb.freebsd.org/changeset/base/301217
>
> Log:
>   This change re-adds L2 caching for TCP and UDP, as originally added in D4306
>   but removed due to other changes in the system. Restore the llentry pointer
>   to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as
>   appropriate.
>
>   Submitted by: Mike Karels
>   Differential Revision:https://reviews.freebsd.org/D6262

This broke ibcore:
https://lists.freebsd.org/pipermail/freebsd-current/2016-June/061640.html
Thanks,
-Ngie
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r301217 - in head/sys: net netinet netinet6

2016-06-02 Thread George V. Neville-Neil
Author: gnn
Date: Thu Jun  2 17:51:29 2016
New Revision: 301217
URL: https://svnweb.freebsd.org/changeset/base/301217

Log:
  This change re-adds L2 caching for TCP and UDP, as originally added in D4306
  but removed due to other changes in the system. Restore the llentry pointer
  to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as
  appropriate.
  
  Submitted by: Mike Karels
  Differential Revision:https://reviews.freebsd.org/D6262

Modified:
  head/sys/net/flowtable.c
  head/sys/net/if_arcsubr.c
  head/sys/net/if_ethersubr.c
  head/sys/net/if_fddisubr.c
  head/sys/net/if_fwsubr.c
  head/sys/net/if_iso88025subr.c
  head/sys/net/if_llatbl.h
  head/sys/net/route.c
  head/sys/net/route.h
  head/sys/netinet/if_ether.c
  head/sys/netinet/if_ether.h
  head/sys/netinet/in_pcb.c
  head/sys/netinet/ip_output.c
  head/sys/netinet/toecore.c
  head/sys/netinet6/in6.h
  head/sys/netinet6/in6_pcb.c
  head/sys/netinet6/ip6_output.c
  head/sys/netinet6/nd6.c
  head/sys/netinet6/nd6.h

Modified: head/sys/net/flowtable.c
==
--- head/sys/net/flowtable.cThu Jun  2 17:31:37 2016(r301216)
+++ head/sys/net/flowtable.cThu Jun  2 17:51:29 2016(r301217)
@@ -696,13 +696,8 @@ flowtable_lookup(sa_family_t sa, struct 
ro->ro_rt = fle->f_rt;
ro->ro_flags |= RT_NORTREF;
lle = fle->f_lle;
-   if (lle != NULL && (lle->la_flags & LLE_VALID)) {
-   ro->ro_prepend = lle->r_linkdata;
-   ro->ro_plen = lle->r_hdrlen;
-   ro->ro_flags |= RT_MAY_LOOP;
-   if (lle->la_flags & LLE_IFADDR)
-   ro->ro_flags |= RT_L2_ME;
-   }
+   if (lle != NULL && (lle->la_flags & LLE_VALID))
+   ro->ro_lle = lle;   /* share ref with fle->f_lle */
 
return (0);
 }

Modified: head/sys/net/if_arcsubr.c
==
--- head/sys/net/if_arcsubr.c   Thu Jun  2 17:31:37 2016(r301216)
+++ head/sys/net/if_arcsubr.c   Thu Jun  2 17:51:29 2016(r301217)
@@ -129,7 +129,8 @@ arc_output(struct ifnet *ifp, struct mbu
else if (ifp->if_flags & IFF_NOARP)
adst = ntohl(SIN(dst)->sin_addr.s_addr) & 0xFF;
else {
-   error = arpresolve(ifp, is_gw, m, dst, , NULL);
+   error = arpresolve(ifp, is_gw, m, dst, , NULL,
+   NULL);
if (error)
return (error == EWOULDBLOCK ? 0 : error);
}
@@ -170,7 +171,8 @@ arc_output(struct ifnet *ifp, struct mbu
if ((m->m_flags & M_MCAST) != 0)
adst = arcbroadcastaddr; /* ARCnet broadcast address */
else {
-   error = nd6_resolve(ifp, is_gw, m, dst, , NULL);
+   error = nd6_resolve(ifp, is_gw, m, dst, , NULL,
+   NULL);
if (error != 0)
return (error == EWOULDBLOCK ? 0 : error);
}

Modified: head/sys/net/if_ethersubr.c
==
--- head/sys/net/if_ethersubr.c Thu Jun  2 17:31:37 2016(r301216)
+++ head/sys/net/if_ethersubr.c Thu Jun  2 17:51:29 2016(r301217)
@@ -199,7 +199,7 @@ ether_requestencap(struct ifnet *ifp, st
 static int
 ether_resolve_addr(struct ifnet *ifp, struct mbuf *m,
const struct sockaddr *dst, struct route *ro, u_char *phdr,
-   uint32_t *pflags)
+   uint32_t *pflags, struct llentry **plle)
 {
struct ether_header *eh;
uint32_t lleflags = 0;
@@ -208,13 +208,16 @@ ether_resolve_addr(struct ifnet *ifp, st
uint16_t etype;
 #endif
 
+   if (plle)
+   *plle = NULL;
eh = (struct ether_header *)phdr;
 
switch (dst->sa_family) {
 #ifdef INET
case AF_INET:
if ((m->m_flags & (M_BCAST | M_MCAST)) == 0)
-   error = arpresolve(ifp, 0, m, dst, phdr, );
+   error = arpresolve(ifp, 0, m, dst, phdr, ,
+   plle);
else {
if (m->m_flags & M_BCAST)
memcpy(eh->ether_dhost, ifp->if_broadcastaddr,
@@ -233,7 +236,8 @@ ether_resolve_addr(struct ifnet *ifp, st
 #ifdef INET6
case AF_INET6:
if ((m->m_flags & M_MCAST) == 0)
-   error = nd6_resolve(ifp, 0, m, dst, phdr, );
+   error = nd6_resolve(ifp, 0, m, dst, phdr, ,
+   plle);
else {
const struct in6_addr *a6;
a6 = &(((const struct sockaddr_in6 *)dst)->sin6_addr);
@@ -283,14 +287,40 @@ ether_output(struct ifnet *ifp, struct m
int loop_copy