Re: svn commit: r238092 - in head/sys: net netinet netinet6

2012-07-09 Thread Gleb Smirnoff
On Sat, Jul 07, 2012 at 09:36:11PM +, Bjoern A. Zeeb wrote:
B  Author: glebius
B  Date: Wed Jul  4 07:37:53 2012
B  New Revision: 238092
B  URL: http://svn.freebsd.org/changeset/base/238092
B  
B  Log:
B   When ip_output()/ip6_output() is supplied a struct route *ro argument,
B   it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning
B   here: it may be supplied to provide route, and it may be supplied to
B   store and return to caller the route that ip_output()/ip6_output()
B   finds. In the latter case skipping FLOWTABLE lookup is pessimisation.
B  
B   The difference between struct route filled by FLOWTABLE and filled
B   by rtalloc() family is that the former doesn't hold a reference on
B   its rtentry. Reference is hold by flow entry, and it is about to
B   be released in future. Thus, route filled by FLOWTABLE shouldn't
B   be passed to RTFREE() macro.
B  
B   - Introduce new flag for struct route/route_in6, that marks route
B not holding a reference on rtentry.
B   - Introduce new macro RO_RTFREE() that cleans up a struct route
B depending on its kind.
B   - All callers to ip_output()/ip6_output() that do supply non-NULL
B but empty route should use RO_RTFREE() to free results of
B lookup.
B   - ip_output()/ip6_output() now do FLOWTABLE lookup always when
B ro-ro_rt == NULL.
B 
B 
B Just read the description but you realize that the proper fix is to
B make flowtable code a bit slower and do proper reference counting?
B 
B Currently a cache flowtable entry might still be releases while a packet
B in being processed using it, right?

I think we should connect Kip to the discussion. Flowtable is designed to
be fast and lockless, and not refcounting rtentries per packet. Yes, it is
theoretically racy. But my patch didn't bring in anything new to
this pecularity of flowtable.

-- 
Totus tuus, Glebius.
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r238092 - in head/sys: net netinet netinet6

2012-07-07 Thread Bjoern A. Zeeb

On 4. Jul 2012, at 07:37 , Gleb Smirnoff wrote:

 Author: glebius
 Date: Wed Jul  4 07:37:53 2012
 New Revision: 238092
 URL: http://svn.freebsd.org/changeset/base/238092
 
 Log:
  When ip_output()/ip6_output() is supplied a struct route *ro argument,
  it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning
  here: it may be supplied to provide route, and it may be supplied to
  store and return to caller the route that ip_output()/ip6_output()
  finds. In the latter case skipping FLOWTABLE lookup is pessimisation.
 
  The difference between struct route filled by FLOWTABLE and filled
  by rtalloc() family is that the former doesn't hold a reference on
  its rtentry. Reference is hold by flow entry, and it is about to
  be released in future. Thus, route filled by FLOWTABLE shouldn't
  be passed to RTFREE() macro.
 
  - Introduce new flag for struct route/route_in6, that marks route
not holding a reference on rtentry.
  - Introduce new macro RO_RTFREE() that cleans up a struct route
depending on its kind.
  - All callers to ip_output()/ip6_output() that do supply non-NULL
but empty route should use RO_RTFREE() to free results of
lookup.
  - ip_output()/ip6_output() now do FLOWTABLE lookup always when
ro-ro_rt == NULL.


Just read the description but you realize that the proper fix is to
make flowtable code a bit slower and do proper reference counting?

Currently a cache flowtable entry might still be releases while a packet
in being processed using it, right?

If my memory serves me right the same is true for llentries cached from
flowtable.

Once that is fixed as well the proper way to free an lltable should be
that the table does not go away while it still holds entry but if it's
marked for deletion the cleanup of the last llentry should also free
the table.  This would revert parts of Kip's last commit and restore
a proper teardown world order.

/bz

-- 
Bjoern A. Zeeb You have to have visions!
   It does not matter how good you are. It matters what good you do!

___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


svn commit: r238092 - in head/sys: net netinet netinet6

2012-07-04 Thread Gleb Smirnoff
Author: glebius
Date: Wed Jul  4 07:37:53 2012
New Revision: 238092
URL: http://svn.freebsd.org/changeset/base/238092

Log:
  When ip_output()/ip6_output() is supplied a struct route *ro argument,
  it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning
  here: it may be supplied to provide route, and it may be supplied to
  store and return to caller the route that ip_output()/ip6_output()
  finds. In the latter case skipping FLOWTABLE lookup is pessimisation.
  
  The difference between struct route filled by FLOWTABLE and filled
  by rtalloc() family is that the former doesn't hold a reference on
  its rtentry. Reference is hold by flow entry, and it is about to
  be released in future. Thus, route filled by FLOWTABLE shouldn't
  be passed to RTFREE() macro.
  
  - Introduce new flag for struct route/route_in6, that marks route
not holding a reference on rtentry.
  - Introduce new macro RO_RTFREE() that cleans up a struct route
depending on its kind.
  - All callers to ip_output()/ip6_output() that do supply non-NULL
but empty route should use RO_RTFREE() to free results of
lookup.
  - ip_output()/ip6_output() now do FLOWTABLE lookup always when
ro-ro_rt == NULL.
  
  Tested by:tuexen (SCTP part)

Modified:
  head/sys/net/flowtable.c
  head/sys/net/route.h
  head/sys/netinet/ip_input.c
  head/sys/netinet/ip_output.c
  head/sys/netinet/sctp_output.c
  head/sys/netinet6/ip6_output.c
  head/sys/netinet6/nd6_nbr.c

Modified: head/sys/net/flowtable.c
==
--- head/sys/net/flowtable.cWed Jul  4 06:29:56 2012(r238091)
+++ head/sys/net/flowtable.cWed Jul  4 07:37:53 2012(r238092)
@@ -619,6 +619,7 @@ flow_to_route(struct flentry *fle, struc
sin-sin_addr.s_addr = hashkey[2];
ro-ro_rt = __DEVOLATILE(struct rtentry *, fle-f_rt);
ro-ro_lle = __DEVOLATILE(struct llentry *, fle-f_lle);
+   ro-ro_flags |= RT_NORTREF;
 }
 #endif /* INET */
 
@@ -826,7 +827,7 @@ flow_to_route_in6(struct flentry *fle, s
memcpy(sin6-sin6_addr, hashkey[5], sizeof (struct in6_addr));
ro-ro_rt = __DEVOLATILE(struct rtentry *, fle-f_rt);
ro-ro_lle = __DEVOLATILE(struct llentry *, fle-f_lle);
-
+   ro-ro_flags |= RT_NORTREF;
 }
 #endif /* INET6 */
 

Modified: head/sys/net/route.h
==
--- head/sys/net/route.hWed Jul  4 06:29:56 2012(r238091)
+++ head/sys/net/route.hWed Jul  4 07:37:53 2012(r238092)
@@ -54,7 +54,8 @@ struct route {
struct  sockaddr ro_dst;
 };
 
-#define RT_CACHING_CONTEXT 0x1
+#defineRT_CACHING_CONTEXT  0x1 /* XXX: not used anywhere */
+#defineRT_NORTREF  0x2 /* doesn't hold reference on 
ro_rt */
 
 /*
  * These numbers are used by reliable protocols for determining
@@ -341,6 +342,18 @@ struct rt_addrinfo {
RTFREE_LOCKED(_rt); \
 } while (0)
 
+#defineRO_RTFREE(_ro) do { \
+   if ((_ro)-ro_rt) { \
+   if ((_ro)-ro_flags  RT_NORTREF) { \
+   (_ro)-ro_flags = ~RT_NORTREF; \
+   (_ro)-ro_rt = NULL;\
+   } else {\
+   RT_LOCK((_ro)-ro_rt);  \
+   RTFREE_LOCKED((_ro)-ro_rt);\
+   }   \
+   }   \
+} while (0)
+
 struct radix_node_head *rt_tables_get_rnh(int, int);
 
 struct ifmultiaddr;

Modified: head/sys/netinet/ip_input.c
==
--- head/sys/netinet/ip_input.c Wed Jul  4 06:29:56 2012(r238091)
+++ head/sys/netinet/ip_input.c Wed Jul  4 07:37:53 2012(r238092)
@@ -1495,8 +1495,7 @@ ip_forward(struct mbuf *m, int srcrt)
 
if (error == EMSGSIZE  ro.ro_rt)
mtu = ro.ro_rt-rt_rmx.rmx_mtu;
-   if (ro.ro_rt)
-   RTFREE(ro.ro_rt);
+   RO_RTFREE(ro);
 
if (error)
IPSTAT_INC(ips_cantforward);

Modified: head/sys/netinet/ip_output.c
==
--- head/sys/netinet/ip_output.cWed Jul  4 06:29:56 2012
(r238091)
+++ head/sys/netinet/ip_output.cWed Jul  4 07:37:53 2012
(r238092)
@@ -105,6 +105,10 @@ extern struct protosw inetsw[];
  * ip_len and ip_off are in host format.
  * The mbuf chain containing the packet will be freed.
  * The mbuf opt, if present, will not be freed.
+ * If route ro is present and has ro_rt initialized, route lookup would be
+ * skipped and ro-ro_rt would be used. If ro is