[head tinderbox] failure on powerpc/powerpc

2012-04-24 Thread FreeBSD Tinderbox
TB --- 2012-04-24 03:49:41 - tinderbox 2.9 running on freebsd-current.sentex.ca
TB --- 2012-04-24 03:49:41 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE 
FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 
d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64
TB --- 2012-04-24 03:49:41 - starting HEAD tinderbox run for powerpc/powerpc
TB --- 2012-04-24 03:49:41 - cleaning the object tree
TB --- 2012-04-24 03:50:43 - cvsupping the source tree
TB --- 2012-04-24 03:50:43 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
/tinderbox/HEAD/powerpc/powerpc/supfile
TB --- 2012-04-24 03:51:23 - building world
TB --- 2012-04-24 03:51:23 - CROSS_BUILD_TESTING=YES
TB --- 2012-04-24 03:51:23 - MAKEOBJDIRPREFIX=/obj
TB --- 2012-04-24 03:51:23 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2012-04-24 03:51:23 - SRCCONF=/dev/null
TB --- 2012-04-24 03:51:23 - TARGET=powerpc
TB --- 2012-04-24 03:51:23 - TARGET_ARCH=powerpc
TB --- 2012-04-24 03:51:23 - TZ=UTC
TB --- 2012-04-24 03:51:23 - __MAKE_CONF=/dev/null
TB --- 2012-04-24 03:51:23 - cd /src
TB --- 2012-04-24 03:51:23 - /usr/bin/make -B buildworld
 World build started on Tue Apr 24 03:51:24 UTC 2012
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
[...]
cc1_main.cpp:(.text+0x78): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_mutex_lock@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x88): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_once@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x90): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_mutex_unlock@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.o: In function `__static_initialization_and_destruction_0(int, int)':
cc1_main.cpp:(.text+0x130): relocation truncated to fit: R_PPC_REL24 against 
symbol `getenv@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x31c): relocation truncated to fit: R_PPC_REL24 against 
symbol `std::basic_stringchar, std::char_traitschar, std::allocatorchar 
::basic_string(char const*, std::allocatorchar const)@@GLIBCXX_3.4' defined 
in .plt section in /obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x358): relocation truncated to fit: R_PPC_REL24 against 
symbol `std::basic_stringchar, std::char_traitschar, std::allocatorchar 
::basic_string(char const*, std::allocatorchar const)@@GLIBCXX_3.4' defined 
in .plt section in /obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x3b8): additional relocation overflows omitted from the 
output
*** Error code 1

Stop in /src/usr.bin/clang/clang.
*** Error code 1

Stop in /src/usr.bin/clang.
*** Error code 1

Stop in /src/usr.bin.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2012-04-24 05:59:33 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2012-04-24 05:59:33 - ERROR: failed to build world
TB --- 2012-04-24 05:59:33 - 6463.66 user 812.23 system 7792.54 real


http://tinderbox.freebsd.org/tinderbox-head-HEAD-powerpc-powerpc.full
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: about /sys/dev/netmap/head.diff

2012-04-24 Thread rizzo
On Mon, Apr 23, 2012 at 08:17:40PM +0800, r...@9du.org wrote:
 i think this head.diff may be is old!

it is, definitely time to remove it since the code
has been merged.

cheers
luigi

 
 in diff
 +#ifdef DEV_NETMAP
 + if (slot) {
 + netmap_load_map(txr-txtag, txbuf-map,
 + NMB(slot), adapter-rx_mbuf_sz);
 + slot++;
 + }
 +#endif /* DEV_NETMAP */
 
 in netmap_kern.h
 
 static inline void
 netmap_load_map(bus_dma_tag_t tag, bus_dmamap_t map, void *buf)
 {
 if (map)
 bus_dmamap_load(tag, map, buf, NETMAP_BUF_SIZE,
 netmap_dmamap_cb, NULL, BUS_DMA_NOWAIT);
 }
 
 
 
 
 yong
 r...@9du.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Andre Oppermann

On 19.04.2012 22:46, Luigi Rizzo wrote:

On Thu, Apr 19, 2012 at 10:05:37PM +0200, Andre Oppermann wrote:

On 19.04.2012 15:30, Luigi Rizzo wrote:

I have been running some performance tests on UDP sockets,
using the netsend program in tools/tools/netrate/netsend
and instrumenting the source code and the kernel do return in
various points of the path. Here are some results which
I hope you find interesting.
- another big bottleneck is the route lookup in ip_output()
   (between entries 51 and 56). Not only it eats another
   100ns+ on an empty routing table, but it also
   causes huge contentions when multiple cores
   are involved.


This is indeed a big problem.  I'm working (rough edges remain) on
changing the routing table locking to an rmlock (read-mostly) which


i was wondering, is there a way (and/or any advantage) to use the
fastforward code to look up the route for locally sourced packets ?


I've completed the updating of the routing table rmlock patch.  There
are two steps.  Step one is just changing the rwlock to an rmlock.
Step two streamlines the route lookup in ip_output and ip_fastfwd by
copying out the relevant data while only holding the rmlock instead
of obtaining a reference to the route.

Would be very interesting to see how your benchmark/profiling changes
with these patches applied.

http://svn.freebsd.org/changeset/base/234649
Log:
  Change the radix head lock to an rmlock (read mostly lock).

  There is some header pollution going on because rmlock's are
  not entirely abstracted and need per-CPU structures.

  A comment in _rmlock.h says this can be hidden if there were
  per-cpu linker magic/support.  I don't know if we have that
  already.

http://svn.freebsd.org/changeset/base/234650
Log:
  Add a function rtlookup() that copies out the relevant information
  from an rtentry instead of returning the rtentry.  This avoids the
  need to lock the rtentry and to increase the refcount on it.

  Convert ip_output() to use rtlookup() in a simplistic way.  Certain
  seldom used functionality may not work anymore and the flowtable
  isn't available at the moment.

  Convert ip_fastfwd() to use rtlookup().

  This code is meant to be used for profiling and to be experimented
  with further to determine which locking strategy returns the best
  results.

Make sure to apply this one as well:
http://svn.freebsd.org/changeset/base/234648
Log:
  Add INVARIANT and WITNESS support to rm_lock locks and optimize the
  synchronization path by replacing a LIST of active readers with a
  TAILQ.

  Obtained from:Isilon
  Submitted by: mlaier

--
Andre
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Luigi Rizzo
On Tue, Apr 24, 2012 at 03:16:48PM +0200, Andre Oppermann wrote:
 On 19.04.2012 22:46, Luigi Rizzo wrote:
 On Thu, Apr 19, 2012 at 10:05:37PM +0200, Andre Oppermann wrote:
 On 19.04.2012 15:30, Luigi Rizzo wrote:
 I have been running some performance tests on UDP sockets,
 using the netsend program in tools/tools/netrate/netsend
 and instrumenting the source code and the kernel do return in
 various points of the path. Here are some results which
 I hope you find interesting.
 - another big bottleneck is the route lookup in ip_output()
(between entries 51 and 56). Not only it eats another
100ns+ on an empty routing table, but it also
causes huge contentions when multiple cores
are involved.
 
 This is indeed a big problem.  I'm working (rough edges remain) on
 changing the routing table locking to an rmlock (read-mostly) which
 
 i was wondering, is there a way (and/or any advantage) to use the
 fastforward code to look up the route for locally sourced packets ?
 
 I've completed the updating of the routing table rmlock patch.  There
 are two steps.  Step one is just changing the rwlock to an rmlock.
 Step two streamlines the route lookup in ip_output and ip_fastfwd by
 copying out the relevant data while only holding the rmlock instead
 of obtaining a reference to the route.
 
 Would be very interesting to see how your benchmark/profiling changes
 with these patches applied.

If you want to give it a try yourself, the high level benchmark is
just the 'netsend' program from tools/tools/netrate/netsend -- i
am running something like

for i in $X ; do
netsend 10.0.0.2  18 0 5 
done

and the cardinality of $X can be used to test contention on the
low layers (routing tables and interface/queues).

From previous tests, the difference between flowtable and
routing table was small with a single process (about 5% or 50ns
in the total packet processing time, if i remember well),
but there was a large gain with multiple concurrent processes.

Probably the change in throughput between HEAD and your
branch is all you need. The info below shows that your
gain is something around 100-200 ns depending on how good
is the info that you return back (see below).

My profiling changes were mostly aimed at charging the costs to the
various layers. With my current setting (single process i7-870 @2933
MHz+Turboboost, ixgbe, FreeBSD HEAD, FLOWTABLE enabled, UDP) i see
the following:

FileFunction/descriptionTotal/delta
nanoseconds
user programsendto()8   96
system call

uipc_syscalls.c sys_sendto104 
uipc_syscalls.c sendit111
uipc_syscalls.c kern_sendit   118
uipc_socket.c   sosend
uipc_socket.c   sosend_dgram  146  137
  sockbuf locking, mbuf alloc, copyin

udp_usrreq.cudp_send  273
udp_usrreq.cudp_output273   57

ip_output.c ip_output 330  198
  route lookup, ip header setup

if_ethersubr.c  ether_output  528  162
  MAC header lookup and construction,
  loopback checks
if_ethersubr.c  ether_output_frame690

ixgbe.c ixgbe_mq_start698
ixgbe.c ixgbe_mq_start_locked 720
ixgbe.c ixgbe_xmit730  220
 mbuf mangling, device programming

--  packet on the wire950

Removing flowtable increases the cost in ip_output()
(obviously) but also in ether_output() (because the
route does not have a lle entry so you need to call
arpresolve on each packet). It also causes trouble
in the device driver because the mbuf does not have a
flowid set, so the ixgbe device driver puts the
packet on the queue corresponding to the current CPU.
If the process (as in my case) floats, one flow might end
up on multiple queues.

So in revising the route lookup i believe it would be good
if we could also get at once most of the info that
ether_output() is computing again and again.

cheers
luigi


 http://svn.freebsd.org/changeset/base/234649
 Log:
   Change the radix head lock to an rmlock (read mostly lock).
 
   There is some header pollution going on because rmlock's are
   not entirely abstracted and need per-CPU structures.
 
   A comment in _rmlock.h says this can be hidden if there were
   per-cpu linker magic/support.  I don't know if we have that
   already.
 
 http://svn.freebsd.org/changeset/base/234650
 Log:
   Add a function rtlookup() that copies out the relevant information
   from an rtentry instead of returning the rtentry.  This avoids the
   need to lock the rtentry and to increase the refcount on it.
 
   Convert ip_output() to use rtlookup() in a simplistic way.  Certain
   seldom used functionality may not work anymore and the flowtable
   isn't available at the moment.
 
   

RE: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Li, Qing

From previous tests, the difference between flowtable and
routing table was small with a single process (about 5% or 50ns
in the total packet processing time, if i remember well),
but there was a large gain with multiple concurrent processes.


Yes, that sounds about right when we did the tests a long while ago.


 Removing flowtable increases the cost in ip_output()
 (obviously) but also in ether_output() (because the
 route does not have a lle entry so you need to call
 arpresolve on each packet). 


Yup.


 So in revising the route lookup i believe it would be good
 if we could also get at once most of the info that
 ether_output() is computing again and again.


Well, the routing table no longer maintains any lle info, so there
isn't much to copy out the rtentry at the completion of route
lookup.

If I understood you correctly, you do believe there is a lot of value
in Flowtable caching concept, but you are not suggesting we reverting
back to having the routing table maintain L2 entries, are you ?

--Qing
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread K. Macy
On Tue, Apr 24, 2012 at 4:16 PM, Li, Qing qing...@bluecoat.com wrote:

 From previous tests, the difference between flowtable and
routing table was small with a single process (about 5% or 50ns
in the total packet processing time, if i remember well),
but there was a large gain with multiple concurrent processes.


 Yes, that sounds about right when we did the tests a long while ago.


 Removing flowtable increases the cost in ip_output()
 (obviously) but also in ether_output() (because the
 route does not have a lle entry so you need to call
 arpresolve on each packet).


 Yup.


 So in revising the route lookup i believe it would be good
 if we could also get at once most of the info that
 ether_output() is computing again and again.


 Well, the routing table no longer maintains any lle info, so there
 isn't much to copy out the rtentry at the completion of route
 lookup.

 If I understood you correctly, you do believe there is a lot of value
 in Flowtable caching concept, but you are not suggesting we reverting
 back to having the routing table maintain L2 entries, are you ?



One could try a similar conversion of the L2 table to an rmlock
without copy while lock is held.

-Kip


-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread K. Macy
On Tue, Apr 24, 2012 at 5:03 PM, K. Macy km...@freebsd.org wrote:
 On Tue, Apr 24, 2012 at 4:16 PM, Li, Qing qing...@bluecoat.com wrote:

 From previous tests, the difference between flowtable and
routing table was small with a single process (about 5% or 50ns
in the total packet processing time, if i remember well),
but there was a large gain with multiple concurrent processes.


 Yes, that sounds about right when we did the tests a long while ago.


 Removing flowtable increases the cost in ip_output()
 (obviously) but also in ether_output() (because the
 route does not have a lle entry so you need to call
 arpresolve on each packet).


 Yup.


 So in revising the route lookup i believe it would be good
 if we could also get at once most of the info that
 ether_output() is computing again and again.


 Well, the routing table no longer maintains any lle info, so there
 isn't much to copy out the rtentry at the completion of route
 lookup.

 If I understood you correctly, you do believe there is a lot of value
 in Flowtable caching concept, but you are not suggesting we reverting
 back to having the routing table maintain L2 entries, are you ?



 One could try a similar conversion of the L2 table to an rmlock
 without copy while lock is held.

Odd .. *with* copy while lock is held.

-Kip
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[head tinderbox] failure on powerpc/powerpc

2012-04-24 Thread FreeBSD Tinderbox
TB --- 2012-04-24 13:01:51 - tinderbox 2.9 running on freebsd-current.sentex.ca
TB --- 2012-04-24 13:01:51 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE 
FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 
d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64
TB --- 2012-04-24 13:01:51 - starting HEAD tinderbox run for powerpc/powerpc
TB --- 2012-04-24 13:01:51 - cleaning the object tree
TB --- 2012-04-24 13:03:09 - cvsupping the source tree
TB --- 2012-04-24 13:03:09 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
/tinderbox/HEAD/powerpc/powerpc/supfile
TB --- 2012-04-24 13:04:14 - building world
TB --- 2012-04-24 13:04:14 - CROSS_BUILD_TESTING=YES
TB --- 2012-04-24 13:04:14 - MAKEOBJDIRPREFIX=/obj
TB --- 2012-04-24 13:04:14 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2012-04-24 13:04:14 - SRCCONF=/dev/null
TB --- 2012-04-24 13:04:14 - TARGET=powerpc
TB --- 2012-04-24 13:04:14 - TARGET_ARCH=powerpc
TB --- 2012-04-24 13:04:14 - TZ=UTC
TB --- 2012-04-24 13:04:14 - __MAKE_CONF=/dev/null
TB --- 2012-04-24 13:04:14 - cd /src
TB --- 2012-04-24 13:04:14 - /usr/bin/make -B buildworld
 World build started on Tue Apr 24 13:04:15 UTC 2012
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
[...]
cc1_main.cpp:(.text+0x78): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_mutex_lock@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x88): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_once@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x90): relocation truncated to fit: R_PPC_REL24 against 
symbol `pthread_mutex_unlock@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.o: In function `__static_initialization_and_destruction_0(int, int)':
cc1_main.cpp:(.text+0x130): relocation truncated to fit: R_PPC_REL24 against 
symbol `getenv@@FBSD_1.0' defined in .plt section in 
/obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x31c): relocation truncated to fit: R_PPC_REL24 against 
symbol `std::basic_stringchar, std::char_traitschar, std::allocatorchar 
::basic_string(char const*, std::allocatorchar const)@@GLIBCXX_3.4' defined 
in .plt section in /obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x358): relocation truncated to fit: R_PPC_REL24 against 
symbol `std::basic_stringchar, std::char_traitschar, std::allocatorchar 
::basic_string(char const*, std::allocatorchar const)@@GLIBCXX_3.4' defined 
in .plt section in /obj/powerpc.powerpc/src/tmp/usr/lib/crt1.o
cc1_main.cpp:(.text+0x3b8): additional relocation overflows omitted from the 
output
*** Error code 1

Stop in /src/usr.bin/clang/clang.
*** Error code 1

Stop in /src/usr.bin/clang.
*** Error code 1

Stop in /src/usr.bin.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2012-04-24 15:09:40 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2012-04-24 15:09:40 - ERROR: failed to build world
TB --- 2012-04-24 15:09:40 - 6324.95 user 799.24 system 7669.12 real


http://tinderbox.freebsd.org/tinderbox-head-HEAD-powerpc-powerpc.full
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Luigi Rizzo
On Tue, Apr 24, 2012 at 02:16:18PM +, Li, Qing wrote:
 
 From previous tests, the difference between flowtable and
 routing table was small with a single process (about 5% or 50ns
 in the total packet processing time, if i remember well),
 but there was a large gain with multiple concurrent processes.
 
 
 Yes, that sounds about right when we did the tests a long while ago.
 
 
  Removing flowtable increases the cost in ip_output()
  (obviously) but also in ether_output() (because the
  route does not have a lle entry so you need to call
  arpresolve on each packet). 
 
 
 Yup.
 
 
  So in revising the route lookup i believe it would be good
  if we could also get at once most of the info that
  ether_output() is computing again and again.
 
 
 Well, the routing table no longer maintains any lle info, so there
 isn't much to copy out the rtentry at the completion of route
 lookup.
 
 If I understood you correctly, you do believe there is a lot of value
 in Flowtable caching concept, but you are not suggesting we reverting
 back to having the routing table maintain L2 entries, are you ?

I see a lot of value in caching in general.

Especially for a bound socket it seems pointless to lookup the
route, iface and mac address(es) on every single packet instead of
caching them. And, routes and MAC addresses are volatile anyways
so making sure that we do the lookup 1us closer to the actual use
gives no additional guarantee.

The frequency with which these info (routes and MAC addresses)
change clearly influences the mechanism to validate the cache.
I suppose we have the following options:

- direct notification: a failure in a direct chain of calls
  can be used to invalidate the info cached in the socket.
  Similarly, some incoming traffic (e.g. TCP RST, FIN,
  ICMP messages) that reach a socket can invalidate the cached values
- assume a minimum lifetime for the info (i think this is what
  happens in the flowtable) and flush it unconditionally
  every such interval (say 10ms).
- if some info changes infrequently (e.g. MAC addresses) one could
  put a version number in the cached value and use it to validate
  the cache.

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread K. Macy
On Tue, Apr 24, 2012 at 6:34 PM, Luigi Rizzo ri...@iet.unipi.it wrote:
 On Tue, Apr 24, 2012 at 02:16:18PM +, Li, Qing wrote:
 
 From previous tests, the difference between flowtable and
 routing table was small with a single process (about 5% or 50ns
 in the total packet processing time, if i remember well),
 but there was a large gain with multiple concurrent processes.
 

 Yes, that sounds about right when we did the tests a long while ago.

 
  Removing flowtable increases the cost in ip_output()
  (obviously) but also in ether_output() (because the
  route does not have a lle entry so you need to call
  arpresolve on each packet).
 

 Yup.

 
  So in revising the route lookup i believe it would be good
  if we could also get at once most of the info that
  ether_output() is computing again and again.
 

 Well, the routing table no longer maintains any lle info, so there
 isn't much to copy out the rtentry at the completion of route
 lookup.

 If I understood you correctly, you do believe there is a lot of value
 in Flowtable caching concept, but you are not suggesting we reverting
 back to having the routing table maintain L2 entries, are you ?

 I see a lot of value in caching in general.

 Especially for a bound socket it seems pointless to lookup the
 route, iface and mac address(es) on every single packet instead of
 caching them. And, routes and MAC addresses are volatile anyways
 so making sure that we do the lookup 1us closer to the actual use
 gives no additional guarantee.

 The frequency with which these info (routes and MAC addresses)
 change clearly influences the mechanism to validate the cache.
 I suppose we have the following options:

 - direct notification: a failure in a direct chain of calls
  can be used to invalidate the info cached in the socket.
  Similarly, some incoming traffic (e.g. TCP RST, FIN,
  ICMP messages) that reach a socket can invalidate the cached values
 - assume a minimum lifetime for the info (i think this is what
  happens in the flowtable) and flush it unconditionally
  every such interval (say 10ms).
 - if some info changes infrequently (e.g. MAC addresses) one could
  put a version number in the cached value and use it to validate
  the cache.

I have a patch that has been sitting around for a long time due to
review cycle latency that caches a pointer to the rtentry (and
llentry) in the the inpcb. Before each use the rtentry is checked
against a generation number in the routing tree that is incremented on
every routing table update.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


make -j4 buildworld error

2012-04-24 Thread r...@9du.org
make -j4 buildworld  error
--
 World build started on Tue Apr 24 21:32:26 CST 2012
--
--
 Rebuilding the temporary build tree
--
rm -rf /usr/obj/usr/src/tmp
rm -rf /usr/obj/usr/src/lib32
mkdir -p /usr/obj/usr/src/tmp/lib
mkdir -p /usr/obj/usr/src/tmp/usr
mkdir -p /usr/obj/usr/src/tmp/legacy/usr
mtree -deU -f /usr/src/etc/mtree/BSD.usr.dist  -p 
/usr/obj/usr/src/tmp/legacy/usr /dev/null
mtree -deU -f /usr/src/etc/mtree/BSD.groff.dist  -p 
/usr/obj/usr/src/tmp/legacy/usr /dev/null
mtree -deU -f /usr/src/etc/mtree/BSD.usr.dist  -p /usr/obj/usr/src/tmp/usr 
/dev/null
mtree -deU -f /usr/src/etc/mtree/BSD.include.dist  -p 
/usr/obj/usr/src/tmp/usr/include /dev/null

.


/usr/obj/usr/src/tmp/usr/src/usr.bin/clang/tblgen created for 
/usr/src/usr.bin/clang/tblgen
rm -f .depend
mkdep -f .depend -a
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/include 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/tools/clang/include 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen -I. 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/../../lib/clang/include 
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS 
-DLLVM_DEFAULT_TARGET_TRIPLE=\x86_64-unknown-freebsd10.0\ 
-I/usr/obj/usr/src/tmp/legacy/usr/include  
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/AsmMatcherEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/AsmWriterEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/AsmWriterInst.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CallingConvEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CodeEmitterGen.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CodeGenDAGPatterns.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CodeGenInstruction.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CodeGenRegisters.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/CodeGenTarget.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DAGISelEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DAGISelMatcher.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DAGISelMatcherGen.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DAGISelMatcherOpt.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DFAPacketizerEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/DisassemblerEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/EDEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/FastISelEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/InstrInfoEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/IntrinsicEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/PseudoLoweringEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/RegisterInfoEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/SetTheory.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/StringMatcher.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/SubtargetEmitter.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/TGValueTypes.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/TableGen.cpp 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/X86DisassemblerTables.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/X86ModRMFilters.cpp
 
/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen/X86RecognizableInstr.cpp
echo tblgen: /usr/lib/libc.a  
/usr/obj/usr/src/tmp/usr/src/usr.bin/clang/tblgen/../../../lib/clang/libllvmtablegen/libllvmtablegen.a
 
/usr/obj/usr/src/tmp/usr/src/usr.bin/clang/tblgen/../../../lib/clang/libllvmsupport/libllvmsupport.a
 /usr/obj/usr/src/tmp/legacy/usr/lib/libegacy.a  .depend
echo tblgen: /usr/lib/libstdc++.a  .depend
c++ -O2 -pipe -I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/include 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/tools/clang/include 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen -I. 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/../../lib/clang/include 
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD 

Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Fabien Thomas
 
 
 I have a patch that has been sitting around for a long time due to
 review cycle latency that caches a pointer to the rtentry (and
 llentry) in the the inpcb. Before each use the rtentry is checked
 against a generation number in the routing tree that is incremented on
 every routing table update.

Hi Kip,

Is there a public location for the patch ?
What can be done to speedup the commit: testing ?

Fabien



RE: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Li, Qing
Yup, all good points. In fact we have considered all of these while doing
the work. In case you haven't seen it already, we did write about these 
issues in our paper and how we tried to address those, flow-table was one
of the solutions.

http://dl.acm.org/citation.cfm?id=1592641

--Qing


 
  Well, the routing table no longer maintains any lle info, so there
  isn't much to copy out the rtentry at the completion of route
  lookup.
 
  If I understood you correctly, you do believe there is a lot of value
  in Flowtable caching concept, but you are not suggesting we reverting
  back to having the routing table maintain L2 entries, are you ?
 
 I see a lot of value in caching in general.
 
 Especially for a bound socket it seems pointless to lookup the
 route, iface and mac address(es) on every single packet instead of
 caching them. And, routes and MAC addresses are volatile anyways
 so making sure that we do the lookup 1us closer to the actual use
 gives no additional guarantee.
 
 The frequency with which these info (routes and MAC addresses)
 change clearly influences the mechanism to validate the cache.
 I suppose we have the following options:
 
 - direct notification: a failure in a direct chain of calls
   can be used to invalidate the info cached in the socket.
   Similarly, some incoming traffic (e.g. TCP RST, FIN,
   ICMP messages) that reach a socket can invalidate the cached values
 - assume a minimum lifetime for the info (i think this is what
   happens in the flowtable) and flush it unconditionally
   every such interval (say 10ms).
 - if some info changes infrequently (e.g. MAC addresses) one could
   put a version number in the cached value and use it to validate
   the cache.
 
 cheers
 luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


RE: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Li, Qing
 
  I have a patch that has been sitting around for a long time due to
  review cycle latency that caches a pointer to the rtentry (and
  llentry) in the the inpcb. Before each use the rtentry is checked
  against a generation number in the routing tree that is incremented
 on
  every routing table update.
 
 Hi Kip,
 
 Is there a public location for the patch ?
 What can be done to speedup the commit: testing ?
 
 Fabien

I performed extensive review of this patch from Kip, and it was
ready to go. Really good work. 

Not sure what is stopping its commit into the tree.

--Qing



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: buildworld fails on FreeBSD 7.x for HEAD from 19.04.2012

2012-04-24 Thread David O'Brien
On Sun, Apr 22, 2012 at 09:06:18AM -0700, Garrett Cooper wrote:
  On 4/20/2012 5:16 AM, Jan Sieka wrote:
  I can't build world from recent sources (HEAD as of 2012.04.19 11:06:48
  UTC) on a machine running FreeBSD 7.3.
...
 Ugh. The usecase (that's now broken) is that Jan from Semihalf might
 have been running CURRENT builds on an older (stable) build machine.

Lets not guess.  If you've found that any version of 10-CURRENT cannot
build HEAD post r234449 please let me know.

I've verified that I can build HEAD on 8.3-PRERELEASE (r231882).

-- 
-- David  (obr...@freebsd.org)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: buildworld fails on FreeBSD 7.x for HEAD from 19.04.2012

2012-04-24 Thread Vladimir Sharun



=== usr.bin/file (all)
/usr/bin/clang -O2 -pipe  -DMAGIC='/usr/share/misc/magic'
-DHAVE_CONFIG_H -I/usr/src/usr.bin/file/../../lib/libmagic -std=gnu99
-Qunused-arguments -fstack-protector -Wsystem-headers -Wall
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes
-Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual
-Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align
-Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls
-Wold-style-definition -Wno-pointer-sign -Wno-empty-body
-Wno-string-plus-int  -o file file.o -lmagic -lz
file.o: In function `main':
/usr/src/usr.bin/file/../../contrib/file/file.c:(.text+0x717): undefined
reference to `magic_getpath'
/usr/src/usr.bin/file/../../contrib/file/file.c:(.text+0x7df): undefined
reference to `magic_list'
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
*** [file] Error code 1

r234657

clang -v:
FreeBSD clang version 3.1 (trunk 154661) 20120413
Target: x86_64-unknown-freebsd10.0
Thread model: posix

FreeBSD 10.0-CURRENT #6: Thu Apr 12 08:56:05 EEST 2012 amd64

make buildworld without j's

On Sun, Apr 22, 2012 at 09:06:18AM -0700, Garrett Cooper wrote:
  On 4/20/2012 5:16 AM, Jan Sieka wrote:
  I can't build world from recent sources (HEAD as of 2012.04.19 11:06:48
  UTC) on a machine running FreeBSD 7.3.
...
 Ugh. The usecase (that's now broken) is that Jan from Semihalf might
 have been running CURRENT builds on an older (stable) build machine.

Lets not guess.  If you've found that any version of 10-CURRENT cannot
build HEAD post r234449 please let me know.

I've verified that I can build HEAD on 8.3-PRERELEASE (r231882).

-- 
-- David  (obr...@freebsd.org)
___freebsd-curr...@freebsd.org 
mailing listhttp://lists.freebsd.org/mailman/listinfo/freebsd-currentTo 
unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


pmap and mtx scalability problem

2012-04-24 Thread Slawa Olhovchenkov
I treid make -j 30 build{world,kernel} (latest -CURRENT) on 24-core machine and 
see poor
scalability of pmap/mtx -- more then 50% cpu spend on system time.

pmcstat:

@ CPU_CLK_UNHALTED_CORE [194841 samples]

42.65%  [83102]_mtx_lock_sleep @ /boot/kernel/kernel
 40.97%  [34051] pmap_enter
  100.0%  [34051]  vm_fault_hold
   100.0%  [34051]   trap_pfault
 30.40%  [25262] vm_page_activate
  100.0%  [25262]  vm_fault_hold
   100.0%  [25262]   trap_pfault
 18.41%  [15300] vm_pageq_remove
  73.63%  [11266]  vm_page_free_toq
   70.69%  [7964]vm_object_terminate
   29.31%  [3302]vm_object_page_remove
  26.37%  [4034]   vm_fault_hold
   100.0%  [4034]trap_pfault

make -j 8:

15.44%  [10740]_mtx_lock_sleep @ /boot/kernel/kernel
 38.77%  [4164]  pmap_enter
  99.93%  [4161]   vm_fault_hold
   100.0%  [4161]trap_pfault
  00.07%  [3]  kmem_back
   100.0%  [3]   kmem_malloc
 27.98%  [3005]  vm_page_activate
  100.0%  [3005]   vm_fault_hold
   100.0%  [3005]trap_pfault
 20.64%  [2217]  vm_pageq_remove
  64.73%  [1435]   vm_page_free_toq
   63.41%  [910] vm_object_terminate

make -j 4

06.86%  [4222] pagezero @ /boot/kernel/kernel
 98.39%  [4154]  trap_pfault
  100.0%  [4154]   trap
 01.11%  [47]vm_fault_hold

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pmap and mtx scalability problem

2012-04-24 Thread K. Macy
Known problem. There is an open disagreement about how to improve the
granularity of locking in pmap.

-Kip

On Tue, Apr 24, 2012 at 9:14 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 I treid make -j 30 build{world,kernel} (latest -CURRENT) on 24-core machine 
 and see poor
 scalability of pmap/mtx -- more then 50% cpu spend on system time.

 pmcstat:

 @ CPU_CLK_UNHALTED_CORE [194841 samples]

 42.65%  [83102]    _mtx_lock_sleep @ /boot/kernel/kernel
  40.97%  [34051]     pmap_enter
  100.0%  [34051]      vm_fault_hold
   100.0%  [34051]       trap_pfault
  30.40%  [25262]     vm_page_activate
  100.0%  [25262]      vm_fault_hold
   100.0%  [25262]       trap_pfault
  18.41%  [15300]     vm_pageq_remove
  73.63%  [11266]      vm_page_free_toq
   70.69%  [7964]        vm_object_terminate
   29.31%  [3302]        vm_object_page_remove
  26.37%  [4034]       vm_fault_hold
   100.0%  [4034]        trap_pfault

 make -j 8:

 15.44%  [10740]    _mtx_lock_sleep @ /boot/kernel/kernel
  38.77%  [4164]      pmap_enter
  99.93%  [4161]       vm_fault_hold
   100.0%  [4161]        trap_pfault
  00.07%  [3]          kmem_back
   100.0%  [3]           kmem_malloc
  27.98%  [3005]      vm_page_activate
  100.0%  [3005]       vm_fault_hold
   100.0%  [3005]        trap_pfault
  20.64%  [2217]      vm_pageq_remove
  64.73%  [1435]       vm_page_free_toq
   63.41%  [910]         vm_object_terminate

 make -j 4

 06.86%  [4222]     pagezero @ /boot/kernel/kernel
  98.39%  [4154]      trap_pfault
  100.0%  [4154]       trap
  01.11%  [47]        vm_fault_hold

 ___
 freebsd-performa...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-performance
 To unsubscribe, send any mail to freebsd-performance-unsubscr...@freebsd.org



-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


segfault in vfscanf(3): clang and __restrict usage

2012-04-24 Thread Jean-Sébastien Pédron
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi everyone,

vfscanf(3) in HEAD (r234606) segfaults when compiled with clang. For
instance, here is a call made in cmake which crashes:
fscanf(f, %*[^\n]\n);

The same libc, compiled with GCC, doesn't segfault.

When it encounters a character class, __svfscanf() calls convert_ccl():

static const int suppress;
#define SUPPRESS_PTR((void *)suppress)

static __inline int
convert_ccl(FILE *fp, char * __restrict p, [...])
{
[...]

if (p == SUPPRESS_PTR) {
[...]
} else {
[...]
}

[...]
}

In this case, there's no argument following the format string, and
convert_ccl is called with p = SUPPRESS_PTR. Therefore, we should
enter the if{} block. But when compiled with clang, we enter the
else{} block (causing the segfault).

I made a small program that shows the problem (attached): it seems to
be related to the __restrict qualifier.

Compiled with GCC:
./ptr-comp
p=0x600ac8 vs. SUPPRESS_PTR=0x600ac8
p == SUPPRESS_PTR

Compiled with clang:
./ptr-comp
p=0x4007dc vs. SUPPRESS_PTR=0x4007dc
p != SUPPRESS_PTR - WRONG

- From what I understand about __restrict, it indicates that the pointer
is the only one pointing to a resource. In vfscanf.c, suppress may
be pointed by several pointers at a time, so I think __restrict here
is incorrect. But I'm really not sure I got it right. And I don't know
either if clang behavior is expected.

What do you think?

- -- 
Jean-Sébastien Pédron
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+XA8kACgkQa+xGJsFYOlOt9wCffUwQ344hfanDzU27wdgW5C+t
4fYAoKPh26OW/ge+VbLaOMTT/YtUYOwM
=OblW
-END PGP SIGNATURE-
#include stdio.h

static const int suppress;
#define SUPPRESS_PTR((void *)suppress)

void
func(char * __restrict p)
{

printf(p=%p vs. SUPPRESS_PTR=%p\n, p, SUPPRESS_PTR);

if (p == SUPPRESS_PTR)
printf(p == SUPPRESS_PTR\n);
else
printf(p != SUPPRESS_PTR - WRONG\n);
}

int
main(int argc, char *argv [])
{
char *p;

p = SUPPRESS_PTR;
func(p);

return (0);
}
PROG = ptr-comp

.include bsd.prog.mk
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: pmap and mtx scalability problem

2012-04-24 Thread Slawa Olhovchenkov
On Tue, Apr 24, 2012 at 09:27:30PM +0200, K. Macy wrote:

 Known problem. There is an open disagreement about how to improve the
 granularity of locking in pmap.

split locking to process-specific information and global information?
use lock-free lists (i see TAILQ_INSERT_TAIL in pmap_enter)?

sorry for stupidity, if any.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pmap and mtx scalability problem

2012-04-24 Thread K. Macy
No. I developed a patch from Jeffr that pushed the vm_page_lock array
down in to the machine dependent code, replacing most of the uses of
the single vm_page_queue_lock. However, alc doesn't like the design
and has not proposed an alternative.

-Kip

On Tue, Apr 24, 2012 at 10:36 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 On Tue, Apr 24, 2012 at 09:27:30PM +0200, K. Macy wrote:

 Known problem. There is an open disagreement about how to improve the
 granularity of locking in pmap.

 split locking to process-specific information and global information?
 use lock-free lists (i see TAILQ_INSERT_TAIL in pmap_enter)?

 sorry for stupidity, if any.



-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pmap and mtx scalability problem

2012-04-24 Thread Slawa Olhovchenkov
On Tue, Apr 24, 2012 at 10:43:08PM +0200, K. Macy wrote:

 No. I developed a patch from Jeffr that pushed the vm_page_lock array
 down in to the machine dependent code, replacing most of the uses of
 the single vm_page_queue_lock. However, alc doesn't like the design
 and has not proposed an alternative.

can i test this?

 -Kip
 
 On Tue, Apr 24, 2012 at 10:36 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
  On Tue, Apr 24, 2012 at 09:27:30PM +0200, K. Macy wrote:
 
  Known problem. There is an open disagreement about how to improve the
  granularity of locking in pmap.
 
  split locking to process-specific information and global information?
  use lock-free lists (i see TAILQ_INSERT_TAIL in pmap_enter)?
 
  sorry for stupidity, if any.
 
 
 
 -- 
    ?The real damage is done by those millions who want to 'get by.'
 The ordinary men who just want to be left in peace. Those who don?t
 want their little lives disturbed by anything bigger than themselves.
 Those with no sides and no causes. Those who won?t take measure of
 their own strength, for fear of antagonizing their own weakness. Those
 who don?t like to make waves?or enemies.
 
    Those for whom freedom, honour, truth, and principles are only
 literature. Those who live small, love small, die small. It?s the
 reductionist approach to life: if you keep it small, you?ll keep it
 under control. If you don?t make any noise, the bogeyman won?t find
 you.
 
    But it?s all an illusion, because they die too, those people who
 roll up their spirits into tiny little balls so as to be safe. Safe?!
 From what? Life is always on the edge of death; narrow streets lead to
 the same place as wide avenues, and a little candle burns itself out
 just like a flaming torch does.
 
    I choose my own way to burn.?
 
    Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pmap and mtx scalability problem

2012-04-24 Thread K. Macy
It's a bit dated at this point. Nonetheless, when gitorious is able to
give something other than 503 to my search queries I'll post it.

On Tue, Apr 24, 2012 at 10:45 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 On Tue, Apr 24, 2012 at 10:43:08PM +0200, K. Macy wrote:

 No. I developed a patch from Jeffr that pushed the vm_page_lock array
 down in to the machine dependent code, replacing most of the uses of
 the single vm_page_queue_lock. However, alc doesn't like the design
 and has not proposed an alternative.

 can i test this?

 -Kip

 On Tue, Apr 24, 2012 at 10:36 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
  On Tue, Apr 24, 2012 at 09:27:30PM +0200, K. Macy wrote:
 
  Known problem. There is an open disagreement about how to improve the
  granularity of locking in pmap.
 
  split locking to process-specific information and global information?
  use lock-free lists (i see TAILQ_INSERT_TAIL in pmap_enter)?
 
  sorry for stupidity, if any.



 --
    ?The real damage is done by those millions who want to 'get by.'
 The ordinary men who just want to be left in peace. Those who don?t
 want their little lives disturbed by anything bigger than themselves.
 Those with no sides and no causes. Those who won?t take measure of
 their own strength, for fear of antagonizing their own weakness. Those
 who don?t like to make waves?or enemies.

    Those for whom freedom, honour, truth, and principles are only
 literature. Those who live small, love small, die small. It?s the
 reductionist approach to life: if you keep it small, you?ll keep it
 under control. If you don?t make any noise, the bogeyman won?t find
 you.

    But it?s all an illusion, because they die too, those people who
 roll up their spirits into tiny little balls so as to be safe. Safe?!
 From what? Life is always on the edge of death; narrow streets lead to
 the same place as wide avenues, and a little candle burns itself out
 just like a flaming torch does.

    I choose my own way to burn.?

    Sophie Scholl



-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pmap and mtx scalability problem

2012-04-24 Thread K. Macy
You can try these. Your mileage *will* vary.


https://gitorious.org/~kmm/freebsd/kmm-sandbox/commits/work/svn_release_8_1_0_page_lock

https://gitorious.org/~kmm/freebsd/kmm-sandbox/commits/work/svn_trunk_page_lock

On Tue, Apr 24, 2012 at 10:51 PM, K. Macy km...@freebsd.org wrote:
 It's a bit dated at this point. Nonetheless, when gitorious is able to
 give something other than 503 to my search queries I'll post it.

 On Tue, Apr 24, 2012 at 10:45 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 On Tue, Apr 24, 2012 at 10:43:08PM +0200, K. Macy wrote:

 No. I developed a patch from Jeffr that pushed the vm_page_lock array
 down in to the machine dependent code, replacing most of the uses of
 the single vm_page_queue_lock. However, alc doesn't like the design
 and has not proposed an alternative.

 can i test this?

 -Kip

 On Tue, Apr 24, 2012 at 10:36 PM, Slawa Olhovchenkov s...@zxy.spb.ru 
 wrote:
  On Tue, Apr 24, 2012 at 09:27:30PM +0200, K. Macy wrote:
 
  Known problem. There is an open disagreement about how to improve the
  granularity of locking in pmap.
 
  split locking to process-specific information and global information?
  use lock-free lists (i see TAILQ_INSERT_TAIL in pmap_enter)?
 
  sorry for stupidity, if any.



 --
    ?The real damage is done by those millions who want to 'get by.'
 The ordinary men who just want to be left in peace. Those who don?t
 want their little lives disturbed by anything bigger than themselves.
 Those with no sides and no causes. Those who won?t take measure of
 their own strength, for fear of antagonizing their own weakness. Those
 who don?t like to make waves?or enemies.

    Those for whom freedom, honour, truth, and principles are only
 literature. Those who live small, love small, die small. It?s the
 reductionist approach to life: if you keep it small, you?ll keep it
 under control. If you don?t make any noise, the bogeyman won?t find
 you.

    But it?s all an illusion, because they die too, those people who
 roll up their spirits into tiny little balls so as to be safe. Safe?!
 From what? Life is always on the edge of death; narrow streets lead to
 the same place as wide avenues, and a little candle burns itself out
 just like a flaming torch does.

    I choose my own way to burn.?

    Sophie Scholl



 --
    “The real damage is done by those millions who want to 'get by.'
 The ordinary men who just want to be left in peace. Those who don’t
 want their little lives disturbed by anything bigger than themselves.
 Those with no sides and no causes. Those who won’t take measure of
 their own strength, for fear of antagonizing their own weakness. Those
 who don’t like to make waves—or enemies.

    Those for whom freedom, honour, truth, and principles are only
 literature. Those who live small, love small, die small. It’s the
 reductionist approach to life: if you keep it small, you’ll keep it
 under control. If you don’t make any noise, the bogeyman won’t find
 you.

    But it’s all an illusion, because they die too, those people who
 roll up their spirits into tiny little balls so as to be safe. Safe?!
 From what? Life is always on the edge of death; narrow streets lead to
 the same place as wide avenues, and a little candle burns itself out
 just like a flaming torch does.

    I choose my own way to burn.”

    Sophie Scholl



-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Some performance measurements on the FreeBSD network stack

2012-04-24 Thread Bjoern A. Zeeb

On 24. Apr 2012, at 17:42 , Li, Qing wrote:

 
 I have a patch that has been sitting around for a long time due to
 review cycle latency that caches a pointer to the rtentry (and
 llentry) in the the inpcb. Before each use the rtentry is checked
 against a generation number in the routing tree that is incremented
 on
 every routing table update.
 
 Hi Kip,
 
 Is there a public location for the patch ?
 What can be done to speedup the commit: testing ?
 
 Fabien
 
 I performed extensive review of this patch from Kip, and it was
 ready to go. Really good work. 
 
 Not sure what is stopping its commit into the tree.

Because there were leaks, there were 100% panics for IPv6, ... at least on
the version I had seen in autumn last year.

There is certainly no one more interested then me on these in, esp. for v6
where the removal of route caching a long time ago made nd6_nud_hint() a NOP
with dst and rt being passed down as NULL only, and where we are doing up to
three route lookups in the output path if no cached rt is passed down along
from the ULP.

If there is an updated patch, I'd love to see it.

/bz

-- 
Bjoern A. Zeeb You have to have visions!
   It does not matter how good you are. It matters what good you do!

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org