On Fri, 3 Oct 2008, Danny Braniss wrote:

On Fri, 3 Oct 2008, Danny Braniss wrote:

gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull.

The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the defaults work fine most of the time, so just use them. Turn the enable syscl on just before you begin a run, and turn it off immediately afterwards. Make sure to reset between reruns (rebooting to a new kernel is fine too!).

in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof
there 3 files:
        7.1-100         host connected at 100 running -prerelease
        7.1-1000        same but connected at 1000
        7.0-1000        -stable with your 'patch'
at 100 my benchmark didn't suffer from the profiling, average was about 9.
at 1000 the benchmark got realy hit, average was around 12 for the patched,
and 4 for the unpatched (less than at 100).

Interesting.  A bit of post-processing:

[EMAIL PROTECTED]:/tmp> cat 7.1-1000 | awk -F' ' '{print $3" "$9}' | sort -n | tail -10
2413283 /r+d/7/sys/kern/kern_mutex.c:141
2470096 /r+d/7/sys/nfsclient/nfs_socket.c:1218
2676282 /r+d/7/sys/net/route.c:293
2754866 /r+d/7/sys/kern/vfs_bio.c:1468
3196298 /r+d/7/sys/nfsclient/nfs_bio.c:1664
3318742 /r+d/7/sys/net/route.c:1584
3711139 /r+d/7/sys/dev/bge/if_bge.c:3287
3753518 /r+d/7/sys/net/if_ethersubr.c:405
3961312 /r+d/7/sys/nfsclient/nfs_subs.c:1066
10688531 /r+d/7/sys/dev/bge/if_bge.c:3726
[EMAIL PROTECTED]:/tmp> cat 7.0-1000 | awk -F' ' '{print $3" "$9}' | sort -n | tail -10
468631 /r+d/hunt/src/sys/nfsclient/nfs_nfsiod.c:286
501989 /r+d/hunt/src/sys/nfsclient/nfs_vnops.c:1148
631587 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1198
701155 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1258
718211 /r+d/hunt/src/sys/kern/kern_mutex.c:141
1118711 /r+d/hunt/src/sys/nfsclient/nfs_bio.c:1664
1169125 /r+d/hunt/src/sys/nfsclient/nfs_subs.c:1066
1222867 /r+d/hunt/src/sys/kern/vfs_bio.c:1468
3876072 /r+d/hunt/src/sys/netinet/udp_usrreq.c:545
5198927 /r+d/hunt/src/sys/netinet/udp_usrreq.c:864

The first set above is with the unmodified 7-STABLE tree, the second with a reversion of read locking on the UDP inpcb. The big blinking sign of interest is that the bge interface lock is massively contended in the first set of output, and basically doesn't appear in the second. There are various reasons bge could stand out quite so much -- one possibly is that previously, the udp lock serialized all access to the interface from the send code, preventing the send and receive paths from contending.

A few things to try:

- Let's look compare the context switch rates on the two benchmarks.  Could
  you run vmstat and look at the cpu cs line during the benchmarks and see how
  similar the two are as the benchmarks run?  You'll want to run it with
  vmstat -w 1 and collect several samples per benchmark, since we're really
  interested in the distribution rather than an individual sample.

- Is there any chance you could drop an if_em card into the same box and run
  the identical benchmarks with and without LOCK_PROFILING to see whether it
  behaves differently than bge when the patch is applied?  if_em's interrupt
  handling is quite different, and may significantly affect lock use, and
  hence contention.

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to