I now have updated versions of these patches, which correct some inconsistencies in approach (universal use of curcpu now, for example), remove some debugging code, etc. I've received relatively little performance feedback on them, and would appreciate it if I could get some. :-) Especially as to whether these impact disk I/O related workloads, useful macrobenchmarks, etc. The latest patch is at:


    
http://www.watson.org/~robert/freebsd/netperf/20050425-uma-mbuf-malloc-critical.diff

The changes in the following files in the combined patch are intended to be broken out in to separate patches, as desired, as follows:

kern_malloc.c           malloc.diff
kern_mbuf.c             mbuf.diff
uipc_mbuf.c             mbuf.diff
uipc_syscalls.c         mbuf.diff
malloc.h                malloc.diff
mbuf.h                  mbuf.diff
pcpu.h                  malloc.diff, mbuf.diff, uma.diff
uma_core.c              uma.diff
uma_int.h               uma.diff

I.e., the pcpu.h changes are a dependency for all of the remaining changes. As before, I'm interested in both the impact of individual patches, and the net effect of the total change associated with all patches applied.

Because this diff was generated by p4, patch may need some help in identifying the targets of each part of the diff.

Robert N M Watson

On Sun, 17 Apr 2005, Robert Watson wrote:


Attached please find three patches:

(1) uma.diff, which modifies the UMA slab allocator to use critical
   sections instead of mutexes to protect per-CPU caches.

(2) malloc.diff, which modifies the malloc memory allocator to use
   critical sections and per-CPU data instead of mutexes to store
   per-malloc-type statistics, coalescing for the purposes of the sysctl
   used to generate vmstat -m output.

(3) mbuf.diff, which modifies the mbuf allocator to use per-CPU data and
   critical sections for statistics, instead of synchronization-free
   statistics which could result in substantial inconsistency on SMP
   systems.

These changes are facilitated by John Baldwin's recent re-introduction of critical section optimizations that permit critical sections to be implemented "in software", rather than using the hardware interrupt disable mechanism, which is quite expensive on modern processors (especially Xeon P4 CPUs). While not identical, this is similar to the softspl behavior in 4.x, and Linux's preemption disable mechanisms (and various other post-Vax systems :-)).

The reason this is interesting is that it allows synchronization of per-CPU data to be performed at a much lower cost than previously, and consistently across UP and SMP systems. Prior to these changes, the use of critical sections and per-CPU data as an alternative to mutexes would lead to an improvement on SMP, but not on UP. So, that said, here's what I'd like us to look at:

- Patches (1) and (2) are intended to improve performance by reducing the
 overhead of maintaining cache consistency and statistics for UMA and
 malloc(9), and may universally impact performance (in a small way) due
 to the breadth of their use through the kernel.

- Patch (3) is intended to restore consistency to statistics in the
 presence of SMP and preemption, at the possible cost of some
 performance.

I'd like to confirm that for the first two patches, for interesting workloads, performance generally improves, and that stability doesn't degrade. For the third partch, I'd like to quantify the cost of the changes for interesting workloads, and likewise confirm no loss of stability.

Because these will have a relatively small impact, a fair amount of caution is required in testing. We may be talking about a percent or two, maybe four, difference in benchmark performance, and many benchmarks have a higher variance than that.

A couple of observations for those interested:

- The INVARIANTS panic with UMA seen in some earlier patch versions is
 believed to be corrected.

- Right now, because I use arrays of foo[MAXCPUS], I'm concerned that
 different CPUs will be writing to the same cache line as they're
 adjacent in memory.  Moving to per-CPU chunks of memory to hold this
 stuff is desirable, but I think first we need to identify a model by
 which to do that cleanly.  I'm not currently enamored of the 'struct
 pcpu' model, since it makes us very sensitive to ABI changes, as well as
 not offering a model by which modules can register new per-cpu data
 cleanly.  I'm also inconsistent about how I dereference into the arrays,
 and intend to move to using 'curcpu' throughout.

- Because mutexes are no longer used in UMA, and not for the others
 either, stats read across different CPUs that are coalesced may be
 slightly inconsistent.  I'm not all that concerned about it, but it's
 worth thinking on.

- Malloc stats for realloc() are still broken if you apply this patch.

- High watermarks are no longer maintained for malloc since they require a
 global notion of "high" that is tracked continuously (i.e., at each
 change), and there's no longer a global view except when the observer
 kicks in (sysctl).  You can imagine various models to restore some
 notion of a high watermark, but I'm not currently sure which is the
 best.  The high watermark notion is desirable though.

So this is a request for:

(1) Stability testing of these patches.  Put them on a machine, make them
   hurt.  If things go South, try applying the patches one by one until
   it's clear which is the source.

(2) Performance testing of these patches.  Subject to the challenges in
   testing them.  If you are interested, please test each patch
   separately to evaluate its impact on your system.  Then apply all
   together and see how it evens out.  You may find that the mbuf
   allocator patch outweighs the benefits of the other two patches, if
   so, that is interesting and something to work on!

I've done some micro-benchmarking using tools like netblast, syscall_timing, etc, but I'm interested particularly in the impact on macrobenchmarks.

Thanks!

Robert N M Watson
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to