On 5/17/05, Hubert Chan <[EMAIL PROTECTED]> wrote: > >>>>> "Michael" == Michael K Edwards <[EMAIL PROTECTED]> writes: > > Michael> I'd be surprised if it's that bad under NPTL, and if it is, I'd > Michael> be surprised if it can't be substantially improved (at least on > Michael> x86) with a little bit of oprofile work. When was that > Michael> performance comparison done? > > The libgc dependency was only added recently, so upstream's performance > comparison was done within the last couple of weeks. I just tried it > out myself, and my own informal tests seem to agree with upstream's > numbers. (I'm running sid, last updated a couple of weeks ago, on a > 2.6.10 kernel.)
This is going to sound stupid, but have you tried it with either glibc 2.3.4 or Ubuntu's modified glibc? There are some threading-related issues that I know were addressed very late in the hoary cycle -- as far as I know, primarily pthread_cancel semantics, but they may have performance implications too -- and the fix may not be in sid. If it's inconvenient to compare, don't worry about it; glibc 2.3.4 will hit sid not long after sarge releases, right? Just to check: you are using the same compilation and linking scheme for both builds you are benchmarking, right? -fPIC and dynamic library thunks can add more than a little overhead to a slab-ish, usually-available-from-free-list malloc(). Oh, and are there things in the header files that change from macros / inline functions to real function calls when you switch on threading? Note also the tuning issues discussed in http://www.hpl.hp.com/personal/Hans_Boehm/gc/scale.html , which includes benchmarks done on a now ancient kernel (2.2.12). If you are using -DPARALLEL_MARK on an SMP (or hyperthreaded) machine, and the scheduler gets the processor affinity wrong for the dedicated marking threads, I could see that having unfortunate performance consequences. If you have compiled with -DTHREAD_LOCAL_ALLOC but are not using the API in gc_local_alloc.h, you will be hurting. In general, I think that the tool you need for this purpose is oprofile. You want to see where threads are sleeping (if at all), how much time is spent fiddling with spinlocks, etc. I'm a novice with oprofile but I'll be needing to learn about it Real Soon Now. Ryan, have you had occasion to throw this or other profiling tools at libgc? Cheers, - Michael

