On 5/26/05, Hubert Chan <[EMAIL PROTECTED]> wrote: > My upstream reports that according to his measurements, disabling > THREAD_LOCAL_ALLOC gives only a 5% performance hit instead of 15%. So > it's much closer to the single-threaded case, but is still a bit > slower.
Perhaps upstream is on a hyperthreaded CPU? If so, could you ask him to try the same conversion to thread-local allocation with and without -DPARALLEL_MARK? If he also sees that PARALLEL_MARK helps (or doesn't hurt, anyway), perhaps Ryan should consider enabling it in the libgc package. Does the libgc-dev package contain a static library? If so, and if you think the dynamic linking overhead matters, you could always link that library statically at build time. Comparing profiling results between the two is a fine way to find candidates for inlining. Cheers, - Michael