On Thu, 15 Sep 2005, Tom Lane wrote: > One thing that did seem to help a little bit was padding the LWLocks > to 32 bytes (by default they are 24 bytes each on x86_64) and ensuring > the array starts on a 32-byte boundary. This ensures that we won't have > any LWLocks crossing cache lines --- contended access to such an LWLock > would probably incur the sort of large penalty seen above, because you'd > be trading two cache lines back and forth not one. It seems that the > important locks are not split that way in CVS tip, because the gain > wasn't much, but I wonder whether some effect like this might explain > some of the unexplainable performance changes we've noticed in the past > (eg, in the dbt2 results). A seemingly unrelated small change in the > size of other data structures in shared memory might move things around > enough to make a performance-critical lock cross a cache line boundary.
What about padding the LWLock to 64 bytes on these architectures. Both P4 and Opteron have 64 byte cache lines, IIRC. This would ensure that a cacheline doesn't hold two LWLocks. Gavin ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend