On Sun, 11 Jun 2006, Kris Kennaway wrote:

* The postgres processes seem to change their proctitle hundreds or thousands of times per second. This is currently done via a Giant-locked sysctl (kern.proc.args) so there is enormous contention for Giant. Even when this is fixed (thanks to a patch from csjp@), each of them requires a syscall and syscalls ain't free. This is not a clever thing to be doing from a performance standpoint.

You might consider disabling setproctitle() entirely to see what impact that has?

* pgsql uses select() and this seems to be a major choke point. I bet you'd see fairly impressive performance gains (especially on SMP) if it was modified to use kqueue instead of select.

* You really want to avoid using IPv6 for transport (since it's Giant-locked). This was an issue at first since I was running against localhost, which maps to ::1 by default. We should reconsider the preference for IPv6 over IPv4 until IPv6 is Giant-free - there are probably many other situations where IPv6 is being secretly used "because it is there" and costing performance.

FYI, for purely loopback traffic, it's probably safe to mark the IPv6 netisr as MPSAFE. Add NETISR_MPSAFE as a flag to the following line in ip6_input.c:

ip6_input.c:    netisr_register(NETISR_IPV6, ip6_input, &ip6intrq, 0);

If you have non-loopback traffic, you may put yourself at greater risks of panic in the IPv6 multicast and neighbor discovery code, however, so this should be done with caution. It might be an interesting exercise though.

* The sysv IPC code is still giant-locked. pgsql makes a lot of semop() calls which grab Giant, and it also msleep()s on the Giant lock in the semwait channel.

It is likely quite easy to put subsystem locks around System V IPC subsystems. I'm a bit surprised no one has done it already. sysvshm is a bit more tricky, but sysvsem and sysvmsg should be quite straight forward.

* When semop() wants to wake up some sleeping processes because semaphores have been released, it does a wakeup() and wakes them all up. This means a thundering herd (I see up to 11 CPUs being woken here). Since we know exactly how many resources are available, it would be better to only wakeup_one() that number of times instead.

Should be easy to experiment with.

Robert N M Watson
Computer Laboratory
Universty of Cambridge
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to