I've looked into Olivier Hubaut's recent reports of 'Too many open files' errors on OS X. What I find is that on Darwin, where we are using Posix semaphores rather than SysV semaphores, each Posix semaphore is treated as an open file --- it shows up in "lsof" output, and more to the point it appears to count against a process's ulimit -n limit. This means that if you are running with, say, max-connections = 100, that's 100+ open files in the postmaster and every active backend. And it's 100+ open files that aren't accounted for in fd.c's estimate of how many files it can open. Since the ulimit -n setting is by default only 256 on this platform, it doesn't take much at all for us to be bumping up against the ulimit -n limit. fd.c copes fine, since it automatically closes other open files any time it gets an EMFILE error. But code outside fd.c is likely to fail hard ... which is exactly the symptom we saw in Olivier's report.
I plan to apply some band-aid fixes to make that code more robust; for instance we can push all calls to opendir() into fd.c so that EMFILE can be handled by closing other open files. (And why does MoveOfflineLogs PANIC on this anyway? It's not critical code...) However, it seems that the real problem here is that we are so far off base about how many files we can open. I wonder whether we should stop relying on sysconf() and instead try to make some direct probe of the number of files we can open. I'm imagining repeatedly open() until failure at some point during postmaster startup, and then save that result as the number-of-openable-files limit. I also notice that OS X 10.3 seems to have working SysV semaphore support. I am tempted to change template/darwin to use SysV where available, instead of Posix semaphores. I wonder whether inheriting 100-or-so open file descriptors every time we launch a backend isn't in itself a nasty performance hit, quite aside from its effect on how many normal files we can open. Comments anyone? There are a lot of unknowns here... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html