On Sat, May 1, 2010 at 12:01 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> The thread here
> http://archives.postgresql.org/pgsql-admin/2010-04/msg00358.php
> shows that current OS X contains the same issue that was complained of
> a year or so ago with respect to NetBSD.  Namely, that if shmget finds
> an existing shared memory segment that is smaller than the current
> request, it will return EINVAL, rather than EEXIST which is what
> InternalIpcMemoryCreate is expecting to get for a collision.  This
> leads to an unnecessary startup failure with a completely misleading
> error message.  It's easy to reproduce on a Mac:
>
> 1. kill -9 an existing postmaster.
> 2. edit postgresql.conf to increase max_connections by 1.
> 3. try to start postmaster.
>
> You get
>
> FATAL:  could not create shared memory segment: Invalid argument
> DETAIL:  Failed system call was shmget(key=5432001, size=29622272, 03600).
> HINT:  This error usually means that PostgreSQL's request for a shared memory 
> segment exceeded your kernel's SHMMAX parameter.  You can either reduce the 
> request size or reconfigure the kernel with larger SHMMAX.  To reduce the 
> request size (currently 29622272 bytes), reduce PostgreSQL's shared_buffers 
> parameter (currently 3072) and/or its max_connections parameter (currently 
> 105).
>        If the request size is already small, it's possible that it is less 
> than your kernel's SHMMIN parameter, in which case raising the request size 
> or reconfiguring SHMMIN is called for.
>        The PostgreSQL documentation contains more information about shared 
> memory configuration.
>
> In the previous go-round, the misleading errno was reported to NetBSD
> as a kernel bug.  I see from their CVS that they did fix it: see 1.113 in
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/sysv_shm.c
>
> But what now seems clear to me is that this behavior probably exists
> in *every* BSD-derived kernel.  It's unlikely that we can get them all
> fixed, especially in view of the POSIX standard's wording saying that a
> kernel's order of error checking is not guaranteed.  It'd be smarter for
> us to install a workaround.
>
> The workaround I'm thinking of is, when we see EINVAL, to try another
> shmget with the same key and flags, and size zero.  If this results in
> EEXIST or EACCES then handle it as a collision.  Otherwise clean up the
> new segment (if we managed to make one, which is unlikely) and report
> the original EINVAL.  This depends on the knowledge that these kernels
> don't check the size against shmmin/shmmax in the code path where
> there's an existing segment, so we will not get an EINVAL on the basis
> of the size and will instead see an errno that reflects the collision,
> if there is one.
>
> Comments?

It seems reasonable, though I couldn't speak to whether it's going to
fully solve the problem.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to