[PATCHES] Feature: POSIX Shared memory support
On Mac OS X and other BSD's, the default System V shared memory limits are often very low and require adjustment for acceptable performance. Particularly, when Postgres is included as part of larger end-user friendly software products, these kernel settings are often difficult to change for 2 reasons: 1. The (arbitrarily) limited resources must be shared by all programs that use System V shared memory. For example on my Mac OS X computer, I have Postgres running a standalone database, but also as part of Apple Remote Desktop. Without manual adjustment, running both simultaneously causes one of them to fail. Correcting this in any robust way is challenging to automate for consumer-style (i.e. Mac) installers. 2. On these BSD's, this System V shared memory is wired down and cannot be swapped out for any reason. If Postgres is running as part of another software program or is a lower priority, other programs cannot use the potentially limited memory. This places the user or developer in a tricky position of having to minimize overall system impact, while permitting enough shared memory for Postgres to perform well. To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). In principle, this should not have any significant affect on performance. Running PGBench on a few different load types gives very similar results (-3%/+1%), that aren't very statistically significant. Of course, on a un-tuned Mac OS X machine (where the original SysV version is limited to the default 4MB) the POSIX version outperforms significantly (+250%). Using the POSIX calls helps minimize the kernel side of the tuning, which is a big plus for integrated uses of Postgres, but also for other amateur installations (i.e. Fink). If this is appropriate for the distribution, it could become a 'contrib' add-on or it could be a autoconf custom build option until it reached greater maturity. Any thoughts? Suggestions? I would also appreciate any advice on more sophisticate ways to measure the performance impacts of a change like this. Thanks, Chris Marcellino Apple Computer, Inc. posix_shmem.c Description: Binary data src/backend/port/posix_shmem.c === / *--- -- * * posix_shmem.c *Implement shared memory using POSIX facilities * * These routines represent a fairly thin layer on top of POSIX shared * memory functionality. * * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * *--- -- */ #include postgres.h #include signal.h #include unistd.h #include sys/file.h #include sys/types.h #include sys/stat.h #include sys/mman.h #ifdef HAVE_KERNEL_OS_H #include kernel/OS.h #endif #include miscadmin.h #include storage/ipc.h #include storage/pg_shmem.h #define IPCProtection (0600) /* access/modify by user only */ #define IPCNameLength 32 /* must be long enough to contain all possible format strings * see GenerateIPDName */ unsigned long UsedShmemSegID = 0; void *UsedShmemSegAddr = NULL; static void GenerateIPCName(int memKey, char *dest); static void *InternalIpcMemoryCreate(int memKey, Size size); static void IpcMemoryDetach(int status, Datum shmaddr); static void IpcMemoryDelete(int status, Datum memKey); static PGShmemHeader *PGSharedMemoryAttach(int key); /* * GenerateIPCName(key, dest) * * Generate a shared memory object key name using the argument key. * This uses the magic number and text to prevent collisions from other * apps. */ static void GenerateIPCName(int memKey, char *dest) { /* This must be 31 characters or less for portability (i.e. Mac OS X) */ sprintf(dest, PostgreSQL.%lx.%lx, PGShmemMagic, memKey); } /* * InternalIpcMemoryCreate(memKey, size) * * Attempt to create a new shared memory segment with the specified key. * Will fail (return NULL) if such a segment already exists. If successful, * attach the segment to the current process and return its attached address. * On success, callbacks are registered with on_shmem_exit to detach and * delete the segment when on_shmem_exit is called. * * If we fail with a failure code other than collision-with-existing- segment, * print out an error and abort. Other types of errors are not recoverable. */ static void * InternalIpcMemoryCreate(int memKey, Size size) { int fd; void *memAddress; charkeyName[IPCNameLength]; struct stat statbuf;
Re: [PATCHES] Feature: POSIX Shared memory support
Chris Marcellino [EMAIL PROTECTED] writes: To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). This has been suggested before, and rejected before, on the grounds that the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. We are not going to risk destroying someone's database (or in the alternative, failing to restart after most crashes, which it looks like your patch would do) in order to make installation fractionally easier. I read through your patch in the hopes that you had a solution for this, but all I find is a copied-and-pasted comment /* * We detect whether a shared memory segment is in use by seeing whether * it (a) exists and (b) has any processes are attached to it. */ followed by code that does no such thing. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Feature: POSIX Shared memory support
Tom Lane wrote: Chris Marcellino [EMAIL PROTECTED] writes: To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). This has been suggested before, and rejected before, on the grounds that the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. We are not going to risk destroying someone's database (or in the alternative, failing to restart after most crashes, which it looks like your patch would do) in order to make installation fractionally easier. I read through your patch in the hopes that you had a solution for this, but all I find is a copied-and-pasted comment /* * We detect whether a shared memory segment is in use by seeing whether * it (a) exists and (b) has any processes are attached to it. */ followed by code that does no such thing. Just an idea, but would it be possible to have a small SysV area as an advisory lock (using the existing semantics) to protect the POSIX segment. Best Regards Michael Paesold ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Feature: POSIX Shared memory support
Tom, that is a definitely valid point and thanks for the feedback. I assume that the 'more modern' string segment naming gave the POSIX methods an edge in avoiding collision between other apps. As far as detecting a) whether anyone else is currently attached to that segment and b) whether an earlier existence of the current backend was still attached to a segment, I presumed that checking the pid's of the backend that owns the shared memory segment and checking the data directory (both which the SysV code already does) would suffice? What am I forgetting? Michael, that is an interesting idea. That might be an avenue to explore if there isn't a simpler way. Thanks, Chris Marcellino On Feb 6, 2007, at 7:51 AM, Michael Paesold wrote: Tom Lane wrote: Chris Marcellino [EMAIL PROTECTED] writes: To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). This has been suggested before, and rejected before, on the grounds that the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. We are not going to risk destroying someone's database (or in the alternative, failing to restart after most crashes, which it looks like your patch would do) in order to make installation fractionally easier. I read through your patch in the hopes that you had a solution for this, but all I find is a copied-and-pasted comment /* * We detect whether a shared memory segment is in use by seeing whether * it (a) exists and (b) has any processes are attached to it. */ followed by code that does no such thing. Just an idea, but would it be possible to have a small SysV area as an advisory lock (using the existing semantics) to protect the POSIX segment. Best Regards Michael Paesold ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Feature: POSIX Shared memory support
Chris Marcellino wrote: Tom, that is a definitely valid point and thanks for the feedback. I assume that the 'more modern' string segment naming gave the POSIX methods an edge in avoiding collision between other apps. As far as detecting a) whether anyone else is currently attached to that segment and b) whether an earlier existence of the current backend was still attached to a segment, I presumed that checking the pid's of the backend that owns the shared memory segment and checking the data directory (both which the SysV code already does) would suffice? Is there an API call to list all PIDs that are connected to a particular segment? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Feature: POSIX Shared memory support
To my knowledge there is unfortunately not a portable call that does that. I was actually referring to the check that the current SysV code does on the pid that is stored in the shmem header. I presume that if the backend is dead, the kill(hdr-creatorPID, 0) returning zero would suffice for confirming the existence of the other backend process. Chris Marcellino On Feb 6, 2007, at 10:32 AM, Alvaro Herrera wrote: Chris Marcellino wrote: Tom, that is a definitely valid point and thanks for the feedback. I assume that the 'more modern' string segment naming gave the POSIX methods an edge in avoiding collision between other apps. As far as detecting a) whether anyone else is currently attached to that segment and b) whether an earlier existence of the current backend was still attached to a segment, I presumed that checking the pid's of the backend that owns the shared memory segment and checking the data directory (both which the SysV code already does) would suffice? Is there an API call to list all PIDs that are connected to a particular segment? -- Alvaro Herrerahttp:// www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Feature: POSIX Shared memory support
Chris Marcellino [EMAIL PROTECTED] writes: I was actually referring to the check that the current SysV code does on the pid that is stored in the shmem header. I presume that if the backend is dead, the kill(hdr-creatorPID, 0) returning zero would suffice for confirming the existence of the other backend process. No, that's not relevant, because only the postmaster's PID will be there --- that test is actually more or less redundant with the existing postmaster.pid lockfile checks. The thing that the SysV attachment count is useful for is detecting whether there are orphaned backends still alive in the database (and potentially changing it, hence the danger). We've speculated on occasion about using file locking in some form as a substitute mechanism for detecting this, but that seems to just bring its own set of not-too-portable assumptions. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PATCHES] WIP: Recursive Queries
Earlier I started working on recursive queries. Unfortunately due to other work I haven't had a chance to look at it in about a week. I probably won't be back on it for another week. Here's a work-in-progress patch for review. If anyone else wants to hack on it while I'm distracted that's fine. Hopefully there'll still be some work for me to do when I get back to it :) I haven't written up a plan for the next step yet so there's some planning work to be done first. I think the first thing is to force the subqueries to be planned individually as Initplans instead of being inlined directly into the query to avoid duplicated plans (though a cost-based heuristic is needed to determine when inline is advantageous). Once we have that the patch might actually be worth applying as it represents good support for non-recursive WITH clauses. If not I would love to get some feedback on whether I'm on the wrong track anywhere. In particular the issues I'm nervous about is whether I'm storing the right kinds of information in the right places, such as when it comes to using the pstate versus nodes of the parse tree. Also whether I'm doing work at the right phase when it comes to parse versus analysis/transformation versus postponing work for later in the optimizer and executor. One thing that isn't working the way I was expecting is that I thought storing the list of common table expression names in scope in the pstate would have the right semantics and it isn't. It has to build up a list of cte names that are in scope from all parent scopes. pstates are inherited when parsing subqueries and reset to saved copies afterwards which seemed like the right place. However: postgres=# with a(x) as (select 1) select * from a; x --- 1 (1 row) postgres=# with a(x) as (select 1) select * from (select * from a) as x; ERROR: relation a does not exist Index: src/backend/nodes/copyfuncs.c === RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/nodes/copyfuncs.c,v retrieving revision 1.360 diff -c -r1.360 copyfuncs.c *** src/backend/nodes/copyfuncs.c 9 Jan 2007 02:14:11 - 1.360 --- src/backend/nodes/copyfuncs.c 30 Jan 2007 17:35:15 - *** *** 1808,1813 --- 1808,1814 COPY_NODE_FIELD(limitOffset); COPY_NODE_FIELD(limitCount); COPY_NODE_FIELD(lockingClause); + COPY_NODE_FIELD(with_cte_list); COPY_SCALAR_FIELD(op); COPY_SCALAR_FIELD(all); COPY_NODE_FIELD(larg); Index: src/backend/nodes/equalfuncs.c === RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/nodes/equalfuncs.c,v retrieving revision 1.294 diff -c -r1.294 equalfuncs.c *** src/backend/nodes/equalfuncs.c 9 Jan 2007 02:14:12 - 1.294 --- src/backend/nodes/equalfuncs.c 30 Jan 2007 17:35:31 - *** *** 746,751 --- 746,752 COMPARE_NODE_FIELD(limitOffset); COMPARE_NODE_FIELD(limitCount); COMPARE_NODE_FIELD(lockingClause); + COMPARE_NODE_FIELD(with_cte_list); COMPARE_SCALAR_FIELD(op); COMPARE_SCALAR_FIELD(all); COMPARE_NODE_FIELD(larg); Index: src/backend/nodes/outfuncs.c === RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/nodes/outfuncs.c,v retrieving revision 1.292 diff -c -r1.292 outfuncs.c *** src/backend/nodes/outfuncs.c 9 Jan 2007 02:14:12 - 1.292 --- src/backend/nodes/outfuncs.c 30 Jan 2007 17:36:05 - *** *** 1419,1424 --- 1419,1425 WRITE_NODE_FIELD(limitOffset); WRITE_NODE_FIELD(limitCount); WRITE_NODE_FIELD(lockingClause); + WRITE_NODE_FIELD(with_cte_list); WRITE_ENUM_FIELD(op, SetOperation); WRITE_BOOL_FIELD(all); WRITE_NODE_FIELD(larg); Index: src/backend/parser/analyze.c === RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/parser/analyze.c,v retrieving revision 1.355 diff -c -r1.355 analyze.c *** src/backend/parser/analyze.c 9 Jan 2007 02:14:13 - 1.355 --- src/backend/parser/analyze.c 30 Jan 2007 16:03:59 - *** *** 2097,2102 --- 2097,2105 /* make FOR UPDATE/FOR SHARE info available to addRangeTableEntry */ pstate-p_locking_clause = stmt-lockingClause; + /* process the WITH clause (pull ctes into the pstate's ctenamespace) */ + transformWithClause(pstate, stmt-with_cte_list); + /* process the FROM clause */ transformFromClause(pstate, stmt-fromClause); Index: src/backend/parser/gram.y === RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/parser/gram.y,v retrieving revision 2.573 diff -c -r2.573 gram.y *** src/backend/parser/gram.y 9 Jan 2007 02:14:14 - 2.573 --- src/backend/parser/gram.y 30 Jan 2007 17:45:46 - *** *** 102,108 static SelectStmt *findLeftmostSelect(SelectStmt *node); static void
Re: [PATCHES] [pgsql-patches] pg_get_domaindef
On Thu, 25 Jan 2007 02:25, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: FAST PostgreSQL wrote: Please find attached the patch with modifications are you proposing to implement the other functions in this TODO item (pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(), pg_get_tabledef(), pg_get_functiondef() ) ? I haven't entirely understood the use case for any of these. It's not Any consensus on these functions? If we decide against having these then its better to remove them from the TODO list temporarily/permanently. Rgds, Arul Shaji pg_dump, for a number of reasons: one being that pg_dump still has to support older backend versions, and another being that every time we let backend SnapshotNow functions get involved, we take another hit to pg_dump's claim to produce a consistent MVCC snapshot. But my real objection is: do we really want to support duplicative code in both pg_dump and the backend? Updating pg_dump is already a major PITA whenever one adds a new feature; doubling that work isn't attractive. (And it'd be double, not just a copy-and-paste, because of the large difference in the operating environment.) So I want to hear a seriously convincing use-case that will justify the maintenance load we are setting up for ourselves. Somebody might want this is not adequate. Perhaps a better area of work would be the often-proposed refactoring of pg_dump into a library and driver program, wherein the library could expose individual functions such as fetch the SQL definition of this object. Unfortunately, that'll be a huge project with no payoff until the end... regards, tom lane This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you. If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Feature: POSIX Shared memory support
From: Chris Marcellino [EMAIL PROTECTED] To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). I hope your work will be accepted. Setting IPC parameters is tedious for normal users, and they sometimes miss the manual article and hit the IPC resource shortage problem, particularly when the system developers run multiple instances on a single machine at the same time. Then, how about semaphores? When I just do configure, PostgreSQL seems to use SysV semaphores. But POSIX semaphore implementation is prepared in src/backend/port/posix_sema.c. Why isn't it used by default? Does it have any problem? # Windows is good in this point, isn't it? I'm sorry to ask you a question even though I've not read your patch well. Does mmap(MAP_SHARED) need msync() to make the change by one process visible to other processes? I found the following in the manual page of mmap on Linux: MAP_SHARED Share this mapping with all other processes that map this object. Storing to the region is equivalent to writing to the file. The file may not actually be updated until msync(2) or munmap(2) are called. BTW, is the number of semaphores for dummy backends (eg bgwriter, autovacuum) counted in PostgreSQL manual? From: Tom Lane [EMAIL PROTECTED] the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. We are not going to risk destroying someone's database (or in the alternative, failing to restart after most crashes, which it looks like your patch would do) in order to make installation fractionally easier. How is this done on Windows? Is it possible to count the number of processes that attach a shared memory? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Feature: POSIX Shared memory support
Responses inline. On Feb 6, 2007, at 7:05 PM, Takayuki Tsunakawa wrote: From: Chris Marcellino [EMAIL PROTECTED] To this end, I have ported the svsv_shmem.c layer to use the POSIX calls (which are some ways more robust w.r.t reducing collision by using strings as shared memory id's, instead of ints). I hope your work will be accepted. Setting IPC parameters is tedious for normal users, and they sometimes miss the manual article and hit the IPC resource shortage problem, particularly when the system developers run multiple instances on a single machine at the same time. As Tom pointed out, the code I posted yesterday is not robust enough for general consumption. I'm working on a better solution, which will likely involve using a very small SysV shmem segment as a mutex of sorts (as Michael Paesold suggested). Then, how about semaphores? When I just do configure, PostgreSQL seems to use SysV semaphores. But POSIX semaphore implementation is prepared in src/backend/port/posix_sema.c. Why isn't it used by default? Does it have any problem? In this case, semaphore usage is unrelated to shared memory shortages. Also, on many platforms the posix_sema's code is used. Either way, Essentially, no one is running out of shared memory due to semaphores. # Windows is good in this point, isn't it? From what I can tell, if you look at the Windows SysV shmem emulation code in src/backend/port/win32/shmem.c, you will see in the shmctl() function that the 'other process detection' code is not implemented, since their is no corresponding Win32 API to implement this. There is only so much you can do in that case. As far as the other platforms go, any replacement for the SysV shmem code should be as reliable as what preceded it. I'm sorry to ask you a question even though I've not read your patch well. Does mmap(MAP_SHARED) need msync() to make the change by one process visible to other processes? I found the following in the manual page of mmap on Linux: MAP_SHARED Share this mapping with all other processes that map this object. Storing to the region is equivalent to writing to the file. The file may not actually be updated until msync(2) or munmap(2) are called. BTW, is the number of semaphores for dummy backends (eg bgwriter, autovacuum) counted in PostgreSQL manual? From: Tom Lane [EMAIL PROTECTED] the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. We are not going to risk destroying someone's database (or in the alternative, failing to restart after most crashes, which it looks like your patch would do) in order to make installation fractionally easier. How is this done on Windows? Is it possible to count the number of processes that attach a shared memory? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Feature: POSIX Shared memory support
Takayuki Tsunakawa [EMAIL PROTECTED] writes: From: Tom Lane [EMAIL PROTECTED] the POSIX API provides no way to detect whether anyone else is attached to the segment. Not being able to tell that is a tremendous robustness hit for us. How is this done on Windows? Is it possible to count the number of processes that attach a shared memory? AFAIK the Windows port is simply wrong/insecure on this point --- it's one of the reasons you'll never see me recommending Windows as the OS for a production Postgres server. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Feature: POSIX Shared memory support
Chris Marcellino [EMAIL PROTECTED] writes: As Tom pointed out, the code I posted yesterday is not robust enough for general consumption. I'm working on a better solution, which will likely involve using a very small SysV shmem segment as a mutex of sorts (as Michael Paesold suggested). One problem with Michael's idea is that it gives up one of the better arguments for having a POSIX option, namely to allow us to run on platforms where SysV shmem support is not there at all. I'm not sure whether the idea can be implemented without creating new failure modes; that will have to wait on seeing a patch. But the strength of the coupling between the SysV and POSIX segments is certainly going to be a red-flag item to look at. Then, how about semaphores? When I just do configure, PostgreSQL seems to use SysV semaphores. But POSIX semaphore implementation is prepared in src/backend/port/posix_sema.c. Why isn't it used by default? Does it have any problem? In this case, semaphore usage is unrelated to shared memory shortages. Also, on many platforms the posix_sema's code is used. Either way, Essentially, no one is running out of shared memory due to semaphores. AFAIK the only platform where the POSIX sema code is really used is Darwin (OS X), and it is not something I'd use there if I had a choice. The problem with it is that *every* semaphore corresponds to an open file handle in the postmaster that has to be inherited by *every* forked child. So N backend slots cost you O(N^2) in kernel filehandles and process fork overhead, plus if N is big you're taking a serious hit in the number of disk files any one backend can have open. This problem may be specific to Darwin's implementation of the POSIX spec, but it's real enough there. If you trawl the archives you'll probably notice a lack of people running big Postgres installations on Darwin, and this is why. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Feature: POSIX Shared memory support
ep ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Feature: POSIX Shared memory support
Then, how about semaphores? When I just do configure, PostgreSQL seems to use SysV semaphores. But POSIX semaphore implementation is prepared in src/backend/port/posix_sema.c. Why isn't it used by default? Does it have any problem? Either way, Essentially, no one is running out of shared memory due to semaphores. In this case, semaphore usage is unrelated to shared memory shortages. Yes, of course, shared memory is not related to semaphores. Also, on many platforms the posix_sema's code is used. Really? When I run 'configure' without any parameter on Red Hat Enterprise Linux 4.0 (kernel 2.6.x), PostgreSQL uses SysV semaphores. I confirmed that by seeing the result of 'ipcs -u'. What platforms is POSIX sema used by PostgreSQL by default? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Feature: POSIX Shared memory support
Yes, as Tom pointed out. Sorry, I misread the autoconf file. I've gotten quite used to Darwin == BSD. I've added a note to my todo list to look into the posix semaphore performance on the Darwin side. --Chris On Feb 6, 2007, at 8:32 PM, Takayuki Tsunakawa wrote: Then, how about semaphores? When I just do configure, PostgreSQL seems to use SysV semaphores. But POSIX semaphore implementation is prepared in src/backend/port/posix_sema.c. Why isn't it used by default? Does it have any problem? Either way, Essentially, no one is running out of shared memory due to semaphores. In this case, semaphore usage is unrelated to shared memory shortages. Yes, of course, shared memory is not related to semaphores. Also, on many platforms the posix_sema's code is used. Really? When I run 'configure' without any parameter on Red Hat Enterprise Linux 4.0 (kernel 2.6.x), PostgreSQL uses SysV semaphores. I confirmed that by seeing the result of 'ipcs -u'. What platforms is POSIX sema used by PostgreSQL by default? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Feature: POSIX Shared memory support
Attached is a beta of the POSIX shared memory layer. It is 75% the original sysv_shmem.c code. I'm looking for ways to refactor it down a bit, while changing as little of the tried-and-tested code as possible. I though I'd put it out there for comments. Of course, unfortunately it is more complicated than the original as it uses both sets of API. Also, I haven't tested the crash recovery thoroughly. The POSIX code could be used Windows-style (i.e. no crash recovery) if one ifdef'd out the SysV calls properly, if they had such a POSIX-only platform they needed to run Postgres on. Using both API is certainly not ideal. You mentioned, We've speculated on occasion about using file locking in some form as a substitute mechanism for detecting this, but that seems to just bring its own set of not-too-portable assumptions What sort of file locking did you have in mind? Do you think this might be worth me trying? Thanks for your help, Chris Marcellino posix_shmem.c Description: Binary data ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Feature: POSIX Shared memory support
Tom Lane wrote: We've speculated on occasion about using file locking in some form as a substitute mechanism for detecting this, but that seems to just bring its own set of not-too-portable assumptions. Maybe we should look some more at that. Use of file locking was one thought I had today after I saw Tom's earlier comments. Perl provides a moderately portable flock(), which we use in fact in buildfarm to stop it from running more than one at a time on a given repo copy. The Perl description starts thus: Calls flock(2), or an emulation of it, on FILEHANDLE. Returns true for success, false on failure. Produces a fatal error if used on a machine that doesn't implement flock(2), fcntl(2) locking, or lockf(3). flock is Perl's portable file locking interface, although it locks only entire files, not records. Note that this means it works on every platform that has ever reported on buildfarm. Maybe we can borrow some code. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match