Re: [HACKERS] win32 socket definition
Tom Lane wrote: There's another copy of ListenSocket[] in the BackendParameters struct. I also wonder about postmaster.c's habit of using -1 for empty slots in ListenSocket ... how safe is that for Win64? On Windows, it should be INVALID_SOCKET. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Marko Kreen wrote: Note - my proposal would be to get rid of HAVE_INLINE, which means we are already using inline functions unconditionally on platforms that matter (gcc). Keeping duplicate code for obsolete compilers is pointless. Microsoft C doesn't matter? I seem to remember that when the Win32 version became available it actually increased the number of people trying postgres rather dramatically. Did that count for nothing? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify rewrite
Josh Berkus wrote: Payloads are also quite useful even in a lossy environment, where you understand that LISTEN is not a queue. For example, I'd like to be using LISTEN/NOTIFY for cache invalidation for some applications; if it misses a few, or double-counts them, it's not an issue. However, I'd Not sure how missing a few can not be an issue for cache invalidation, but never mind. In the use-cases I've considered I've used a trigger to write a change notification to a table and a notify to indicate that notification record(s) have been changed. The notifications contain copies of the primary keys and the action so the cache processor knows what's changed and the notify is a a wakeup signal. If this is in fact the most common use case, perhaps an alternative approach would be to automate it directly, so that writing the triggers (and using the trigger processing engines) would be unecessary, so: - the queue definition can be automated with reference to the parent table, by DDL stating that one is required - the notification 'name' is effectively the queue name and the subscription says 'tell me when a change note is placed in the queue' Doing this in the database engine core allows a number of potential optimisations: - the mechanism does not require general trigger execution - the queue does not have to be a real table, and can have custom semantics: it may not actually be necessary to store copies of the primary key data if it can refer to the rows so the data can be retrieved as the queue is consumed - if there are no subscribers to the queue then the insertion to it can be elided - if the server crashes, connected consumers should assume caches are invalid and theer is no ACID requirement for the queue data - if the queue fills then slow consumer(s) can be dropped and can receive a data loss indicator like to be able to send message like players_updated|45321 where 45321 is the ID of the player updated. Indeed. Just a thought, anyway. (FWIW I was initially concerned about the lack of payload, but with any sort of lossy compression I figured it wasn't, actually, and I needed a notification queue) James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Could postgres be much cleaner if a future release skipped backward compatibility?
Tom Lane wrote: Actually, I think any attempt to do that would result in a fork, and a consequent splintering of the community. We can get away Perhaps the answer might be a pre-emptive simplifying fork to postgres-NG, perhaps taking a lead from MySQL and Samba. I'm not sure that you would necessarily lose too much: if the postgres-NG system implements a functional subset (perhaps on a subset of platforms, eg ones with C++ and threading support) with some implementation advantages, then it might allow users who are interested in the potential advantages of -NG to check that they are using the subset while still remaining PostgreSQL users for serious use. Suppose, for example, that you severely limit the ability of individual connections to load extensions, and make it a dbo task - and have an architecture that is not process-per-connection but process-per-db. This might allow some changes in the cache and read-ahead/write-behind that use large memory on modern systems more easily - perhaps in the way that the falcon store for MySQL planned to. The core query planner and execution engine can remain common, even if the process structure that hosts it and the underlying cache page management is quite different. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: the function time and the commit time a lot better. But I'm not sure if the use-case is popular enough to deserve such a hack. For some OLTP workloads, it makes a lot of sense to spool tuples of primary key plus new fields into a temp table and then doing a single update or delete operation referencing the temp table. Perhaps not so much for code designed for postgres where there is some extra flexibility with array params and the like, but for code that targets other systems as well. Having temp tables be as fast as possible is quite a big win in this case. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea: global temp tables
Kevin Grittner wrote: contexts. I don't think the updates to the system tables have the same magnitude of performance hit as creating these tables, especially if write barriers are on. Wouldn't it be cleaner just to defer creation of real files to support the structures associated with a temp table until it i snecessary to spill the data from the backend's RAM? This data doesn't need to be in shared memory and the tables and data aren't visible to any other session, so can't they run out of RAM most of the time (or all the time if the data in them is short lived)? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq WSACleanup is not needed
Andrew Chernow wrote: The only problem is how to detect the first connection. In a threaded environment you'd have to perform locking in connectdb, which is probably not going to fly. Well, if you do an atomic test for a flag being zero, and if so then enter a critsec, do the init iff you're the first in, and then set the flag on the way out, then: - most times, you'll just have the atomic test - other times, you have a short critsec I can't see that being a big deal considering you're about to resolve the server hostname and then do a TCP/IP connect. My understanding is that if you do WSAStartup and WSACleanup scoped to each connection then: - the internal counting means that only the 0 - 1 and 1 - 0 transitions are expensive - libpq will only incur the cost if the application didn't do it already So it seems that the cost is incurred by an application that: - makes no other use of winsock (or also does startup/cleanup often) - does not retain a connection (or pool) but creates and closes a single connection often How many applications are there that match this pattern? Isn't it enough just to tell the user to do WSAStartup and WSACleanup in main() if they find they have a performance problem? Surely most Windows programs effectively do that anyway, often as a side effect of using a framework. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq WSACleanup is not needed
Magnus Hagander wrote: The use-case of rapidly creating and dropping connections isn't particularly common, I think. And there is a perfectly functioning workaround - something that we should perhaps document in the FAQ or somewhere in the documentation? Would it be accetable to do initialise if the number of connections is changing from 0, and tidy if the cumber goes back to 0? Applications that retain a connection would not suffer the cost on subsequent connect/disconnect. The init/term is the tidiest way to do it, but the above might help - perhaps init could just add a phantom usecount and work the same way. If you have a DLL for libpq, could you do it in process attach and detach? Wouldn't that be the most common case anyway? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq WSACleanup is not needed
Andrew Chernow wrote: m$ docs indicate that wsastartup can't be called from dllmain :( OK, fair cop. Says it in the MSDN online version but not in the SDK 6.1 version. :-( Some helper(s) must start threads I guess. Re the counting and doing it on first/last socket - of course WSAStartup counts internally. Prsumably its only slow when the count is actually going to zero? Is there a need for a new API to control this - can't you just interpret another parameter keyword in PQconnectdb (or the pgoptions string in PQsetdbLogin I guess)? (Having said that, how do you control send and receive buffer sizes? Presumably nagle is always disabled, but those features could be controlled the same way? Would presumably allow PGOPTIONS to be parsed too, though docs 30.12 says that does runtime options for the server, though PQconnectdb in 30.1 suggests that it might be parsed for the client connect too. Maybe.). James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Significantly larger toast tables on 8.4?
Peter Eisentraut wrote: c. Are there any well-known pitfalls/objections which would prevent me from changing the algorithm to something more efficient (read: IO-bound)? copyright licenses and patents Would it be possible to have a plugin facility? I guess the most likely candidate is the LZJB mechanism in ZFS which is CDDL licensed. Would that be compatible in contrib, if not in the main source? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About CMake
Andrew Dunstan wrote: Quite so. CMake outputs MSVC Project files, as I understand it. If you know of another cross-platform build tool that will do that then speak up. I think the wxWidgets team have one, and I think scons has some support for doing that, though I haven't tried that part of scons. The first uses Perl, scons uses Python. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Elide null updates
I saw on a summary for 8.4 that there's a generic function for use as a before row trigger that will elide null updates (ie replacement with an identical row). I can see that this is great - but I was wondering if it should be more integrated and turned on in the db schema. Trivially, doing so would mean that there is less of an issue trying to integrate with a custom user before trigger, although that's really no more than an inconvenience. I understand that there is an argument for not making it the default behaviour given that trigger execution and locking are both affected if we do elide the update completely, but it occured to me that while I might want the trigger and locking behaviour, I probably never want the actual database image copy to happen. Doing so will needlessly bloat the database file and give the vacuum procedure work to do - and it seems interfere with the new optimisations relating to pages that are all visible in all transactions. Would it be possible to determine a null update cheaply and retain the locking and trigger execution, but elide the actual row copy - and in particular the associated impact in terms of setting status flags etc? I guess this would need to be handled at a lower level than the trigger approach - and would need an option that is integrated into the schema, so we can elide the copy, and optionally the trigger execution, and optionally the lock. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Elide null updates
Andrew Dunstan wrote: I don't follow what you're saying. If an update is skipped by a trigger, nothing new is written to disk, and there should be nothing to vacuum from it. That's why this trigger can speed up certain update queries enormously. OK I'll try again. Suppose we do an update. This will: - write the new image and do the MVCC housekeeping - logically lock the updated row - fire update triggers Suppose the pre-update trigger elides the update - ALL of the above is removed. Now, one of the objections to making it the default behaviour is that the side effects (such as the lock and the trigger) might be desirable, or at least that removing them is a change in behaviour. I'm wondering whether it would be possible to remove the physical update but retain the logical side effects,, so this argument about changed semantics is removed, and the only issue is whether the cost of identifying the noop update is worthwhile given the savings achieved, which will be application dependent. James (I suspect that if you step back from the implementation of the SQL engine as a series of procedural steps on rows - and think of it in terms of relational set algebra, then it is entirely defensible to elide such an update as a matter of course and that it SHOULD happen - but then there is always fun and games around inserting duplicates too, and I suspect most of us don't think in algebra terms) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: Welcoming our C++ friends
Kurt Harriman wrote: B) let the build farm do a nightly build with a C++ compiler merely as a test to verify that no C++ compilation errors are introduced, but continue to use C 'officially' for builds and releases; or This was the intent of my suggestion. There can be advantages in that you can use a lot of C99 (and still port to non-C99 envs eg MSVC) if you have a few ifdefs to use std::vector instead of dynamic arrays, but the bigger issue (for me) was always been that the name mangling means that you find out pretty quickly if you have a mismatch between declaration and definition of functions. Attempting the link with C++ mangling can put this to rest, even if you never try running it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: Welcoming our C++ friends
Kurt Harriman wrote: The foremost opposing argument seems to have been that there should be no attempt to alleviate the existing reserved word problem without automatic enforcement to guarantee that never in the future can new occurrences be introduced. Is there anything in the source that would necessarily preclude using the C++ compiler to build *all* the code? I'd guess that this would be quite a big patch to do this in any places where we have implicit conversions from void* to char* etc, but the result is valid as C and C++ and arguably better documented. C++ is picky about a few things you can do in C, but most of them are things I'd rather not do anyway. Run such a build on the build farm each night, and it will be obvious as soon as C++-unfriendly code sneaks in. And who know, maybe eventually we could use C++ properly in the code. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
David Fetter wrote: This part is a deal-killer. It's a giant up-hill slog to sell warm standby to those in charge of making resources available because the warm standby machine consumes SA time, bandwidth, power, rack space, etc., but provides no tangible benefit, and this feature would have exactly the same problem. IMHO, without the ability to do read-only queries on slaves, it's not worth doing this feature at all. That's not something that squares with my experience *at all*, which admitedly is entirely in investment banks. Business continuity is king, and in some places the warm standby rep from the database vendor is trusted more than block-level rep from the SAN vendor (though that may be changing to some extent in favour of the SAN). James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication hooks
Marko Kreen wrote: There is this tiny matter of replicating schema changes asynchronously, but I suspect nobody actually cares. Few random points about that: I'm not sure I follow you - the Sybase 'warm standby' replication of everything is really useful for business continuity. The per-table rep is more effective for publishing reference data, but is painful to maintain. Not having something that automagically reps a complete copy including DDL (except for temp tables) is a major weakness IMO. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Aidan Van Dyk wrote: The whole single-threaded WAL replay problem is going to rear it's ugly head here too, and mean that a slave *won't* be able to keep up with a busy master if it's actually trying to apply all the changes in real-time. Is there a reason to commit at the same points that the master committed? Wouldn't relaxing that mean that at least you would get 'big' commits and some economy of scale? It might not be too bad. All I can say is that Sybase warm standby is useful, even though the rep for an update that changes a hundred rows is a hundred updates keyed on primary key, which is pretty sucky in terms of T-SQL performance. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] triggers on prepare, commit, rollback... ?
Tom Lane wrote: * Trigger on rollback: what's that supposed to do? The current transaction is already aborted, so the trigger has no hope of making any database changes that will ever be visible to anyone. It can however affect state in the backend doing the rollback, which can be useful. * Trigger on commit: what do you do if the transaction fails after calling the trigger? The reductio ad absurdum for this is to consider having two on-commit triggers, where obviously the second one could fail. Ditto - this is effectively at the point where messaging for NOTIFY happens, and if it fails then that's tough. If you need to implement a custom NOTIFY, this is where to do it. Another response I've heard is but I don't want to make inside-the-database changes, I want to propagate the state to someplace external. Of course that's completely broken too, because there is You really are being absurdly judgemental here. _You_ may not have a use case, but that does not mean that no-one else does. Some things are idempotent and are effectively hints - that they are not transacted can be well understood and accomodated. Is 'Tom doesn't need it' an adequate reason to take such a hard line? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgkill on win32
Magnus Hagander wrote: The problem is when winsock operations are interrupted by APCs. See: http://archives.postgresql.org/pgsql-hackers-win32/2004-04/msg00013.php Whoa! Indeed, that's a bit sucky because they really aren't documented as interruptible. In this case though I see not material problem with going back to APCs. At the moment you deliver to a pipe server thread with a pipe RPC. I can't see why you cannot deliver to a signal handling thread with the APC - the published {thread-id,function} tuple does not need to refer to the main thread for the process. This would de-synchronize the delivery but make a relatively small change since that background thread could deliver to the main thread the same way it does now. If there were any desire to provide a MT-aware postmaster, the same technique of masking signals except on a signal thread might apply. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgkill on win32
Magnus Hagander wrote: You interested in trying to code up a patch to verify that? ;) Practical reality says that I won't get to this before the next version of Windows is released. I don't want to promise something I can't deliver. If there were any desire to provide a MT-aware postmaster, the same technique of masking signals except on a signal thread might apply. Define MT-aware :-) It's certainly MT-aware in the fact that it's already MT... But there is no interest in making the actual backends launch as threads in the postmaster - at least not currently. I seem to remember being slapped about for daring to suggest using a threaded embedded language even if only one thread calls into the core, on the ground that the signals might not go to the right thread. So I'm assuming that a thread-aware build would generally mask async signals and wait for them in a specific thread in sigwait, which would effectively match the Win32 model (for a threaded build). On the other hand, I'd normally regard signals as the work of the devil and prefer to send a wakeup via some other IPC, for pretty much that reason, but there you go. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgkill on win32
I'm wondering if the mechanism used for sending signals between postmaster processes on Win32 is much more heavyweight that is necessary. Is there a reason not to call OpenThread on the target postmaster's thread id, and then use QueueUserAPC to execute a 'signal handler' method on it? (Or Terminate Thread for 'extreme' cases). I don't think its entirely trivial because it would be better to cache the handles for a short time rather than Open/Queue/Close each time, so it may still be necessary to have a background thread that checks the handles are not signalled and closes them if they are. But signal delivery could be somewhat faster. Haven't tried it - but I can't help thinking that the named pipe service is a bit heavyweight. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgkill on win32
Magnus Hagander wrote: Yes. We used to use APCs, but touching anything remotely related to Winsock from an APC is not supported... We had a lot of trouble with it By implication you'd be doing socket'y stuff from the signal handler on UNIX? Scary. I was assuming it would be used to signal an event that would release the main thread and then do the real work there. I suppose by implication you can't knock a thread out of select from the APC? Though, presumably, you could hand over the request to a waiting application thread and *that* would have full WinSock access. I can help feeling that the eventselect mechanism is the one to use, not the crappy bsd select emulation. initially, and it took a long support case with Microsoft PSS to figure out what was broken, because this being unsupported was not properly documented. Indeed. And its a very odd limitation given the way that APCs are used with completion ports. Did any sort of reference get written to the knowledge base for this, do you know? It certainly is ;-) We could probably find something more efficient, but APCs are not the one. The concern I have isn't so much that the mechanism might be a bit ugly, but the synchronous delivery and the scheduling gymnastics implied.. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans
Would it be possible to store plans with an indication of the search path that was used to find tables, and for temp tables some snapshot of the statistics for the table if any? My suspicions are that: * where you have a lot of short-lived connections then actually they will often use the default search path - or a similar one * if a temp table is in use then normally these will be small or contain 'similar' data There is a danger that these heuristics will be poor if long-running connections are in play - but they have no excuse not to do their own preparation. Perhaps you could have named cache segments and connections could 'opt in' to a cache segment if they want such sharing? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] notify with payload (pgkill, notify)
Is the intent to replace most uses of (pg)kill with a general purpose messaging system between the processes, or (just) to address notify per se? (Presumably with 'fire-and-forget' and also rpc semantics? And pub-sub? And some sort of write to an fd protected by an atomic flag to elide multiple writes when the process hasn't woken and acknowledged the ATTN yet?) If pgkill is not used for signalling, could this reduce the reliance on signals (except for trying to kill off processes) to the point where ot becomes much less scary to link to libraries that use signals themselves and/or threaded runtimes? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] notify with payload (pgkill, notify)
Andrew Dunstan wrote: If you have an alternative suggestion them you need to make it now. well, I'm not sure about the non-collapsing business but no matter. We are not intending to use FDs for message passing. They will be stored in shared memory. See previous discussions for details: I'm more interested in replacing the pgkill wakeup mechanism with something that is less sucky on Win32. If you just want to send a wakeup then making a pipe RPC is crazy since it introduces a lot of scheduling. A unix domain datagram socket to which processes can send one byte is enough to toggle the readability if the socket without doing much transfer, and using an atomic flag in shm means that subsequent wakeups can be elided until the process acknowledges the signal, which it should do before scanning the shm queue. I don't see any reason to pass the data through the kernel if you have the shm handling routines, particularly if you have ones that handle spilling. If you're happy with using signals on UNIX (and Tom suggested that SIGUSR usage and signal handling might be one of the concerns about allowing threaded addins) then at least on Win32 it should be worth adding a variant of kill which has a void return and does not try to return information about the validity of the target process id. Personally I try to avoid signals and would prefer an fd level-based signal. But YMMV. I was just wondering if what you're doing could be generalised as a 'post to backend' system that can be used for other duties in future, so you'd need a message type and payload blob rather than assuming that the type is 'notify' and the payload is the payload length. If you had a pgsql extension that was 'wait for notificiation' or 'wait for row in message table' then the wakeup could essentially be directed at code in the target backend, rather than using it as a conduit to target code in the client. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgkill
I was looking at the notify processing in async.c and I noticed that kill is called whether or not the process has been signalled already, and whether or not 'this' process has signalled the process. It seems unecessary to me - especially if we are on Win32 and the pgkill is implemented as a CallNamedPipe. My understanding is that signal is normally a fairly expensive operation at the best of times, particularly so when its turned from a fire-and-forget to an RPC with scheduling. I appreciate that signal wishes to determine whether a process is dead, but it must be questionable whether this is necessarily something that should be done by peers when the information is immediately out of date and we can definitively determine a crash in the master process anyway. So: 1) why do the RPC, rather than detect death from the master process? 2) Why not use the existing compare-and-set atomic infrastructure to maintain a 'pending signal' flag (or flags) in struct PGPROC and elide signals that are flagged and not yet indicated as processed by the target process? 3) If we do both the above, would it not be cleaner to use an fd with a local datagram socket than a signal on nearly all systems? And a semaphore on Win32? So its all picked up in select or WaitForMultipleObjects? I know the comment in async.c is: 'but we do still send a SIGUSR2 signal, just in case that backend missed the earlier signal for some reason.'. But that seems somewhat lame - we might have multiple signals compressed but does any system actually *lose* them? It also occurred to me that we should not kill as we go, but accumulate a set of pids to signal and then signal each after the iteration is complete so we can do as little processing with the pg_notify resources held as possible, and certainly no system calls if we can help it. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] first time hacker ;) messing with prepared statements
Tom Lane wrote: PFC [EMAIL PROTECTED] writes: Do the parse tree store fully qualified schema.table or schema.function ? They store OIDs. So, what happens if we reference a temporary table or something else that requires resolution down a search path? I believe Microsoft and Sybase have to defer some optimisation because of this. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory support
Tom Lane wrote: Yeah, I would be far more interested in this patch if it avoided needing SysV shmem at all. The problem is to find an adequate substitute for the nattch-based interlock against live children of a dead postmaster. (confused) Why can't you use mmap of /dev/zero and inherit the fd into child processes? (simple enough to do something similar on Win32, even if the mechanism isn't identical) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] first time hacker ;) messing with prepared statements
PFC wrote: Hello, So, I embarked (yesterday) on a weekend project to add a new feature to Postgres... I use PHP with persistent connections and always have been bothered that those very small AJAX queries (usually simple selects returning 1 row) take more CPU in postgres to parse plan than to actually execute. Microsoft's answer to this issue with SQLServer appears to have been to introduce a smart cache for all statement plans. It seems to be very effective. I guess you're doing much the same thing but with more user intervention, in effect. Are you sure that you application wouldn't benefit more from a MOM solution with persisted database connections? Have you looked at http://safmq.sourceforge.net/? James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Noob Hints on testing and debugging?
Tom Lane wrote: (eg how to run it up and feed it SQL ideally without running a postmaster and execing a back end) Why would you consider that ideal? Such a scenario would have approximately zip to do with the real-world environment your patch would face. Your points what? If I'm fiddling with the language parser and transaction engine, that's hardly relevant. Its necessary to debug in a full env too, of course, but an ability to run a standalone engine which does its own private shmem setup would be handy. If its not there, too bad. It makes the compile/test cycle much faster, since you don't have to go through the rigmarole of attaching to a process, and its one of the aspects of using VS that's a whole pile nicer than the make/start/gdb-to-process stuff I have to use at work. (And at least I have TotalView there, even if the admins can't organise KDevelop or Anjuta or Eclipse.) Is there any sanity at all in a trigger-on-rollback? Exactly what would you expect it to be able to accomplish that anyone else could see after the transaction has rolled back? (And no, trigger on commit isn't very much saner.) Why do you say this? I'm interested in side effects external to the db. Logging to a custom logger is an obvious trivial example - in fact I want to set flags in the data triggers and then process them. I'm currently thinking that rollback can be dispensed with in favour of a setup phase on commit start. I'm looking to do things that would otherwise be handled with the event system for some use cases but not all (eg a final consistency check point in before commit). Since that seems to have issues and this can be more general, it seems worth looking at. I'll post a straw man. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal for db level triggers
Background: Firebird 2.1 allows: CREATE TRIGGER name ON ( CONNECT | DISCONNECT | TRANSACTION START | TRANSACTION COMMIT | TRANSACTION ROLLBACK ) I want to allow syntax: CREATE TRIGGER name event event (3 variations, application can specify any number of them): AFTER CONNECT AFTER TRANSACTION START BEFORE TRANSACTION COMMIT In usage: AFTER START clears counters and flags. UPDATE triggers on data set counters and flags. BEFORE COMMIT examines the counters and flags and performs any final validation or adjustments (or external events such as sending a MoM message) Some of this can be done with the event mechanism but: - it can be handy to have a place to check the consistency of all changes as a whole (assuming they've been recorded by other triggers) - the event mechanism is a bit sucky at the moment anyway, and with this I can go straight to my MoM of choice instead of using a gateway Storage: In pg_trigger with: tgrelid = (oid)0 (and constraint data set to 'f',0,0) use additional bits in the tgtype column (smallint) Check: enough bits remain in the tgtype column. Currently it stores (B|A)(I|U|D) but we may be able to reuse the I,U,D bits for CONNECT, START, COMMIT respectively. Caching: Backends should cache the set of database-level triggers and refresh it at the start of a transaction. The transaction triggers should be stored in 2 lists for ease of access. A shared memory location should store the last time (or a sequence value) reflecting changes to the set of database-level triggers. The timestamp/sequencer should be updated atomically as we changes to pg_trigger (either for all such changes, or only for database triggers) : such changes are quite rare. Execution: Triggers in the same class run in alphabetic order. Database triggers are NOT run in the account that is dbo. This ensures that dbo is never prevented from connecting to a database or from listing or dropping such triggers. (Consider allowing them to be turned on using a 'set' command: the connect will happen immediately that completes. The default will be 'off'.) The AFTER CONNECT triggers run immediately after authentication in an implicit transaction. If the transaction is rolled back by any trigger then we disconnect immediately. The AFTER START triggers are run when we enter a new transaction context, before any changes are made. If there is a rollback then we stop processing triggers. The BEFORE COMMIT triggers are run just before normal commit processing starts. They may elect to roll back. If there is a rollback then we stop processing triggers. Implementation: (I made this up. I don't know what I'm doing really. This is from a very cursory look at the code. Help!) AFTER CONNECT needs to be inserted at the end of the authorization processing and we need to detect a rollback there so we can disconnect. AFTER START can be handled by a new state inserted in front of TRANS_INPROGRESS (or handled inline in xact.c:StartTransaction?) BEFORE COMMIT is handled in xact.c:CommitTransaction. We already allow triggers to be deferred to the commit and run in AfterTriggerFireDeferred. We need to allow those triggers to run, then we run the BEFORE COMMIT triggers, and then perform a second round of pending deferred triggers. So, insert code before AfterTriggerEndXact(true). Needs to be in PrepareTransaction too. Unknown: how do autocommit statements work? Need to operate 'as if' we have start and commit. The usage is: AFTER CONNECT provides a way to enhance security and permissioning. It allows us to initialise state. AFTER START is used to re-initialise state. We enter START immediately we try to do anything, so we defer recovery from ROLLBACK to this point. We can tell we have rolled back logically since we have state left around. BEFORE COMMIT is run just before we perform the actual commit: it can still make changes to data, and can still rollback. It needs to do side effects and clear out state. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for db level triggers
James Mansion wrote: In usage: AFTER START clears counters and flags. UPDATE triggers on data set counters and flags. BEFORE COMMIT examines the counters and flags and performs any final validation or adjustments (or external events such as sending a MoM message) I'd like to point out also that AFTER CONNECT is a good opportunity to CREATE TEMP TABLE (be nice if a global temp table definition could be persisted and automatically duplicated into each session, but never mind). And if we use data update triggers to insert into a DELETE ROWS temp table or an in-memory data structure, the BEFORE COMMIT trigger is the place to do a bulk copy into real table(s) or combine rows into a BLOb in an audit table. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Build problem with VS2008 Express Edition
Hi, I managed to get most of 8.3 built with VS2008EE last night. Ii had to change some references to msbuild to vsbuild, which I guess is expected but one compile issue surprised me. I had to change #ifdef IPV6_V6ONLY at backend/libpq/pqcomm.c:386 to: #if defined(IPV6_V6ONLY) (!defined(WIN32) || (_WIN32_WINNT = 0x0501)) because IPPROTO_IPV6 is defined in ws2ipdef.h but the IIPROTO_V6 enum isn't defined in ws2def.h unless you set the version up appropriately. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Noob Hints on testing and debugging?
Are there any hints and tips anywhere on practical issues for testing and debugging changes to a backend? (eg how to run it up and feed it SQL ideally without running a postmaster and execing a back end) I'm using VS2008 on XP by choice, and I'd like to investigate supporting something closely related to the Firebird 2.1 trigger on transaction commmit and rollback. James -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
Kenneth Marshall wrote: conversion process themselves. Accepting random input puts a performance hit on everybody following the standard. Why is that necessarily the case? Why not have a liberal parser and a configurable switch that determines whether non-standard forms are liberally accepted, accepted with a logged warning, or rejected? James ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
Mark Mielke wrote: I recall there being a measurable performance difference between the most liberal parser, and the most optimized parser, back when I wrote one for PostgreSQL. I don't know how good the one in use for PostgreSQL 8.3 is. As to whether the cost is noticeable to people or not - that depends on what they are doing. The problem is that a UUID is pretty big, and parsing it liberally means a loop. It just seems odd - I would have thought one would use re2c or ragel to generate something and the performance would essentially be O[n] on the input length in characters - using either a collection of allowed forms or an engine that normalises case and discards the '-' characters between any hex pairs. So yes these would have a control loop. Is that so bad? Either way its hard to imagine how parsing a string of this length could create a measurable performance issue compared to what will happen with the value post parse. James ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] wishlist for 8.4
Richard Huxton wrote: It would be nice to have a more dynamic language built-in. I'm not aware of any BSD-licensed dynamic languages though. Lua is MIT. And I believe there's even a pl/Lua around. The small size of the interpreter and ease of C interfacing would make it ideal, or at least it seems that way to me. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PostgreSQL 8.4 development plan
Tom Lane wrote: There is, although I think a large fraction of it will get bounced as needs more work, which should reduce the pressure. We'll just be trying to give feedback to let the patch authors move forward, which will not take as much time as actually committing would take. The current queue is http://momjian.postgresql.org/cgi-bin/pgpatches_hold Note that a lot of the bulk is discussion of things that aren't anywhere near committable anyway. Wouldn't it make sense try to catch up a bit and fix as much of this as is feasible (including return and resubmit) for things where the desire is uncontroversial, even if the implementation is flawed, and accept other things that are fully acceptable in the time it takes to do that, and then call it a wrap? The curre nt *plan* is for a 14 month cycle. And it will probably slip. Some of the queued items are going to be very old by the time you go to 8.4 on this program, which seems a shame. That sort of plan looks to me more like a 'major refactoring to get to 9.0' sort of plan, than an incremental release. James ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Postgresql Materialized views
Mark Mielke wrote: Joshua D. Drake wrote: Unless you are going to *pay* for it - you do realize that the best way to get it implemented, would be to open up the source code, and give it a try yourself? Because users possibly want to do that - use it? Some of us have better things to do than go through the learning curve of how the internals of a non-trivial system work. Does that really mean its unreasonable to voice an opinion of what would make the system more useful? Offensive is relative. I find it offensive when people demand things on one of the many mailing lists I read without providing anything to the community. If your view of the community is that it should be insular and closed to those who can't or won't be developers, then fine. But taking that attitude will pretty much guarantee that your system will never amount to more than a hill of beans. One of the major problems with open source as a whole is that you get this 'fix it yourself or pay for it' business which provides no way to spread the cost over many users who would all have something to gain - but none of whom can justify footing the bill for the entire development. Most of us are in that position as users, even if we do have skills that would enable us to help - we have our own issues to deal with. Buying support isn't the answer - its not support that's needed, after all, so much as an ability to buy a share of influence over a roadmap.. Do you want ensure that only the very rich in cash or time can have any influence? You're going the right way about it with your attitude, which appears deliberately user-hostile. What do you want? James ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Sorting Improvements for 8.4
Ron Mayer wrote: Or do you mean being able to perform parts of the query plan fully in parallel? If this, then one would need a lot more than ParallelSort... I wouldn't recommend that - it seems like a Hard Problem. Isn't it the case that the implicit unions from processing partitioned data provides a more-or-less-ideal opportunity here? I certainly have sympathy for parallelising expensive queries to bring the best response time down, even if the average under full load goes up slightly, since any implied locks (including pinning of read-ahead ages) will be released sooner. And when load is light, users who are online get more of the hardware they paid for. James ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] .NET or Mono functions in PG
Tom Lane wrote: It is also pretty well established that if pltcl or plperl cause the backend to become multithreaded, things break horribly. I strongly Isn't that something that needs to be fixed? Its one thing not to allow for multiple threads to be in your code, but not having a threaded library linked in is going to become more and more of a problem. James ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] .NET or Mono functions in PG
Magnus Hagander wrote: I did look at this at some earlier point as well. One big problem at that time was that once you embedded mono, yuo had all sorts of threads running in your backend ;-) Is that necessarily a problem? You have to compile with a thread-capable libc and take some care that the heap implementation is well tuned, but there's no reason why the mono housekeeping threads should cause you any problem is there? It should be much like the embedded Java. Another way to do it is the PL/J way (I think). Which is starting up a separate process with the VM in it and then do RPC of some kind to it. Which has more overhead per call, but lower per backend etc. And a lot less dangerous. Given that one would want to embed to have very low latency both on trigger invocation and on calls back into the data engine, I don't really see the point personally. I'm not sure how important it is to make the embeded interface look like a standard interface (in that way that the embedded CLR in MSSQL does - mostly) or whether it can be a thin wrapper over the C API. I think there's good mileage in starting with the thin wrapper, then at least some common business logic code can be used. James ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Win32 shared memory speed
Magnus Hagander wrote: IIRC, there hasn't been any direct benchmark for it (though I've wanted to do that but had no time), but it's been the olnly real explanation put forward for the behaviour we've seen. And it does make sense given the thread-centric view of the windows mm. /Magnus How is it supposed to be slow, once its mapped into your process? There's no OS interaction at all then. If you are suggesting that the inter-process synch objects are slow, then that may be so: just use interlocked increment and a spin lock in place of a mutex and use an associated event to wake up if necessary. You dont have to use a named kernel mutex, though it may be handy while setting up the shared memory. If you are repeatedly changing the mappings - well, that may be something that needs optimisation. James ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: real procedures again (8.4)
Josh Berkus wrote: Not only would they be generally useful for SP programming, but multisets would eliminate one of the big hurdles in re-writing T-SQL stored procedures in PG, and thus make it easier to port from SQL Server. You don't hear a lot of demand for multisets on the mailing lists because we're not getting those SQL Server / Sybase crossovers now. Its true that multiple result sets are a big deal with T-SQL programming: but I think you'll also need to provide a way for the locking model to behave in a similar way and also very importantly to be able to emulate the after-statement triggers view of new and old images. James ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] INSERT/SELECT and excessive foreign key checks
Andrew Dunstan wrote: Could we achieve the same thing in a more general way by having a per-FK tiny (say 10?) LRU cache of values checked. Then it wouldn't only be restricted to constant expressions. Of course, then the trigger would need to keep state, so it might well be too complex (e.g. what if there are are concurrent inserts?) I was wondering whether one could try to identify what might be termed 'enum tables' that exist to provide lookups. There are perhaps three main types of table that is the target of a foreign key lookup: 1) tables that map to program language enumerations: typically small (less than a hundred rows) and changing very infrequently. 2) tables that hold quasi-static reference data where rows are 'never' deleted (the may be amended, perhaps to identify that they are logically inactivated, but still needed for reference lookup from existing rows elsewhere) - typically customer definitions, product definitions, site definitions and that sort of thing that is often regarded as 'static data' by a user application session but which may change. 3) master records in master/detail relationships such as order/orderline. If you can have mechanisms that reflect the likelihood of an update and optimise accordingly, then hopefully performance in real-world applications can be improved. In the case of 1) for example, we might reasonably have a single logical read/write lock that controls access to ALL such tables in a schema, and a single 'update generation count'. The lock would effectively provide repeatable read stability across all of the tables (a multi-table table lock) while in place, and the generation count (which can be a tran id) idicates to caching processes when the cache is stale. This means that unlike normal MVCC the readers will block a writer, but in this case we expect the write to happen only during application release. In the case of 2), we can't use the cross-table lock, and the tables may be large, so the suggested LRU cache per table (with a table-level read/write lock again) may be most effective, but we may elect to regard a read lock as allowing any operation that doesn't invalidate the primary key.. And in the case of 3) we don't do anything special at all. I certainly think that anything which can materially reduce lookups in case 1) and hopefully 2) will encourage good database design and declarative referential integrity - in some clases of high performance application the cost is too high to be done inline with an update, which is a shame. James ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Another idea for index-only scans
Decibel! wrote: The advantage to Bruce's idea is that it sounds pretty simple to implement. While it wouldn't be of use for many general cases, it *would* be useful for read-only tables, ie: old partitions. Wouldn't the mostcommon case by foreign key checks against tables that essentially map application enums to display strings? This is a rather common scenario. It would be nice if such tables (which are typically small) could be retained in each backend process with a simple check that the cached data is still valid. James ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Connection and transaction triggers
I've been looking at the new trigger features planned for Firebird 2.1 and I'm a bit jealous. I'm interested in 'stateful' proc packages that can benefit from accumulating data from updates during a transaction and then performing an action at the end - perhaps doing a bulk insert in a pre-commit trigger, or communicating with an external system in a post-commit trigger, and using a begin tran or rollback trigger to tidy up the accumulator. Would this be a difficult thing to add? I quite like their temporary table treatment too. James ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match