Re: [HACKERS] changing MyDatabaseId
Robert, On 11/15/2010 05:39 AM, Robert Haas wrote: I've spent a few hours pouring over the source code with coarse-toothed comb, trying to figure out just exactly what might break if we changed MyDatabaseId after backend startup time, or in other words, allowed a backend to unbind from the database to which it was originally bound and rebind to a new one. This is related to the periodic conversations we've had about a built-in connection pooler, and/or maintaining a pool of worker threads that could be used to service parallel query, replication sets, etc. What follows is not meant to be a concrete design proposal; it's basic research that may lead to a proposal at some time in the future. Still, comments are welcome. Thanks a lot for doing that, saved me a couple hours (presumably more than it cost you :-) Thoughts? The question obviously is whether or not this is faster than just terminating one backend and starting a new one. Which basically costs an additional termination and re-creation of a process (i.e. fork()) AFAICS. Or what other savings do you envision? If that's it, it certainly seems like a huge amount of work for very little benefit. Or does this feature enable something that's impossible to do otherwise? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wednesday 17 November 2010 11:04:04 Markus Wanner wrote: Robert, On 11/15/2010 05:39 AM, Robert Haas wrote: I've spent a few hours pouring over the source code with coarse-toothed comb, trying to figure out just exactly what might break if we changed MyDatabaseId after backend startup time, or in other words, allowed a backend to unbind from the database to which it was originally bound and rebind to a new one. This is related to the periodic conversations we've had about a built-in connection pooler, and/or maintaining a pool of worker threads that could be used to service parallel query, replication sets, etc. What follows is not meant to be a concrete design proposal; it's basic research that may lead to a proposal at some time in the future. Still, comments are welcome. Thanks a lot for doing that, saved me a couple hours (presumably more than it cost you :-) Thoughts? The question obviously is whether or not this is faster than just terminating one backend and starting a new one. Which basically costs an additional termination and re-creation of a process (i.e. fork()) AFAICS. Or what other savings do you envision? Well, one could optimize most of the resetting away if the the old MyDatabaseId and the new one are the same - an optimization which is hardly possible with forking new backends. Also I think it could lower the impact due locking the procarray an related areas. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-platform, multi-locale regression tests
Kevin, On 11/13/2010 01:28 AM, Kevin Grittner wrote: Should anyone else run into this, it's controlled by this in the test scheduling definitions (the tdef values): 'xfail': True There are other test flags you can override here, like 'skip' to skip a test. Correct. Looks like dtester urgently needs documentation... Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
Andreas, On 11/17/2010 11:38 AM, Andres Freund wrote: Well, one could optimize most of the resetting away if the the old MyDatabaseId and the new one are the same - an optimization which is hardly possible with forking new backends. Uh? Why not simply re-use the same backend, then? Or do you think of re-connecting to the same database as a way to reset your connection? Also I think it could lower the impact due locking the procarray an related areas. That may be, yes. But as pointed out by Tom and Alvaro, you'd have to adhere to a compatible sequence of changes to shared memory to avoid race conditions. That possibly requires using a very similar locking sequence as the combination of a detaching and a newly starting backend would use. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wednesday 17 November 2010 11:58:33 Markus Wanner wrote: Andreas, On 11/17/2010 11:38 AM, Andres Freund wrote: Well, one could optimize most of the resetting away if the the old MyDatabaseId and the new one are the same - an optimization which is hardly possible with forking new backends. Uh? Why not simply re-use the same backend, then? Or do you think of re-connecting to the same database as a way to reset your connection? I am thinking of a connection-pooler like setup. Quite often your main-load goes towards a single database - in that situation you don't have to reset the database id most of the time. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On 11/17/2010 12:09 PM, Andres Freund wrote: I am thinking of a connection-pooler like setup. Quite often your main-load goes towards a single database - in that situation you don't have to reset the database id most of the time. Okay, so that's what I'd call a connection-reset or something. Or probably a re-authentication feature which allows switching users (and thus permissions on that same database). Could make sense for a connection pooler, yes. Not having to flush caches in that case could even turn out to be a good win. And I think it's a lot simpler to just switch the user than to switch the database. Such a feature looks more promising to me. (Do we have other connection related and unchangeable state?) As side notes: for the bgworkers in Postgres-R, I'm currently re-using existing backends. As they only do change set application, the amount of connection-reset required is minimal (well, there isn't a client connection for these kinds of backends, in the first place, but that's another story). Plus they are always acting as superuser, no authentication or user switching required in that case. For parallel querying as well as async transactions, it might make sense to be able to switch users but remain connected to the same database. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Overlap operators for ARRAY and NULLs
There might be inconsistency in overlap operators for array types ( @, @, and ) when the argument arrays contain NULLs. - SELECT 2 = ANY (ARRAY[1, NULL]) = NULL - SELECT ARRAY[2] @ ARRAY[1, NULL] = false NULL means unknown in definition, so should it return NULL rather than false? I found the issue when I read spec of MULTISET. In the SQL standard, the following expression returns NULL. I was thinking to make SUBMULTISET OF to be an alias of @ operator, but they seems to be incompatible. - MULTISET[2] SUBMULTISET OF MULTISET[1, NULL] = NULL Will we change the operator's behavior? It would be more consistent, but incompatible with older server versions. If impossible, I'll add separated submultiset_of() function instead of @ operator for MULTISET supports. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
Excerpts from Markus Wanner's message of mié nov 17 07:04:04 -0300 2010: Thoughts? The question obviously is whether or not this is faster than just terminating one backend and starting a new one. Which basically costs an additional termination and re-creation of a process (i.e. fork()) AFAICS. Or what other savings do you envision? I don't think it's a speed thing only. It would be a great thing to have in autovacuum, for example, where we have constant problem reports because the system failed to fork a new backend. If we could simply reuse an already existing one, it would be a lot more robust. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On 11/17/2010 01:27 PM, Alvaro Herrera wrote: I don't think it's a speed thing only. It would be a great thing to have in autovacuum, for example, where we have constant problem reports because the system failed to fork a new backend. If we could simply reuse an already existing one, it would be a lot more robust. Hm, that's an interesting point. To actually increase robustness, it would have to be a failure scenario that (temporarily) prevents forking, but allows an existing backend to continue to do work (i.e. the ability to allocate memory or open files come to mind). Any idea about what's usually causing these fork() failures? I'm asking because I'm afraid that for example, in case of an out of memory condition, we'd just hit an OOM error later on, without being able to perform the VACUUM job, either. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
Hi, Here is my review of 'rollback sequence reset for TRUNCATE ... RESTART IDENTITY' patch. - Is the patch in context diff format? It's in git diff format. I guess it's OK ? - Does it apply cleanly to the current git master? Yes - Does it include reasonable tests, necessary doc patches, etc? Doc: Yes, it removes the warning about TRUNCATE ... RESTART IDENTITY, which is the point of the patch Tests: There is a new regression test added for restart identity. And 'make check' passes (tested on linux). - Usability review (skills needed: test-fu, ability to find and read spec) - Read what the patch is supposed to do, and consider: - Does the patch actually implement that? Yes. - Do we want that? I think so, it removes a trap limitation of truncate - Do we already have it? No - Does it follow SQL spec, or the community-agreed behavior? I think so - Does it include pg_dump support (if applicable)? Not applicable - Are there dangers? Not that I think of - Have all the bases been covered? I think so - Feature test (skills needed: patch, configure, make, pipe errors to log) - Apply the patch, compile it and test: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions - Are there corner cases the author has failed to consider? I don't think so - Are there any assertion failures or crashes? No Performance review (skills
Re: [HACKERS] changing MyDatabaseId
Excerpts from Markus Wanner's message of mié nov 17 09:57:18 -0300 2010: On 11/17/2010 01:27 PM, Alvaro Herrera wrote: I don't think it's a speed thing only. It would be a great thing to have in autovacuum, for example, where we have constant problem reports because the system failed to fork a new backend. If we could simply reuse an already existing one, it would be a lot more robust. Hm, that's an interesting point. To actually increase robustness, it would have to be a failure scenario that (temporarily) prevents forking, but allows an existing backend to continue to do work (i.e. the ability to allocate memory or open files come to mind). Well, the autovacuum mechanism involves a lot of back-and-forth between launcher and postmaster, which includes some signals, a fork() and backend initialization. The failure possibilities are endless. Fork failure communication is similarly brittle. Any idea about what's usually causing these fork() failures? I'm asking because I'm afraid that for example, in case of an out of memory condition, we'd just hit an OOM error later on, without being able to perform the VACUUM job, either. To be honest I have no idea. Sometimes the server is just too loaded. Right now we have this delay, if the process is not up and running in 60 seconds then we have to assume that something happened, and we no longer wait for it. If we knew the process was already there, we could leave it alone; we'd know it would get to its duty eventually. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] describe objects, as in pg_depend
Hi, A customer of ours (Enova Financial) requested the ability to describe objects in pg_depend. The wiki contains a simplistic SQL snippet that does the task, but only for some of the object types, and it's rather ugly. It struck me that we could fulfill this very easily by exposing the getObjectDescription() function at the SQL level, as in the attached module. I propose we include this as a builtin function. Opinions? -- Álvaro Herrera alvhe...@alvh.no-ip.org #include postgres.h #include catalog/dependency.h #include fmgr.h #include utils/builtins.h #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif Datum describe_object(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(describe_object); Datum describe_object(PG_FUNCTION_ARGS) { Oid classid = PG_GETARG_OID(0); Oid objid = PG_GETARG_OID(1); int32 subobjid = PG_GETARG_INT32(2); ObjectAddress address; char *description = NULL; if (classid != InvalidOid) { address.classId = classid; address.objectId = objid; address.objectSubId = subobjid; description = getObjectDescription(address); } if (!description || description[0] == '\0') PG_RETURN_NULL(); PG_RETURN_TEXT_P(cstring_to_text(description)); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On 11/17/2010 02:19 PM, Alvaro Herrera wrote: Well, the autovacuum mechanism involves a lot of back-and-forth between launcher and postmaster, which includes some signals, a fork() and backend initialization. The failure possibilities are endless. Fork failure communication is similarly brittle. I certainly agree to that. However, a re-connecting mechanism wouldn't allow us to get rid of the existing avworker startup infrastructure entirely. And for increased robustness, we'd require a less brittle re-connecting mechanism. Given Robert's list, that doesn't seem trivial, either. (But still doable, yes). Right now we have this delay, if the process is not up and running in 60 seconds then we have to assume that something happened, and we no longer wait for it. If we knew the process was already there, we could leave it alone; we'd know it would get to its duty eventually. You are assuming presence of pool here. Which is fine, it's just not something that a re-connecting feature would solve per se. (Says he who coded the bgworkers pool thingie). Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin cousinm...@gmail.com wrote: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions i haven't tested this nor readed the patch but i wondering what happens in the presence of a prepared transaction (2PC), did you try with concurrent transactions with different serialization levels? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
The Wednesday 17 November 2010 15:50:36, Jaime Casanova wrote : On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin cousinm...@gmail.com wrote: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions i haven't tested this nor readed the patch but i wondering what happens in the presence of a prepared transaction (2PC), did you try with concurrent transactions with different serialization levels? I haven't tested with 2PC. I didn't check with different isolations levels either. I just verified that locking was happening as it should : truncate is blocked by a transaction already locking the table with an AccessShareLock and vice- versa. And that Rollbacking and rollbacking to savepoint restores the sequence to the correct state : the sequence isn't restored to its value at the savepoint, but at its last value before the truncate. I don't see a special test-case with different isolation levels or 2PC. What do you have in mind ? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: fsync()ing the file at shutdown doesn't seem too bad to me from performance point of view, we tolerate that for all other tables. And you can always truncate the table yourself before shutdown. The objection to that was not about performance. It was about how to find out what needs to be fsync'd. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] describe objects, as in pg_depend
Alvaro Herrera alvhe...@alvh.no-ip.org writes: A customer of ours (Enova Financial) requested the ability to describe objects in pg_depend. The wiki contains a simplistic SQL snippet that does the task, but only for some of the object types, and it's rather ugly. It struck me that we could fulfill this very easily by exposing the getObjectDescription() function at the SQL level, as in the attached module. What's the point of the InvalidOid check? It seems like you're mostly just introducing a corner case: sometimes, but not always, the function will return NULL instead of failing for bad input. I think it should just fail always. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Isn't HANDLE 64 bits on Win64?
Magnus Hagander mag...@hagander.net writes: Do you still have a reference to the page that said they will never be assigned that high? http://msdn.microsoft.com/en-us/library/ms810720.aspx which says USER and GDI handles are sign extended 32b values To facilitate the porting, a decision has been made that these system handles should stay as 32b values, sign extended to 64b on the 64b platform. That is, the individual handle types are still based on the HANDLE type, which maps to void *, and so the size of the handle is the size of the pointer, i.e. 4 bytes on 32b and 8 bytes on 64b. However, the actual value of the handle on the 64b platform, (i.e. the meaningful bits), fits within the lower 32b, while the upper bits just carry the sign. This should make it easy to port the majority of the application code. Handling of the special values, like -1, should be fairly transparent. It also should agree nicely with all the cases where the handles had been remoted with the help of the IDL definitions from the public file wtypes.idl. However, care needs to be taken when remoting the handles was done via a DWORD, as the upper long should be properly sign extended on the 64b side. The app should use HandleToLong() and LongToHandle() macros (inline functions) to do the casting right. What's not clear to me is whether the section title means that only certain handles have this guarantee, and if so whether we have to worry about running into ones that don't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
Markus Wanner mar...@bluegap.ch writes: On 11/17/2010 02:19 PM, Alvaro Herrera wrote: Well, the autovacuum mechanism involves a lot of back-and-forth between launcher and postmaster, which includes some signals, a fork() and backend initialization. The failure possibilities are endless. Fork failure communication is similarly brittle. I certainly agree to that. However, a re-connecting mechanism wouldn't allow us to get rid of the existing avworker startup infrastructure entirely. I'm afraid that any such change would trade a visible, safe failure mechanism (no avworker) for invisible, impossible-to-debug data corruption scenarios (due to failure to reset some bit of cached state). It certainly won't give me any warm fuzzy feeling that I can trust autovacuum. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Indent authentication overloading
Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? If not now, then at least put it on the TODO of things to do the next time we need to break backwards compatibility with the format of pg_hba.conf? Though if we're going to break backwards compatibility anywhere, pg_hba is probably one of the least bad places to do it... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
Magnus Hagander mag...@hagander.net writes: Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? Maybe, but it seems like the time to raise the objection was six or eight years ago :-(. Renaming now will do little except to introduce even more confusion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] describe objects, as in pg_depend
Excerpts from Tom Lane's message of mié nov 17 12:20:06 -0300 2010: Alvaro Herrera alvhe...@alvh.no-ip.org writes: A customer of ours (Enova Financial) requested the ability to describe objects in pg_depend. The wiki contains a simplistic SQL snippet that does the task, but only for some of the object types, and it's rather ugly. It struck me that we could fulfill this very easily by exposing the getObjectDescription() function at the SQL level, as in the attached module. What's the point of the InvalidOid check? It seems like you're mostly just introducing a corner case: sometimes, but not always, the function will return NULL instead of failing for bad input. I think it should just fail always. If the check is not there, the calling query will have to prevent the function from being called on rows having OID=0 in pg_depend. (These rows show up in the catalog for pinned objects). The query becomes either incomplete (because you don't report pinned objects) or awkward (because you have to insert a CASE expression to avoid calling the function in that case). I don't think it's all that necessary anyway. If the function goes in without that check, it will still be a huge improvement over the statu quo. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On Wed, Nov 17, 2010 at 16:39, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? Maybe, but it seems like the time to raise the objection was six or eight years ago :-(. Renaming now will do little except to introduce even more confusion. For existing users, yes. For new users, no. I certainly get comments on it pretty much every time I do training that includes explaining pg_hba options. The question is if it's worth confusing our existing users a little, at the advantage of not confusing new users. We could of course also just drop ident-over-tcp completely, but there might be some poor guy out there who actually *uses* it :-) And I agree it would've been much better to do it years ago. That doesn't mean we shouldn't at least *consider* doing it at some point. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] describe objects, as in pg_depend
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of mié nov 17 12:20:06 -0300 2010: What's the point of the InvalidOid check? If the check is not there, the calling query will have to prevent the function from being called on rows having OID=0 in pg_depend. (These rows show up in the catalog for pinned objects). Hmm. It would be good to document that motivation somewhere. Also, for my own taste it would be better to do /* for pinned items in pg_depend, return null */ if (!OidIsValid(catalogId)) PG_RETURN_NULL(); ... straight line code here ... rather than leave the reader wondering whether there are any other cases where the function is intended to return null. Oh, one other gripe: probably better to name it pg_describe_object. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On 11/17/2010 04:25 PM, Tom Lane wrote: I'm afraid that any such change would trade a visible, safe failure mechanism (no avworker) for invisible, impossible-to-debug data corruption scenarios (due to failure to reset some bit of cached state). It certainly won't give me any warm fuzzy feeling that I can trust autovacuum. Well, Alvaro doesn't quite seem have a warm fuzzy feeling with the status quo, either. And I can certainly understand his concerns. But yes, the os-level process separation and cache state reset guarantee that an exit() / fork() pair provides is hard to match up against in user space. So, Alvaro's argument for robustness only stands under the assumption that we can achieve a perfect cache state reset mechanism. Now, how feasible is that? Are there any kind of tools that could help us check? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: fsync()ing the file at shutdown doesn't seem too bad to me from performance point of view, we tolerate that for all other tables. And you can always truncate the table yourself before shutdown. The objection to that was not about performance. It was about how to find out what needs to be fsync'd. Just a crazy brainstorming thought, but If this is a clean shutdown then all the non-unlogged tables have been checkpointed so they should have no dirty pages in them anyways. So we could just fsync everything. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote: On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote: If we don't need a PoC module for each new hooks, I'm not strongly motivated to push it into contrib tree. How about your opinion? I'd say let it go, unless someone else feels strongly about it. I would use this module (rate limit new connection attempts) as soon as I could. Putting a cap on potential CPU usage on a production DB by either a blackhat or mistake by a developer caused by a mistake in configuration (leaving the port accessible) is definitely useful, even in the face of max_connections. My production apps already have their connections and seldom need new ones. They all use CPU though. Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] duplicate connection failure messages
Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of s??b nov 13 22:36:31 -0300 2010: OK, I found out how to get the IP address with the attached patch. The problem is that only pghost is set, never pghostaddr. I am not even sure how that would get set for this code because my tests show it is not: This doesn't work for IPv6 addresses, though. pghostaddr is specified by the user on the command line as an optimization to avoid DNS lookups IIRC, which is why you don't see the code setting it. OK, I doubt we want to add complexity to improve this, so I see our options as: o ignore the problem o display IPv4/IPv6 labels o display only an IPv6 label o something else Comments? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On Wed, Nov 17, 2010 at 04:43:00PM +0100, Magnus Hagander wrote: On Wed, Nov 17, 2010 at 16:39, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? Maybe, but it seems like the time to raise the objection was six or eight years ago :-(. Renaming now will do little except to introduce even more confusion. For existing users, yes. For new users, no. Yep. If we're to be a successful project, the vast majority of our users are future users, not current or past ones. I certainly get comments on it pretty much every time I do training that includes explaining pg_hba options. The question is if it's worth confusing our existing users a little, at the advantage of not confusing new users. We could of course also just drop ident-over-tcp completely, but there might be some poor guy out there who actually *uses* it :-) +1 for dropping it completely. We have dropped features--automatic cast to TEXT, for example--that a good deal more of our user base relied on, for reasons less compelling than this. And I agree it would've been much better to do it years ago. That doesn't mean we shouldn't at least *consider* doing it at some point. The sooner, the better, IMHO. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On 2010-11-17 22:43, Magnus Hagander wrote: at the advantage of not confusing new users. We could of course also just drop ident-over-tcp completely, but there might be some poor guy out there who actually *uses* it :-) As far as I know, companies do use it in their internal networks where they do have a reasonable shot at full control over ident connections. I don't know how easy it would be for them to switch to other methods. Jeroen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On ons, 2010-11-17 at 16:35 +0100, Magnus Hagander wrote: Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? The original patch called the Unix domain socket version peer (whereas the name ident comes from the official name of the TCP/IP protocol used). You can look it up in the archives, but I believe the argument for using the name ident for both was because ident was established and the new feature would provide the same functionality. That said, I completely agree with you. Every time I look through a pg_hba.conf I think, that's a terrible name, we should rename this. We could perhaps introduce an alternative name and slowly deprecate the original one. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Greg Stark gsst...@mit.edu wrote: If this is a clean shutdown then all the non-unlogged tables have been checkpointed so they should have no dirty pages in them anyways. So we could just fsync everything. Or just all the unlogged tables. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wed, Nov 17, 2010 at 5:04 AM, Markus Wanner mar...@bluegap.ch wrote: The question obviously is whether or not this is faster than just terminating one backend and starting a new one. I agree. Which basically costs an additional termination and re-creation of a process (i.e. fork()) AFAICS. Or what other savings do you envision? I think the thing I'd like to do (or see somebody else do) is write a test harness that connects to the database, runs a single SELECT * FROM dual (one row, one column table), and then disconnects; and then oprofile the daylights out of the backend processes. In other words, I'd like to measure as exactly as we can the overhead of each part of the startup process. I think that would give us a clearer picture of where the overhead is, and then we could look more directly at which line items might be avoidable for rebinding to a new database. However, that test doesn't capture everything. For example, imagine a connection pooler sitting in front of PG. Rebinding to a new database means disconnecting a TCP connection and establishing a new one. Switching databases might save some latency there even if we don't actually save much in terms of CPU instructions. Maybe that's not important, though. I don't know. I don't want to let my theorizing get too far ahead of the data. It also occurs to me to wonder whether there's some way that we can speed up backend startup, period. One of the frequent complaints levied against PostgreSQL is that our connections are too expensive. AFAICT, this is partly because backend startup costs are high, and partly because of internal contention, especially around ProcArrayLock. Squeezing on the startup overhead will be valuable no matter what we decide to do about database switching. A variant on this theme would be to try to adjust the startup sequence in some way so that less of it needs to be redone if we switch databases, which might be possible even if a more general solution isn't. Unfortunately, I'm not sure how feasible this is, but maybe there's a way... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On Wed, Nov 17, 2010 at 17:10, Jeroen Vermeulen j...@xs4all.nl wrote: On 2010-11-17 22:43, Magnus Hagander wrote: at the advantage of not confusing new users. We could of course also just drop ident-over-tcp completely, but there might be some poor guy out there who actually *uses* it :-) As far as I know, companies do use it in their internal networks where they do have a reasonable shot at full control over ident connections. I don't know how easy it would be for them to switch to other methods. Yea, I think deleting it is going a bit overboard. If it was a matter of changing it for those who use ident over tcp, I really wouldn't hesitate - they're few :-) But the problem is that it's the ident-over-tcp that's correctly named, not the other one... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 11:00 AM, Greg Stark gsst...@mit.edu wrote: On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: fsync()ing the file at shutdown doesn't seem too bad to me from performance point of view, we tolerate that for all other tables. And you can always truncate the table yourself before shutdown. The objection to that was not about performance. It was about how to find out what needs to be fsync'd. Just a crazy brainstorming thought, but If this is a clean shutdown then all the non-unlogged tables have been checkpointed so they should have no dirty pages in them anyways. So we could just fsync everything. Hmm, that reminds me: checkpoints should really skip writing buffers belonging to unlogged relations altogether; and any fsync against an unlogged relation should be skipped. I need to go take a look at what's required to make that happen, either as part of this patch or as a follow-on commit. It might be interesting to have a kind of semi-unlogged table where we write a special xlog record for the first access after each checkpoint but otherwise don't xlog. On redo, we truncate the tables mentioned, but not any others, since they're presumably OK. But that's not what I'm trying to design here. I'm trying optimize it for the case where you DON'T care about durability and you just want it to be as fast as possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
On Wed, Nov 17, 2010 at 17:31, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-11-17 at 16:35 +0100, Magnus Hagander wrote: Currently, we overload indent meaning both unix socket authentication and ident over tcp, depending on what type of connection it is. This is quite unfortunate - one of them being one of the most secure options we have, the other one being one of the most *insecure* ones (really? ident over tcp? does *anybody* use that intentionally today?) Should we not consider naming those two different things? The original patch called the Unix domain socket version peer (whereas the name ident comes from the official name of the TCP/IP protocol used). You can look it up in the archives, but I believe the argument for using the name ident for both was because ident was established and the new feature would provide the same functionality. Yeah, I vaguely recall that discussion - too lazy to actually look it up :-) I think the argument was definitely wrong, but it didn't seem so at the time... That said, I completely agree with you. Every time I look through a pg_hba.conf I think, that's a terrible name, we should rename this. We could perhaps introduce an alternative name and slowly deprecate the original one. That seems reasonable. Maybe even have the server emit a warning when it sees it (since we now read/parse pg_hba.conf on server start, it would only show up once per server reload, not on every connect). Or maybe just doc-deprecate in 9.1, warning in 9.2, drop in 9.3 or something? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Wed, Nov 17, 2010 at 10:32 AM, Ross J. Reedstrom reeds...@rice.edu wrote: On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote: On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote: If we don't need a PoC module for each new hooks, I'm not strongly motivated to push it into contrib tree. How about your opinion? I'd say let it go, unless someone else feels strongly about it. I would use this module (rate limit new connection attempts) as soon as I could. Putting a cap on potential CPU usage on a production DB by either a blackhat or mistake by a developer caused by a mistake in configuration (leaving the port accessible) is definitely useful, even in the face of max_connections. My production apps already have their connections and seldom need new ones. They all use CPU though. If KaiGai updates the code per previous discussion, would you be willing to take a crack at adding documentation? P.S. Your email client seems to be setting the Reply-To address to a ridiculous value. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensible executor nodes for preparation of SQL/MED
On Wed, Nov 17, 2010 at 2:13 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: On Wed, Nov 17, 2010 at 10:51, Itagaki Takahiro itagaki.takah...@gmail.com wrote: On Wed, Nov 17, 2010 at 03:36, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 16, 2010 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: I am of the opinion that a run-time-extensible set of plan node types is merest fantasy. We will never have that, so putting in place 5% of the infrastructure for it is a waste of time and notational complexity. I think I agree; and moreover there's been no compelling argument made why we would need that for SQL/MED anyway. I see. I'll cut useless parts from my patch. I tested simplified version, but I cannot see measurable performance improvement at this time. So, I'll turn down the whole proposal to use function pointer calls. I'm sorry for all the fuss. Wait a minute... I'm confused. Didn't you have a measurable performance improvement with an earlier version of this patch? If taking out the useless parts removed the performance benefit, maybe they weren't useless? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HOT updates in index-less tables
On Sun, Nov 14, 2010 at 1:12 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 14.11.2010 00:29, Robert Haas wrote: On Sat, Nov 13, 2010 at 12:13 PM, Tom Lanet...@sss.pgh.pa.us wrote: Hannu Krosingha...@2ndquadrant.com writes: On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote: If a table has no indexes, we will always decide that any same-page update operation is a HOT update, since obviously it isn't modifying any indexed columns. But is there any benefit to doing so? If we do the in-page mini vacuum even without HOT, then there should be no benefit from index-less HOT updates. AFAICS we do: heap_update marks the page as prunable whether it's a HOT update or not. The only difference between treating the update as HOT vs not-HOT is that if there was more than one HOT update, the intermediate tuples could be completely reclaimed by page pruning (ie, their line pointers go away too). With not-HOT updates, the intermediate line pointers would have to remain in DEAD state until vacuum, since page pruning wouldn't know if there were index entries pointing at them. But that seems like a pretty tiny penalty. I'm not at all convinced that's a tiny penalty. Me neither. It's a tiny penalty when you consider one update, but if you repeatedly update the same tuple, you accumulate dead line pointers until the next real vacuum runs. With HOT updates, you reach a steady state where page pruning is all you need. Then again, if you're repeatedly updating a row in a table with no indexes, presumably it's a very small table or you would create an index on it. And frequently autovacuuming a small index is quite cheap too. The case here is when you have say a control table that is managing a gapless sequence, or a materialization table with a very small number of records. These type of tables get updated very frequently, perhaps in every transaction. People without detailed implementation knowledge of postgresql might assume that leaving an index off the table is faster in these situations. The danger here is that if autovacuum is stalled for whatever reason, you get exponentially bad behavior as the table gets stuffed with bad records. index-less hot was put in intentionally. As autovacuum gets smarter and smarter, the reasoning to do this get weaker. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST insert algorithm rewrite
Hmm, will have to do some benchmarking on that. I'm using the Consistent function when walking down to check if the downlink needs to be updated, and assumed that it would be insignificant compared to the cost of calling Penalty on all the keys on the page. Why consistent?! It's impossible - you don't know right strategy number, index with storage type/over type could do not accept the same type as query. Index over tsvector is an example. There should be no difference in performance here AFAICS. The children need to be updated a second time to clear the flag, but we don't release the locks on them in the middle, and we're only talking about setting a single flag, so it should make no difference. Agree with that -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wed, Nov 17, 2010 at 4:52 PM, Robert Haas robertmh...@gmail.com wrote: However, that test doesn't capture everything. For example, imagine a connection pooler sitting in front of PG. Rebinding to a new database means disconnecting a TCP connection and establishing a new one. Switching databases might save some latency there even if we don't actually save much in terms of CPU instructions. Maybe that's not important, though. I don't know. I don't want to let my theorizing get too far ahead of the data. Everything you said is true but there's more. A freshly created backend needs to build relcache entries and for every relation in your query. A reused connection eventually warms up the relcache and syscaches and can plan new queries using them without doing any syscalls. And of course if it's a query that's already been planned might be able to reuse the entire plan structure without replanning it. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST insert algorithm rewrite
Sorry, I missed beginning of discussion on GiST, so I read it on the web mail archive. You wrote: http://archives.postgresql.org/pgsql-hackers/2010-11/msg00939.php [skip] 0. (the child page is locked) 1. The parent page is locked. 2. The child page is split. The original page becomes the left half, and new buffers are allocated for the right halves. 3. The downlink is inserted on the parent page (and the original downlink is updated to reflect only the keys that stayed on the left page). While keeping the child pages locked, the NSN field on the children are updated with the new LSN of the parent page. ... The scan checks that by comparing the LSN it saw on the parent page with the NSN on the child page. If parent LSN NSN, we saw the parent before the downlink was inserted. Now, the problem with crash recovery is that the above algorithm depends on the split to keep the parent and child locked until the downlink is inserted in the parent. If you crash between steps 2 and 3, the locks are gone. If a later insert then updates the parent page, because of a split on some unrelated child page, that will bump the LSN of the parent above the NSN on the child. Scans will see that the parent LSN child NSN, and will no longer follow the rightlink. [skip] I disagree with that opinion: if we crash between 2 and 3 then why will somebody update parent before WAL replay? WAL replay process in this case should complete child split by inserting invalid pointer and tree become correct again, although it needs to repair invalid pointers. The same situation with b-tree: WAL replay repairs incomplete split before any other processing. Or do I miss something important? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Robert, All: I hope you're following the thread on -general about this feature. We're getting a lot of feedback. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Indent authentication overloading
Magnus Hagander mag...@hagander.net writes: If it was a matter of changing it for those who use ident over tcp, I really wouldn't hesitate - they're few :-) But the problem is that it's the ident-over-tcp that's correctly named, not the other one... Yeah, renaming the TCP version would be quite wrong. If we're going to do something about this, I agree with Peter's suggestion: add peer as the preferred name for the Unix-socket method, and deprecate but don't remove ident. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST insert algorithm rewrite
On 17.11.2010 19:46, Teodor Sigaev wrote: I disagree with that opinion: if we crash between 2 and 3 then why will somebody update parent before WAL replay? WAL replay process in this case should complete child split by inserting invalid pointer and tree become correct again, although it needs to repair invalid pointers. The same situation with b-tree: WAL replay repairs incomplete split before any other processing. Or do I miss something important? Yeah, see the thread that started this: http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php http://archives.postgresql.org/message-id/12375.1289429...@sss.pgh.pa.us The code currently relies on the end-of-recovery processing to finish the incomplete, but I'm trying to get rid of that end-of-recovery processing altogether. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 17.11.2010 17:11, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: fsync()ing the file at shutdown doesn't seem too bad to me from performance point of view, we tolerate that for all other tables. And you can always truncate the table yourself before shutdown. The objection to that was not about performance. It was about how to find out what needs to be fsync'd. I must be missing something: we handle that just fine with normal tables, why is it a problem for unlogged tables? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wed, Nov 17, 2010 at 12:42 PM, Greg Stark gsst...@mit.edu wrote: On Wed, Nov 17, 2010 at 4:52 PM, Robert Haas robertmh...@gmail.com wrote: However, that test doesn't capture everything. For example, imagine a connection pooler sitting in front of PG. Rebinding to a new database means disconnecting a TCP connection and establishing a new one. Switching databases might save some latency there even if we don't actually save much in terms of CPU instructions. Maybe that's not important, though. I don't know. I don't want to let my theorizing get too far ahead of the data. Everything you said is true but there's more. A freshly created backend needs to build relcache entries and for every relation in your query. A reused connection eventually warms up the relcache and syscaches and can plan new queries using them without doing any syscalls. And of course if it's a query that's already been planned might be able to reuse the entire plan structure without replanning it. I think you're missing the point. If we switch databases, all cached relations and plans have to be flushed anyway. We're talking about what might NOT need to be flushed on switching databases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
Marc Cousin cousinm...@gmail.com writes: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions I think there's a rather nasty problem here, which is what to do with the cached nextval/currval state. As submitted, the patch does the same thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued nextval values, but don't touch currval) at the time of resetting the sequence. That's fine, but what if the transaction later rolls back? The cached state is untouched by rollback, so if the transaction had done any nextval()s meanwhile, the cache will be out of step with the rolled-back sequence contents. We never had to worry about this before because sequence operations didn't roll back, by definition. If we're going to add a situation where they do roll back, we need to consider the case. I think we can arrange to clear cached unissued values on the next attempt to nextval() the sequence, by dint of adding the relfilenode to SeqTable entries and clearing cached state whenever we note that it doesn't match the current relfilenode of the sequence. However, I'm unsure what ought to happen to currval. It doesn't seem too practical to try to roll it back to its pre-transaction value. Should we leave it alone (ie, possibly reflecting a value that was assigned inside the failed transaction)? The other alternative would be to clear it as though nextval had never been issued at all in the session. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 17.11.2010 17:11, Tom Lane wrote: The objection to that was not about performance. It was about how to find out what needs to be fsync'd. I must be missing something: we handle that just fine with normal tables, why is it a problem for unlogged tables? Hmm ... that's a good point. If we simply treat unlogged tables the same as regular for checkpointing purposes, don't we end up having flushed them all correctly during a shutdown checkpoint? I was thinking that WAL-logging had some influence on that logic, but it doesn't. Robert is probably going to object that he wanted to prevent any fsyncing for unlogged tables, but the discussion over in pgsql-general is crystal clear that people do NOT want to lose unlogged data over a clean shutdown and restart. If all it takes to do that is to refrain from lobotomizing the checkpoint logic for unlogged tables, I say we should refrain. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote: Robert, All: I hope you're following the thread on -general about this feature. We're getting a lot of feedback. I haven't been; I'm not subscribed to general; it'd be useful to CC me next time. Reading through the thread in the archives, it seems like people are mostly confused. Some are confused about the current behavior of the patch (no, it really does always truncate your tables, I swear); others are confused about how WAL logging works (of course a backend crash doesn't truncate an ordinary table - that's because it's WAL LOGGED); and still others are maybe not exactly confused but hoping that unlogged table = MyISAM (try not to corrupt your data, but don't get too bent out of shape about the possibility that it may get corrupted anyway). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Magnus Hagander wrote: I basically report the connection error string if it starts with FATAL:. I originally tried to check for an ERRCODE_INVALID_PASSWORD error field (see // comments), but it seems there is no way to access this, i.e. PQgetResult(conn) on a connection failure is always NULL. Anyway, perhaps FATAL is a better test because it will report any major failure, not just a .pgpass one. Patch attached. Bad Bruce, using C++ comments like that :P And non-context diff ;) That comment use was to highlight that those are not for commit, but there if people want to test. As far as the diff, it seems git-external-diff isn't portable to non-Linux systems; I will post a separate email on that. Does this actually solve the *problem*, though? The problem is not what is reported on stdout/stderr, the problem is that the net result is that the server is reported as not started (by the service control manager) when it actually *is* started. In this case, stderr doesn't even go anywhere. What happens if you *don't* Ctrl-C it? I was just going to post on that. :-) Right now, it prints the FATAL and keeps printing 60 times, then says not running. Should we just exit on FATAL and output a special exit string, or say running? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote: Magnus Hagander wrote: Does this actually solve the *problem*, though? The problem is not what is reported on stdout/stderr, the problem is that the net result is that the server is reported as not started (by the service control manager) when it actually *is* started. In this case, stderr doesn't even go anywhere. What happens if you *don't* Ctrl-C it? I was just going to post on that. :-) Right now, it prints the FATAL and keeps printing 60 times, then says not running. Should we just exit on FATAL and output a special exit string, or say running? From the perspective of the service control manager, it should say running. That might break other scenarios though, but i'm not sure - I think we can safely say the server is running when we try to log in and get a password failure. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] git diff script is not portable
Looking here: http://wiki.postgresql.org/wiki/Working_with_Git the script we recommend for doing context diffs for git, git-external-diff, is not portable: http://anarazel.de/pg/git-external-diff It uses diff -L, which is not supported by FreeBSD, and I imagine many other operating systems. If we want people to use this to produce context diffs, we should provide a portable script. I can modify it to be portable, but it is currently hosted on some other site. How should I handle this?How do I contact the author, or perhaps I should create a new on on ftp.postgresql.org and link to that. Comments? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 17.11.2010 17:11, Tom Lane wrote: The objection to that was not about performance. It was about how to find out what needs to be fsync'd. I must be missing something: we handle that just fine with normal tables, why is it a problem for unlogged tables? Hmm ... that's a good point. If we simply treat unlogged tables the same as regular for checkpointing purposes, don't we end up having flushed them all correctly during a shutdown checkpoint? I was thinking that WAL-logging had some influence on that logic, but it doesn't. Robert is probably going to object that he wanted to prevent any fsyncing for unlogged tables, but the discussion over in pgsql-general is crystal clear that people do NOT want to lose unlogged data over a clean shutdown and restart. If all it takes to do that is to refrain from lobotomizing the checkpoint logic for unlogged tables, I say we should refrain. I think that's absolutely a bad idea. I seriously do not want to have a conversation with someone about why their unlogged tables are exacerbating their checkpoint I/O spikes. I'd be happy to have two modes, though. We should probably revisit the syntax, though. One, it seems that CREATE UNLOGGED TABLE is not as clear as I thought it was. Two, when (not if) we add more durability levels, we don't want to create keywords for all of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Magnus Hagander wrote: On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote: Magnus Hagander wrote: Does this actually solve the *problem*, though? The problem is not what is reported ?on stdout/stderr, the problem is that the net result is that the server is reported as not started (by the service control manager) when it actually *is* started. In this case, stderr doesn't even go anywhere. What happens if you *don't* Ctrl-C it? I was just going to post on that. ?:-) ?Right now, it prints the FATAL and keeps printing 60 times, then says not running. ?Should we just exit on FATAL and output a special exit string, or say running? From the perspective of the service control manager, it should say running. That might break other scenarios though, but i'm not sure - I think we can safely say the server is running when we try to log in and get a password failure. That was another part of the discussion. Right now we report any FATAL, so it might be a password problem, or something else, and it seems doing all FATALs is the best idea because it will catch any other cases like this. Is FATAL, in general, enough to conclude the server is running? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git diff script is not portable
On Wed, Nov 17, 2010 at 19:54, Bruce Momjian br...@momjian.us wrote: Looking here: http://wiki.postgresql.org/wiki/Working_with_Git the script we recommend for doing context diffs for git, git-external-diff, is not portable: http://anarazel.de/pg/git-external-diff It uses diff -L, which is not supported by FreeBSD, and I imagine many other operating systems. If we want people to use this to produce context diffs, we should provide a portable script. I can modify it to be portable, but it is currently hosted on some other site. How should I handle this? How do I contact the author, or perhaps I should create a new on on ftp.postgresql.org and link to that. Do it the git way - fork it and put it on your github page. Or do it the pg way - fork it and put it in src/tools. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] MergeAppend costing
See the attached test case. With that setup, this uses MergeAppend: explain select * from ma_parent order by id limit 10; But this one does not: explain select * from ma_parent order by name limit 10; ...which seems odd, because the index on ma_child1 and sorting the other two ought to still be better than appending all three and sorting the whole thing. If you drop ma_child2, you get MergeAppend again: begin; drop table ma_child2; explain select * from ma_parent order by name limit 10; rollback; ...which makes me wonder if our costing model is off? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company inh.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git diff script is not portable
On Wednesday 17 November 2010 19:54:49 Bruce Momjian wrote: Looking here: http://wiki.postgresql.org/wiki/Working_with_Git the script we recommend for doing context diffs for git, git-external-diff, is not portable: http://anarazel.de/pg/git-external-diff It uses diff -L, which is not supported by FreeBSD, and I imagine many other operating systems. If we want people to use this to produce context diffs, we should provide a portable script. I can modify it to be portable, but it is currently hosted on some other site. How should I handle this?How do I contact the author, or perhaps I should create a new on on ftp.postgresql.org and link to that. If you have changes I am happy to change the script - but hosting it a *.postgresql.org domain might be a good idea anyway. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote: Magnus Hagander wrote: On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote: Magnus Hagander wrote: Does this actually solve the *problem*, though? The problem is not what is reported ?on stdout/stderr, the problem is that the net result is that the server is reported as not started (by the service control manager) when it actually *is* started. In this case, stderr doesn't even go anywhere. What happens if you *don't* Ctrl-C it? I was just going to post on that. ?:-) ?Right now, it prints the FATAL and keeps printing 60 times, then says not running. ?Should we just exit on FATAL and output a special exit string, or say running? From the perspective of the service control manager, it should say running. That might break other scenarios though, but i'm not sure - I think we can safely say the server is running when we try to log in and get a password failure. That was another part of the discussion. Right now we report any FATAL, so it might be a password problem, or something else, and it seems doing all FATALs is the best idea because it will catch any other cases like this. Is FATAL, in general, enough to conclude the server is running? No - specifically, we will send FATAL when the database system is starting up, which is exactly the one we want to *avoid*. I think we should only exclude the password case. I guess we could also do all fatal *except* list, but that seems more fragile. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] changing MyDatabaseId
On Wed, Nov 17, 2010 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote: I think you're missing the point. If we switch databases, all cached relations and plans have to be flushed anyway. We're talking about what might NOT need to be flushed on switching databases. Oh sorry, yes, I missed that point. I will mention that your point about TCP connection establishment latency is real. TCP connection establishment adds several milliseconds of latency (for low latency connections -- obviously it's much much worse for long-haul connections) and then the congestion control slow start adds more if there's a significant amount of data to transfer. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert is probably going to object that he wanted to prevent any fsyncing for unlogged tables, but the discussion over in pgsql-general is crystal clear that people do NOT want to lose unlogged data over a clean shutdown and restart. If all it takes to do that is to refrain from lobotomizing the checkpoint logic for unlogged tables, I say we should refrain. I think that's absolutely a bad idea. The customer is always right, and I think we are hearing loud and clear what the customers want. Please let's not go out of our way to create a feature that isn't what they want. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Magnus Hagander mag...@hagander.net writes: On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote: Is FATAL, in general, enough to conclude the server is running? No - specifically, we will send FATAL when the database system is starting up, which is exactly the one we want to *avoid*. I think we should only exclude the password case. I guess we could also do all fatal *except* list, but that seems more fragile. I believe that the above argument is exactly backwards. What we want here is to check the result of postmaster.c's canAcceptConnections(), and there are only a finite number of error codes that can result from rejections there. If we get past that, there are a large number of possible failures, but all of them indicate that the postmaster is in principle willing to accept connections. Checking for password errors only is utterly wrong: any other type of auth failure would be the same for this purpose, as would no such database, no such user, too many connections, etc etc etc. What we actually want here, and don't have, is the fabled pg_ping protocol... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert is probably going to object that he wanted to prevent any fsyncing for unlogged tables, but the discussion over in pgsql-general is crystal clear that people do NOT want to lose unlogged data over a clean shutdown and restart. ?If all it takes to do that is to refrain from lobotomizing the checkpoint logic for unlogged tables, I say we should refrain. I think that's absolutely a bad idea. The customer is always right, and I think we are hearing loud and clear what the customers want. Please let's not go out of our way to create a feature that isn't what they want. regards, tom lane I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Cheers, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 02:22 PM, Kenneth Marshall wrote: On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: On Wed, Nov 17, 2010 at 1:46 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert is probably going to object that he wanted to prevent any fsyncing for unlogged tables, but the discussion over in pgsql-general is crystal clear that people do NOT want to lose unlogged data over a clean shutdown and restart. �If all it takes to do that is to refrain from lobotomizing the checkpoint logic for unlogged tables, I say we should refrain. I think that's absolutely a bad idea. The customer is always right, and I think we are hearing loud and clear what the customers want. Please let's not go out of our way to create a feature that isn't what they want. I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Yeah, I was just thinking something like that would be good, and should overcome Robert's objection to the whole idea. I also agree with Tom's sentiment above. To answer another point I see Tom made on the -general list: while individual backends may crash from time to time, crashes of the whole Postgres server are very rare in my experience in production environments. It's really pretty robust, unless you're doing crazy stuff. So that makes it all the more important that we can restart a server cleanly (say, to change a config setting) without losing the unlogged tables. If we don't allow that we'll make a laughing stock of ourselves. Honestly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Excerpts from Robert Haas's message of mié nov 17 15:48:56 -0300 2010: On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote: Robert, All: I hope you're following the thread on -general about this feature. We're getting a lot of feedback. I haven't been; I'm not subscribed to general; it'd be useful to CC me next time. FWIW I've figured that being subscribed to the lists is good even if I have my mail client configured to hide these emails by default. It's a lot easier for searching stuff that someone else references. (I made the mistake of having it hide all pg-general email even though I was CC'ed, though, which is the trivial way to implement this. I don't recommend repeating this mistake.) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Andrew Dunstan and...@dunslane.net writes: On 11/17/2010 02:22 PM, Kenneth Marshall wrote: I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Yeah, I was just thinking something like that would be good, and should overcome Robert's objection to the whole idea. I don't think you can fsync only in the shutdown checkpoint and assume your data is safe, if you didn't fsync a write a few moments earlier. Now, a few minutes ago Robert was muttering about supporting more than one kind of degraded-reliability table. I could see inventing unlogged tables, which means exactly that (no xlog support, but we still checkpoint/fsync as usual), and unsynced tables which also/instead suppress fsync activity. The former type could be assumed to survive a clean shutdown/restart, while the latter wouldn't. This would let people pick their poison. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Now, a few minutes ago Robert was muttering about supporting more than one kind of degraded-reliability table. I could see inventing unlogged tables, which means exactly that (no xlog support, but we still checkpoint/fsync as usual), and unsynced tables which also/instead suppress fsync activity. The former type could be assumed to survive a clean shutdown/restart, while the latter wouldn't. This would let people pick their poison. We're assuming here that the checkpoint activity for the unlogged table causes significant load on a production system. Maybe we should do some testing before we try to make this overly complex? I wouldn't be surprised to find that on most filesystems the extra checkpointing of the unlogged tables adds only small minority overhead. Shouldn't be hard to build out pgbench into something which will test this ... if only I had a suitable test machine available. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
[ forgot to comment on this part ] Andrew Dunstan and...@dunslane.net writes: To answer another point I see Tom made on the -general list: while individual backends may crash from time to time, crashes of the whole Postgres server are very rare in my experience in production environments. Well, if you mean the postmaster darn near never goes down, that's true, because we go out of our way to ensure it does as little as possible. But that has got zip to do with this discussion, because a backend crash has to be assumed to have corrupted unlogged tables. There are some folk over in -general who are wishfully thinking that only a postmaster crash would lose their unlogged data, but that's simply wrong. Backend crashes *will* truncate those tables; there is no way around that. The comment I made was that my experience as to how often backends crash might not square with production experience --- but you do have to draw the distinction between a backend crash and a postmaster crash. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote: Is FATAL, in general, enough to conclude the server is running? No - specifically, we will send FATAL when the database system is starting up, which is exactly the one we want to *avoid*. I think we should only exclude the password case. I guess we could also do all fatal *except* list, but that seems more fragile. I believe that the above argument is exactly backwards. What we want here is to check the result of postmaster.c's canAcceptConnections(), and there are only a finite number of error codes that can result from rejections there. If we get past that, there are a large number of possible failures, but all of them indicate that the postmaster is in principle willing to accept connections. Checking for password errors only is utterly wrong: any other type of auth failure would be the same for this purpose, as would no such database, no such user, too many connections, etc etc etc. Agreed. So how do we pass that info to libpq without exceeding the value of fixing this problem? Should we parse pg_controldata output? pg_upgrade could use machine-readable output from that too. What we actually want here, and don't have, is the fabled pg_ping protocol... Well, we are basically figuring how to implement that with this fix, whether it is part of pg_ctl or a separate binary. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git diff script is not portable
Magnus Hagander wrote: On Wed, Nov 17, 2010 at 19:54, Bruce Momjian br...@momjian.us wrote: Looking here: ? ? ? ?http://wiki.postgresql.org/wiki/Working_with_Git the script we recommend for doing context diffs for git, git-external-diff, is not portable: ? ? ? ?http://anarazel.de/pg/git-external-diff It uses diff -L, which is not supported by FreeBSD, and I imagine many other operating systems. If we want people to use this to produce context diffs, we should provide a portable script. ?I can modify it to be portable, but it is currently hosted on some other site. ?How should I handle this? ? ?How do I contact the author, or perhaps I should create a new on on ftp.postgresql.org and link to that. Do it the git way - fork it and put it on your github page. Or do it the pg way - fork it and put it in src/tools. src/tools is a very good idea. Objections? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 02:44 PM, Tom Lane wrote: [ forgot to comment on this part ] Andrew Dunstanand...@dunslane.net writes: To answer another point I see Tom made on the -general list: while individual backends may crash from time to time, crashes of the whole Postgres server are very rare in my experience in production environments. Well, if you mean the postmaster darn near never goes down, that's true, because we go out of our way to ensure it does as little as possible. But that has got zip to do with this discussion, because a backend crash has to be assumed to have corrupted unlogged tables. There are some folk over in -general who are wishfully thinking that only a postmaster crash would lose their unlogged data, but that's simply wrong. Backend crashes *will* truncate those tables; there is no way around that. The comment I made was that my experience as to how often backends crash might not square with production experience --- but you do have to draw the distinction between a backend crash and a postmaster crash. OK. I'd missed that. Thanks for clarifying. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan and...@dunslane.net wrote: The customer is always right, and I think we are hearing loud and clear what the customers want. Please let's not go out of our way to create a feature that isn't what they want. I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Yeah, I was just thinking something like that would be good, and should overcome Robert's objection to the whole idea. Could we slow down here a bit and talk through the ideas here in a logical fashion? The customer is always right, but the informed customer makes better decisions than the uninformed customer. This idea, as proposed, does not work. If you only include dirty buffers at the final checkpoint before shutting down, you have no guarantee that any buffers that you either didn't write or didn't fsync previously are actually on disk. Therefore, you have no guarantee that the table data is not corrupted. So you really have to decide between including the unlogged-table buffers in EVERY checkpoint and not ever including them at all. Which one is right depends on your use case. For example, consider the poster who said that, when this feature is available, they plan to try ripping out their memcached instance and replacing it with PostgreSQL running unlogged tables. Suppose this poster (or someone else in a similar situation) has a 64 GB and is currently running a 60 GB memcached instance on it, which is not an unrealistic scenario for memcached. Suppose further that he dirties 25% of that data each hour. memcached is currently doing no writes to disk. When he switches to PostgreSQL and sets checkpoints_segments to a gazillion and checkpoint_timeout to the maximum, he's going to start writing 15 GB of data to disk every hour - data which he clearly doesn't care about losing, or preserving across restarts, because he's currently storing it in memcached. In fact, with memcached, he'll not only lose data at shutdown - he'll lose data on a regular basis when everything is running normally. We can try to convince ourselves that someone in this situation will not care about needing to get 15GB of disposable data per hour from memory to disk in order to have a feature that he doesn't need, but I think it's going to be pretty hard to make that credible. Now, second use case. Consider someone who is currently running PostgreSQL in a non-durable configuration, with fsync=off, full_page_writes=off, and synchronous_commit=off. This person - who is based on someone I spoke with at PG West - is doing a large amount of data processing using PostGIS. Their typical workflow is to load a bunch of data, run a simulation, and then throw away the entire database. They don't want to pay the cost of durability because if they crash in mid-simulation they will simply rerun it. Being fast is more important. Whether or not this person will be happy with the proposed behavior is a bit harder to say. If it kills performance, they will definitely hate it. But if the performance penalty is only modest, they may enjoy the convenience of being able to shut down the database and start it up again later without losing data. Third use case. Someone on pgsql-general mentioned that they want to write logs to PG, and can abide losing them if a crash happens, but not on a clean shutdown and restart. This person clearly shuts down their production database a lot more often than I do, but that is OK. By explicit stipulation, they want the survive-a-clean-shutdown behavior. I have no problem supporting that use case, providing they are willing to take the associated performance penalty at checkpoint time, which we don't know because we haven't asked, but I'm fine with assuming it's useful even though I probably wouldn't use it much myself. I also agree with Tom's sentiment above. To answer another point I see Tom made on the -general list: while individual backends may crash from time to time, crashes of the whole Postgres server are very rare in my experience in production environments. It's really pretty robust, unless you're doing crazy stuff. So that makes it all the more important that we can restart a server cleanly (say, to change a config setting) without losing the unlogged tables. If we don't allow that we'll make a laughing stock of ourselves. Honestly. Let's please not assume that there is only one reasonable option here, or that I have not thought about some of these issues. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Bruce Momjian br...@momjian.us writes: Agreed. So how do we pass that info to libpq without exceeding the value of fixing this problem? Should we parse pg_controldata output? pg_upgrade could use machine-readable output from that too. pg_controldata seems 100% unrelated to this problem. You cannot even tell if the postmaster is alive just by inspecting pg_control. What we actually want here, and don't have, is the fabled pg_ping protocol... Well, we are basically figuring how to implement that with this fix, whether it is part of pg_ctl or a separate binary. Possibly the cleanest fix is to implement pg_ping as a libpq function. You do have to distinguish connection failures (ie connection refused) from errors that came back from the postmaster, and the easiest place to be doing that is inside libpq. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
The Wednesday 17 November 2010 19:41:19, Tom Lane wrote : Marc Cousin cousinm...@gmail.com writes: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions I think there's a rather nasty problem here, which is what to do with the cached nextval/currval state. As submitted, the patch does the same thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued nextval values, but don't touch currval) at the time of resetting the sequence. That's fine, but what if the transaction later rolls back? The cached state is untouched by rollback, so if the transaction had done any nextval()s meanwhile, the cache will be out of step with the rolled-back sequence contents. Yes, I completely missed testing with non default cache value. And it fails, of course, some values are generated a second time twice after a rollback We never had to worry about this before because sequence operations didn't roll back, by definition. If we're going to add a situation where they do roll back, we need to consider the case. I think we can arrange to clear cached unissued values on the next attempt to nextval() the sequence, by dint of adding the relfilenode to SeqTable entries and clearing cached state whenever we note that it doesn't match the current relfilenode of the sequence. However, I'm unsure what ought to happen to currval. It doesn't seem too practical to try to roll it back to its pre-transaction value. Should we leave it alone (ie, possibly reflecting a value that was assigned inside the failed transaction)? The other alternative would be to clear it as though nextval had never been issued at all in the session. Should currval really be used after a failed transaction ? Right now, we can have a value that has been generated inside a rollbacked transaction too. I'd vote for leave it alone. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Agreed. So how do we pass that info to libpq without exceeding the value of fixing this problem? Should we parse pg_controldata output? pg_upgrade could use machine-readable output from that too. pg_controldata seems 100% unrelated to this problem. You cannot even tell if the postmaster is alive just by inspecting pg_control. I was thinking of this: $ pg_controldata /u/pg/data ... Database cluster state: shut down What we actually want here, and don't have, is the fabled pg_ping protocol... Well, we are basically figuring how to implement that with this fix, whether it is part of pg_ctl or a separate binary. Possibly the cleanest fix is to implement pg_ping as a libpq function. You do have to distinguish connection failures (ie connection refused) from errors that came back from the postmaster, and the easiest place to be doing that is inside libpq. OK, so a new libpq function --- got it. Would we just pass the status from the backend or can it be done without backend modifications? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE recursion to typed tables
Peter Eisentraut pete...@gmx.net writes: Here is the patch that adds [RESTRICT|CASCADE] to ALTER TYPE ... ADD/ALTER/DROP/RENAME ATTRIBUTE, so that recurses to typed tables. And here's my commitfest review of it: - patch applies cleanly - adds regression tests - passes them - is useful and needed, and something we don't already have - don't generate warnings (or I missed them) :) Code wise, though, I wonder about the name of the recursing parameter of the renameatt_internal function is src/backend/commands/tablecmds.c, which seems to only get used to detect erroneous attempt at renaming the table column directly. Maybe it's only me not used enough to PostgreSQL code yet, but here it distract the code reader. Having another parameter called recurse is not helping, too, but I don't see this one needs to be changed. I'm not sure what a good name would be here, alter_type_cascade is an example that comes to mind, on the verbose side. As I think the issue is to be decided by a commiter, I will go and mark this patch as ready for commiter! Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 2:42 PM, Josh Berkus j...@agliodbs.com wrote: Now, a few minutes ago Robert was muttering about supporting more than one kind of degraded-reliability table. I could see inventing unlogged tables, which means exactly that (no xlog support, but we still checkpoint/fsync as usual), and unsynced tables which also/instead suppress fsync activity. The former type could be assumed to survive a clean shutdown/restart, while the latter wouldn't. This would let people pick their poison. We're assuming here that the checkpoint activity for the unlogged table causes significant load on a production system. Maybe we should do some testing before we try to make this overly complex? I wouldn't be surprised to find that on most filesystems the extra checkpointing of the unlogged tables adds only small minority overhead. Shouldn't be hard to build out pgbench into something which will test this ... if only I had a suitable test machine available. I guess the point I'd make here is that checkpoint I/O will be a problem for unlogged tables in exactly the same situations in which it is a problem for regular tables. There is some amount of I/O that your system can handle before the additional I/O caused by checkpoints starts to become a problem. If unlogged tables (or one particular variant of unlogged tables) don't need to participate in checkpoints, then you will be able to use unlogged tables, in situations where they are appropriate to the workload, to control your I/O load and hopefully keep it below the level where it causes a problem. Of course, there will also be workloads where your system has plenty of spare capacity (in which case it won't matter) or where your system is going to be overwhelmed anyway (in which case it doesn't really matter either). But if you are somewhere between those two extremes, this has to matter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
On 10-11-17 03:00 PM, Marc Cousin wrote: The Wednesday 17 November 2010 19:41:19, Tom Lane wrote : Marc Cousincousinm...@gmail.com writes: - Does the feature work as advertised? Yes. It works consistently, isn't fooled by savepoints or multiple serials in a table, or concurrent transactions I think there's a rather nasty problem here, which is what to do with the cached nextval/currval state. As submitted, the patch does the same thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued nextval values, but don't touch currval) at the time of resetting the sequence. That's fine, but what if the transaction later rolls back? The cached state is untouched by rollback, so if the transaction had done any nextval()s meanwhile, the cache will be out of step with the rolled-back sequence contents. Yes, I completely missed testing with non default cache value. And it fails, of course, some values are generated a second time twice after a rollback I will look at addressing this in an updated patch. We never had to worry about this before because sequence operations didn't roll back, by definition. If we're going to add a situation where they do roll back, we need to consider the case. I think we can arrange to clear cached unissued values on the next attempt to nextval() the sequence, by dint of adding the relfilenode to SeqTable entries and clearing cached state whenever we note that it doesn't match the current relfilenode of the sequence. However, I'm unsure what ought to happen to currval. It doesn't seem too practical to try to roll it back to its pre-transaction value. Should we leave it alone (ie, possibly reflecting a value that was assigned inside the failed transaction)? The other alternative would be to clear it as though nextval had never been issued at all in the session. Should currval really be used after a failed transaction ? Right now, we can have a value that has been generated inside a rollbacked transaction too. I'd vote for leave it alone. I agree probably shouldn't be using curval after a failed transaction which is why having it return as if it hadn't been issued sounds like a more reasonable behaviour. If an application tries a currval following the rollback then at least the application won't get a bogus value. It is better to return an error than to let the application continuing on thinking it has a sequence value that won't be (or has not) been assigned to some other session. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wednesday 17 November 2010 20:54:14 Robert Haas wrote: On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan and...@dunslane.net wrote: The customer is always right, and I think we are hearing loud and clear what the customers want. Please let's not go out of our way to create a feature that isn't what they want. I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Yeah, I was just thinking something like that would be good, and should overcome Robert's objection to the whole idea. Could we slow down here a bit and talk through the ideas here in a logical fashion? The customer is always right, but the informed customer makes better decisions than the uninformed customer. This idea, as proposed, does not work. If you only include dirty buffers at the final checkpoint before shutting down, you have no guarantee that any buffers that you either didn't write or didn't fsync previously are actually on disk. Therefore, you have no guarantee that the table data is not corrupted. So you really have to decide between including the unlogged-table buffers in EVERY checkpoint and not ever including them at all. Which one is right depends on your use case. How can you get a buffer which was no written out *at all*? Do you want to force all such pages to stay in shared_buffers? That sounds quite a bit more complicated than what you proposed... For example, consider the poster who said that, when this feature is available, they plan to try ripping out their memcached instance and replacing it with PostgreSQL running unlogged tables. Suppose this poster (or someone else in a similar situation) has a 64 GB and is currently running a 60 GB memcached instance on it, which is not an unrealistic scenario for memcached. Suppose further that he dirties 25% of that data each hour. memcached is currently doing no writes to disk. When he switches to PostgreSQL and sets checkpoints_segments to a gazillion and checkpoint_timeout to the maximum, he's going to start writing 15 GB of data to disk every hour - data which he clearly doesn't care about losing, or preserving across restarts, because he's currently storing it in memcached. In fact, with memcached, he'll not only lose data at shutdown - he'll lose data on a regular basis when everything is running normally. We can try to convince ourselves that someone in this situation will not care about needing to get 15GB of disposable data per hour from memory to disk in order to have a feature that he doesn't need, but I think it's going to be pretty hard to make that credible. To really support that use case we would first need to make shared_buffers properly scale to 64GB - which unfortunatley, in my experience, is not yet the case. Also, see the issues in the former paragraph - I have severe doubts you can support such a memcached scenario by pg. Either you spill to disk if your buffers overflow (fine with me) or you need to throw away data memcached alike. I doubt there is a sensible implementation in pg for the latter. So you will have to write to disk at some point... Third use case. Someone on pgsql-general mentioned that they want to write logs to PG, and can abide losing them if a crash happens, but not on a clean shutdown and restart. This person clearly shuts down their production database a lot more often than I do, but that is OK. By explicit stipulation, they want the survive-a-clean-shutdown behavior. I have no problem supporting that use case, providing they are willing to take the associated performance penalty at checkpoint time, which we don't know because we haven't asked, but I'm fine with assuming it's useful even though I probably wouldn't use it much myself. Maybe I am missing something - but why does this imply we have to write data at checkpoints? Just fsyncing every file belonging to an persistently-unlogged (or whatever sensible name anyone can come up) table is not prohibively expensive - in fact doing that on a local $PGDATA with approx 300GB and loads of tables doing so takes less than 15s on a system with hot inode/dentry cache and no dirty files. (just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files and then fsyncs every one). The assumption of a hot inode cache is realistic I think. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 11/17/2010 02:22 PM, Kenneth Marshall wrote: I would be fine with only having a safe shutdown with unlogged tables and skip the checkpoint I/O all other times. Yeah, I was just thinking something like that would be good, and should overcome Robert's objection to the whole idea. I don't think you can fsync only in the shutdown checkpoint and assume your data is safe, if you didn't fsync a write a few moments earlier. Now, a few minutes ago Robert was muttering about supporting more than one kind of degraded-reliability table. I could see inventing unlogged tables, which means exactly that (no xlog support, but we still checkpoint/fsync as usual), and unsynced tables which also/instead suppress fsync activity. The former type could be assumed to survive a clean shutdown/restart, while the latter wouldn't. This would let people pick their poison. OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). Permanent tables are WAL-logged, participate in checkpoints, and are fsync'd. They survive crashes and clean restarts, and are replicated. 2. UNLOGGED (what this patch currently implements). Unlogged tables are not WAL-logged, but they do participate in checkpoints and they are fsync'd on request. They survive clean restarts, but on a crash they are truncated. They are not replicated. 3. UNSYNCED (future work). Unsynced tables are not WAL-logged, do not participate in checkpoints, and are never fsync'd. After any sort of crash or shutdown, clean or otherwise, they are truncated. They are not replicated. 4. GLOBAL TEMPORARY (future work). Global temporary tables are not WAL-logged, do not participate in checkpoints, and are never fsync'd. The contents of each global temporary table are private to that session, so that they can use the local buffer manager rather than shared buffers. Multiple sessions can use a global temporary table at the same time, and each sees separate contents. At session exit, any contents inserted by the owning backend are lost; since all sessions exit on crash or shutdown, all contents are also lost at that time. 5. LOCAL TEMPORARY (our current temp tables). Local temporary tables are not WAL-logged, do not participate in checkpoints, and are never fsync'd. The table definition and all of its contents are private to the session, so that they are dropped at session exit (or at transaction end if ON COMMIT DROP is used). Since all sessions exit on crash or shutdown, all table definitions and all table contents are lost at that time. It's possible to imagine a few more stops on this hierarchy. For example, you could have an ASYNCHRONOUS table between (1) and (2) that always acts as if synchronous_commit=off, but is otherwise replicated and durable over crashes; or a MINIMALLY LOGGED table that is XLOG'd as if wal_level=minimal even when the actual value of wal_level is otherwise, and is therefore crash-safe but not replication-safe; or a level that is similar to unlogged but we XLOG the first event that dirties a page after each checkpoint, and therefore even on a crash we need only remove the tables for which such an XLOG record has been written. All of those are a bit speculative perhaps but we could jam them in there if there's demand, I suppose. I don't particularly care for the name UNSYNCED, and I'm starting not to like UNLOGGED much either, although at least that one is an actual word. PERMANENT and the flavors of TEMPORARY are a reasonably comprehensible as a description of user-visible behavior, but UNLOGGED and UNSYNCED sounds a lot like they're discussing internal details that the user might not actually understand or care about. I don't have a better idea right off the top of my head, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 03:37 PM, Robert Haas wrote: I don't particularly care for the name UNSYNCED, and I'm starting not to like UNLOGGED much either, although at least that one is an actual word. PERMANENT and the flavors of TEMPORARY are a reasonably comprehensible as a description of user-visible behavior, but UNLOGGED and UNSYNCED sounds a lot like they're discussing internal details that the user might not actually understand or care about. I don't have a better idea right off the top of my head, though. Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund and...@anarazel.de wrote: The customer is always right, but the informed customer makes better decisions than the uninformed customer. This idea, as proposed, does not work. If you only include dirty buffers at the final checkpoint before shutting down, you have no guarantee that any buffers that you either didn't write or didn't fsync previously are actually on disk. Therefore, you have no guarantee that the table data is not corrupted. So you really have to decide between including the unlogged-table buffers in EVERY checkpoint and not ever including them at all. Which one is right depends on your use case. How can you get a buffer which was no written out *at all*? Do you want to force all such pages to stay in shared_buffers? That sounds quite a bit more complicated than what you proposed... Oh, you're right. We always have to write buffers before kicking them out of shared_buffers, but if we don't fsync them we have no guarantee they're actually on disk. For example, consider the poster who said that, when this feature is available, they plan to try ripping out their memcached instance and replacing it with PostgreSQL running unlogged tables. Suppose this poster (or someone else in a similar situation) has a 64 GB and is currently running a 60 GB memcached instance on it, which is not an unrealistic scenario for memcached. Suppose further that he dirties 25% of that data each hour. memcached is currently doing no writes to disk. When he switches to PostgreSQL and sets checkpoints_segments to a gazillion and checkpoint_timeout to the maximum, he's going to start writing 15 GB of data to disk every hour - data which he clearly doesn't care about losing, or preserving across restarts, because he's currently storing it in memcached. In fact, with memcached, he'll not only lose data at shutdown - he'll lose data on a regular basis when everything is running normally. We can try to convince ourselves that someone in this situation will not care about needing to get 15GB of disposable data per hour from memory to disk in order to have a feature that he doesn't need, but I think it's going to be pretty hard to make that credible. To really support that use case we would first need to make shared_buffers properly scale to 64GB - which unfortunatley, in my experience, is not yet the case. Well, that's something to aspire to. :-) Also, see the issues in the former paragraph - I have severe doubts you can support such a memcached scenario by pg. Either you spill to disk if your buffers overflow (fine with me) or you need to throw away data memcached alike. I doubt there is a sensible implementation in pg for the latter. So you will have to write to disk at some point... I agree that there are difficulties, but again, doing checkpoint I/O for data that the user was willing to throw away is going in the wrong direction. Third use case. Someone on pgsql-general mentioned that they want to write logs to PG, and can abide losing them if a crash happens, but not on a clean shutdown and restart. This person clearly shuts down their production database a lot more often than I do, but that is OK. By explicit stipulation, they want the survive-a-clean-shutdown behavior. I have no problem supporting that use case, providing they are willing to take the associated performance penalty at checkpoint time, which we don't know because we haven't asked, but I'm fine with assuming it's useful even though I probably wouldn't use it much myself. Maybe I am missing something - but why does this imply we have to write data at checkpoints? Just fsyncing every file belonging to an persistently-unlogged (or whatever sensible name anyone can come up) table is not prohibively expensive - in fact doing that on a local $PGDATA with approx 300GB and loads of tables doing so takes less than 15s on a system with hot inode/dentry cache and no dirty files. (just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files and then fsyncs every one). The assumption of a hot inode cache is realistic I think. Hmm. I don't really want to try to do it in this patch because it's complicated enough already, but if people don't mind the shutdown sequence potentially being slowed down a bit, that might allow us to have the best of both worlds without needing to invent multiple durability levels. I was sort of assuming that people wouldn't want to slow down the shutdown sequence to avoid losing data they've already declared isn't that valuable, but evidently I underestimated the demand for kinda-durable tables. If the overhead of doing this isn't too severe, it might be the way to go. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 03:48:52PM -0500, Andrew Dunstan wrote: On 11/17/2010 03:37 PM, Robert Haas wrote: I don't particularly care for the name UNSYNCED, and I'm starting not to like UNLOGGED much either, although at least that one is an actual word. PERMANENT and the flavors of TEMPORARY are a reasonably comprehensible as a description of user-visible behavior, but UNLOGGED and UNSYNCED sounds a lot like they're discussing internal details that the user might not actually understand or care about. I don't have a better idea right off the top of my head, though. Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED. +1 for describing the end-user-visible behavior. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 12:48 PM, Andrew Dunstan wrote: Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED. UNSAFE and EXTREMELY_UNSAFE?? :) Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Robert Haas robertmh...@gmail.com wrote: OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). 2. UNLOGGED (what this patch currently implements). 3. UNSYNCED (future work). 4. GLOBAL TEMPORARY (future work). 5. LOCAL TEMPORARY (our current temp tables). All of the above would have real uses in our shop. It's possible to imagine a few more stops on this hierarchy. Some of these might be slightly preferred over the above in certain circumstances, but that's getting down to fine tuning. I think the five listed above are more important than the speculative ones mentioned. I don't particularly care for the name UNSYNCED EVANESCENT? I'm starting not to like UNLOGGED much either EPHEMERAL? Actually, the UNSYNCED and UNLOGGED seem fairly clear -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Nov 17, 2010, at 4:00 PM, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). 2. UNLOGGED (what this patch currently implements). 3. UNSYNCED (future work). 4. GLOBAL TEMPORARY (future work). 5. LOCAL TEMPORARY (our current temp tables). All of the above would have real uses in our shop. It's possible to imagine a few more stops on this hierarchy. Some of these might be slightly preferred over the above in certain circumstances, but that's getting down to fine tuning. I think the five listed above are more important than the speculative ones mentioned. I don't particularly care for the name UNSYNCED EVANESCENT? I'm starting not to like UNLOGGED much either EPHEMERAL? Actually, the UNSYNCED and UNLOGGED seem fairly clear Unless one thinks that the types could be combined- perhaps a table declaration could use both UNLOGGED and UNSYNCED? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). 2. UNLOGGED (what this patch currently implements). 3. UNSYNCED (future work). 4. GLOBAL TEMPORARY (future work). 5. LOCAL TEMPORARY (our current temp tables). All of the above would have real uses in our shop. It's possible to imagine a few more stops on this hierarchy. Some of these might be slightly preferred over the above in certain circumstances, but that's getting down to fine tuning. I think the five listed above are more important than the speculative ones mentioned. I don't particularly care for the name UNSYNCED EVANESCENT? I'm starting not to like UNLOGGED much either EPHEMERAL? Actually, the UNSYNCED and UNLOGGED seem fairly clear I think Andrew's suggestion of VOLATILE is pretty good. It's hard to come up with multiple words that express gradations of we might decide to chuck your data if things go South, though. Then again if we go with Andres's suggestion maybe we can get by with one level. Or if we still end up with multiple levels, maybe it's best to use VOLATILE for everything 1 and 4, and then have a subordinate clause to specify gradations. CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS THAT EXPLAIN THE DETAILS GO HERE; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 04:00 PM, Kevin Grittner wrote: Actually, the UNSYNCED and UNLOGGED seem fairly clear I think Robert's right. These names won't convey much to someone not steeped in our technology. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Excerpts from Robert Haas's message of mié nov 17 17:51:37 -0300 2010: On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund and...@anarazel.de wrote: How can you get a buffer which was no written out *at all*? Do you want to force all such pages to stay in shared_buffers? That sounds quite a bit more complicated than what you proposed... Oh, you're right. We always have to write buffers before kicking them out of shared_buffers, but if we don't fsync them we have no guarantee they're actually on disk. You could just open all the segments and fsync them. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Wed, Nov 17, 2010 at 04:05:56PM -0500, Robert Haas wrote: On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). 2. UNLOGGED (what this patch currently implements). 3. UNSYNCED (future work). 4. GLOBAL TEMPORARY (future work). 5. LOCAL TEMPORARY (our current temp tables). All of the above would have real uses in our shop. It's possible to imagine a few more stops on this hierarchy. Some of these might be slightly preferred over the above in certain circumstances, but that's getting down to fine tuning. I think the five listed above are more important than the speculative ones mentioned. I don't particularly care for the name UNSYNCED EVANESCENT? I'm starting not to like UNLOGGED much either EPHEMERAL? Actually, the UNSYNCED and UNLOGGED seem fairly clear I think Andrew's suggestion of VOLATILE is pretty good. It's hard to come up with multiple words that express gradations of we might decide to chuck your data if things go South, though. Then again if we go with Andres's suggestion maybe we can get by with one level. Or if we still end up with multiple levels, maybe it's best to use VOLATILE for everything 1 and 4, and then have a subordinate clause to specify gradations. CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS THAT EXPLAIN THE DETAILS GO HERE; How about something like: OPTIONS (SYNC=no, LOG=no, ... ) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Excerpts from Robert Haas's message of mié nov 17 18:05:56 -0300 2010: CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS THAT EXPLAIN THE DETAILS GO HERE; What about some reloptions? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On 11/17/2010 11:44 AM, Tom Lane wrote: ...because a backend crash has to be assumed to have corrupted unlogged tables... So in a typical use-case, say storing session data on a web-site, one crashed backend could wreck sessions for some or all of the site? Is there a mechanism in the proposal that would allow a client to determine the state of a table (good, truncated, wrecked, etc.)? Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY
Steve Singer ssin...@ca.afilias.info writes: I will look at addressing this in an updated patch. I've nearly finished revising the patch already, don't worry about it. Should currval really be used after a failed transaction ? Right now, we can have a value that has been generated inside a rollbacked transaction too. I'd vote for leave it alone. I agree probably shouldn't be using curval after a failed transaction Well, people can do that now, and it doesn't throw an error. I'm inclined to agree with Marc that just leaving it alone (ie, it returns the last value produced, whether the transaction rolls back or not) is the best thing. There's inherently going to be some inconsistency here, since there's no such thing as a transactional sequence change otherwise. I don't see the point of going way out of our way to move the inconsistencies around. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Possibly the cleanest fix is to implement pg_ping as a libpq function. You do have to distinguish connection failures (ie connection refused) from errors that came back from the postmaster, and the easiest place to be doing that is inside libpq. OK, so a new libpq function --- got it. Would we just pass the status from the backend or can it be done without backend modifications? It would definitely be better to do it without backend mods, so that the functionality would work against back-branch postmasters. To my mind, the entire purpose of such a function is to classify the possible errors so that the caller doesn't have to. So I wouldn't consider that it ought to pass back the status from the backend. I think what we basically want is a function that takes a conninfo string (or one of the variants of that) and returns an enum defined more or less like this: * failed to connect to postmaster * connected, but postmaster is not accepting sessions * postmaster is up and accepting sessions I'm not sure those are exactly the categories we want, but something close to that. In particular, I don't know if there's any value in subdividing the not accepting sessions status --- pg_ctl doesn't really care, but other use-cases might want to tell the difference between the various canAcceptConnections failure states. BTW, it is annoying that we can't definitively distinguish postmaster is not running from a connectivity problem, but I can't see a way around that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Josh Berkus j...@agliodbs.com writes: I've been trying to diagnose in a production database why certain tables never get autovacuumed despite having a substantial % of updates. The obvious reason is locks blocking autovacuum from vacuuming the table ... but the trick is we don't log such blocking behavior, at all. This means that there is no possible way for a user to figure out *why* his tables aren't getting autovacuumed. And yes, this is a common problem. A quick survey on IRC found 3 active users on channel (out of 20 or so) who'd encountered it. The current case I'm looking at is a table with over 70% bloat which hasn't been autovacuumed since the database was upgraded a month ago. What I'd like to do is add some logging code to autovacuum.c so that if log_autovacuum is any value other than -1, failure to vacuum due to locks gets logged. Does this make sense? It's hard to tell, because you're just handwaving about what it is you think isn't being logged; nor is it clear whether you have any evidence that locks are the problem. Offhand I'd think it at least as likely that autovacuum thinks it doesn't need to do anything, perhaps because of a statistics issue. There *is* an elog(DEBUG3) in autovacuum.c that reports whether autovac thinks a table needs vacuumed/analyzed ... maybe that needs to be a tad more user-accessible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MergeAppend costing
Robert Haas robertmh...@gmail.com writes: See the attached test case. With that setup, this uses MergeAppend: explain select * from ma_parent order by id limit 10; But this one does not: explain select * from ma_parent order by name limit 10; ...which seems odd, because the index on ma_child1 and sorting the other two ought to still be better than appending all three and sorting the whole thing. Not really; what you're not accounting for is that the top-level sort is a lot cheaper than a full sort of the large child relation would be, because it gets hacked to do a top-N sort instead of a full sort. What this example suggests is that we should consider ways to pass down the top-N-ness to sorts executed as part of a MergeAppend tree. That seems a tad messy though, both in the executor and the planner. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
It's hard to tell, because you're just handwaving about what it is you think isn't being logged; nor is it clear whether you have any evidence that locks are the problem. Offhand I'd think it at least as likely that autovacuum thinks it doesn't need to do anything, perhaps because of a statistics issue. There *is* an elog(DEBUG3) in autovacuum.c that reports whether autovac thinks a table needs vacuumed/analyzed ... maybe that needs to be a tad more user-accessible. Yeah, it would be really good to be able to log that without bumping the log levels of the server in general to DEBUG3. On a busy production server, using any of the DEBUG levels is pretty much out of the question ... they can produce up to 1GB/minute in output. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Josh Berkus j...@agliodbs.com writes: There *is* an elog(DEBUG3) in autovacuum.c that reports whether autovac thinks a table needs vacuumed/analyzed ... maybe that needs to be a tad more user-accessible. Yeah, it would be really good to be able to log that without bumping the log levels of the server in general to DEBUG3. On a busy production server, using any of the DEBUG levels is pretty much out of the question ... they can produce up to 1GB/minute in output. Well, the way to deal with that would be to add a GUC that enables reporting of those messages at LOG level. But it's a bit hard to argue that we need such a thing without more evidence. Maybe you could just locally modify the DEBUG3 to LOG and see whether it teaches you anything? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On Thu, Nov 18, 2010 at 08:35, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, it would be really good to be able to log that without bumping the log levels of the server in general to DEBUG3. Well, the way to deal with that would be to add a GUC that enables reporting of those messages at LOG level. But it's a bit hard to argue that we need such a thing without more evidence. Maybe you could just locally modify the DEBUG3 to LOG and see whether it teaches you anything? How about adding a special role for autovacuum, and running autovacuum by the role instead of the database owner? If we have autovacuum role for autovacuum processes, we could set log_min_messages to DEBUG3 for only the role with per-user configuration. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Itagaki Takahiro itagaki.takah...@gmail.com writes: On Thu, Nov 18, 2010 at 08:35, Tom Lane t...@sss.pgh.pa.us wrote: Well, the way to deal with that would be to add a GUC that enables reporting of those messages at LOG level. Â But it's a bit hard to argue that we need such a thing without more evidence. Â Maybe you could just locally modify the DEBUG3 to LOG and see whether it teaches you anything? How about adding a special role for autovacuum, and running autovacuum by the role instead of the database owner? If we have autovacuum role for autovacuum processes, we could set log_min_messages to DEBUG3 for only the role with per-user configuration. That seems like a major kluge ... and anyway it doesn't fix the problem, because DEBUG3 is still going to result in a lot of unwanted log output, even if it's confined to autovacuum. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers