Re: [HACKERS] query cancel issues in contrib/dblink
Itagaki Takahiro wrote: Stephen Frost sfr...@snowman.net wrote: Can you provide an update on this patch? Joe, you were going to review and possibly commit it. Itagaki, did you have a new version? Are there any outstanding issues? Thanks for your reviewing. Here is an updated version. I fixed some bugs: * Fix connection leak on cancel. In the previous patch, running transactions are canceled, but the temporary connection was leaked. * Discard all pending results on cancel handler. I implemented PQexec-compatible behavior with async APIs. Thanks for the update. I am planning to review, but my time will be extremely limited for the next two weeks. I might find some time this coming weekend, and will do what I can, but after that it will be 9/28 (after my son's wedding on 9/27) before I get time to look closely. Joe signature.asc Description: OpenPGP digital signature
Re: [HACKERS] dblink doesn't transfar non-error meesages
Itagaki Takahiro wrote: Hi, contrib/dblink doesn't transfar any non-error messages. Is it by design or to be a ToDo item? We can use PQsetNoticeReceiver() here. I never thought about or needed, and no one else ever asked...patch welcomed. Joe signature.asc Description: OpenPGP digital signature
Re: [HACKERS] errcontext support in PL/Perl
On Tue, 2009-07-21 at 20:54 +0300, Alexey Klyukin wrote: Attached is the updated version of the patch (the original description is here: http://archives.postgresql.org/pgsql-hackers/2009-07/msg01332.php) that doesn't use global variables. It also shows the last line of the context in the example above correctly: committed -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logging configuration changes [REVIEW]
On Wed, Sep 16, 2009 at 12:24 AM, Abhijit Menon-Sen a...@toroid.org wrote: LOG: received SIGHUP, reloading configuration files LOG: parameter log_connections reset to default value off LOG: parameter log_disconnections reset to default value off LOG: Parameter max_connections cannot be changed without restarting the server LOG: parameter log_checkpoints changed to on ok, maybe this is not the most brilliant observation but someone has to say it... keep the same case in the word parameter ;) -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logging configuration changes [REVIEW]
At 2009-09-16 01:18:10 -0500, jcasa...@systemguards.com.ec wrote: ok, maybe this is not the most brilliant observation but someone has to say it... keep the same case in the word parameter ;) Oops, thanks. Re²vised patch attached. -- ams diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l index 9e9c3f7..a290601 100644 --- a/src/backend/utils/misc/guc-file.l +++ b/src/backend/utils/misc/guc-file.l @@ -260,9 +260,8 @@ ProcessConfigFile(GucContext context) { ereport(elevel, (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM), - errmsg(attempted change of parameter \%s\ ignored, - gconf-name), - errdetail(This parameter cannot be changed after server start.))); + errmsg(parameter \%s\ cannot be changed without restarting the server, + gconf-name))); continue; } diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 2224d56..8fa9599 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -4599,18 +4599,16 @@ set_config_option(const char *name, const char *value, if (changeVal !is_newvalue_equal(record, value)) ereport(elevel, (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM), - errmsg(attempted change of parameter \%s\ ignored, - name), - errdetail(This parameter cannot be changed after server start.))); + errmsg(parameter \%s\ cannot be changed without restarting the server, + name))); return true; } if (context != PGC_POSTMASTER) { ereport(elevel, (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM), - errmsg(attempted change of parameter \%s\ ignored, - name), - errdetail(This parameter cannot be changed after server start.))); + errmsg(parameter \%s\ cannot be changed without restarting the server, + name))); return false; } break; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove flatfiles.c
That's what I want to believe. But picture if you have, say a 1-terabyte table which is 50% dead tuples and you don't have a spare 1-terabytes to rewrite the whole table. But trying to VACUUM FULL that table is going to be horridly painful too, and you'll still have bloated indexes afterwards. You might as well just live with the 50% waste, especially since if you did a full-table update once you'll probably do it again sometime. I'm having a hard time believing that VACUUM FULL really has any interesting use-case anymore. This was almost exactly the scenario I faced recently. A production database unexpectedly filled up its partition. On investigation, we found a developer had added a component to the application that updated every row in one table each day, exhausting the free space map. Over time, most of the tables in the system had grown to contain 50-70% dead tuples. The owner of the system was understandably reluctant to dump and restore the system, and there wasn't enough space left on the system to rewrite any of the large tables. In the end, I dropped a table (the one owned by the offending developer... 8-), and this gave me just enough space to VACUUM FULL one table at a time. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
At 2009-07-30 13:37:16 -0700, prent...@cisco.com wrote: This patch changes plpgsql IN parameters so they are mutable. Makes sense, applies fine, works fine. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Tue, 2009-09-15 at 12:37 -0400, Robert Haas wrote: Instead of calling these generalized index constraints, I wonder if we oughtn't to be calling them something like don't-overlap constraints (that's a bad name, but something along those lines). They're not really general at all, except compared to uniqueness constraints (and they aren't called generalized unique-index constraints, just generalized index constraints). What they should be called is generalized unique constraints, without reference to index. Because what they generalize is the operator by which uniqueness is determined. Don't all of these have the same effect at the end? CREATE TABLE data (a int UNIQUE); CREATE TABLE data (a int); CREATE UNIQUE INDEX data_a_idx ON data USING btree (a); CREATE TABLE data (a int); CREATE INDEX data_a_idx ON data USING btree (a); ALTER TABLE data ADD CONSTRAINT a_unique (a =) USING INDEX data_a_idx; Which brings me to two concerns about the syntax. First, we have so far been fairly consistent to document that unique indexes are an implementation detail of unique constraints and should usually not be used directly. This new approach basically reverses that and forces you to define your constraints by means of implementation details rather than a logical description. There is nothing in this feature that makes it strikingly different from the existing constraint types in a way that would prevent a reasonable syntax for defining the constraint at table definition time. Another problem this would lead to is that a say dump of a table definition wouldn't actually contain all the constraints that apply to the table anymore, because there might be additional stuff such as this that can't be expressed that way. If you look at the example from the documentation, CREATE TABLE circles(c circle); CREATE INDEX circles_idx ON circles USING gist (c); ALTER TABLE circles ADD CONSTRAINT circles_idx_constr (c ) USING INDEX circles_idx; the only variable pieces of information that need to be provided to the table are the index type and the operator. So why not just write it like this CREATE TABLE circles (c circle UNIQUE ON gist ); and let that create the required index. And then traditional unique constraints would fit into this as well: CREATE TABLE data (a int UNIQUE ON btree =); The other problem I see with the syntax is that ALTER TABLE circles ADD CONSTRAINT circles_idx_constr (c ) USING INDEX circles_idx; doesn't seem very intuitive about what is actually being constrained. For a while I was thinking that it was constraining the table to values that are in the index or something. So using a word such as UNIQUE would help explain what is going on. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Tue, 2009-09-15 at 12:22 -0700, Jeff Davis wrote: I don't like using the word unique in the description, I think it only adds to the confusion. It would emphasize that a unique constraint is a common special case of the feature. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby 0.2.1
--On 15. September 2009 22:41:59 +0100 Simon Riggs si...@2ndquadrant.com wrote: http://wiki.postgresql.org/images/0/01/Hot_Standby_Recovery_Functions.pdf This doesn't work for me, it seems the correct link is http://wiki.postgresql.org/images/1/10/Hot_Standby_Recovery_Functions.pdf ? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bulk Inserts
OK, that makes sense. I thought you had hacked either XLogInsert or the heap WAL replay code so that you could just accumulate tuples in the rdata chain and then submit them all under the cover of a single WALInsertLock. Ah, no, I did not do that. This would be difficult to do : rdata chain contains buffer pointers, and when we are in XLogInsert, we have an exclusive lock on the buffer. If XLogInsert accumulated xlog records as you say, then later, when it's time to write them to the xlog, it would no longer hold exclusive lock on the buffer, so its contents could have changed, and if XLogInsert decides to do a full page write, the contents will be wrong. Besides, the LSN needs to be set in the page at every call to heap_insert (or else WAL will not be Ahead), so if XLogInsert accumulated stuff before writing it, it would need a mechanism to assign a LSN without having written the xlog data yet. If you haven't done that, then of course doing the bulk insert doesn't help much if you still do tuple-by-tuple XLogInsert. Exactly. So in the case that it is under the limit, you first run through the tuples putting them into the block, then run through the tuples again doing the XLogInserts? Yes, exactly. This isn't really optimal... I wonder if I could build one rdata chain containing all updates to the tuples and submit it in one go. Would it work ?... Do you have an IO bottleneck even in the absence of fsyncs? Sometimes, yes : - When xlog is on a (rather slow) software RAID1, I've found that the fsyncs which happen when switching to a new xlog segment are significant, and the amount of log data written is dangerously close to the max throughput, too. - When xlog is on a much faster RAID5, there is no such problem. fsync on raid5 is horrendously slow if you have many pending small writes, but if all you need to sync is a 16MB file which nicely aligns on raid stripes, it's not a problem. So in order to benchmark the right thing, I have : - all the tables in a big RAMDISK - xlog on RAID5 - fsync=fdatasync I've also found that setting wal_buffers to a large value like 128MB gives a significant speed boost when doing COPY or INSERT INTO SELECT, probably because it allows the backends to always find space in the buffers even if the walwriter is a bit busy. My experience on multi-core machines with decent IO systems has been that the amount of WAL traffic (by volume) matters rather little, as opposed to the number WALInsertLocks taken, which matter quite a bit. Of course this depends quite a bit on your OS and hardware. Exactly : with the configuration above, iowait is extremely small (1%) , yet all processes still spend 50% of their time waiting on WALInsertLock. The lock is held for about 15% of the total time ; with 4 cores and 4 threads, if a core spends X time holding the lock, it's logical that the others spend 3*X time waiting on it. But consider this : - 4 processes spend 50% of their time waiting on the lock - therefore at any time there are average 2 processes waiting - therefore every time the lock is Released, a process is woken up = more than 200.000 context switches per second seen in vmstat Actually it does 1 context switch every two inserted rows, which is enormous... I've done a bit of profiling and found that 70% of the time spent holding this lock is... computing the header's CRC. Just for a quick check, I removed all CRC calculations. There was absolutely no performance gain. Another angle of attack would be to make wal-writing more efficient... If you mean to do this without changing the xlog interfaces, I'm not optimistic. Agree : that's why I didn't even try ;) If you have to call XLogInsert once per row that is copied (or insert...select), then my experiments show that simply taking the WALInsertLock and immediately releasing it, doing absolutely no real work while it is held, is already a substanial multi-core scalibility bottleneck. Confirmed by context switch issue above... Having all cores block on the same lock for every row can be OK if it's a spinlock protecting just a few lines of code... which is not the present case... Once we accept that this must be done, the next existing bottleneck is the memcpy of the first byte from the rdata chain into the shared wal_buffers, presumably because this copy involves fighting the cache line away from other cores. Once you've copied the first byte, the rest of them seem to be almost free. (Again, this is probably hardware and situation dependent). Since I have one 4 core CPU I probably don't see this effect (a multi-cpu case would). I've seen some suggestions that the wal_buffer block initation work be moved from being done by AdvanceXLInsert to instead be done by XLogWrite. However, I've not seen any indication that AdvanceXLInsert is a meaningful bottlneck in the first place. Except when wal_buffers is too small:
Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09
Hi, On Wed, Sep 16, 2009 at 11:37 AM, Fujii Masao masao.fu...@gmail.com wrote: I was thinking that the automatic reconnection capability is the TODO item for the later CF. The infrastructure for it has already been introduced in the current patch. Please see the macro MAX_WALRCV_RETRIES (backend/ postmaster/walreceiver.c). This is the maximum number of times to retry walreceiver. In the current version, this is the fixed value, but we can make this user-configurable (parameter of recovery.conf is suitable, I think). Also a parameter like retries_interval might be necessary. This parameter indicates the interval between each reconnection attempt. Do you think that these parameters should be introduced right now? or the later CF? I updated the TODO list on the wiki, and marked the items that I'm going to develop for the later CommitFest. http://wiki.postgresql.org/wiki/Streaming_Replication#Todo_and_Claim Do you have any other TODO item? How much is that priority? And, is there already-listed TODO item which should be developed right now (CommitFest 2009-09)? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rough draft: easier translation of psql help
On Sun, 2009-09-13 at 23:46 -0400, Alvaro Herrera wrote: Should create_help.pl be run on make dist? It is. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Abhijit Menon-Sen wrote: At 2009-07-30 13:37:16 -0700, prent...@cisco.com wrote: This patch changes plpgsql IN parameters so they are mutable. Makes sense, applies fine, works fine. How does this compare with PLSQL? I know in Ada an IN argument is in effect a constant. I understand the utility, because I occasionally knock against this restriction, but if it's incompatible with PLSQL I think we should think about it more carefully. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
At 2009-09-16 08:37:40 -0400, and...@dunslane.net wrote: How does this compare with PLSQL? I don't remember anything of PL/SQL myself, but Pavel Stehule had this to say in response to the original post: This behave is in conflict with PL/SQL, what should do some problems. I thing, so I understand well, why this behave is in PL/SQL. It hasn't sense in plpgsql, because OUT and INOUT params has little bit different syntax (calling) and nobody will do similar bugs (perhaps). What is interesting - this behave is in conformity with SQL/PSM, where parameters are mutable too. I am for it. PL/pgSQL doesn't promise compatibility with PL/SQL and this change should to help some beginners (and this limit is artificial and unnecessary). Given the existing OUT/INOUT syntax difference as noted, I don't think the patch represents a significant problem. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_hba.conf: samehost and samenet
Stef Walter wrote: [Looks like my response to this never made it to the mailing list, sending again...] Your original message did not carry a Message-Id header, and neither does this one (at least the copy I got). Are you doing something weird to the message? This worries me, because we intend for message-ids to become our primary keys into the archives, and we rely very heavily on them (for example on the commitfest webapp at http://commitfest.postgresql.org). Magnus Hagander wrote: 2009/8/25 Alvaro Herrera alvhe...@commandprompt.com: Something to keep in mind -- my getifaddrs(3) manpage says that on BSD it can return addresses that have ifa_addr set to NULL, which your code doesn't seem to check. Thanks for catching that. I've added a check, and attached a new patch. [...] -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 8:37 AM, Andrew Dunstan and...@dunslane.net wrote: Abhijit Menon-Sen wrote: At 2009-07-30 13:37:16 -0700, prent...@cisco.com wrote: This patch changes plpgsql IN parameters so they are mutable. Makes sense, applies fine, works fine. How does this compare with PLSQL? I know in Ada an IN argument is in effect a constant. I understand the utility, because I occasionally knock against this restriction, but if it's incompatible with PLSQL I think we should think about it more carefully. At worst it's an upward-compatible extension, or am I wrong? If it's useful, which I think it is, what's the harm? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Abhijit Menon-Sen wrote: At 2009-09-16 08:37:40 -0400, and...@dunslane.net wrote: How does this compare with PLSQL? I don't remember anything of PL/SQL myself, but Pavel Stehule had this to say in response to the original post: This behave is in conflict with PL/SQL, what should do some problems. I thing, so I understand well, why this behave is in PL/SQL. It hasn't sense in plpgsql, because OUT and INOUT params has little bit different syntax (calling) and nobody will do similar bugs (perhaps). What is interesting - this behave is in conformity with SQL/PSM, where parameters are mutable too. I am for it. PL/pgSQL doesn't promise compatibility with PL/SQL and this change should to help some beginners (and this limit is artificial and unnecessary). Given the existing OUT/INOUT syntax difference as noted, I don't think the patch represents a significant problem. I'm not terribly impressed by either of Pavel's arguments. SQL/PSM is irrelevant, and the existence of one inconsistency doesn't seems to me to be a good rationale to create another. If there were a major increase in utility I would be more willing, but at best this overcomes a minor inconvenience, that is easily worked around. It probably won't cause any problem with code being migrated from PLSQL, but it will affect code going the other way. The question is: do we care about that? I'm prepared to be persuaded that we shouldn't care, but I'm not quite there yet. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Sep 15, 2009 at 10:28:28AM -0700, Jeff Davis wrote: On Tue, 2009-09-15 at 13:16 -0400, Robert Haas wrote: Uhh so what happens if I create an index constraint using the +(integer, integer) operator? You can use any operator that has an index search strategy. Overlaps is probably the most useful, but you could imagine other operators, like a bi-directional containment operator (either LHS is contained in RHS, or vice-versa). You can also get creative and have a similarity operator that determines whether two tuples are too similar. As long as it is symmetric, the feature will work. One question: does the operator have to be reflexive? I.e. A op A holds for all A? I am thinking proximity or as you state above similarity. May be this is a good metaphor, leading to a good name. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKsOPyBcgs9XrR2kYRAhsUAJkBICYUMK0tDrycPbctiGF7YKI/9gCeLQzq DPyAAkMgZJFn8BZ7P8119/g= =lVz1 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] errcontext support in PL/Perl
Selena Deckelmann escribió: From the patch review group this evening, courtesy of Webb Sprague and Brent Dombrowski. I'm replying on their behalf so that this response is in the correct thread. Looks like Peter also reviewed, but I'm forwarding this on anyway. (Note: Problem running regressions because make check doesn't descend automatically into the plperl -- we built with --with-perl successfully, but it doesn't change the behavior of make check. However, the tests do work with patch.) You have to use make -C src/pl installcheck. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Andrew Dunstan wrote: It probably won't cause any problem with code being migrated from PLSQL, but it will affect code going the other way. The question is: do we care about that? I'm prepared to be persuaded that we shouldn't care, but I'm not quite there yet. Anybody trying to port code from PL/pgSQL to PL/SQL is going to be a lot more inconvenienced by the OUT parameter stuff. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Andrew Dunstan and...@dunslane.net writes: It probably won't cause any problem with code being migrated from PLSQL, but it will affect code going the other way. The question is: do we care about that? I'm prepared to be persuaded that we shouldn't care, but I'm not quite there yet. IIRC the original complaint was from someone trying to migrate code from T/SQL or some other not-quite-PLSQL language. Like you, I'm on the fence about whether to accept this patch, but it does have some in-migration rationale. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
I'm not terribly impressed by either of Pavel's arguments. SQL/PSM is irrelevant, and the existence of one inconsistency doesn't seems to me to be a good rationale to create another. If there were a major increase in utility I would be more willing, but at best this overcomes a minor inconvenience, that is easily worked around. It probably won't cause any problem with code being migrated from PLSQL, but it will affect code going the other way. The question is: do we care about that? I'm prepared to be persuaded that we shouldn't care, but I'm not quite there yet. In this case I have not strong opinion. Similarity with SQL/PSM isn't my main argument. I see, so immutable IN arguments are typical problem for beginners. Internally arguments are not immutable - so mutable arguments should help to people who start with PostgreSQL. But I accept, so this increase difference between plpgsql and pl/sql what is wrong too. Regards Pavel cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 6:03 AM, Andrew Dunstan wrote: Abhijit Menon-Sen wrote: At 2009-09-16 08:37:40 -0400, and...@dunslane.net wrote: How does this compare with PLSQL? I don't remember anything of PL/SQL myself, but Pavel Stehule had this to say in response to the original post: This behave is in conflict with PL/SQL, what should do some problems. I thing, so I understand well, why this behave is in PL/SQL. It hasn't sense in plpgsql, because OUT and INOUT params has little bit different syntax (calling) and nobody will do similar bugs (perhaps). What is interesting - this behave is in conformity with SQL/PSM, where parameters are mutable too. I am for it. PL/pgSQL doesn't promise compatibility with PL/SQL and this change should to help some beginners (and this limit is artificial and unnecessary). Given the existing OUT/INOUT syntax difference as noted, I don't think the patch represents a significant problem. I'm not terribly impressed by either of Pavel's arguments. SQL/PSM is irrelevant, and the existence of one inconsistency doesn't seems to me to be a good rationale to create another. If there were a major increase in utility I would be more willing, but at best this overcomes a minor inconvenience, that is easily worked around. It probably won't cause any problem with code being migrated from PLSQL, but it will affect code going the other way. The question is: do we care about that? I'm prepared to be persuaded that we shouldn't care, but I'm not quite there yet. My motivation for submitting the patch was that it makes porting a huge collection of Informix SPL stored procedures easier. There are so many differences between plpgsql and SPL that you would think this wasn't that big of a deal, however, most of the other issues are easily taken care of with a simple sed script or something slightly more advanced (e.g. dealing with the declare/define block differences). This is one of the few compatibility issues where you really need to review and change lots of code by hand. The patch doesn't break existing code and doesn't make it any harder to port code from PL/SQL and on the flip side, this patch with the named/mixed notation patch from Pavel makes porting from Informix's SPL much easier. Thanks for everyone's consideration. -Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Tue, Sep 15, 2009 at 7:02 PM, Joshua Tolley eggyk...@gmail.com wrote: On Tue, Sep 15, 2009 at 05:52:35PM -0400, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: On Tue, 2009-09-15 at 14:42 -0700, Jeff Davis wrote: operator constraints operator exclusion constraints operator conflict constraints conflict operator constraints operator index constraints index constraints generalized index constraints something else? Just to add a couple more permutations of Robert Haas's suggestions: exclusion operator constraints exclusive operator constraints To my ear, operator exclusion constraints or exclusive operator constraints seem reasonable; the other permutations of that phrase simply aren't good English. I was having a hard time coming up with a name that was adequately short-and-sweet, and still conveyed the idea of both operator and index, which seems important so as to designate between these and the constraints we've had all along. Perhaps indexed operator constraints? Really I suppose they are indexed operator exclusion constraints, but that may be too wordy. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Linux LSB init script
OK, I have a number of comments on this proposal. What does it buy us, in general, and compared to the existing (non-LSB) script? This becomes clearer when you consider what LSB actually entails, which is: - INFO header - standardized exit codes - functions from /lib/lsb/init-functions My argument is that it might be better to just merge the INFO header and the exit codes into the existing script and forget about using the functions. The old script already has a chkconfig header, which was the moral predecessor to the LSB INFO section. So someone evidently thought this sort of thing was useful, and so we might as well update it. Differentiating the exit codes is the bulk of your code, but what about merging this part into pg_ctl itself? The init-functions then don't really buy you anything, except that you replace pg_ctl with start_daemon and killproc, and good old echo with log_failure_msg. And then you proceed in pg_initd_stop to basically reimplement half of the pg_ctl logic in shell code. And using the init-functions is the only thing here that is not backward compatible, so removing it would address the concern about support pre/non-LSB distributions. On the code itself, I think you might have gone a bit overboard with commenting and error checking. First, the is, in my mind, no need to repeat half of the LSB spec in the comments. Second, there is no need to check for the existence of every script, program, file, and command line argument separately. The existing script actually does that well enough. If something is missing, the shell itself will complain soon enough. Quite aside from anything else, I'm fairly sure a bloated script like that will scare users away when they given the alternative choice of using the small and easy-to-navigate old script in the same directory. So in summary my proposal would be to take the existing linux script, add the INFO header, and try to put all the other functionality that is missing into pg_ctl, making the init script a small wrapper around it. That would actually benefit the mainstream init scripts as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Wed, Sep 16, 2009 at 8:59 AM, Robert Haas robertmh...@gmail.com wrote: On Sep 16, 2009, at 8:37 AM, Andrew Dunstan and...@dunslane.net wrote: Abhijit Menon-Sen wrote: At 2009-07-30 13:37:16 -0700, prent...@cisco.com wrote: This patch changes plpgsql IN parameters so they are mutable. Makes sense, applies fine, works fine. How does this compare with PLSQL? I know in Ada an IN argument is in effect a constant. I understand the utility, because I occasionally knock against this restriction, but if it's incompatible with PLSQL I think we should think about it more carefully. At worst it's an upward-compatible extension, or am I wrong? If it's useful, which I think it is, what's the harm? are we guarding against cases like: select _foo, adjust_foo(_foo) from bar; -- adjust_foo is inout ?? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 8:49 AM, Merlin Moncure wrote: On Wed, Sep 16, 2009 at 8:59 AM, Robert Haas robertmh...@gmail.com wrote: At worst it's an upward-compatible extension, or am I wrong? If it's useful, which I think it is, what's the harm? are we guarding against cases like: select _foo, adjust_foo(_foo) from bar; -- adjust_foo is inout Two things: 1) the patch only affects IN parameters, 2) the parameter is a local copy and doesn't affect parameters/ variables outside of its scope. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Wed, 2009-09-16 at 15:11 +0200, to...@tuxteam.de wrote: One question: does the operator have to be reflexive? I.e. A op A holds for all A? I don't think that reflexivity is a strict requirement. You could make this a constraint over a boolean attribute such that false conflicts with true and true conflicts with false. That would mean that your table would have to consist of either all false or all true. I am thinking proximity or as you state above similarity. May be this is a good metaphor, leading to a good name. That's an interesting idea: proximity constraint. I like it because (a) proximity might reasonably be considered a more general form of the word unique, which might satisfy Peter's argument; (b) it conveys the use case; and (c) it sounds good. There are a couple bizarre cases where proximity doesn't quite fit, like my boolean example above, but I'm OK with that. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Linux LSB init script
Peter Eisentraut pete...@gmx.net wrote: OK, I have a number of comments on this proposal. Thanks for looking at it. What does it buy us, in general, and compared to the existing (non-LSB) script? A conforming script for use in LSB conforming implementations. what LSB actually entails, which is: - INFO header - standardized exit codes - functions from /lib/lsb/init-functions As well as a standardized set of actions and standard semantics for them, if we're only talking about the init script parts. My argument is that it might be better to just merge the INFO header and the exit codes into the existing script and forget about using the functions. That's where I started, but felt that it was of no benefit unless the OS used them, in which case I thought conforming behavior would be wanted. The old script already has a chkconfig header, which was the moral predecessor to the LSB INFO section. So someone evidently thought this sort of thing was useful, and so we might as well update it. I'm not following what you say here. Is there something wrong with the current chkconfig header in the non-LSB script? Differentiating the exit codes is the bulk of your code, but what about merging this part into pg_ctl itself? That would result in different exit codes for a lot of circumstances. I'm OK with that, if everyone agrees that we want, for example an exit code of 0 for an attempt to start a server which is already running or an attempt to stop a server which isn't running. (These are only two examples of differences between what is required of an LSB conforming init script and what pg_ctl does. Are we all in universal agreement to make such a change to pg_ctl? The init-functions then don't really buy you anything, except that you replace pg_ctl with start_daemon and killproc, and good old echo with log_failure_msg. And then you proceed in pg_initd_stop to basically reimplement half of the pg_ctl logic in shell code. Well, with differences in behavior, of course, like attempting a fast shutdown, and escalating to an immediate shutdown if that doesn't succeed in the allowed time. And using the init-functions is the only thing here that is not backward compatible, so removing it would address the concern about support pre/non-LSB distributions. Use of these functions is a requirement for LSB conforming scripts. If you don't use them, particularly for success and failure messages, you're not conforming. For one thing, using these functions causes the success and failure messages to be formatted like those for other services on the machine, and the format varies a lot between distributions. Also, consider this statement in the spec: | Error and status messages should be printed with the logging | functions (see Init Script Functions) log_success_msg(), | log_failure_msg() and log_warning_msg(). Scripts may write to | standard error or standard output, but implementations need not | present text written to standard error/output to the user or do | anything else with it. Not using the functions would cause the script to only work for some undefined subset of conforming implementations. First, the is, in my mind, no need to repeat half of the LSB spec in the comments. Well, that's a bit of hyperbole -- I did the math and it's less than 10% of what I consider to be the part of the spec fairly directly related to init scripts. Which parts of these comments do you feel should be omitted?: # Actions other than status may use these return codes: # 1 - generic or unspecified error # 2 - invalid or excess argument(s) # 3 - unimplemented feature (for example, reload) # 4 - user had insufficient privilege # 5 - program is not installed # 6 - program is not configured # 7 - program is not running # Start the service. # If already running, return success without start attempt. # Stop the service. # If not running, return success without stop attempt. # Stop and restart the service if the service is already running, # otherwise start the service. # Restart the service if the service is already running. # If stopped, return success without stop attempt. # Cause the configuration of the service to be reloaded # without actually stopping and restarting the service. # (Since PostgreSQL supports reload, force-reload should use that.) # Print the current status of the service. # Return codes differ from other actions: # 0 - program is running or service is OK # 1 - program is dead and /var/run pid file exists # 2 - program is dead and /var/lock lock file exists # 3 - program is not running # 4 - program or service status is unknown # If we don't recognize action, consider it an invalid argument. # If the standard adds actions we don't support, exit should be 3 for those. I put in as much as I found useful to myself while working on the script. I left it in on the
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
IIRC the original complaint was from someone trying to migrate code from T/SQL or some other not-quite-PLSQL language. Like you, I'm on the fence about whether to accept this patch, but it does have some in-migration rationale. As someone who writes a lot of plpgsql, I'm in favor of the patch. 1. Compatibility with PL/SQL, especially in the extra features direction, has never been a tremendous priority for us before; 2. We don't particularly care if native plpgsql procedures can be back-ported to PLSQL, and if we did there are much greater compatibility issues than this one; 3. This patch eliminates a common plpgsql beginner error and saves all of us heavy plpgsql users some typing, especially when the use of a mutable variable means that we can eliminate the DECLARE section entirely, as in: This: CREATE PROCEDURE mod ( x int, y int ) RETURNS int LANGUAGE plpgsql AS $f$ DECLARE z INT := x; BEGIN z := x % y; RETURN z; END; $f$ Becomes this: CREATE PROCEDURE mod ( x int, y int ) RETURNS int LANGUAGE plpgsql AS $f$ BEGIN x := x % y; RETURN x; END; $f$ -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Wed, 2009-09-16 at 10:14 +0300, Peter Eisentraut wrote: What they should be called is generalized unique constraints, without reference to index. Because what they generalize is the operator by which uniqueness is determined. How about GUC, for short? ;-) Do you think that Tomás's suggestion of proximity constraints would satisfy your requirement on the basis that proximity is a more general form of the word unique? First, we have so far been fairly consistent to document that unique indexes are an implementation detail of unique constraints and should usually not be used directly. This new approach basically reverses that and forces you to define your constraints by means of implementation details rather than a logical description. There is nothing in this feature that makes it strikingly different from the existing constraint types in a way that would prevent a reasonable syntax for defining the constraint at table definition time. Another problem this would lead to is that a say dump of a table definition wouldn't actually contain all the constraints that apply to the table anymore, because there might be additional stuff such as this that can't be expressed that way. Those are all good points. I think ultimately we need to support this at table creation time and make index specification optional. We do need to allow the user to specify the index, however. An important use case of this feature is defining an index on (A, B, C) and using it to enforce UNIQUE(A,B) and UNIQUE(A,C). CREATE TABLE circles (c circle UNIQUE ON gist ); Should we use USING instead of ON to be consistent with CREATE INDEX? Also, right now a UNIQUE by itself creates a btree. Let's say the user has btree_gist installed and they declare (a =, b =) without specifying the AM. Which one do we choose? I suppose we can come up with a default order, like btree, hash, gist, gin; and just choose the first one with a matching opclass. Also, we should decide whether UNIQUE(a,b) should be shorthand for (a =, b =), or whether they should be treated differently somehow. If so, we'll need to come up with some kind of rules for how it determines that UNIQUE chooses to use a btree with indisunique enforcement; and we need to come up with a way to force it to choose the new enforcement mechanism in my patch if the user wants to. CREATE TABLE data (a int UNIQUE ON btree =); I think we should provide some way for the user to specify what enforcement mechanism is used when multiple options are possible. In this example should it use the existing mechanism or the new mechanism? ALTER TABLE circles ADD CONSTRAINT circles_idx_constr (c ) USING INDEX circles_idx; doesn't seem very intuitive about what is actually being constrained. For a while I was thinking that it was constraining the table to values that are in the index or something. So using a word such as UNIQUE would help explain what is going on. I'm still uncomfortable using the word UNIQUE. I see that you are taking an approach similar to ORDER BY ... USING. However, I'm concerned because we're using a special case word to describe the general feature, and I think that's confusing. On the other hand, it's hard to argue that (a , b =) is intuitive, so I'll acquiesce if you get some agreement from someone else. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Tue, 2009-09-15 at 22:52 +1000, Brendan Jurd wrote: I'm just getting started reviewing this version now. I noticed that your patch seems to have been generated by git. Ok, I now have a public git repo on git.postgresql.org, and I rebased my patch before I pushed it. See updates in my generalized-constraints branch. I have psql support now, so it might be slightly easier to review. The language and name of the feature are going through a little turmoil right now, as you can see, so I'm trying to keep up with that. As that settles down I'll improve the docs. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Wed, Sep 16, 2009 at 3:14 AM, Peter Eisentraut pete...@gmx.net wrote: On Tue, 2009-09-15 at 12:37 -0400, Robert Haas wrote: Instead of calling these generalized index constraints, I wonder if we oughtn't to be calling them something like don't-overlap constraints (that's a bad name, but something along those lines). They're not really general at all, except compared to uniqueness constraints (and they aren't called generalized unique-index constraints, just generalized index constraints). What they should be called is generalized unique constraints, without reference to index. Because what they generalize is the operator by which uniqueness is determined. Well, it should eventually be possible to use this feature to create an index which excludes overlapping ranges in fact, unless I misunderstand, that's the principle likely use case. Which is not unique-ness at all. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Feedback on getting rid of VACUUM FULL
Hackers, Here's the feedback on replacing VACUUM FULL with VACUUM REWRITE: http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959 Of note: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Lots of people have said something hypothetical, but nobody has come forward with a I have this database X and several times Y happened, and only FULL would work This makes me think that there very likey are no actual use cases where we need to preserve FULL. b) Several people have strongly pushed for a phased removal of FULL over more than one PG version, with a warning message about depreciation. c) Vivek had some points about required implementation: However, there still must be a way to compact the tables that is mvcc safe. From what I have read and recall, cluster is not. Thus, the vacuum rewrite would be a mandatory feature (or cluster could be made mvcc safe). Is Vivek correct about this? News to me ... -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Josh Berkus j...@agliodbs.com wrote: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Would rewrite have handled this?: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01016.php -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Josh Berkus wrote: Hackers, Here's the feedback on replacing VACUUM FULL with VACUUM REWRITE: http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959 Of note: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Lots of people have said something hypothetical, but nobody has come forward with a I have this database X and several times Y happened, and only FULL would work This makes me think that there very likey are no actual use cases where we need to preserve FULL. Well, Andrew McNamara just posted today: http://archives.postgresql.org/message-id/20090916063341.0735c5ac...@longblack.object-craft.com.au Had VACUUM FULL not been available, though, I'm pretty sure he would've come up with something else instead. c) Vivek had some points about required implementation: However, there still must be a way to compact the tables that is mvcc safe. From what I have read and recall, cluster is not. Thus, the vacuum rewrite would be a mandatory feature (or cluster could be made mvcc safe). Is Vivek correct about this? News to me ... No, that was fixed in 8.3. I was just going to post that we should make a decision about this, because ISTM there's some code in Simon's hot standby patch that is only required to support VACUUM FULL. If we make the decision that we drop VACUUM FULL in 8.5, we can take that part out of the patch now. It's not a huge amount of code, but still. I'm in favor of removing VACUUM FULL in 8.5. To replace it, we should offer: 1) VACUUM REWRITE, which is like CLUSTER but doesn't use an index, and 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 13:40 , Josh Berkus wrote: 3. This patch eliminates a common plpgsql beginner error and saves all of us heavy plpgsql users some typing, especially when the use of a mutable variable means that we can eliminate the DECLARE section entirely, as in: This: CREATE PROCEDURE mod ( x int, y int ) RETURNS int LANGUAGE plpgsql AS $f$ DECLARE z INT := x; BEGIN z := x % y; RETURN z; END; $f$ This is also currently valid: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ DECLARE z INT := x % y; BEGIN RETURN z; END; $f$ As is this: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ BEGIN RETURN (x % y); END; $f$ Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. An advantage here is that it would allow people to do a partial vacuum full to gradually move tuples from the end of the relation to the beginning. That would allow vacuums in between the updates to free the index tuples, preventing index bloat. Another thing to think about is that lazy vacuum only shrinks the heap file if it happens to be able to acquire an access exclusive lock. Because vacuum can't be run inside a transaction block, I don't think there's currently a way to ensure that the heap file actually gets shrunk. How about we provide some way to make it acquire an access exclusive lock at the beginning, but still perform a lazy vacuum? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On 9/16/09 11:20 AM, Kevin Grittner wrote: Josh Berkus j...@agliodbs.com wrote: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Would rewrite have handled this?: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01016.php Ok, that sounds like a real use case. However, given Heikki's post about FULL being an issue for Hot Standby, I'm more inclined to provide a workaround ... for example, allowing REWRITE to write to a designated tablespace, which would allow people to use a portable drive or similar for the extra disk space. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 11:10 -0700, Josh Berkus wrote: Hackers, Here's the feedback on replacing VACUUM FULL with VACUUM REWRITE: http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959 Of note: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. The only case is when you are out of disk space and can't afford to write out a full set of live rows. What I'd like to propose is VACUUM FULL CONCURRENTLY, which similar to VACUUM CONCURRENTLY, would actually do the compaction phase, that is, move simultaneously from two directions, from start, to find empty space and from end to find tuples. for each sufficiently large empty space the forward scan finds it would take one or more tuples from the reverse scan and then null update those to the empty space found by the free-space-scan beginning. it should do that in small chunks, say one page at a time, so it will minimally interfere with OLTP loads. Once these two scans meet, you can stop and either run an non full vacuum, or just continue in similar fashion to non-full vacuum and do the cleanups of indexes and heap. You may need to repeat this a few times to get actual shrinkage but it has the very real advantage of being usable on 24/7 systems, which neither VACUUM FULL nor CLUSTER possess. At some point I actually had external scripts doing similar stuff for on-line table shrinking, the only difference being that I could not move the tuple towards beginning right away (pg preferred in-page updates) and had to keep doing null updates (id=id where id) until the page number in ctid changed. Lots of people have said something hypothetical, but nobody has come forward with a I have this database X and several times Y happened, and only FULL would work This makes me think that there very likey are no actual use cases where we need to preserve FULL. b) Several people have strongly pushed for a phased removal of FULL over more than one PG version, with a warning message about depreciation. c) Vivek had some points about required implementation: However, there still must be a way to compact the tables that is mvcc safe. From what I have read and recall, cluster is not. Thus, the vacuum rewrite would be a mandatory feature (or cluster could be made mvcc safe). Is Vivek correct about this? News to me ... It used to be true at some point, probably not true any more. IIRC, the problem was, that old table was not locked during rewrite and thus some code could be updating the old heap even while the data had been muved to the new one. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: I was just going to post that we should make a decision about this, because ISTM there's some code in Simon's hot standby patch that is only required to support VACUUM FULL. If we make the decision that we drop VACUUM FULL in 8.5, we can take that part out of the patch now. It's not a huge amount of code, but still. I'm in favor of removing VACUUM FULL in 8.5. To replace it, we should offer: 1) VACUUM REWRITE, which is like CLUSTER but doesn't use an index, and 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. I have not checked, but I suspect pg_reorg may already be doing something similar http://pgfoundry.org/forum/forum.php?forum_id=1561 -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Hannu, The only case is when you are out of disk space and can't afford to write out a full set of live rows. Well, it's actually rather specific. You need to have: a) *Some* free disk space (FULL requires extra disk) but not enough to copy one entire table and its indexes. b) be already down or willing to accept the long downtime which comes with FULL more than you're willing to go out and get some extra disk or move your database to a new share. There's no question that this combination is fairly circumstantial and represents a minority of potential vacuum cases. Unfortunately, it does seem to represent some real-life ones, so we have to take those into account. What I'd like to propose is VACUUM FULL CONCURRENTLY, which similar to VACUUM CONCURRENTLY, would actually do the compaction phase, that is, move simultaneously from two directions, from start, to find empty space and from end to find tuples. for each sufficiently large empty space the forward scan finds it would take one or more tuples from the reverse scan and then null update those to the empty space found by the free-space-scan beginning. it should do that in small chunks, say one page at a time, so it will minimally interfere with OLTP loads. How would this work with HS? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 11:10 -0700, Josh Berkus wrote: Hackers, Here's the feedback on replacing VACUUM FULL with VACUUM REWRITE: http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959 Of note: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. The only case is when you are out of disk space and can't afford to write out a full set of live rows. What I'd like to propose is VACUUM FULL CONCURRENTLY, which similar to VACUUM CONCURRENTLY, would actually do the compaction phase, that is, move simultaneously from two directions, from start, to find empty space and from end to find tuples. for each sufficiently large empty space the forward scan finds it would take one or more tuples from the reverse scan and then null update those to the empty space found by the free-space-scan beginning. it should do that in small chunks, say one page at a time, so it will minimally interfere with OLTP loads. Once these two scans meet, you can stop and either run an non full vacuum, or just continue in similar fashion to non-full vacuum and do the cleanups of indexes and heap. You may need to repeat this a few times to get actual shrinkage but it has the very real advantage of being usable on 24/7 systems, which neither VACUUM FULL nor CLUSTER possess. At some point I actually had external scripts doing similar stuff for on-line table shrinking, the only difference being that I could not move the tuple towards beginning right away (pg preferred in-page updates) and had to keep doing null updates (id=id where id) until the page number in ctid changed. Lots of people have said something hypothetical, but nobody has come forward with a I have this database X and several times Y happened, and only FULL would work This makes me think that there very likey are no actual use cases where we need to preserve FULL. b) Several people have strongly pushed for a phased removal of FULL over more than one PG version, with a warning message about depreciation. c) Vivek had some points about required implementation: However, there still must be a way to compact the tables that is mvcc safe. From what I have read and recall, cluster is not. Thus, the vacuum rewrite would be a mandatory feature (or cluster could be made mvcc safe). Is Vivek correct about this? News to me ... It used to be true at some point, probably not true any more. IIRC, the problem was, that old table was not locked during rewrite and thus some code could be updating the old heap even while the data had been muved to the new one. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 13:20 -0500, Kevin Grittner wrote: Josh Berkus j...@agliodbs.com wrote: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Would rewrite have handled this?: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01016.php If REWRITE is just a CLUSTER using seqscan, then no If it is a sequence of 1. ordinary VACUUM (it can't run out of FSM anymore, no?) 2. a process moving live tuples from end (using reverse seqscan) to free space found scanning in first-to-last direction, either one tuple at a time or one page at a time, until the two scans meet 3. another ordinary VACUUM to actually reclaim the free space 4. repeat a few times so that tuples at the end of relation (for whatever reason) added while doing 1-3 are also moved towards beginning then yes, it would have taken some time, but it would have definitely helped It would still have caused index bloat, so to get full benefit of it, one should have finished it up with an equivalent of CONCURRENT REINDEX. -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 11:48 -0700, Josh Berkus wrote: Hannu, The only case is when you are out of disk space and can't afford to write out a full set of live rows. Well, it's actually rather specific. You need to have: a) *Some* free disk space (FULL requires extra disk) but not enough to copy one entire table and its indexes. b) be already down or willing to accept the long downtime which comes with FULL more than you're willing to go out and get some extra disk or move your database to a new share. There's no question that this combination is fairly circumstantial and represents a minority of potential vacuum cases. Unfortunately, it does seem to represent some real-life ones, so we have to take those into account. Agreed. What I'd like to propose is VACUUM FULL CONCURRENTLY, which similar to VACUUM CONCURRENTLY, would actually do the compaction phase, that is, move simultaneously from two directions, from start, to find empty space and from end to find tuples. for each sufficiently large empty space the forward scan finds it would take one or more tuples from the reverse scan and then null update those to the empty space found by the free-space-scan beginning. it should do that in small chunks, say one page at a time, so it will minimally interfere with OLTP loads. How would this work with HS? Exactly the same as just doing a lot of UPDATE's which move tuples around between pages. It actually _is_ a lots of updates, just with extra condition that tuple is always moved to lowest available free slot. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Hannu, If it is a sequence of 1. ordinary VACUUM (it can't run out of FSM anymore, no?) 2. a process moving live tuples from end (using reverse seqscan) to free space found scanning in first-to-last direction, either one tuple at a time or one page at a time, until the two scans meet 3. another ordinary VACUUM to actually reclaim the free space 4. repeat a few times so that tuples at the end of relation (for whatever reason) added while doing 1-3 are also moved towards beginning Sounds good, you want to code it for 8.5? I could actually see two tools, one VACUUM FULL CONCURRENTLY and one VACUUM REWRITE. The first would be in place and the second would be fast. Both should work better with HS than current VF does. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Michael, This is also currently valid: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ DECLARE z INT := x % y; BEGIN RETURN z; END; $f$ As is this: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ BEGIN RETURN (x % y); END; $f$ Certainly. I was doing that to have a simple example; obviously you wouldn't write a mod funciton, and you wouldn't do it in plpgsql. There are other case where the lack of mutability in IN parameters causes you to create a throwaway variable. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Sep 16, 2009 at 09:45:52AM -0700, Jeff Davis wrote: On Wed, 2009-09-16 at 15:11 +0200, to...@tuxteam.de wrote: One question: does the operator have to be reflexive? I.e. A op A holds for all A? I don't think that reflexivity is a strict requirement. You could make this a constraint over a boolean attribute such that false conflicts with true and true conflicts with false. That would mean that your table would have to consist of either all false or all true. Let me see whether I've understood this: more in general, op could be not equal -- i.e. . Then, once one value was inserted into the column, all other values would have to be equal. [...] That's an interesting idea: proximity constraint. I like it because (a) proximity might reasonably be considered a more general form of the word unique, which might satisfy Peter's argument; (b) it conveys the use case; and (c) it sounds good. Yes, riding on this geometric metaphor (distance), I was trying to visualize relations which are symmetric but not refexive and came up with things like point X is at least d1 far, at most d2 far from Y i.e. Y are all points which are whithin a ring centered on Y, with inner radius d1 and outer radius d2. Special cases would be d1=0, d20 (that's conventional proximity) -- but there are weirder cases, as your example above (d2 possibly infinite, d1 small, giving a punctured space). There are a couple bizarre cases where proximity doesn't quite fit, like my boolean example above, but I'm OK with that. Hmmm. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKsT88Bcgs9XrR2kYRAlmcAJ9+rP7AkimXRPoKGaBoJkthX2LzggCfTWst KF/XMRouhlEbQcORaeoIbc0= =BBEF -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 15:17 , Josh Berkus wrote: Michael, This is also currently valid: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ DECLARE z INT := x % y; BEGIN RETURN z; END; $f$ As is this: CREATE FUNCTION mod (x int, y int) RETURNS int LANGUAGE plpgsql AS $f$ BEGIN RETURN (x % y); END; $f$ Certainly. I was doing that to have a simple example; obviously you wouldn't write a mod funciton, and you wouldn't do it in plpgsql. There are other case where the lack of mutability in IN parameters causes you to create a throwaway variable. Have an example at hand? I'd argue that in a case of a function of more complexity from a code clarity standpoint you'd want to assign to a new variable that describes what the new value reflects. Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
On Sep 16, 2009, at 12:44 PM, Michael Glaesemann wrote: Certainly. I was doing that to have a simple example; obviously you wouldn't write a mod funciton, and you wouldn't do it in plpgsql. There are other case where the lack of mutability in IN parameters causes you to create a throwaway variable. Have an example at hand? I'd argue that in a case of a function of more complexity from a code clarity standpoint you'd want to assign to a new variable that describes what the new value reflects. I can't say I disagree with you from a purist standpoint, but for porting existing code sometimes it's more efficient to port what you have without rewriting it. In some of the code I'm looking at porting, this is a very simple example of a common pattern I'm seeing: create function create_some_object(pobjectid uuid, psomefkobjectid uuid) returns uuid as $$ begin if pobjectid is null then pobjectid := newid() end if if psomefkobjectid is null then select objectid into psomefkobjectid from somefktable where whatever; end if -- create the object return pobjectid end; -Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Hannu Krosing wrote: On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. I have not checked, but I suspect pg_reorg may already be doing something similar http://pgfoundry.org/forum/forum.php?forum_id=1561 Hmm, AFAICT pg_reorg is much more complex, writing stuff to a temp table and swapping relfilenodes afterwards. More like the VACUUM REWRITE that's been discussed. For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. It uses the same principle as VACUUM FULL, scans from the end, moving tuples to lower-numbered pages until it can't do it anymore. It requires a small change to heap_update(), to override the preference to store the new tuple on the same page as the old one, but other than that, it's all in the external module. To test: -- Create and populate test table CREATE TABLE foo (id int4 PRIMARY KEY); INSERT INTO foo SELECT a FROM generate_series(1,10) a; -- Delete a lot of tuples from the beginning. This creates the hole that we want to compact out. DELETE FROM foo WHERE id 9; -- Vacuum to remove the dead tuples VACUUM VERBOSE foo; -- Run the utility to move the tuples SELECT vacuumfull('foo'); -- Vacuum table again to remove the old tuple versions of the moved rows and truncate the file. VACUUM VERBOSE foo; -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/contrib/Makefile b/contrib/Makefile index 0afa149..59c9279 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -40,6 +40,7 @@ SUBDIRS = \ test_parser \ tsearch2 \ unaccent \ + vacuumfull \ vacuumlo ifeq ($(with_openssl),yes) diff --git a/contrib/vacuumfull/Makefile b/contrib/vacuumfull/Makefile new file mode 100644 index 000..925d2c4 --- /dev/null +++ b/contrib/vacuumfull/Makefile @@ -0,0 +1,24 @@ +#- +# +# vacuumfull Makefile +# +# $PostgreSQL$ +# +#- + +MODULE_big = vacuumfull +OBJS = vacuumfull.o +DATA_built = vacuumfull.sql +DATA = uninstall_vacuumfull.sql + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/vacuumfull +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + diff --git a/contrib/vacuumfull/uninstall_vacuumfull.sql b/contrib/vacuumfull/uninstall_vacuumfull.sql new file mode 100644 index 000..9ecab84 --- /dev/null +++ b/contrib/vacuumfull/uninstall_vacuumfull.sql @@ -0,0 +1,6 @@ +/* $PostgreSQL$ */ + +-- Adjust this setting to control where the objects get dropped. +SET search_path = public; + +DROP FUNCTION vacuumfull(regclass); diff --git a/contrib/vacuumfull/vacuumfull.c b/contrib/vacuumfull/vacuumfull.c new file mode 100644 index 000..07139ba --- /dev/null +++ b/contrib/vacuumfull/vacuumfull.c @@ -0,0 +1,286 @@ +/*- + * + * vacuumfull.c + * An utility to replace old VACUUM FULL + * + * XXX + * + * Copyright (c) 2007-2009, PostgreSQL Global Development Group + * + * IDENTIFICATION + * $PostgreSQL$ + * + *- + */ + +#include postgres.h + +#include access/heapam.h +#include access/xact.h +#include executor/executor.h +#include miscadmin.h +#include storage/bufmgr.h +#include storage/procarray.h +#include utils/acl.h +#include utils/tqual.h +#include utils/inval.h +#include utils/memutils.h + +PG_MODULE_MAGIC; + +Datum vacuumfull(PG_FUNCTION_ARGS); + + +/*-- + * ExecContext: + * + * As these variables always appear together, we put them into one struct + * and pull initialization and cleanup into separate routines. + * ExecContext is used by repair_frag() and move_xxx_tuple(). More + * accurately: It is *used* only in move_xxx_tuple(), but because this + * routine is called many times, we initialize the struct just once in + * repair_frag() and pass it on to move_xxx_tuple(). + */ +typedef struct ExecContextData +{ + ResultRelInfo *resultRelInfo; + EState *estate; + TupleTableSlot *slot; +} ExecContextData; + +typedef ExecContextData *ExecContext; + +static void +ExecContext_Init(ExecContext ec, Relation rel) +{ + TupleDesc tupdesc = RelationGetDescr(rel); + + /* + * We need a ResultRelInfo and an EState so we can use the regular + * executor's index-entry-making machinery. + */ + ec-estate =
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 23:53 +0300, Heikki Linnakangas wrote: Hannu Krosing wrote: On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. I have not checked, but I suspect pg_reorg may already be doing something similar http://pgfoundry.org/forum/forum.php?forum_id=1561 Hmm, AFAICT pg_reorg is much more complex, writing stuff to a temp table and swapping relfilenodes afterwards. More like the VACUUM REWRITE that's been discussed. For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. It uses the same principle as VACUUM FULL, scans from the end, moving tuples to lower-numbered pages until it can't do it anymore. It requires a small change to heap_update(), to override the preference to store the new tuple on the same page as the old one, but other than that, it's all in the external module. Exactly as I hoped :D One thing that would be harder to do, and which CLUSTER currently does is introducing empty space within pages, based on fillfactor. Doing that would need a similar, though reversed strategy. But it is probably not something that is often needed, as a an update on page with no free space would eventually do almost the same. To test: -- Create and populate test table CREATE TABLE foo (id int4 PRIMARY KEY); INSERT INTO foo SELECT a FROM generate_series(1,10) a; -- Delete a lot of tuples from the beginning. This creates the hole that we want to compact out. DELETE FROM foo WHERE id 9; -- Vacuum to remove the dead tuples VACUUM VERBOSE foo; -- Run the utility to move the tuples SELECT vacuumfull('foo'); -- Vacuum table again to remove the old tuple versions of the moved rows and truncate the file. VACUUM VERBOSE foo; Now, if you could just make vacuumfull('foo'); run in multiple transactions (say one per N tuples moved, or even per N seconds spent) to make it friendlier for OLTP workloads, which then dont have to wait for the whole thing to finish in order to proceed with update of a moved tuple (and also to deal with deadloks from trying to move an updated tuple) then I'd claim we have a much better VACUUM FULL :) -- Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 23:53 +0300, Heikki Linnakangas wrote: For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. It uses the same principle as VACUUM FULL, scans from the end, moving tuples to lower-numbered pages until it can't do it anymore. It requires a small change to heap_update(), to override the preference to store the new tuple on the same page as the old one, but other than that, it's all in the external module. It fails at initdb time for me: FATAL: unrecognized heap_update status: 5 STATEMENT: REVOKE ALL on pg_authid FROM public; Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Michael, Have an example at hand? I'd argue that in a case of a function of more complexity from a code clarity standpoint you'd want to assign to a new variable that describes what the new value reflects. Depends on what programming language you're used to. For those of us who do a lot of pass-by-reference in our non-database code, reusing the IN variable is natural. I know not being able to is a longstanding annoyance for me. And I really don't think it's the place of the PostgreSQL project to try to force what some of us think is good PL coding style on people. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, Sep 16, 2009 at 4:53 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hannu Krosing wrote: On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. I have not checked, but I suspect pg_reorg may already be doing something similar http://pgfoundry.org/forum/forum.php?forum_id=1561 Hmm, AFAICT pg_reorg is much more complex, writing stuff to a temp table and swapping relfilenodes afterwards. More like the VACUUM REWRITE that's been discussed. For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. It uses the same principle as VACUUM FULL, scans from the end, moving tuples to lower-numbered pages until it can't do it anymore. It requires a small change to heap_update(), to override the preference to store the new tuple on the same page as the old one, but other than that, it's all in the external module. To test: -- Create and populate test table CREATE TABLE foo (id int4 PRIMARY KEY); INSERT INTO foo SELECT a FROM generate_series(1,10) a; -- Delete a lot of tuples from the beginning. This creates the hole that we want to compact out. DELETE FROM foo WHERE id 9; -- Vacuum to remove the dead tuples VACUUM VERBOSE foo; -- Run the utility to move the tuples SELECT vacuumfull('foo'); -- Vacuum table again to remove the old tuple versions of the moved rows and truncate the file. VACUUM VERBOSE foo; I think this should be in core, not a contrib module. I also wonder whether we should consider teaching regular VACUUM to do a little of this every time it's run. Right now, once your table gets bloated, it stays bloated forever, until you intervene. Making it slowly get better by itself would reduce the number of people who live with the problem for a month or a year before writing in to say Access to this table seems really slow ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Josh Berkus wrote: Michael, Have an example at hand? I'd argue that in a case of a function of more complexity from a code clarity standpoint you'd want to assign to a new variable that describes what the new value reflects. Depends on what programming language you're used to. For those of us who do a lot of pass-by-reference in our non-database code, reusing the IN variable is natural. I know not being able to is a longstanding annoyance for me. It's the pass by reference case that would be dangerous, in fact. The fact that in C all function parameters are passed by value (unlike, say, FORTRAN) is what makes it safe to modify them inside the function. Anyway, debates about such thigs tend to get a bit religious. getting more practical, I'm slightly inclined to say Steve Prentice has made a good enough case for doing this. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby 0.2.1
All, Now that Simon has submitted this, can some of the heavy-hitters here review it? Heikki? Nobody's name is next to it. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, Sep 16, 2009 at 1:42 PM, Josh Berkus j...@agliodbs.com wrote: On 9/16/09 11:20 AM, Kevin Grittner wrote: Josh Berkus j...@agliodbs.com wrote: a) To date, I have yet to hear a single person bring up an actual real-life use-case where VACUUM FULL was desireable and REWRITE would not be. Would rewrite have handled this?: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01016.php Ok, that sounds like a real use case. However, given Heikki's post about FULL being an issue for Hot Standby, I'm more inclined to provide a workaround ... for example, allowing REWRITE to write to a designated tablespace, which would allow people to use a portable drive or similar for the extra disk space. if you have a portable drive at hand you can create a tablespace in that dirve, move the table to that tablespace, return to the old tablespace and drop the new tblspc... ok, one command for all that could be handy but not a need... the real problem is when you *don't* have more space... i have been recently in that situation and vaccum full was a life saver but the only reason that server came to that situation was a horribly fsm configuration and a bad design that forces an incredible amount of updates... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic copy options
Robert, Here is a new version of the patch with an updated doc and psql. I changed the name of the CSV options to prefix them with csv_ to avoid confusion with any future options. I also had to change the grammar to allow '*' as a parameter (needed for cvs_force_quote). When we decide to drop the old syntax (in 8.6?), we will be able to clean a lot especially in psql. Emmanuel On Mon, Sep 14, 2009 at 2:51 PM, Emmanuel Cecchet m...@asterdata.com wrote: This looks good. Shoud I try to elaborate on that for the patch with error logging and autopartitioning in COPY? That make sense to me. You shouldn't need to do anything else in gram.y; whatever you want to add should just involve changing copy.c. If not, please post the details. We also need to fix the psql end of this, and the docs... any interest in taking a crack at either of those? ...Robert -- Emmanuel Cecchet Aster Data Systems Web: http://www.asterdata.com ### Eclipse Workspace Patch 1.0 #P Postgres8.5-COPY Index: src/test/regress/sql/copy2.sql === RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copy2.sql,v retrieving revision 1.18 diff -u -r1.18 copy2.sql --- src/test/regress/sql/copy2.sql 25 Jul 2009 00:07:14 - 1.18 +++ src/test/regress/sql/copy2.sql 16 Sep 2009 22:37:31 - @@ -73,17 +73,17 @@ \. -- various COPY options: delimiters, oids, NULL string -COPY x (b, c, d, e) from stdin with oids delimiter ',' null 'x'; +COPY x (b, c, d, e) from stdin (oids, delimiter ',', null 'x'); 50,x,45,80,90 51,x,\x,\\x,\\\x 52,x,\,,\\\,,\\ \. -COPY x from stdin WITH DELIMITER AS ';' NULL AS ''; +COPY x from stdin (DELIMITER ';', NULL ''); 3000;;c;; \. -COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X'; +COPY x from stdin (DELIMITER ':', NULL E'\\X'); 4000:\X:C:\X:\X 4001:1:empty:: 4002:2:null:\X:\X @@ -108,13 +108,13 @@ INSERT INTO no_oids (a, b) VALUES (20, 30); -- should fail -COPY no_oids FROM stdin WITH OIDS; -COPY no_oids TO stdout WITH OIDS; +COPY no_oids FROM stdin (OIDS); +COPY no_oids TO stdout (OIDS); -- check copy out COPY x TO stdout; COPY x (c, e) TO stdout; -COPY x (b, e) TO stdout WITH NULL 'I''m null'; +COPY x (b, e) TO stdout (NULL 'I''m null'); CREATE TEMP TABLE y ( col1 text, @@ -130,11 +130,23 @@ COPY y TO stdout WITH CSV FORCE QUOTE col2 ESCAPE E'\\'; COPY y TO stdout WITH CSV FORCE QUOTE *; +-- Test new 8.5 syntax + +COPY y TO stdout (CSV); +COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|'); +COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\'); +COPY y TO stdout (CSV, CSV_FORCE_QUOTE *); + +\COPY y TO stdout (CSV) +\COPY y TO stdout (CSV, CSV_QUOTE , DELIMITER '|') +\COPY y TO stdout (CSV, CSV_FORCE_QUOTE (col2), CSV_ESCAPE E'\\') +\COPY y TO stdout (CSV, CSV_FORCE_QUOTE *) + --test that we read consecutive LFs properly CREATE TEMP TABLE testnl (a int, b text, c int); -COPY testnl FROM stdin CSV; +COPY testnl FROM stdin (CSV); 1,a field with two LFs inside,2 @@ -143,14 +155,14 @@ -- test end of copy marker CREATE TEMP TABLE testeoc (a text); -COPY testeoc FROM stdin CSV; +COPY testeoc FROM stdin (CSV); a\. \.b c\.d \. \. -COPY testeoc TO stdout CSV; +COPY testeoc TO stdout (CSV); DROP TABLE x, y; DROP FUNCTION fn_x_before(); Index: src/test/regress/sql/aggregates.sql === RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/aggregates.sql,v retrieving revision 1.15 diff -u -r1.15 aggregates.sql --- src/test/regress/sql/aggregates.sql 25 Apr 2009 16:44:56 - 1.15 +++ src/test/regress/sql/aggregates.sql 16 Sep 2009 22:37:31 - @@ -104,7 +104,7 @@ BIT_OR(i4) AS ? FROM bitwise_test; -COPY bitwise_test FROM STDIN NULL 'null'; +COPY bitwise_test FROM STDIN (NULL 'null'); 1 1 1 1 1 B0101 3 3 3 null2 B0100 7 7 7 3 4 B1100 @@ -171,7 +171,7 @@ BOOL_OR(b3)AS n FROM bool_test; -COPY bool_test FROM STDIN NULL 'null'; +COPY bool_test FROM STDIN (NULL 'null'); TRUE nullFALSE null FALSE TRUEnullnull null TRUEFALSE null Index: src/test/regress/sql/copyselect.sql === RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copyselect.sql,v retrieving revision 1.2 diff -u -r1.2 copyselect.sql --- src/test/regress/sql/copyselect.sql 7 Aug 2008 01:11:52 - 1.2 +++ src/test/regress/sql/copyselect.sql 16 Sep 2009 22:37:31 - @@ -61,7 +61,7 @@ -- -- Test headers, CSV and quotes -- -copy (select t from test1 where id = 1) to stdout csv header force quote t; +copy (select t from test1 where id = 1) to stdout (csv, csv_header, csv_force_quote (t)); -- -- Test psql builtins, plain table -- Index: src/test/regress/expected/aggregates.out
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Robert Haas robertmh...@gmail.com wrote: I think this should be in core, not a contrib module. +1 I also wonder whether we should consider teaching regular VACUUM to do a little of this every time it's run. Right now, once your table gets bloated, it stays bloated forever, until you intervene. Making it slowly get better by itself would reduce the number of people who live with the problem for a month or a year before writing in to say Access to this table seems really slow +1 if feasible. That would be a very nice feature. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby 0.2.1
On Wed, Sep 16, 2009 at 6:05 PM, Josh Berkus j...@agliodbs.com wrote: Now that Simon has submitted this, can some of the heavy-hitters here review it? Heikki? Nobody's name is next to it. I don't think anyone is planning to ignore this patch, but it wasn't included in the first round of round-robin reviewing assignments because it wasn't submitted until the following day, after the announced deadline for submissions had already passed. So most people are probably busy with with some other patch at the moment, but that's a temporary phenomenon. This is a pretty small CommitFest, so there shouldn't be any shortage of reviewers, though Heikki's time may be stretched a little thin, since Streaming Replication is also in the queue, and he is working on index-only scans. That's really for him to comment on, though. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Robert Haas wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hannu Krosing wrote: On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to I also wonder whether we should consider teaching regular VACUUM to do a little of this every time it's run. Right now, once your table gets Having it be built into VACUUM would surprise me a bit, but I wonder if autovacuum could detect when such a tuple-mover would be useful, and run one before it does a VACUUM if needed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Josh Berkus j...@agliodbs.com writes: 3. This patch eliminates a common plpgsql beginner error With respect, that argument is one hundred percent false. I can think of maybe two complaints about the behavior that we've heard in the last ten years. The only real argument I've heard in favor of this is that it will simplify importing not-too-well-written Informix code. That might be sufficient reason, but let's not invent claims about it being a common problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. I don't think you've really thought this through; particularly not this: + rel = heap_open(relid, AccessShareLock); You can NOT modify a relation with only AccessShareLock, and frankly I doubt you should be doing this with less than exclusive lock. Which would make the thing quite unpleasant to use in practice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench: new feature allowing to launch shell commands
Hi all, Sorry for my late reply. There is no other update for this patch since the 13th of August, at least until today. The new patch is attached By the way I worked on the comments that Dan and Gabriel pointed out. I added a check on system such as to prevent an error from this side. By the way, I noticed that the way return values are managed in doCustom and in process_commands is different. Such as to make an homogeneous code, return values are managed the same way in both functions in the patch, explaining why I did not return a specific value when file commands are treated in doCustom. Here is also as wanted a simple transaction that permits to use this function: \set nbranches :scale \set naccounts 10 * :scale \setrandom aid 1 :naccounts \setrandom bid 1 :nbranches \setrandom delta -5000 5000 \setrandom txidrand 0 1 START TRANSACTION; UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT abalance FROM pgbench_accounts WHERE aid = :aid; UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; PREPARE TRANSACTION ':txidrand'; \shell ls -ll $PGDATA/pg_twophase COMMIT PREPARED ':txidrand'; The shell command I use here permits to scan the 2PC state files written in pg_twophase. You will notice that in this case files have a size of 540B. Also please do not forget to set PGDATA. The new functionnality proposed here is just for analysis purposes. Even if it decreased the performance of pgbench, it is interesting to have a simple benchmark that permits to analyse precisely the system while transaction are being run. Regards, --- postgresql-8.4.0.orig/contrib/pgbench/pgbench.c 2009-09-17 09:07:24.0 +0900 +++ postgresql-8.4.0/contrib/pgbench/pgbench.c 2009-09-17 09:03:35.0 +0900 @@ -120,6 +120,7 @@ typedef struct } Variable; #define MAX_FILES 128 /* max number of SQL script files allowed */ +#define SHELL_COMMAND_SIZE 256 /* maximum size allowed for shell command */ /* * structures used in custom query mode @@ -984,7 +985,44 @@ top: st-listen = 1; } + else if (pg_strcasecmp(argv[0], shell) == 0) + { + int j, +retval, +retvalglob; + char commandLoc[SHELL_COMMAND_SIZE]; + + retval = snprintf(commandLoc,SHELL_COMMAND_SIZE-1,%s,argv[1]); + if (retval 0 +|| retval SHELL_COMMAND_SIZE-1) + { +fprintf(stderr, Error launching shell command: too many characters\n); +return; + } + retvalglob = retval; + for (j = 2; j argc; j++) + { +char *commandLoc2 = strdup(commandLoc); +retval = snprintf(commandLoc,SHELL_COMMAND_SIZE-1,%s %s, commandLoc2, argv[j]); +retvalglob += retval; +if (retval 0 + || retvalglob SHELL_COMMAND_SIZE-1) +{ + fprintf(stderr, Error launching shell command: too many characters\n); + free(commandLoc2); + return; +} +free(commandLoc2); + } + retval = system(commandLoc); + if (retval 0) + { +fprintf(stderr, Error launching shell command: command not launched); +return; + } + st-listen = 1; + } goto top; } } @@ -1280,6 +1318,14 @@ process_commands(char *buf) fprintf(stderr, %s: extra argument \%s\ ignored\n, my_commands-argv[0], my_commands-argv[j]); } + else if (pg_strcasecmp(my_commands-argv[0], shell) == 0) + { + if (my_commands-argc 1) + { +fprintf(stderr, %s: missing command\n, my_commands-argv[0]); +return NULL; + } + } else { fprintf(stderr, Invalid command %s\n, my_commands-argv[0]); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I'm in favor of removing VACUUM FULL in 8.5. To replace it, we should offer: 1) VACUUM REWRITE, which is like CLUSTER but doesn't use an index, and Check, although I'm not eager to make REWRITE a fully reserved word, which is what this would entail. I would propose that we call this VACUUM FULL. 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. I think it does require a table lock; you are ignoring the impact on concurrent transactions of changing existing tuples' CTIDs (and XMINs). In particular this could absolutely NOT be a standard part of plain vacuum, despite all the wishful thinking going on downthread. But if we get rid of old-style VACUUM FULL then we do need something to cover those few-and-far-between situations where you really do desperately need to compact a table in place; and a utility like this seems like a reasonable solution. I'm thinking in particular that it should be possible to have it move just a bounded number of tuples at a time, so that you could do a VACUUM to clean out the indexes in between move passes. Otherwise you run the risk of running out of disk space anyway, due to index bloat. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: I was just going to post that we should make a decision about this, because ISTM there's some code in Simon's hot standby patch that is only required to support VACUUM FULL. If we make the decision that we drop VACUUM FULL in 8.5, we can take that part out of the patch now. It's not a huge amount of code, but still. All it saves is a few hacks, which realistically don't cause anything more than an eyesore. VF has been ugly for years so we don't need to react quickly and I don't want to delay HS. Please let's not focus on side problems. I'm in favor of removing VACUUM FULL in 8.5. To replace it, we should offer: 1) VACUUM REWRITE, which is like CLUSTER but doesn't use an index, and I think that can be called VACUUM FULL also. We are just changing the internal implementation after all. There are too many scripts that already invoke VF to ask people to rewrite. 2) Another utility that does something like UPDATE ... WHERE ctid ? to move tuples to lower pages. It will be different from current VACUUM FULL in some ways. It won't require a table lock, for example, but it won't be able to move update chains as nicely. But it would be trivial to write one, so I think we should offer that as a contrib module. Hmmm, I think such a utility could easily cause more complaints than VACUUM FULL unless we had a few other things as well. It doesn't move update chains so it will mean that the table will not be able to shrink immediately, nor even for a long time afterwards if there are long queries. If a table were concurrently updated then this would not help at all, unless the FSM channelled *all* backends carefully to parts of the table that would help the process rather than hinder it. It will also bloat indexes just as VF does now. REINDEX CONCURRENTLY would help with that and we need it anyway for other reasons - and it needs to be invoked by autovacuum. A better way would be to have the FSM sense that packing was needed and then alter the path transactions take so that they naturally begin to repack the table over time. That way we wouldn't need to run a utility at all in most cases. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 11:40 -0700, Jeff Davis wrote: Another thing to think about is that lazy vacuum only shrinks the heap file if it happens to be able to acquire an access exclusive lock. Because vacuum can't be run inside a transaction block, I don't think there's currently a way to ensure that the heap file actually gets shrunk. How about we provide some way to make it acquire an access exclusive lock at the beginning, but still perform a lazy vacuum? I think it would be useful to have an additional option to force VACUUM to wait for the lock so it can truncate. It's annoying to have to re-run VACUUM just to give it a chance at the lock again. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic copy options
On Wed, Sep 16, 2009 at 6:43 PM, Emmanuel Cecchet m...@asterdata.com wrote: Here is a new version of the patch with an updated doc and psql. Thanks, that's great. I don't think the way the doc changes are formatted is consistent with what we've done elsewhere. I think that breaking the options out as a separate block could be OK (because otherwise they have to be duplicated between COPY TO and COPY FROM) but it should be done more like the way that the SELECT page is done. Also, you haven't documented the syntax 100% correctly: the boolean options work just like the boolean explain options - they take an optional argument which if omitted defaults to true, but you can also specify 0, 1, true, false, on, off. See defGetBoolean. So those should be specified as: BINARY [boolean] OIDS [boolean] CSV [boolean] CSV_HEADER [boolean] See how we did it in sql-explain.html. I changed the name of the CSV options to prefix them with csv_ to avoid confusion with any future options. I also had to change the grammar to allow '*' as a parameter (needed for cvs_force_quote). You seem to have introduced a LARGE number of unnecessary whitespace changes here which are not going to fly. You need to go through and revert all of those. It's hard to tell what you've really changed here, but also every whitespace change that gets committed is a potential merge conflict for someone else; plus pgindent will eventually change it back, thus creating another potential merge conflict for someone else. I am not 100% sold on renaming all of the CSV-specific options to add csv_. I would like to get an opinion from someone else on whether that is a good idea or not. I am fairly certain it is NOT a good idea to support BOTH the old and new option names, as you've done here. If you're going to rename them, you should update gram.y and change the makeDefElem() calls within the copy_opt_list productions to emit the new names. When we decide to drop the old syntax (in 8.6?), we will be able to clean a lot especially in psql. Considering that we are still carrying syntax that was deprecated in 7.3, I don't think it's likely that we'll phase out the present syntax anywhere nearly that quickly. But it's reasonable to ask whether we should think about removing support for the pre-7.3 syntax altogether for 8.5. It doesn't seem to cost us much to keep that support around, but then again it's been deprecated for seven major releases, so it might be about time. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Well, Andrew McNamara just posted today: http://archives.postgresql.org/message-id/20090916063341.0735c5ac...@longblack.object-craft.com.au Had VACUUM FULL not been available, though, I'm pretty sure he would've come up with something else instead. Indeed I would have. And it was our own slackness that got us into the situation. Several people suggested using a portable drive - in this case, it would not have been practical as the machines are physically managed by another group at a remote location (the paperwork would be the real blocker). Getting more drives added to the SAN would have been even more painful. I was just going to post that we should make a decision about this, because ISTM there's some code in Simon's hot standby patch that is only required to support VACUUM FULL. If we make the decision that we drop VACUUM FULL in 8.5, we can take that part out of the patch now. It's not a huge amount of code, but still. I'm in favor of removing VACUUM FULL in 8.5. To replace it, we should offer: 1) VACUUM REWRITE, which is like CLUSTER but doesn't use an index, and My preference would be to keep the VACUUM FULL command, but to reimplement it as a table rewriter (like CLUSTER?). I see little risk to changing the behaviour without changing the name - only experts are currently aware exactly what it actually does, and they are more likely to keep an eye out for changes like this. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 20:36 -0400, Tom Lane wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: For the kicks, I looked at what it would take to write a utility like that. It turns out to be quite trivial, patch attached. I don't think you've really thought this through; particularly not this: + rel = heap_open(relid, AccessShareLock); You can NOT modify a relation with only AccessShareLock, and frankly I doubt you should be doing this with less than exclusive lock. Which would make the thing quite unpleasant to use in practice. C'mon, we know he knows that. But I guess we should define the locking requirement for such a utility explicitly: ShareUpdateExclusiveLock, please. What we need is VACUUM FULL CONCURRENTLY and REINDEX CONCURRENTLY. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Simon Riggs si...@2ndquadrant.com writes: What we need is VACUUM FULL CONCURRENTLY and REINDEX CONCURRENTLY. VACUUM FULL CONCURRENTLY is a contradiction in terms. Wishing it were possible doesn't make it so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Tue, Sep 15, 2009 at 4:51 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Same problem. Build log attached. ...Robert My renonc, please, try new patch. I forgot mark regproc.c file. regards Pavel Stehule This one compiles for me. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 21:00 -0400, Tom Lane wrote: But if we get rid of old-style VACUUM FULL then we do need something to cover those few-and-far-between situations where you really do desperately need to compact a table in place; and a utility like this seems like a reasonable solution. I'm thinking in particular that it should be possible to have it move just a bounded number of tuples at a time, so that you could do a VACUUM to clean out the indexes in between move passes. Otherwise you run the risk of running out of disk space anyway, due to index bloat. Agreed to all of the above, though I see some challenges. The way I read the thread so far is that there are multiple requirements: * Shrink a table efficiently - when time and space available to do so * Shrink a table in place - when no space available * Shrink a table concurrently - when no dedicated time available We probably can't do all of them at once, but we do need all of them, at various times. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Simon Riggs si...@2ndquadrant.com writes: I think it would be useful to have an additional option to force VACUUM to wait for the lock so it can truncate. It's annoying to have to re-run VACUUM just to give it a chance at the lock again. It would be better to separate out the truncate-what-you-can behavior as an entirely distinct operation. If we go with Heikki's plan of a new special operation that moves tuples down without trying to preserve XMINs, then we could have that thing truncate any empty end pages as its first (not last) step. But it might be more useful/flexible if they were just two separate ops. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Simon Riggs si...@2ndquadrant.com writes: The way I read the thread so far is that there are multiple requirements: * Shrink a table efficiently - when time and space available to do so To be addressed by the CLUSTER-based solution (VACUUM REWRITE or whatever we call it). * Shrink a table in place - when no space available To be addressed by the UPDATE-style tuple-mover (which could be thought of as VACUUM FULL rewritten to not use any special mechanisms). * Shrink a table concurrently - when no dedicated time available Wishful thinking, which should not stop us from proceeding with the solutions we know how to implement. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, Sep 16, 2009 at 9:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: The way I read the thread so far is that there are multiple requirements: * Shrink a table efficiently - when time and space available to do so To be addressed by the CLUSTER-based solution (VACUUM REWRITE or whatever we call it). * Shrink a table in place - when no space available To be addressed by the UPDATE-style tuple-mover (which could be thought of as VACUUM FULL rewritten to not use any special mechanisms). * Shrink a table concurrently - when no dedicated time available Wishful thinking, which should not stop us from proceeding with the solutions we know how to implement. The UPDATE-style tuple-mover might work for this too, for certain workloads. If most of your transactions are short, and the server load is not too high, it might be OK to lock the table, move a few tuples, lock the table, move a few tuples, etc. Now if you have long-running transactions, not so much. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Robert Haas robertmh...@gmail.com writes: On Wed, Sep 16, 2009 at 9:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: * Shrink a table concurrently - when no dedicated time available Wishful thinking, which should not stop us from proceeding with the solutions we know how to implement. The UPDATE-style tuple-mover might work for this too, for certain workloads. If most of your transactions are short, and the server load is not too high, it might be OK to lock the table, move a few tuples, lock the table, move a few tuples, etc. Now if you have long-running transactions, not so much. Yeah, I was just wondering about that myself. Seems like there would be lots of situations where short exclusive-lock intervals could be tolerated, even though not long ones. So that's another argument for being able to set an upper bound on how many tuples get moved per call. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
On Wed, 2009-09-16 at 21:48 -0400, Tom Lane wrote: Yeah, I was just wondering about that myself. Seems like there would be lots of situations where short exclusive-lock intervals could be tolerated, even though not long ones. But a short-lived exclusive lock can turn into a long-lived exclusive lock if there are long-lived transactions ahead of it in the queue. We probably don't want to automate anything by default that acquires exclusive locks, even for a short time. However, I agree that it's fine in many situations if the administrator is choosing it. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Jeff Davis pg...@j-davis.com writes: On Wed, 2009-09-16 at 21:48 -0400, Tom Lane wrote: Yeah, I was just wondering about that myself. Seems like there would be lots of situations where short exclusive-lock intervals could be tolerated, even though not long ones. But a short-lived exclusive lock can turn into a long-lived exclusive lock if there are long-lived transactions ahead of it in the queue. We probably don't want to automate anything by default that acquires exclusive locks, even for a short time. However, I agree that it's fine in many situations if the administrator is choosing it. Right, which is why autovacuum can't have anything to do with this. But as an emergency recovery tool it seems reasonable enough. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic copy options
Robert Haas wrote: I don't think the way the doc changes are formatted is consistent with what we've done elsewhere. I think that breaking the options out as a separate block could be OK (because otherwise they have to be duplicated between COPY TO and COPY FROM) but it should be done more like the way that the SELECT page is done. I looked at the way it is done in SELECT and there is a section per clause (from clause, where clause, ...). So I am not sure how you want to apply that here besides the copy parameters and the option clause. Also, you haven't documented the syntax 100% correctly: the boolean options work just like the boolean explain options - they take an optional argument which if omitted defaults to true, but you can also specify 0, 1, true, false, on, off. See defGetBoolean. So those should be specified as: BINARY [boolean] OIDS [boolean] CSV [boolean] CSV_HEADER [boolean] See how we did it in sql-explain.html. Ok, fixed. I changed the name of the CSV options to prefix them with csv_ to avoid confusion with any future options. I also had to change the grammar to allow '*' as a parameter (needed for cvs_force_quote). You seem to have introduced a LARGE number of unnecessary whitespace changes here which are not going to fly. You need to go through and revert all of those. It's hard to tell what you've really changed here, but also every whitespace change that gets committed is a potential merge conflict for someone else; plus pgindent will eventually change it back, thus creating another potential merge conflict for someone else. Sorry, I overlooked a format in Eclipse that formatted the whole file instead of the block I was working on. This should be fixed now. I am not 100% sold on renaming all of the CSV-specific options to add csv_. I would like to get an opinion from someone else on whether that is a good idea or not. I am fairly certain it is NOT a good idea to support BOTH the old and new option names, as you've done here. If you're going to rename them, you should update gram.y and change the makeDefElem() calls within the copy_opt_list productions to emit the new names. Agreed for the makeDefElem(). For changing the names, I think that names like 'header', 'escape' and 'quote' are too generic to not conflict with something that is not csv. If you think of another format that could be added to copy, it is likely to re-use the same variable names. The only thing that seems odd is that if you use a CSV_* option, you still have to add CSV [on] to the option list which seems kind of redundant. When we decide to drop the old syntax (in 8.6?), we will be able to clean a lot especially in psql. Considering that we are still carrying syntax that was deprecated in 7.3, I don't think it's likely that we'll phase out the present syntax anywhere nearly that quickly. But it's reasonable to ask whether we should think about removing support for the pre-7.3 syntax altogether for 8.5. It doesn't seem to cost us much to keep that support around, but then again it's been deprecated for seven major releases, so it might be about time. While I understand the need for the server to still support the syntax, is it necessary for newer version of psql to support the old syntax? I am attaching the new version of the patch with the current modifications addressing your comments. Emmanuel -- Emmanuel Cecchet Aster Data Systems Web: http://www.asterdata.com ### Eclipse Workspace Patch 1.0 #P Postgres8.5-COPY Index: src/test/regress/sql/copy2.sql === RCS file: /home/manu/cvsrepo/pgsql/src/test/regress/sql/copy2.sql,v retrieving revision 1.18 diff -u -r1.18 copy2.sql --- src/test/regress/sql/copy2.sql 25 Jul 2009 00:07:14 - 1.18 +++ src/test/regress/sql/copy2.sql 17 Sep 2009 03:14:48 - @@ -73,17 +73,17 @@ \. -- various COPY options: delimiters, oids, NULL string -COPY x (b, c, d, e) from stdin with oids delimiter ',' null 'x'; +COPY x (b, c, d, e) from stdin (oids, delimiter ',', null 'x'); 50,x,45,80,90 51,x,\x,\\x,\\\x 52,x,\,,\\\,,\\ \. -COPY x from stdin WITH DELIMITER AS ';' NULL AS ''; +COPY x from stdin (DELIMITER ';', NULL ''); 3000;;c;; \. -COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X'; +COPY x from stdin (DELIMITER ':', NULL E'\\X'); 4000:\X:C:\X:\X 4001:1:empty:: 4002:2:null:\X:\X @@ -108,13 +108,13 @@ INSERT INTO no_oids (a, b) VALUES (20, 30); -- should fail -COPY no_oids FROM stdin WITH OIDS; -COPY no_oids TO stdout WITH OIDS; +COPY no_oids FROM stdin (OIDS); +COPY no_oids TO stdout (OIDS); -- check copy out COPY x TO stdout; COPY x (c, e) TO stdout; -COPY x (b, e) TO stdout WITH NULL 'I''m null'; +COPY x (b, e) TO stdout (NULL 'I''m null'); CREATE TEMP TABLE y ( col1 text, @@ -130,11 +130,23 @@ COPY y TO stdout WITH CSV FORCE QUOTE col2 ESCAPE E'\\'; COPY y TO stdout WITH CSV FORCE QUOTE
Re: [HACKERS] generic copy options
Emmanuel Cecchet m...@asterdata.com writes: Robert Haas wrote: When we decide to drop the old syntax (in 8.6?), we will be able to clean a lot especially in psql. Considering that we are still carrying syntax that was deprecated in 7.3, I don't think it's likely that we'll phase out the present syntax anywhere nearly that quickly. While I understand the need for the server to still support the syntax, is it necessary for newer version of psql to support the old syntax? psql has MORE need to support old syntax than the backend does, because it's supposed to work against old servers. I wonder though if we couldn't simplify matters. Offhand it seems to me that psql doesn't need to validate the command's syntax fully. All it really needs to do is find the target filename and replace it with STDIN/STDOUT. Could we have it just treat the remainder of the line literally, and not worry about the details of what the options might be? Let the backend worry about throwing an error if they're bad. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Sep 16, 2009 at 9:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: * Shrink a table concurrently - when no dedicated time available Wishful thinking, which should not stop us from proceeding with the solutions we know how to implement. The UPDATE-style tuple-mover might work for this too, for certain workloads. If most of your transactions are short, and the server load is not too high, it might be OK to lock the table, move a few tuples, lock the table, move a few tuples, etc. Now if you have long-running transactions, not so much. Yeah, I was just wondering about that myself. Seems like there would be lots of situations where short exclusive-lock intervals could be tolerated, even though not long ones. That was my thinking. The tuple moving can block if another backend is doing updates concurrently, and the moving can block other backends from updating (and cause serialization errors). But that seems like a perfectly acceptable limitation that we can simply document. Surely it's better than taking an ExclusiveLock. So that's another argument for being able to set an upper bound on how many tuples get moved per call. Yeah, that would alleviate it. We could write a client utility to call it repeatedly, and perhaps VACUUMs in between, to make it easier to use. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Linux LSB init script
On ons, 2009-09-16 at 12:05 -0500, Kevin Grittner wrote: what LSB actually entails, which is: - INFO header - standardized exit codes - functions from /lib/lsb/init-functions As well as a standardized set of actions and standard semantics for them, if we're only talking about the init script parts. Yes, I didn't fully realize that the existing script didn't implement all possible actions. But that can obviously be backported as well. My argument is that it might be better to just merge the INFO header and the exit codes into the existing script and forget about using the functions. That's where I started, but felt that it was of no benefit unless the OS used them, in which case I thought conforming behavior would be wanted. I think it's important to consider what the conforming behavior really achieves in practice. The INFO section is essential nowadays for correct functioning on a large portion of distributions. The exit codes are relatively uninteresting but occasionally useful. The init-functions don't make a difference at all, as far as I'm aware. The old script already has a chkconfig header, which was the moral predecessor to the LSB INFO section. So someone evidently thought this sort of thing was useful, and so we might as well update it. I'm not following what you say here. Is there something wrong with the current chkconfig header in the non-LSB script? Well, you mentioned earlier that you were hesitant about backporting the INFO header without adding all the other conforming stuff. From a practical point of view, I think an INFO header is nowadays essential independent of what the rest of the script does. Differentiating the exit codes is the bulk of your code, but what about merging this part into pg_ctl itself? That would result in different exit codes for a lot of circumstances. I'm OK with that, if everyone agrees that we want, for example an exit code of 0 for an attempt to start a server which is already running or an attempt to stop a server which isn't running. (These are only two examples of differences between what is required of an LSB conforming init script and what pg_ctl does. Are we all in universal agreement to make such a change to pg_ctl? Well, in such cases it may be useful to add an option such as --oknodo to select the idempotent behavior. The init-functions then don't really buy you anything, except that you replace pg_ctl with start_daemon and killproc, and good old echo with log_failure_msg. And then you proceed in pg_initd_stop to basically reimplement half of the pg_ctl logic in shell code. Well, with differences in behavior, of course, like attempting a fast shutdown, and escalating to an immediate shutdown if that doesn't succeed in the allowed time. Right, but if you think that this behavior the right/better one (which it arguably isn't), why shouldn't it be available from the mainline tools? Obviously, most people have felt for the last N years that pg_ctl provides adequate shutdown options. If not, then let's look there. We shouldn't hardcode an alternative policy into a marginal init script, which it will receive little testing and scrutiny. And using the init-functions is the only thing here that is not backward compatible, so removing it would address the concern about support pre/non-LSB distributions. Use of these functions is a requirement for LSB conforming scripts. If you don't use them, particularly for success and failure messages, you're not conforming. For one thing, using these functions causes the success and failure messages to be formatted like those for other services on the machine, and the format varies a lot between distributions. Also, consider this statement in the spec: | Error and status messages should be printed with the logging | functions (see Init Script Functions) log_success_msg(), | log_failure_msg() and log_warning_msg(). Scripts may write to | standard error or standard output, but implementations need not | present text written to standard error/output to the user or do | anything else with it. Not using the functions would cause the script to only work for some undefined subset of conforming implementations. Yeah, except that part of the spec is hilariously unrealistic. And there is no evidence to suggest that use of these functions will cause messages to be formatted like other services on the machine. Debian/Ubuntu for example have given up on those functions because they don't anything useful and provide their own set of functions that you ought to use to format messages like on those systems. First, the is, in my mind, no need to repeat half of the LSB spec in the comments. Well, that's a bit of hyperbole -- I did the math and it's less than 10% of what I consider to be the part of the spec fairly directly related to init scripts. Which parts of these comments do you feel should be