Re: [HACKERS] why can't plpgsql return a row-expression?
2012/12/4 Asif Rehman asifr.reh...@gmail.com: Hi, Here is the updated patch. I overlooked the loop, checking to free the conversions map. Here are the results now. postgres=# select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a numeric, b numeric); sum -- 303000.0 (1 row) postgres=# select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a float, b numeric); sum -- 303000.00012 Regards, --Asif yes, it is fixed. ok I have no objection Regards Pavel Stehule On Wed, Dec 5, 2012 at 12:40 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello I fully agree with Asif's arguments previous tupconvert implementation was really strict - so using enhanced tupconver has very minimal impact for old user - and I checked same speed for plpgsql function - patched and unpatched pg. tested CREATE OR REPLACE FUNCTION public.foo(i integer) RETURNS SETOF record LANGUAGE plpgsql AS $function$ declare r record; begin r := (10.1,20.1); for i in 1..10 loop return next r; end loop; return; end; $function$ select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a numeric, b numeric); More - everywhere where we use tupconvert internally, then we use cached conversion map - so I am sure, so speed of ANALYZE cannot be impacted by this patch There are other two issue: it allow to write new differnt slow code - IO cast is about 5-10-20% slower, and with this path anybody has new possibilities for new bad code. But it is not a problem of this patch. It is related to plpgsql design and probably we should to move some conversions to outside plpgsql to be possible reuse conversion map and enhance plpgsql. I have a simple half solutions - plpgsql_check_function can detect this situation in almost typical use cases and can raises a performance warning. So this issue is not stopper for me - because it is not new issue in plpgsql. Second issue is more significant: there are bug: postgres=# select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a numeric, b numeric); sum -- 303000.0 (1 row) postgres=# select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a float, b numeric); sum --- 7.33675699577682e-232 (1 row) it produces wrong result And with minimal change it kill session create or replace function foo(i integer) returns setof record as $$ declare r record; begin r := (10,20); for i in 1..10 loop return next r; end loop; return; end; $$ language plpgsql; postgres=# select sum(a) from generate_series(1,3000) g(i), lateral foo(i) as (a numeric, b numeric); The connection to the server was lost. Attempting reset: Failed. create or replace function fo(i integer) returns record as $$ declare r record; begin r := (10,20); return r; end; $$ language plpgsql; postgres=# select sum(a) from generate_series(1,3000) g(i), lateral fo(i) as (a int, b numeric); sum --- 3 (1 row) postgres=# select sum(a) from generate_series(1,3000) g(i), lateral fo(i) as (a numeric, b numeric); The connection to the server was lost. Attempting reset: Failed. Regards Pavel Stehule 2012/12/3 Asif Rehman asifr.reh...@gmail.com: Hi, Thanks for the review. Please see the updated patch. Hmm. We're running an I/O cast during do_tup_convert() now, and look up the required functions for each tuple. I think if we're going to support this operation at that level, we need to look up the necessary functions at convert_tuples_by_foo level, and then apply unconditionally if they've been set up. Done. TupleConversionMap struct should keep the array of functions oid's that needs to be applied. Though only for those cases where both attribute type id's do not match. This way no unnecessary casting will happen. Also, what are the implicancies for existing users of tupconvert? Do we want to apply casting during ANALYZE for example? AFAICS the patch shouldn't break any case that works today, but I don't see that there has been any analysis of this. I believe this part of the code should not impact existing users of tupconvert. As this part of code is relaxing a restriction upon which an error would have been generated. Though it could have a little impact on performance but should be only for cases where attribute type id's are different and are implicitly cast able. Besides existing users involve tupconvert in case of inheritance. And in that case an exact match is expected. Regards, --Asif -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: store additional info in GIN index
On Wed, Dec 5, 2012 at 1:56 AM, Tomas Vondra t...@fuzzy.cz wrote: On 4.12.2012 20:12, Alexander Korotkov wrote: Hi! On Sun, Dec 2, 2012 at 5:02 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: I've tried to apply the patch with the current HEAD, but I'm getting segfaults whenever VACUUM runs (either called directly or from autovac workers). The patch applied cleanly against 9b3ac49e and needed a minor fix when applied on HEAD (because of an assert added to ginRedoCreatePTree), but that shouldn't be a problem. Thanks for testing! Patch is rebased with HEAD. The bug you reported was fixed. Applies fine, but I get a segfault in dataPlaceToPage at gindatapage.c. The whole backtrace is here: http://pastebin.com/YEPuWeuV The messages written into PostgreSQL log are quite variable - usually it looks like this: 2012-12-04 22:31:08 CET 31839 LOG: database system was not properly shut down; automatic recovery in progress 2012-12-04 22:31:08 CET 31839 LOG: redo starts at 0/68A76E48 2012-12-04 22:31:08 CET 31839 LOG: unexpected pageaddr 0/1BE64000 in log segment 00010069, offset 15089664 2012-12-04 22:31:08 CET 31839 LOG: redo done at 0/69E63638 but I've seen this message too 2012-12-04 22:20:29 CET 31709 LOG: database system was not properly shut down; automatic recovery in progress 2012-12-04 22:20:29 CET 31709 LOG: redo starts at 0/AEAFAF8 2012-12-04 22:20:29 CET 31709 LOG: record with zero length at 0/C7D5698 2012-12-04 22:20:29 CET 31709 LOG: redo done at 0/C7D55E I wasn't able to prepare a simple testcase to reproduce this, so I've attached two files from my fun project where I noticed it. It's a simple DB + a bit of Python for indexing mbox archives inside Pg. - create.sql - a database structure with a bunch of GIN indexes on tsvector columns on messages table - load.py - script for parsing mbox archives / loading them into the messages table (warning: it's a bit messy) Usage: 1) create the DB structure $ createdb archives $ psql archives create.sql 2) fetch some archives (I consistently get SIGSEGV after first three) $ wget http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-01.gz $ wget http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-02.gz $ wget http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-03.gz 3) gunzip and load them using the python script $ gunzip pgsql-hackers.*.gz $ ./load.py --db archives pgsql-hackers.* 4) et voila - a SIGSEGV :-( I suspect this might be related to the fact that the load.py script uses savepoints quite heavily to handle UNIQUE_VIOLATION (duplicate messages). Thanks for bug report. It is fixed in the attached patch. -- With best regards, Alexander Korotkov. ginaddinfo.3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DEALLOCATE IF EXISTS
On 30.11.2012 12:05, Vik Reykja wrote: On Tue, Nov 27, 2012 at 3:15 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: I fail to see the point of DEALLOCATE IF EXISTS. Do you have real use case for this, or was this just a case of adding IF EXISTS to all commands for the sake of completeness? Usually the client knows what statements have been prepared, but perhaps you want to make sure everything is deallocated in some error handling case or similar. But in that case, you might as well just issue a regular DEALLOCATE and ignore errors. Or even more likely, you'll want to use DEALLOCATE ALL. Hmm. The test case I had for it, which was very annoying in an I want to be lazy sort of way, I am unable to reproduce now. So I guess this becomes a make it like the others and the community can decide whether that's desirable. In my personal case, which again I can't reproduce because it's been a while since I've done it, DEALLOCATE ALL would have worked. I was basically preparing a query to work on it in the same conditions that it would be executed in a function, and I was only working on one of these at a time so ALL would have been fine. Ok. Being the lazy person that I am, I'm going to just mark this as rejected then. There is no consensus that we should decorate every DDL command with IF EXISTS, and even if we did, it's not clear that it should include DEALLOCATE. But thanks for the effort anyway! - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the number of pending entries in GIN index with FASTUPDATE=on
On 28.11.2012 04:11, Kyotaro HORIGUCHI wrote: 3. pgstatginidex shows only version, pending_pages, and pendingtuples. Why don't you show the other properties such as entry pages, data pages, entries, and total pages as pgstatindex does? I didn't expose those because they are accurate as of last VACUUM. But if you think they are useful, I don't object to expose them. Ok, my point was the apparent consistency of the functions. I don't have any distinct wish about this. We can always add more fields later if there's a need. I'll mark this as Ready for Committer. Thanks, committed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
On 30.11.2012 12:18, Vik Reykja wrote: I am trying to review this patch and on my work computer everything compiles and tests perfectly. However, on my laptop, the regression tests don't pass with cache lookup failed for type XYZ where XYZ is some number that does not appear to be any type oid. I don't really know where to go from here. I am asking that other people try this patch to see if they get errors as well. I get the same error. I'll mark this as waiting on author in the commitfest. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
2012/12/5 Heikki Linnakangas hlinnakan...@vmware.com: On 30.11.2012 12:18, Vik Reykja wrote: I am trying to review this patch and on my work computer everything compiles and tests perfectly. However, on my laptop, the regression tests don't pass with cache lookup failed for type XYZ where XYZ is some number that does not appear to be any type oid. I don't really know where to go from here. I am asking that other people try this patch to see if they get errors as well. I get the same error. I'll mark this as waiting on author in the commitfest. it was with new patch? http://archives.postgresql.org/message-id/cafj8prdyxfxzcl2freslu-dpqaokou-ekky12tbzdo559pw...@mail.gmail.com Regards Pavel - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 20.11.2012 21:25, Simon Riggs wrote: On 19 November 2012 16:25, Robert Haasrobertmh...@gmail.com wrote: Beyond that, I think much of the appeal of the extension feature is that it dumps as CREATE EXTENSION hstore; and nothing more. That allows you to migrate a dump between systems with different but compatible versions of the hstore and have things work as intended. I'm not opposed to the idea of being able to make extensions without files on disk work ... but I consider it a niche use case; the behavior we have right now works well for me and hopefully for others most of the time. Distributing software should only happen by files? So why does Stackbuilder exist on the Windows binary? Why does yum exist? What's wrong with ftp huh? Why does CPAN? I've a feeling this case might be a sensible way forwards, not a niche at all. I have to join Robert in scratching my head over this. I don't understand what the use case is. Can you explain? I don't understand the comparison with stackbuilder, yum, ftp and CPAN. CPAN seems close to pgxn, but what does that have to do with this patch? On 20.11.2012 11:08, Dimitri Fontaine wrote: Apparently I'm not the only one doing extensions without anything to compile, all SQL: http://keithf4.com/extension_tips_3 No doubt about that. I'm sure extensions written in pure SQL or PL/pgSQL are very common. But what does that have to do with this patch? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
On 05.12.2012 10:38, Pavel Stehule wrote: 2012/12/5 Heikki Linnakangashlinnakan...@vmware.com: I get the same error. I'll mark this as waiting on author in the commitfest. it was with new patch? http://archives.postgresql.org/message-id/cafj8prdyxfxzcl2freslu-dpqaokou-ekky12tbzdo559pw...@mail.gmail.com Ah, no, sorry. The new patch passes regressions just fine. Never mind.. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage
2012/12/5 Heikki Linnakangas hlinnakan...@vmware.com: On 05.12.2012 10:38, Pavel Stehule wrote: 2012/12/5 Heikki Linnakangashlinnakan...@vmware.com: I get the same error. I'll mark this as waiting on author in the commitfest. it was with new patch? http://archives.postgresql.org/message-id/cafj8prdyxfxzcl2freslu-dpqaokou-ekky12tbzdo559pw...@mail.gmail.com Ah, no, sorry. The new patch passes regressions just fine. Never mind.. :) Pavel - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing PD_ALL_VISIBLE
On Fri, 2012-11-30 at 13:16 -0800, Jeff Davis wrote: I tried for quite a while to show any kind of performance difference between checking the VM and checking PD_ALL_VISIBLE on a 12-core box (24 if you count HT). Three patches in question: 1. Current unpatched master 2. patch that naively always checks the VM page, pinning and unpinning each time 3. Same as #2, but tries to keep buffers pinned (I had to fix a bug in this patch though -- new version forthcoming) New patch attached. Nathan Boley kindly lent me access to a 64-core box, and that shows a much more interesting result. The previous test (on the 12-core) basically showed no difference between any of the patches. Now, I see why on the 64 core box: the interesting region seems to be around 32 concurrent connections. The left column is the concurrency, and the right is the runtime. This test was for concurrent scans of a 350MB table (each process did 4 scans and quit). Test program attached. Patch 1 (scan test): 001 004.299533 002 004.434378 004 004.708533 008 004.518470 012 004.487033 016 004.513915 024 004.765459 032 006.425780 048 007.089146 064 007.908850 072 009.461419 096 013.098646 108 015.278592 128 019.797206 Patch 2 (scan test): 001 004.385206 002 004.596340 004 004.616684 008 004.832248 012 004.858336 016 004.689959 024 005.016797 032 006.857642 048 012.049407 064 025.774772 072 032.680710 096 059.147500 108 083.654806 128 120.350200 Patch 3 (scan test): 001 004.464991 002 004.95 004 004.562364 008 004.649633 012 004.628159 016 004.518748 024 004.768348 032 004.834177 048 007.003305 064 008.242714 072 009.732261 096 013.231056 108 014.996977 128 020.488570 As you can see, patch #2 starts to show a difference at around 32 and completely falls over by 48 connections. This is expected because it's the naive approach that pins the VM page every time it needs it. Patch #1 and #3 are effectively the same, subsequent runs (and with more measurements around concurrency 32) show that the differences are just noise (which seems to be greater around the inflection point of 32). All of the numbers that seem to show any difference can end up with patch #1 better or patch #3 better, depending on the run. I tried the delete test, too, but I still couldn't see any difference. (I neglected to mention in my last email: I aborted after each delete so that it would be repeatable). The inflection point there is significantly lower, so I assume it must be contending over something else. I tried making the table unlogged to see if that would change things, but it didn't change much. This test only scales linearly to about 8 or so. Or, there could be something wrong with my test. So, I conclude that contention is certainly a problem for scans for patch #2, but patch #3 seems to fix that completely by holding the buffer pins. The deletes are somewhat inconclusive, but I just can't see a difference. Holding more pins does have a distributed cost in theory, as Tom points out, but I don't know where to begin testing that. We'll have to make a decision between (a) maintaining the extra complexity and doing the extra page writes involved with PD_ALL_VISIBLE; or (b) holding onto one extra pin per table being scanned. Right now, if PD_ALL_VISIBLE did not exist, it would be pretty hard to justify putting it in as far as I can tell. Regards, Jeff Davis #include libpq-fe.h #include stdlib.h #include stdio.h #include sys/time.h void test(char *query, int niter) { PGconn *conn; PGresult *result; int i; conn = PQconnectdb(host=/tmp dbname=postgres); if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, connection failed!\n); exit(1); } result = PQprepare(conn, , query, 0, NULL); if (PQresultStatus(result) != PGRES_COMMAND_OK) { fprintf(stderr, PREPARE failed: %s, PQerrorMessage(conn)); PQclear(result); exit(1); } PQclear(result); for (i = 0; i niter; i++) { result = PQexecPrepared(conn, , 0, NULL, NULL, NULL, 0); if (PQresultStatus(result) != PGRES_TUPLES_OK) { fprintf(stderr, EXECUTE PREPARED failed: %s\n, PQerrorMessage(conn)); PQclear(result); exit(1); } PQclear(result); } PQfinish(conn); } int main(int argc, char *argv[]) { int niter; int nprocs; char *query = SELECT COUNT(*) FROM foo; int i; pid_t *procs; struct timeval tv1, tv2; if (argc != 3) { fprintf(stderr, expected 3 arguments, got %d\n, argc); exit(1); } nprocs = atoi(argv[1]); niter = atoi(argv[2]); procs = malloc(sizeof(pid_t) * nprocs); gettimeofday(tv1, NULL); for (i = 0; i nprocs; i++) { pid_t pid = fork(); if (pid == 0) { test(query, niter); exit(0); } else { procs[i] = pid; } } for (i = 0; i nprocs; i++) { int status; waitpid(procs[i], status, 0); if (!WIFEXITED(status)) { fprintf(stderr, child did not exit!\n, argc); exit(1);
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On 5 December 2012 00:16, Josh Berkus j...@agliodbs.com wrote: Sure, and the DevOps staff would be using the EXPLAIN feature (if we had it). After which they could do little anyway except complain to the ORM authors, who might or might not give a damn. I don't see that there's enough value-added from what you suggest to justify the development time. You're still thinking of a schema change as a SQL script. ORM-based applications usually do not run their schema changes as SQL scripts, thus there's nothing to EXPLAIN. Anything which assumes the presense of a distict, user-accessible SQL script is going to leave out a large class of our users. And anything which assumes the *absence* of a manual script is also leaving out a large class of users. ORMs are very important, but not the only thing we serve. Please assume that script meant a set of SQL statements that are executed in a specific sequence to change a database model from one version to another. Anything which requires editing of all (or worse, just some) of the SQL statements is not a good solution. For ORMs, this requires each ORM to make its own change to support that functionality and to have a separate mode where it is used. For manual scripts, this requires specific editing, which fails, as already described. Either way EXPLAIN is bad, since editing/separate modes can introduce bugs. I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. That's best used as a function to turn it on and then a function to produce the report. However, as I said, if we had the EXPLAIN ALTER, we could use auto-explain to log the ALTER plans (finally, a good use for auto-explain). So that's a workable workaround. And EXPLAIN ALTER would offer us more flexibility than any logging option, of course. Auto explain executes things twice, which is not possible for DDL, so it won't work. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On 12/5/2012 1:42 AM, Simon Riggs wrote: I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. where does this get displayed? is it just tossed into the postgres log files? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] PITR potentially broken in 9.2
On 5 December 2012 00:35, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: So apparently this is something we broke since Nov 18. Don't know what yet --- any thoughts? Further experimentation shows that reverting commit ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work. So there's something wrong/incomplete about that fix. This is a bit urgent since we now have to consider whether to withdraw 9.2.2 and issue a hasty 9.2.3. Do we have a regression here since 9.2.1, and if so how bad is it? I'll look at this now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Heikki Linnakangas hlinnakan...@vmware.com writes: No doubt about that. I'm sure extensions written in pure SQL or PL/pgSQL are very common. But what does that have to do with this patch? This patch is all about enabling users to create extension without having to ship them as root on the file system of the database(s) server(s) first. When you're having to code your extension in C, you know you're in for shipping an executable binary (.so, .dylib or .dll), and for security reasons it's well understood that you will have to get root privileges on the server's file system to ship your binary before to be able to ask PostgreSQL to please load it and execute the code in there. When you can code your extension using dynamic code such as SQL or PL/pgSQL, PL/pythonu or PL/perl, there's absolutely no good reason to have to do the ship on the server's file system first that I can see. Technically creating an extension inline (sending its definition in the CREATE EXTENSION query itself) solves the problem of having to access the server's file system as root. Then, next pg_dump will include CREATE EXTENSION foo; as usual and at pg_restore time that access files on the file systems. But maybe you still are not granted access to the server's file system as root on the pg_restore target, right? So now you need to be able to include the extension's script into the dump. Now, we don't want to have more than one kind of extensions. That's what we learnt all together when reviewing my proposal from last year. Having more than one way to ship an extension is good, having two different animals with two different incompatible behaviors named the same thing is bad. The solution we found is then to be able to include an extension's script into pg_dump's output, and that's what my current patch implements, per last year review. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Row Level Security
Thanks for your reviewing in spite of large number of lines. My comments are below. 2012/12/4 Simon Riggs si...@2ndquadrant.com: Patch looks good and also like it will/can be ready for 9.3. I'm happy to put time into this as committer and/or reviewer and take further responsibility for it, unless others wish to. LIKES * It's pretty simple to understand and use * Check qual is stored in pre-parsed form. That is fast, and also secure, since changing search_path of the user doesn't change the security check at all. Nice. * Performance overhead is low, integration with indexing is clear and effective and it works with partitioning * It works, apart from design holes listed below, easily plugged IMHO DISLIKEs * Who gets to see stats on the underlying table? Are the stats protected by security? How does ANALYZE work? I think, ANALYZE should perform on the raw tables without row-security policy. Even though statistics are gray-zone, it is not a complete set of the raw table contents, so all we can do is just implying the original from processed statistical values. The situation is similar to iteration of probe using PK/FK violation. In general, it is called covert channel, and out of the scope in regular access control mechanism (including MAC). So, I don't think we have special protection on pg_stat even if row- security is configured. * INSERT ignores the SECURITY clause, on the ground that this has no meaning. So its possible to INSERT data you can't see. For example, you can insert medical records for *another* patient, or insert new top secret information. This also causes a security hole... since inserted rows can violate defined constraints, letting you know that other keys you can't see exist. Why don't we treat the SECURITY clause as a CHECK constraint? That makes intuitive sense to me. * UPDATE allows you to bypass the SECURITY clause, to produce new rows that you can't see. (Not good). But you can't get them back again, cos you can't see them. The above two comments seems me that you are suggesting to apply checks on both of scanning rows stage (UPDATE case) and modifying rows stage (INSERT and UPDATE), to prevent touchable rows getting gone anywhere. In the previous discussion, it was suggested we can implement this feature using before-row trigger. However, I love the idea to support same row-security policy integrated with CHECK constraint; that kills individual user's operation to define triggers. One problem is a case when row-security policy contains SubLink node. I expect it takes a special case handling, however, also guess not hard to implement so much. Let me investigate the code around here. * TRUNCATE works, and allows you to remove all rows of a table, even ones you can't see to run a DELETE on. Er... It was my oversight. My preference is to rewrite TRUNCATE command with DELETE statement in case when row-security policy is active on the target table. In this case, a NOTICE message may be helpful for users not to assume the table is always empty after the command. None of those things are cool, at all. Oracle defaults to putting VPD on all event types: INSERT, UPDATE, DELETE, SELECT. ISTM we should be doing the same, not just say we can add an INSERT trigger if you want. Adding a trigger just begs the question as to why we are bothering in the first place, since this functionality could already be added by INSERT, UPDATE or DELETE triggers, if they are a full replacement for this feature. The only answer is ease of use We can easily add syntax like this [ROW SECURITY CHECK ( ) [ON [ ALL | INSERT, UPDATE, DELETE, SELECT [.., with the default being ALL I think it is flaw of Oracle. :-) In case when user can define leakable function, it enables to leak contents of invisible rows at the timing when executor fetch the rows, prior to modification stage, even if we allows to configure individual row-security policies for SELECT and DELETE or UPDATE commands. My preference is one policy on a particular table for all the commands. * The design has nothing at all to do with SECURITY LABELs. Why did we create them, I wonder? I understood we would have row-level label security. Doesn't that require us to have a data type, such as reglabel or similar enum? Seems strange. Oracle has two features: Oracle Label Security and Row Level Security - I think it should be implemented on the next step. It additionally takes two independent features (1) functionality to inject a column to store security label at table definition. (2) functionality to assign a default security label when a new row is inserted. As Oracle constructs OLS on the top of VPD feature, the base row- security feature shall be upstreamed first. OTHER * The docs should explain a little better how to optimize using RLS. Specifically, the fact that indexable operators are marked leakproof and thus can be optimized ahead of the rlsquals. The docs say rls quals
Re: [HACKERS] Fwd: question on foreign key lock
Robert, thank you for the answer. 1. need exclusive lock anyway to add triggers. Why adding a trigger needs exclusive lock? Someone would say blocking reads is not needed (since possible trigger events are: Insert/Update/Delete/Truncate). 2. will create a risk of deadlock. From user perspective a risk of deadlock is sometimes better than excessive locking. Transactional DDL users should be prepared for exceptions/retries anyway. 3. I made a naive test of simply changing AccessExclusiveLock to ExclusiveLock, and seeing how many regression tests it breaks. It breaks none :-) Current Git head gives me 2 fails/133 tests regardless of this change. regards, Filip On Mon, Nov 12, 2012 at 5:20 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Nov 8, 2012 at 3:45 AM, Filip Rembiałkowski filip.rembialkow...@gmail.com wrote: maybe this is a better group for this question? I can't see why creating foreign key on table A referencing table B, generates an AccessExclusiveLock on B. It seems (to a layman :-) ) that only writes to B should be blocked. I'm really interested if this is either expected effect or any open TODO item or suboptimal behavior of postgres. This comment explains it: /* * Grab an exclusive lock on the pk table, so that someone doesn't delete * rows out from under us. (Although a lesser lock would do for that * purpose, we'll need exclusive lock anyway to add triggers to the pk * table; trying to start with a lesser lock will just create a risk of * deadlock.) */ pkrel = heap_openrv(fkconstraint-pktable, AccessExclusiveLock); Concurrent DDL is something that's been discussed in detail on this list in the past; unfortunately, there are some tricky race conditions are the shared invalidation queue and SnapshotNow that make it hard to implement properly. I'm hoping to have some time to work on this at some point, but it hasn't happened yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching timeline over streaming replication
On Tuesday, December 04, 2012 10:01 PM Heikki Linnakangas wrote: After some diversions to fix bugs and refactor existing code, I've committed a couple of small parts of this patch, which just add some sanity checks to notice incorrect PITR scenarios. Here's a new version of the main patch based on current HEAD. After testing with the new patch, the following problems are observed. Defect - 1: 1. start primary A 2. start standby B following A 3. start cascade standby C following B. 4. start another standby D following C. 5. Promote standby B. 6. After successful time line switch in cascade standby C D, stop D. 7. Restart D, Startup is successful and connecting to standby C. 8. Stop C. 9. Restart C, startup is failing. Defect-2: 1. start primary A 2. start standby B following A 3. start cascade standby C following B. 4. Start another standby D following C. 5. Execute the following commands in the primary A. create table tbl(f int); insert into tbl values(generate_series(1,1000)); 6. Promote standby B. 7. Execute the following commands in the primary B. insert into tbl values(generate_series(1001,2000)); insert into tbl values(generate_series(2001,3000)); 8. Stop standby D normally and restart D. Restart is failing. 9. Stop standby C normally and restart C. Restart is failing. Note: Stop the node means doing a smart shutdown. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Josh Berkus Sent: 05 December 2012 01:17 To: Tom Lane Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option However, as I said, if we had the EXPLAIN ALTER, we could use auto-explain to log the ALTER plans (finally, a good use for auto-explain). So that's a workable workaround. And EXPLAIN ALTER would offer us more flexibility than any logging option, of course. +1 And for the minority who wants to check themselves (like me) in their tooling this is also usable solution. Regards Petr Jelinek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On 5 December 2012 09:46, John R Pierce pie...@hogranch.com wrote: On 12/5/2012 1:42 AM, Simon Riggs wrote: I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. where does this get displayed? is it just tossed into the postgres log files? Good question. Where would you like? It needs to be per-session rather than global. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Is ramdisk usefull ?
Hello everybody, There is message from Tom Lane (2008 !) about ramdisk: begin / So, IMHO, saying trust your OS + PostgreSQL is not a 100% perfect approach for the people who are asking to keep their objects on RAM, even though I know that there is nothing we can say right now. Well, nothing is a 100% solution. But my opinion is that people who think they are smarter than an LRU caching algorithm are typically mistaken. If the table is all that heavily used, it will stay in memory just fine. If it's not sufficiently heavily used to stay in memory according to an LRU algorithm, maybe the memory space really should be spent on something else. Now there are certainly cases where a standard caching algorithm falls down --- the main one I can think of offhand is where you would like to give one class of queries higher priority than another, and so memory space should preferentially go to tables that are needed by the first class. But if that's your problem, pin these tables in memory is still an awfully crude solution to the problem. I'd be inclined to think instead about a scheme that lets references made by higher-priority queries bump buffers' use-counts by more than 1, or some other way of making the priority considerations visible to an automatic cache management algorithm. regards, tom lane end / I'm working on a migration project (IDS2 to Postgresql) for a big organisation in Belgium. We need a very quick access to a postgresql table. I did some tests with and without using a ramdisk, and I noticed that requests are smarter using the ramdisk (context: more than 1,000k reads a day and only 5 or 6 writes a day). Is this message up-to-date ? Can you confirm me that's better to let work the engine of postgresql ? Is there another solution to give more or less priority to a request or a table ? Giving less priority to a request is also interesting me because we need to launch big batch request. My first idea was to create a slave server using streaming replication. Maybe there is another solution... ? Best Regards, Gilles Fauvie -- *Gilles Fauvie - Open Source Consultant *Email: gfau...@integer.be LinkedIn: http://be.linkedin.com/in/fauviegilles *INTEGER SPRL/BVBA *P. +32 (0)67 88 36 13 | F. +32 (0)67 47 52 04 http://www.integer.be | Email: customerc...@integer.be To contact our technical support, we highly recommend that you use the email address customerc...@integer.be. This will help us save time and we can respond more quickly to your questions. Pour contacter notre support technique, nous recommandons grandement que vous utilisiez l'adresse email customerc...@integer.be. Cela nous aidera à gagner du temps et nous pourrons répondre plus vite à vos questions.
[HACKERS] Review: Dumping an Extension's Script
I have done a high level review of the patch, and here are my comments: 1) I don't know if community is really looking for this functionality, I followed the thread a bit and appraently there is a disagreement over it. I will not comments on that. Let the big guys decide if they really want this feature. 2) I think comments are very limited, I would be really great if you can add thorough comments of whatever new that you have added. Couple of examples are given below: i) You have added 'script' variable in 'ExtensionControlFile' structure, but no comments. All other variables have one, you should follow the suite and we can understand what's going on at the first look. ii) What is the need of require option? It would be great if you can go through all the comments and try to add the missing comments and improve the old ones. 3) There are spelling mistakes in the existing comments, for example: i) File:pg_dump.c, line:850 ii) File: pg_dump.h line: 136 It would be good idea to run a spell check on your old and new comments. 4) 'if_no_exits' flag of new grammar rule is set to false, even though the rule states that the extension is created as 'IF NOT EXISTS'. 5) Why 'relocatable' option is added to the new grammar rule of create extension? In a regular 'create extension' statement, when 'schema' option is specified, it means that the extension is relocatable. So why is option needed? If there is a specific reason why 'scheme' option wasn't used, then please explain. 6) I am unable to figure out why new 'require' option is needed(comments would have helped)? Is it because when we will dump the extension with -X flag, it will first dump that referenced extension? I created the following extension(hstore was already created): create extension testex with version '1' requires 'hstore' as $testex$ create type testtype as(a char) $testex$; I then dumped the extension using -X option, only 'testex' extension was dumped and not 'hstore'. Apparently by looking at query being used to fetch extensions in dump, the extensions that it depends on are also being fetched. But it's not working like that. 7) You have removed an old PG comments in pg_dump.c line:7526 above the following check: if (!extinfo-dobj.dump || dataOnly) return; I guess it's been done in mistake? 8) In extension.c the following check is added: /* * If the extension script is not in the command, then the user is not * the extension packager and we want to read about relocatable and * requires in the control file, not in the SQL command. */ if (d_relocatable || d_requires) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(parameter \%s\ is only expected when using CREATE EXTENSION AS, d_relocatable ? relocatable : requires))); Is the above code really reachable? If the user has not specified a script(old create extension syntax is used), then 'd_relocatable' and 'd_requires' will never be true, because the old 'create extension' syntax doesn't allow it. 9) For every extension being dumped, and then for every object inside that extension. The code traverses all the fetched items of the database to compare if that item belong to that specific extension. Because of that 'DumpableObject **dobjs' is traversed fully, every time a single extension is dumped. This seems to be big big performance hit for a huge database. And considering if many many extensions are dumped in a huge database, dump might become very slow. Can you think of any other scheme? Maybe you should follow PG's design, and fetch the extension objects in dumpExtension() and dump them. 10) pg_dumpall is not handled at all. User can't use -X switch if he wants to dump all the databases with extensions. Regards, Ali Dar
Re: [HACKERS] [WIP] pg_ping utility
Phil Sorber escribió: On Mon, Dec 3, 2012 at 11:59 PM, Michael Paquier michael.paqu...@gmail.com wrote: - Same thing with this example: + para +Standard Usage: +screen + prompt$/prompt userinputpg_isready/userinput + prompt$/prompt userinputecho $?/userinput + computeroutput0/computeroutput +/screen + /para For the time being PQPING_OK returns 0 because it is on top of the enum PGPing, but this might change if for a reason or another the order of outputs is changed. So I understand what you mean by the ordering might change, but this is actual output from the shell. I'm not sure how to convey that sentiment properly here and still have a real example. Perhaps just remove the example? No, I think it is the reference docs on the returned value that must be fixed. That is, instead of saying that the return value correspond to the enum values, you should be saying that it will return literal0/literal if it's okay, 1 in another case and 2 in yet another case. And then next to the PQping() enum, add a comment that the values must not be messed around with because pg_isready exposes them to users and shell scripts. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Dumping an Extension's Script
1) I don't know if community is really looking for this functionality, I followed the thread a bit and appraently there is a disagreement over it. I will not comments on that. Let the big guys decide if they really want this feature. I saw very few distinct people voting on that. I think this is a killer feature to improve extension usage. The use-case is very clear and confirmed (and has been explained by Dimitri already : deploy extension in several hundreds of database without setting up ton's of new connections, ip/user/password association, etc. just a single sql command) BTW I can dump everything from port 5432, except extensions (I can even run pg_basebackup instead of rsync !). -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
John R Pierce pie...@hogranch.com writes: On 12/5/2012 1:42 AM, Simon Riggs wrote: I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. where does this get displayed? is it just tossed into the postgres log files? And perhaps more to the point, what's the advantage compared to good old log_statement = ddl? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum truncate exclusive lock round two
On Tue, Dec 4, 2012 at 2:05 PM, Jan Wieck janwi...@yahoo.com wrote: So the question on the table is which of these three intervals should be GUCs, and what values to use if they aren't. I could live with all the above defaults, but would like to see more comments on them. I largely agree with what's already been said. The only interval that seems to me to maybe need its own knob is the total time after which the autovacuum worker will give up. If we set it to 2 * deadlock_timeout, some people might find that a reason to raise deadlock_timeout. Since people *already* raise deadlock_timeout to obscenely high values (a minute? an hour???) and then complain that things blow up in their face, I think there's a decent argument to be made that piggybacking anything else on that setting is unwise. Against that, FWICT, this problem only affects a small number of users: Jan is the only person I can ever remember reporting this issue. I'm not dumb enough to think he's the only person who it affects; but my current belief is that it's not an enormously common problem. So the main argument I can see against adding a GUC is that the problem is too marginal to justify a setting of its own. What I really see as the key issue is: suppose we hardcode this to say 2 seconds. Is that going to fix the problem effectively for 99% of the people who have this problem, or for 25% of the people who have this problem? In the former case, we probably don't need a GUC; in the latter case, we probably do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Here's a first attempt at a new documentation chapter. This goes in part Server Programming, just after the SPI chapter. I just noticed that worker_spi could use some more sample code, for example auth_counter was getting its own LWLock and also its own shmem area, which would be helpful to demonstrate I think. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services !-- doc/src/sgml/bgworker.sgml -- chapter id=bgworker titleBackground Worker Processes/title indexterm zone=bgworker primaryProcesses/primary secondaryAdditional/secondary /indexterm para PostgreSQL can be extended to run user-supplied code in separate processes. Such processes are started, stopped and monitored by commandpostgres/command, which permits them have a lifetime closely linked to the server's status. These processes have the option to attach to productnamePostgreSQL/'s shared memory area and connect to databases internally. /para warning para There are considerable robustness and security risks in using background worker processes, because them being written in the literalC/ language gives them unrestricted access to data. Administrators wishing to enable modules that include background worker process should exercise extreme caution. Only carefully audited modules should be permitted to run background worker processes. /para /warning para Only modules listed in varnameshared_preload_libraries/ can run background workers. A module wishing to register a background worker needs to register it by calling functionRegisterBackgroundWorker(typeBackgroundWorker *worker/type)/function. The structure structnameBackgroundWorker/structname is defined thus: programlisting typedef struct BackgroundWorker { char *bgw_name; int bgw_flags; BgWorkerStartTime bgw_start_time; int bgw_restart_time; /* in seconds, or BGW_NEVER_RESTART */ bgworker_main_type bgw_main; void *bgw_main_arg; bgworker_sighdlr_type bgw_sighup; bgworker_sighdlr_type bgw_sigterm; } BackgroundWorker; /programlisting /para para structfieldbgw_name/ is a string to be used in log messages, process lists and similar contexts. /para para structfieldbgw_flags/ is a bitwise-or'd bitmask indicating the capabilities that the module would like to have. Possible values are literalBGWORKER_SHMEM_ACCESS/literal (requesting shared memory access) and literalBGWORKER_BACKEND_DATABASE_CONNECTION/literal (requesting the ability to establish a database connection, through which it can later run transactions and queries). /para para structfieldbgw_start_time/structfield is the server state during which commandpostgres/ should start the process; it can be one of literalBgWorkerStart_PostmasterStart/ (start as soon as commandpostgres/ itself has finished its own initialization; processes requesting this are not eligible for database connections), literalBgWorkerStart_ConsistentState/ (start as soon as consistent state has been reached in a HOT standby, allowing processes to connect to databases and run read-only queries), and literalBgWorkerStart_RecoveryFinished/ (start as soon as the system has entered normal read-write state). Note the last two values are equivalent in a server that's not a HOT standby. /para para structfieldbgw_restart_time/structfield is the interval, in seconds, that commandpostgres/command should wait before restarting the process, in case it crashes. It can be any positive value, or literalBGW_NEVER_RESTART/literal, indicating not to restart the process in case of a crash. /para para structfieldbgw_main/structfield is a pointer to the function to run once the process is started. structfieldbgw_main_arg/structfield will be passed to it as its only argument. Note that literalMyBgworkerEntry/literal is a pointer to a copy of the structnameBackgroundWorker/structname structure passed at registration time. /para para structfieldbgw_sighup/structfield and structfieldbgw_sigterm/ are pointers to functions that will be installed as signal handlers for the new process. /para paraOnce running, the process can connect to a database by calling functionBackgroundWorkerInitializeConnection(parameterchar *dbname/parameter, parameterchar *username/parameter)/function. This allows the process to run transactions and queries using the literalSPI/literal interface. If varnamedbname/ is NULL, the session is not connected to any particular database, but shared catalogs can be accessed. If varnameusername/ is NULL, the process will run as the superuser created during commandinitdb/. /para para Signals are initially blocked when control reaches the structfieldbgw_main/ function, and must be unblocked by it; this is to allow the process to
Re: [HACKERS] [WIP] pg_ping utility
On Wed, Dec 5, 2012 at 8:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Phil Sorber escribió: On Mon, Dec 3, 2012 at 11:59 PM, Michael Paquier michael.paqu...@gmail.com wrote: - Same thing with this example: + para +Standard Usage: +screen + prompt$/prompt userinputpg_isready/userinput + prompt$/prompt userinputecho $?/userinput + computeroutput0/computeroutput +/screen + /para For the time being PQPING_OK returns 0 because it is on top of the enum PGPing, but this might change if for a reason or another the order of outputs is changed. So I understand what you mean by the ordering might change, but this is actual output from the shell. I'm not sure how to convey that sentiment properly here and still have a real example. Perhaps just remove the example? No, I think it is the reference docs on the returned value that must be fixed. That is, instead of saying that the return value correspond to the enum values, you should be saying that it will return literal0/literal if it's okay, 1 in another case and 2 in yet another case. And then next to the PQping() enum, add a comment that the values must not be messed around with because pg_isready exposes them to users and shell scripts. +1 I'm on board with this. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Dumping an Extension's Script
Hi, Thanks for your review! Ali Dar ali.munir@gmail.com writes: 1) I don't know if community is really looking for this functionality, I followed the thread a bit and appraently there is a disagreement over it. I will not comments on that. Let the big guys decide if they really want this feature. Let's open the extension feature to users who are not at ease with building packages for their distribution of choice or our users who are not granted root privileges on their database servers. After all you don't even need to be a superuser to CREATE an EXTENSION. 2) I think comments are very limited, I would be really great if you can add thorough comments of whatever new that you have added. Couple of examples are given below: I've added some comments in the attached new revision of the patch, and fixed the errors you've been listing. ii) What is the need of require option? It would be great if you can go through all the comments and try to add the missing comments and improve the old ones. When CREATE EXTENSION is given the extension's script in the SQL command itself directly, the control files property are not to be found in a separate file. Most of those properties relate to the packaging and the reading of the file so we don't need them by construction, but it's not the case with two of them. The parameter require is used in the create extension code to be able to set the search_path correctly before running the extension's script, so that dependencies are found automatically. The parameter relocatable ends up in the pg_extension catalogs for being able to generate an ERROR if the user attempts to ALTER EXTENSION … SET SCHEMA, so we need a way to have it from the SQL command now (so that pg_dump emits a non lossy command). Here's the comment I've been adding to gram.y to explain that. Note that I wouldn't expect to find such a comment at this place, but didn't see another place where to detail it… /* * This option is needed only when the script is given * inline. We then don't have any control file but still * need to set the search_path according to the dependency * list when running the extension's script. */ if (strcmp($1, requires) == 0) 3) There are spelling mistakes in the existing comments, for example: Fixed, thanks. 4) 'if_no_exits' flag of new grammar rule is set to false, even though the rule states that the extension is created as 'IF NOT EXISTS'. Oh, a copy/paste bug, the worse kind. Fixed, thanks. 5) Why 'relocatable' option is added to the new grammar rule of create extension? In a regular 'create extension' statement, when 'schema' option is specified, it means that the extension is relocatable. So why is option needed? If there is a specific reason why 'scheme' option wasn't used, then please explain. When schema option is specificied, it does NOT mean that the extension is relocatable. The relocatable option is choosen by the extension's author to cope with some dependencies and other problems, and is important for reading the script (@extschema@ substitutions). Also it's then essential to disallow ALTER EXTENSION … SET SCHEMA … even if that very command could work, so we keep that parameter in the catalogs. It's one of the only two control parameters that still needs to be passed on the SQL command now. 6) I am unable to figure out why new 'require' option is needed(comments would have helped)? Is it because when we will dump the extension with -X flag, it will first dump that referenced extension? That's not how it works, no. We can of course review that, but I don't currently see any reason to force dumping required extensions scripts. The 'require' processing didn't change at all in this patch. You are just allowed to give the list in the SQL command rather than the control file, that's all. I created the following extension(hstore was already created): create extension testex with version '1' requires 'hstore' as $testex$ create type testtype as(a char) $testex$; I then dumped the extension using -X option, only 'testex' extension was dumped and not 'hstore'. I suppose that at restore time you will have to have hstore on disk with its .so file, but maybe won't have testex readily deployed on the new server. So my understanding is that what's happening is what needs to happen. What do you think? 7) You have removed an old PG comments in pg_dump.c line:7526 above the following check: Oops. Added back. Thanks. 8) In extension.c the following check is added: /* * If the extension script is not in the command, then the user is not * the extension packager and we want to read about relocatable and * requires in the control file, not in the SQL command. */ if (d_relocatable || d_requires) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(parameter \%s\ is only expected when using CREATE EXTENSION AS, d_relocatable ? relocatable :
Re: [HACKERS] autovacuum truncate exclusive lock round two
Robert Haas wrote: Since people *already* raise deadlock_timeout to obscenely high values (a minute? an hour???) and then complain that things blow up in their face, I think there's a decent argument to be made that piggybacking anything else on that setting is unwise. If people are really doing that, then I tend to agree. I wasn't aware of that practice. Against that, FWICT, this problem only affects a small number of users: Jan is the only person I can ever remember reporting this issue. I'm not dumb enough to think he's the only person who it affects; but my current belief is that it's not an enormously common problem. So the main argument I can see against adding a GUC is that the problem is too marginal to justify a setting of its own. What I really see as the key issue is: suppose we hardcode this to say 2 seconds. Is that going to fix the problem effectively for 99% of the people who have this problem, or for 25% of the people who have this problem? In the former case, we probably don't need a GUC; in the latter case, we probably do. Given the fact that autovacuum will keep throwing workers at it to essentially loop indefinitely at an outer level, I don't think the exact setting of this interval is all that critical either. My gut feel is that anything in the 2 second to 5 second range would be sane, so I won't argue over any explicit setting within that range. Below that, I think the overhead of autovacuum coming back to the table repeatedly would probably start to get too high; below that we could be causing some small, heavily-updated table to be neglected by autovacuum -- especially if you get multiple autovacuum workers tied up in this delay on different tables at the same time. Regarding how many people are affected, I have seen several reports of situations where users claim massive impact on performance when autovacuum kicks in. The reports have not included enough detail to quantify the impact or in most cases to establish a cause, but this seems like it could have a noticable impact, especially if the deadlock timeout was set to more than a second. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP json generation enhancements
On Nov 26, 2012, at 11:12 AM, Peter Eisentraut pete...@gmx.net wrote: Although my intuition would be [], the existing concatenation-like aggregates return null for no input rows, so this probably ought to be consistent with those. This annoys me at times, but I wrap such calls in COALESCE() and forget about it. So I agree to keep it consistent with other array-returning aggregate functions. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 4, 2012, at 10:05 AM, Josh Berkus j...@agliodbs.com wrote: json_get(json, variadic text) = json Given that I already do the equivalent in Python, this would suit me well. Not sure about other users ... Well, given that sometimes you will have mixed arrays and objects, how would you distinguish 42 as an object key or an array index? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 05.12.2012 12:22, Dimitri Fontaine wrote: Heikki Linnakangashlinnakan...@vmware.com writes: No doubt about that. I'm sure extensions written in pure SQL or PL/pgSQL are very common. But what does that have to do with this patch? This patch is all about enabling users to create extension without having to ship them as root on the file system of the database(s) server(s) first. ... When you can code your extension using dynamic code such as SQL or PL/pgSQL, PL/pythonu or PL/perl, there's absolutely no good reason to have to do the ship on the server's file system first that I can see. Technically creating an extension inline (sending its definition in the CREATE EXTENSION query itself) solves the problem of having to access the server's file system as root. Ok, I'm with you this far. Then, next pg_dump will include CREATE EXTENSION foo; as usual and at pg_restore time that access files on the file systems. But maybe you still are not granted access to the server's file system as root on the pg_restore target, right? So now you need to be able to include the extension's script into the dump. Now you lost me. I can see the need to install an extension without access to the filesystem - but it does not follow that you need to be able to dump an extension script. In general, I think you're confusing three things: 1. The way an extension is deployed. It could be by copying the files to the file system, by sending them over libpq, or shipped in .rpms by the OS, or something else. 2. The way an extension's files are laid out before it's deployed. Typically, you want to keep an extension's source code (whether it's C or SQL or plpython) in a version control system. 3. Being able to deploy extensions to the server without superuser or root access I think it would make this discussion a lot clearer if we keep those concerns separate. It's useful to have a mechanism to deploy an extension over libpq. It's not clear to me if you're envisioning to change 2. I don't think we should; having a .sql file and a .control file seems perfectly fine to me. I'd suggest that we just need a way to upload an extension to the server via libpq. Something like UPLOAD EXTENSION foo, which goes into COPY mode and you can stream over a zip file containing the .sql and .control file that make up the extension. The server would unzip the file into the right directory. Now, point 3 is yet another issue. If you need to copy the extension files to /usr/share/, you need root (or similar) access on the filesystem. We could allow extensions to be located somewhere in the data directory instead. Like $PGDATA/extensions. But again, that would be an independent change from 1 and 2. And I still don't understand why pg_dump needs to know about any of this... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
Simon Riggs si...@2ndquadrant.com writes: On 5 December 2012 09:46, John R Pierce pie...@hogranch.com wrote: On 12/5/2012 1:42 AM, Simon Riggs wrote: I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. where does this get displayed? is it just tossed into the postgres log files? Good question. Where would you like? What about pg_stat_ddl, a new system's view? It would maybe need to have some xid/cid like ordering to be able to reproduce the script. It could also maybe use the ddl_rewrite module I'm working on for the event triggers framework, as far as normalized SQL goes. The rewrite information would be a boolean column in that view, I guess. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Nov 28, 2012, at 4:10 PM, Merlin Moncure mmonc...@gmail.com wrote: Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. The problem I see with the current proposal is that this limitation, it seems to me, would prevent the ability to index nested keys. If you're essentially composing and decomposing JSON values as you drill down, the intermediate JSON values between the original one and the final return value can't be indexed, can they? For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Best, David PS: SOrry for the delayed replies, digging my way out of a couple weeks of back posts… -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:11 PM, David E. Wheeler wrote: On Dec 4, 2012, at 10:05 AM, Josh Berkus j...@agliodbs.com wrote: json_get(json, variadic text) = json Given that I already do the equivalent in Python, this would suit me well. Not sure about other users ... Well, given that sometimes you will have mixed arrays and objects, how would you distinguish 42 as an object key or an array index? if the thing is an array, test to see if the string is a valid integer string, and if so use the integer value. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:14 PM, David E. Wheeler wrote: On Nov 28, 2012, at 4:10 PM, Merlin Moncure mmonc...@gmail.com wrote: Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. The problem I see with the current proposal is that this limitation, it seems to me, would prevent the ability to index nested keys. If you're essentially composing and decomposing JSON values as you drill down, the intermediate JSON values between the original one and the final return value can't be indexed, can they? For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Probably not. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 19:13:10 +0200, Heikki Linnakangas wrote: On 05.12.2012 12:22, Dimitri Fontaine wrote: Heikki Linnakangashlinnakan...@vmware.com writes: No doubt about that. I'm sure extensions written in pure SQL or PL/pgSQL are very common. But what does that have to do with this patch? This patch is all about enabling users to create extension without having to ship them as root on the file system of the database(s) server(s) first. ... When you can code your extension using dynamic code such as SQL or PL/pgSQL, PL/pythonu or PL/perl, there's absolutely no good reason to have to do the ship on the server's file system first that I can see. Technically creating an extension inline (sending its definition in the CREATE EXTENSION query itself) solves the problem of having to access the server's file system as root. Ok, I'm with you this far. Then, next pg_dump will include CREATE EXTENSION foo; as usual and at pg_restore time that access files on the file systems. But maybe you still are not granted access to the server's file system as root on the pg_restore target, right? So now you need to be able to include the extension's script into the dump. Now you lost me. I can see the need to install an extension without access to the filesystem - but it does not follow that you need to be able to dump an extension script. In general, I think you're confusing three things: 1. The way an extension is deployed. It could be by copying the files to the file system, by sending them over libpq, or shipped in .rpms by the OS, or something else. 2. The way an extension's files are laid out before it's deployed. Typically, you want to keep an extension's source code (whether it's C or SQL or plpython) in a version control system. 3. Being able to deploy extensions to the server without superuser or root access I think it would make this discussion a lot clearer if we keep those concerns separate. It's useful to have a mechanism to deploy an extension over libpq. It's not clear to me if you're envisioning to change 2. I don't think we should; having a .sql file and a .control file seems perfectly fine to me. I'd suggest that we just need a way to upload an extension to the server via libpq. Something like UPLOAD EXTENSION foo, which goes into COPY mode and you can stream over a zip file containing the .sql and .control file that make up the extension. The server would unzip the file into the right directory. Not sure what is better here. Dimitri's way seems to be easier to manage for people who maintain their database in update scripts and such and your's seems to be a bit simpler from the backend perspective. Now, point 3 is yet another issue. If you need to copy the extension files to /usr/share/, you need root (or similar) access on the filesystem. We could allow extensions to be located somewhere in the data directory instead. Like $PGDATA/extensions. But again, that would be an independent change from 1 and 2. I think installing them into some global space is not a sensible interim-step. Having a UPLOAD EXTENSION in one database affect all other databases or even clusters (because you e.g. updated the version) would be really confusing. Which leads to: And I still don't understand why pg_dump needs to know about any of this... Extensions should be fully per-database and we want pg_dump backups to be restorable into another database/clusters/servers. So having a mode for pg_dump that actually makes dumps that are usable for recovering after a disaster seems sensible to me. Otherwise you need to redeploy from the VCS or whatever, which isn't really what you want when restoring a database backup. Comparing the situation to the one where you have extensions provided by the packaging system or by /contrib or whatever doesn't seem to be all that valid to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 9:21 AM, Andrew Dunstan and...@dunslane.net wrote: For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Probably not. That greatly reduces its utility for querying, though not, of course, for using it in procedural code. Wouldn't using a jsonpath-style implementation allow for indexing? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 05.12.2012 19:27, Andres Freund wrote: And I still don't understand why pg_dump needs to know about any of this... Extensions should be fully per-database and we want pg_dump backups to be restorable into another database/clusters/servers. So having a mode for pg_dump that actually makes dumps that are usable for recovering after a disaster seems sensible to me. Otherwise you need to redeploy from the VCS or whatever, which isn't really what you want when restoring a database backup. Ok - but that it yet another issue, not to be confused with how you deploy extensions. If we are to have such a mode in pg_dump, it should be able to dump *all* extensions, regardless of how they were deployed. (ok, might be difficult for extensions that include .so files or similar, but certainly for an extension that only contains a .sql file and a .control file, it shouldn't matter how it was deployed). And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On 5 December 2012 15:01, Tom Lane t...@sss.pgh.pa.us wrote: John R Pierce pie...@hogranch.com writes: On 12/5/2012 1:42 AM, Simon Riggs wrote: I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. where does this get displayed? is it just tossed into the postgres log files? And perhaps more to the point, what's the advantage compared to good old log_statement = ddl? That logs DDL statements for the whole system and isn't user settable. It wouldn't be useful to extend that, since the user wouldn't be able to enable/disable and the stats would get dumped in the server log. Need something more user specific. Ideas * pg_stat_ddl (from Dimitri) which would be a temp table containing results * Stream of NOTICE statements back to client that seems easier * err... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 19:40:58 +0200, Heikki Linnakangas wrote: On 05.12.2012 19:27, Andres Freund wrote: And I still don't understand why pg_dump needs to know about any of this... Extensions should be fully per-database and we want pg_dump backups to be restorable into another database/clusters/servers. So having a mode for pg_dump that actually makes dumps that are usable for recovering after a disaster seems sensible to me. Otherwise you need to redeploy from the VCS or whatever, which isn't really what you want when restoring a database backup. Ok - but that it yet another issue, not to be confused with how you deploy extensions. If we are to have such a mode in pg_dump, it should be able to dump *all* extensions, regardless of how they were deployed. (ok, might be difficult for extensions that include .so files or similar, but certainly for an extension that only contains a .sql file and a .control file, it shouldn't matter how it was deployed). For me it seems pretty natural to support dumping extension the way they got created. I.e. a plain CREATE EXTENSION ...; if the extension was preinstalled and some form that includes the extension source if you installed it via the connection. Extensions that were installed in some global manner *should* not be dumped with ones installed over the connection. E.g. dumping /contrib or packaged modules seems to be a bad idea to me. That would possibly be useful as a debugging tool, but I don't see much point besides that. And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. What would be the case for the per-cluster in the case of uploaded extensions? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Heikki Linnakangas hlinnakan...@vmware.com writes: Ok - but that it yet another issue, not to be confused with how you deploy extensions. If we are to have such a mode in pg_dump, it should be able to dump *all* extensions, regardless of how they were deployed. (ok, might be difficult for extensions that include .so files or similar, but certainly for an extension that only contains a .sql file and a .control file, it shouldn't matter how it was deployed). That's what you have in the current patch. Try = create extension 'hstore'; $ pg_dump --extension-script hstore It works as far as the script is concerned, and the control file is not needed any more because the script as dumped does not need it, except for the two parameters 'require' and 'relocatable', that are added in the SQL command. The binary file is not taken care of by this mechanism. Remember that in most cases pg_restore will not be granted to deploy it at the right place anyway, for security reasons. And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. At the SQL level, extensions do live in a database. The only reason why we currently have them on the file system is binary executables (.so, .dylib, .dll). And those are not per database, not even per cluster, not even per major version, they are *per server*. It's something that makes me very sad, and that I want to have the chance to fix later, but that won't happen in 9.3, and certainly not in that very patch… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Andres Freund and...@2ndquadrant.com writes: On 2012-12-05 19:13:10 +0200, Heikki Linnakangas wrote: And I still don't understand why pg_dump needs to know about any of this... Extensions should be fully per-database and we want pg_dump backups to be restorable into another database/clusters/servers. Wait a minute. I haven't bought into either of those statements, and most particularly not the first one. Upthread, Dimitri claimed that he wasn't creating two different kinds of extensions with this patch, but the more I read about it the more it seems that he *is* making a fundamentally different kind of animal. And I don't think it's necessarily a good idea, especially not if we still call it an extension. I kind of like Heikki's idea of leaving CREATE EXTENSION alone and inventing a separate UPLOAD EXTENSION operation, but there's a problem with that: in many, probably most, installations, the server does not and should not have permission to scribble on the directories where the extension scripts are stored. Possibly we could circumvent that by creating an auxiliary extensions directory under $PGDATA. (But then it starts to seem like pg_dumpall --- not pg_dump --- ought to include those files in its output...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 11:14 AM, David E. Wheeler da...@justatheory.com wrote: On Nov 28, 2012, at 4:10 PM, Merlin Moncure mmonc...@gmail.com wrote: Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. The problem I see with the current proposal is that this limitation, it seems to me, would prevent the ability to index nested keys. If you're essentially composing and decomposing JSON values as you drill down, the intermediate JSON values between the original one and the final return value can't be indexed, can they? For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? I think best practices for JSON manipulation (at least in performance sensitive cases with large documents) are going to be to fully decompose into sql structures and manipulate after the fact. JSON's primary role is to serve as data exchange and Andrew's API (with the tweaks he came up with) seems to facilitate that pretty well; full decomposition is a snap. Indexing large documents for fancy querying is a niche case but also quite complex. This isn't very well covered by xmlpath either btw -- I think for inspiration we should be looking at hstore. That said, how would you do that? The first thing that jumps into my mind is to cut right to the chase: Maybe the semantics could be defined so that implement hackstack @ needle would reasonable cover most cases. So my takeaways are: *) decomposition != precise searching. andrew's api handles the former and stands on it's own merits. *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 12:29 PM, David E. Wheeler wrote: On Dec 5, 2012, at 9:21 AM, Andrew Dunstan and...@dunslane.net wrote: For sufficiently large columns, I expect I would want a GIN index to speed JSON value extraction queries. Possible with this proposal? Probably not. That greatly reduces its utility for querying, though not, of course, for using it in procedural code. Wouldn't using a jsonpath-style implementation allow for indexing? Indexing tree-like data isn't at all easy. We don't index XML either. There has been discussion of this sort of indexing it in the past, and a couple of people have said they would work on it, but I have not seen a proposal or a single line of code. Jsonpath on its own would not do what you're suggesting. A first approach to indexing treeish data requires that you generate all the possible paths and index that. That would be quite explosive in volume. And anyway, jsonpath is not on offer here. I'm sorry what I have offered isn't what you want, but plenty of other people have told me it will go a long way meeting their needs. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Heikki Linnakangas hlinnakan...@vmware.com writes: And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. I think anyone arguing for the former is confusing an installed extension with a not-installed one. Maybe it would help if we adopted different terminologies. Perhaps call the control+sql files a template, while using extension for the installed entity? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
Simon Riggs si...@2ndquadrant.com writes: On 5 December 2012 15:01, Tom Lane t...@sss.pgh.pa.us wrote: And perhaps more to the point, what's the advantage compared to good old log_statement = ddl? That logs DDL statements for the whole system and isn't user settable. Not true; you can set it with ALTER ROLE/DATABASE, and a superuser can adjust it for his own session. * pg_stat_ddl (from Dimitri) which would be a temp table containing results * Stream of NOTICE statements back to client that seems easier * err... And an ORM is going to do what with either, pray tell? None of these are of any use except with an interactive session; in which the user could perfectly well use EXPLAIN, anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 12:55:42 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-12-05 19:13:10 +0200, Heikki Linnakangas wrote: And I still don't understand why pg_dump needs to know about any of this... Extensions should be fully per-database and we want pg_dump backups to be restorable into another database/clusters/servers. Wait a minute. I haven't bought into either of those statements, and most particularly not the first one. Ok. Upthread, Dimitri claimed that he wasn't creating two different kinds of extensions with this patch, but the more I read about it the more it seems that he *is* making a fundamentally different kind of animal. And I don't think it's necessarily a good idea, especially not if we still call it an extension. I have to admit I haven't read the whole discussion about this. And I also have to say that I have no idea yet whether I like the current implementation because I haven't looked at it yet. I just wanted to give input to the separate problems Heikki listed. Because I wished for something roughly like this for years... To me it seems to be sensible that extensions which are preinstalled on the system are global and extensions which a single user inside a single database created are per database. Imo that doesn't make them all that fundamentally different. I kind of like Heikki's idea of leaving CREATE EXTENSION alone and inventing a separate UPLOAD EXTENSION operation, but there's a problem with that: in many, probably most, installations, the server does not and should not have permission to scribble on the directories where the extension scripts are stored. Possibly we could circumvent that by creating an auxiliary extensions directory under $PGDATA. (But then it starts to seem like pg_dumpall --- not pg_dump --- ought to include those files in its output...) UPLOAD EXTENSION seems to be a good idea. But I really really would like them to go to a per-database directory not a per-cluster one. Otherwise the coordination between different database owners inside a cluster will get really hairy. I want to be able to install different versions of an application into different databases. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 05.12.2012 20:07, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. I think anyone arguing for the former is confusing an installed extension with a not-installed one. Maybe it would help if we adopted different terminologies. Perhaps call the control+sql files a template, while using extension for the installed entity? +1 on the naming. You could still argue that templates should be per-database. It would make life easier for someone who is database owner but not superuser, for example, allowing you to install an extension that only affects your own database (assuming we set up the permissions so that that's possible, of course). - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Dimitri Fontaine dimi...@2ndquadrant.fr writes: At the SQL level, extensions do live in a database. The only reason why we currently have them on the file system is binary executables (.so, .dylib, .dll). And those are not per database, not even per cluster, not even per major version, they are *per server*. It's something that makes me very sad, and that I want to have the chance to fix later, but that won't happen in 9.3, and certainly not in that very patch⦠I think you're wasting your time to imagine that that case will ever be fixed. Allowing the server to scribble on executable files would set off all kinds of security alarm bells, and rightly so. If Postgres ever did ship with such a thing, I rather imagine that I'd be required to patch it out of Red Hat releases (not that SELinux wouldn't prevent it from happening anyway). I do see an argument for allowing SQL-only extensions to be installed this way, since that doesn't allow the execution of anything the user couldn't execute anyway. There's no need to worry about anything except control and script files though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 05.12.2012 20:13, Andres Freund wrote: But I really really would like them to go to a per-database directory not a per-cluster one. Otherwise the coordination between different database owners inside a cluster will get really hairy. I want to be able to install different versions of an application into different databases. Extension authors should be careful to maintain backwards-compatibility, so that it would be enough to have the latest version installed. If you break compatibility, you probably should rename the extension. That said, I can understand that in practice you'd want to have different versions installed at the same time, so that you don't need to re-test everything when upgrading an extension, and don't need to trust that the extension author didn't accidentally break backwards-compatibility anyway. If you really meant different versions of an application, and not different versions of an extension, then it seems to me that you're abusing the extension infrastructure for something else. If you have some functions that you consider part of the application, even if those functions might be useful in other applications too, you probably don't want to treat them as an extension. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 20:15:42 +0200, Heikki Linnakangas wrote: On 05.12.2012 20:07, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: And whether extension control files (or the same information stored in a table or wherever) should be per-database or per cluster - that's *yet* another separate issue. You could argue for either behavior. I think anyone arguing for the former is confusing an installed extension with a not-installed one. Not sure whether it would be the best design, but having something like UPLOAD EXTENSION which can only exist in the installed form would be enough for nearly all the use-cases I experienced. Maybe it would help if we adopted different terminologies. Perhaps call the control+sql files a template, while using extension for the installed entity? +1 on the naming. +1 on the idea of naming them separately, I am not happy with template, but then I don't have a better suggestion. You could still argue that templates should be per-database. It would make life easier for someone who is database owner but not superuser, for example, allowing you to install an extension that only affects your own database (assuming we set up the permissions so that that's possible, of course). +1. We could even have two variants, UPLOAD [GLOBAL] EXTENSION/TEMPLATE. ISTM that we would need some kind of search path anyway so adding that separation seems to be a minimal amount of additional effort. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 20:23:29 +0200, Heikki Linnakangas wrote: On 05.12.2012 20:13, Andres Freund wrote: But I really really would like them to go to a per-database directory not a per-cluster one. Otherwise the coordination between different database owners inside a cluster will get really hairy. I want to be able to install different versions of an application into different databases. Extension authors should be careful to maintain backwards-compatibility, so that it would be enough to have the latest version installed. If you break compatibility, you probably should rename the extension. In theory yes. In practice: That said, I can understand that in practice you'd want to have different versions installed at the same time, so that you don't need to re-test everything when upgrading an extension, and don't need to trust that the extension author didn't accidentally break backwards-compatibility anyway. ;) If you really meant different versions of an application, and not different versions of an extension, then it seems to me that you're abusing the extension infrastructure for something else. If you have some functions that you consider part of the application, even if those functions might be useful in other applications too, you probably don't want to treat them as an extension. I was thinking of reusable parts of applications that might be used in more than one application. *But* I think this also is a good basis to encapsulate individual non-shared parts of an application. Why not? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 13:18:16 -0500, Tom Lane wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: At the SQL level, extensions do live in a database. The only reason why we currently have them on the file system is binary executables (.so, .dylib, .dll). And those are not per database, not even per cluster, not even per major version, they are *per server*. It's something that makes me very sad, and that I want to have the chance to fix later, but that won't happen in 9.3, and certainly not in that very patch… Maybe I am missing something, but you already can separate them per major version. You co-wrote the debian infrastructure to do so for some debian packages, so I am not sure what you mean here. Adding some *NON WRITABLE* per-cluster library directory doesn't seem to be as controversion as other suggestions. I think you're wasting your time to imagine that that case will ever be fixed. Allowing the server to scribble on executable files would set off all kinds of security alarm bells, and rightly so. If Postgres ever did ship with such a thing, I rather imagine that I'd be required to patch it out of Red Hat releases (not that SELinux wouldn't prevent it from happening anyway). +1 Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
Simon, And anything which assumes the *absence* of a manual script is also leaving out a large class of users. ORMs are very important, but not the only thing we serve. Yes. In the long run, we'll probably need two solutions. An interactive EXPLAIN, and something which logs or aborts for the ORM folks. Please assume that script meant a set of SQL statements that are executed in a specific sequence to change a database model from one version to another. Anything which requires editing of all (or worse, just some) of the SQL statements is not a good solution. For ORMs, this requires each ORM to make its own change to support that functionality and to have a separate mode where it is used. Exactly. And only the ORMs which are very close to PostgreSQL would be willing to do this. Most would not. I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] The problem with anything which reports back to the session is that even when DBAs are running SQL scripts, migrations are seldom run in an interactive session. For example, I manage all migrations for large projects using Python and YAML files, and SQLitch uses Perl and JSON wrappers for the SQL. Doing migrations via psql -f filename -q is also very common. So anything reported back in an interactive session would be lost. That's why we need a mechanism which either logs, or aborts on specific actions. From the perspective of the DevOps staff, abort is possibly the better option, but there may be issues with it on our end. That was the attraction of the original NOREWRITE patch, although as I said that suffers from new keywords and a total lack of extensibility. What about adding something like: ddl_action = [ none, log, warn, abort ] ddl_events = [ all, rewrite, exclusive, access_exclusive ] I realize I'm getting out into the weeds here, but I'm thinking as a contract DBA, what would *really* help me? and something like the above would do it. This would allow me to do something like: I wanna test this Rails migration, and have it die if it tries to do a full table rewrite or take an access_exclusive lock. And I'll check the logs afterwards if it blows up. ddl_action = 'log,abort' ddl_events = 'rewrite,access_exclusive' This would make it very easy to set some rules for the organization, and enforce them with automated testing. Auto explain executes things twice, which is not possible for DDL, so it won't work. I keep trying to find a use for auto-explain. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 9:57 AM, Merlin Moncure mmonc...@gmail.com wrote: Indexing large documents for fancy querying is a niche case but also quite complex. This isn't very well covered by xmlpath either btw -- I think for inspiration we should be looking at hstore. Agreed, although hstore, IIRC, does not support nesting. That said, how would you do that? The first thing that jumps into my mind is to cut right to the chase: Maybe the semantics could be defined so that implement hackstack @ needle would reasonable cover most cases. Yes. So my takeaways are: *) decomposition != precise searching. andrew's api handles the former and stands on it's own merits. Agreed. *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first I don't understand how xmlpath/jsonpath is not able to be implemented with operators. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On Wed, Dec 5, 2012 at 5:22 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: This patch is all about enabling users to create extension without having to ship them as root on the file system of the database(s) server(s) first. Right, but it changes the way that existing extensions *dump*, which seems to me to undo the use case that works now. I mean, the advantage of dumping an extension as CREATE EXTENSION hstore is that you can reload that dump on a different server with a newer version of hstore installed, and it'll still work. If we go back to dumping all of the SQL commands that compose that extension, then it'll break again, in exactly the way things were broken before we had extensions in the first place. Back in the bad old days, you'd dump your old database (which had all of the SQL commands for say hstore) and then reload it on your new database - and it would fail, because the old SQL commands didn't match the new binaries. Oops. With the extension mechanism, it all works just fine: the old database emits CREATE EXTENSION hstore and the new database can execute that just fine. You still have a problem if the extension has meanwhile been changed in a backwards-incompatible way that doesn't work for your application (i.e. you're using the = operator which has since been removed) but hopefully that doesn't happen too often, and in any event it seems relatively unavoidable. And it takes nothing away from the problem that extensions DO solve, which is incompatibilities between the SQL file and the shared library. When you can code your extension using dynamic code such as SQL or PL/pgSQL, PL/pythonu or PL/perl, there's absolutely no good reason to have to do the ship on the server's file system first that I can see. Technically creating an extension inline (sending its definition in the CREATE EXTENSION query itself) solves the problem of having to access the server's file system as root. True, but so does not putting the code into an extension at all. You can just create loose functions and operators. It's unclear to me what advantage the extension mechanism provides if there's no shared library and no on-disk files involved. Then, next pg_dump will include CREATE EXTENSION foo; as usual and at pg_restore time that access files on the file systems. But maybe you still are not granted access to the server's file system as root on the pg_restore target, right? So now you need to be able to include the extension's script into the dump. Granting for the moment that there's a reason to call this an extension at all, rather than a schema or just a bunch of random CREATE commands, which is not obvious to me, yes, you need to include it in the dump. But sure then the extension needs to be marked as being, somehow, a different flavor of extension that can only use SQL (not shlibs) and needs to be dumped in-line, because otherwise, as noted above, we break things for the flavor of extensions we've already got. Also, even there, it seems to me that it ought to work something like this: CREATE EXTENSION inline_extension NULL; -- create an extension with no members CREATE FUNCTION blahblahblah ... ALTER EXTENSION inline_extension ADD FUNCTION blahblab ... and so on for all the other members That is, the extension members should just become dumpable objects. This seems quite bizarre since the whole point of extensions AIUI is to avoid dumping the members, but it's better than what the patch implements. In the patch, IIRC, you emit all the members as a separate dump that gets enclosed by dollar quotes. This strikes me as ugly, and I think you can construct circular dependency situations in which it will fail outright. Now, we don't want to have more than one kind of extensions. That's what we learnt all together when reviewing my proposal from last year. Having more than one way to ship an extension is good, having two different animals with two different incompatible behaviors named the same thing is bad. The solution we found is then to be able to include an extension's script into pg_dump's output, and that's what my current patch implements, per last year review. I don't think I agree. I don't see a problem having more than one kind of extensions, but I'm worried that you're trying to shoehorn something that isn't really an extension into an extension-sized box. And I sure don't want that to mean let's break stuff that works right now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 10:04 AM, Andrew Dunstan and...@dunslane.net wrote: Indexing tree-like data isn't at all easy. We don't index XML either. There has been discussion of this sort of indexing it in the past, and a couple of people have said they would work on it, but I have not seen a proposal or a single line of code. Yeah, I forgot that xmlpath was not indexable. Jsonpath on its own would not do what you're suggesting. A first approach to indexing treeish data requires that you generate all the possible paths and index that. That would be quite explosive in volume. And anyway, jsonpath is not on offer here. Yeah, explosive for sure, but for sufficiently small JSON values, that shouldn’t be much of an issue. I expect GINs to be expensive anyway (see full-text indexing). I am not invested in jsonpath; I just cited it as an example of using a single function call to do a nested search. Obviously `json_get(json, variadic text)` allows this, too, and could potentially use a GIN index of a JSON tree to perform the variadic text search at some point in the future, yes? I'm sorry what I have offered isn't what you want, but plenty of other people have told me it will go a long way meeting their needs. *Sigh.* I guess I have not been clear. The stuff you propose is *awesome*. I love it. The syntax with the chaining operators warms my heart, and I can’t wait to make *extensive* use of it in my procedural code. Maybe I would never *need* to do column queries of JSON contents often enough to require an expensive index. So I'm happy with this stuff, as long as it does not get in the way of supporting indexing at some point in the future. I can’t wait to start using it! Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
*) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On Wed, Dec 5, 2012 at 12:47 PM, Andres Freund and...@2ndquadrant.com wrote: For me it seems pretty natural to support dumping extension the way they got created. I.e. a plain CREATE EXTENSION ...; if the extension was preinstalled and some form that includes the extension source if you installed it via the connection. Extensions that were installed in some global manner *should* not be dumped with ones installed over the connection. E.g. dumping /contrib or packaged modules seems to be a bad idea to me. That would possibly be useful as a debugging tool, but I don't see much point besides that. I agree with all of that. What I can't quite figure out is - AIUI, extensions are a way of bundling shared libraries with SQL scripts, and a way of managing the dump and restore process. If you just have SQL, there's no bundling to do, and if you reverse out the pg_dump changes (which is more or less what's being proposed here), then what do you have left other than the good feeling of being part of an extension? At that point, it seems to me that you've gone to a lot of work to add a layer of packaging that serves no real purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum truncate exclusive lock round two
On Wed, Dec 5, 2012 at 11:24 AM, Kevin Grittner kgri...@mail.com wrote: Robert Haas wrote: Since people *already* raise deadlock_timeout to obscenely high values (a minute? an hour???) and then complain that things blow up in their face, I think there's a decent argument to be made that piggybacking anything else on that setting is unwise. If people are really doing that, then I tend to agree. I wasn't aware of that practice. It's probably not quite common enough to be called a practice, but I have encountered it a number of times in support situations. Alas, I no longer remember the details of exactly what misery it caused, but I do remember it wasn't good. :-) Against that, FWICT, this problem only affects a small number of users: Jan is the only person I can ever remember reporting this issue. I'm not dumb enough to think he's the only person who it affects; but my current belief is that it's not an enormously common problem. So the main argument I can see against adding a GUC is that the problem is too marginal to justify a setting of its own. What I really see as the key issue is: suppose we hardcode this to say 2 seconds. Is that going to fix the problem effectively for 99% of the people who have this problem, or for 25% of the people who have this problem? In the former case, we probably don't need a GUC; in the latter case, we probably do. Given the fact that autovacuum will keep throwing workers at it to essentially loop indefinitely at an outer level, I don't think the exact setting of this interval is all that critical either. My gut feel is that anything in the 2 second to 5 second range would be sane, so I won't argue over any explicit setting within that range. Below that, I think the overhead of autovacuum coming back to the table repeatedly would probably start to get too high; below that we could be causing some small, heavily-updated table to be neglected by autovacuum -- especially if you get multiple autovacuum workers tied up in this delay on different tables at the same time. I think that part of what's tricky here is that the dynamics of this problem depend heavily on table size. I handled one support case where lowering autovacuum_naptime to 15s was an indispenable part of the solution, so in that case having an autovacuum worker retry for more than a few seconds sounds kind of insane. OTOH, that case involved a small, rapidly changing table. If you've got an enormous table where vacuum takes an hour to chug through all of it, abandoning the effort to truncate the table after a handful of seconds might sound equally insane. Many it'd be sensible to relate the retry time to the time spend vacuuming the table. Say, if the amount of time spent retrying exceeds 10% of the time spend vacuuming the table, with a minimum of 1s and a maximum of 1min, give up. That way, big tables will get a little more leeway than small tables, which is probably appropriate. Regarding how many people are affected, I have seen several reports of situations where users claim massive impact on performance when autovacuum kicks in. The reports have not included enough detail to quantify the impact or in most cases to establish a cause, but this seems like it could have a noticable impact, especially if the deadlock timeout was set to more than a second. Yeah, I agree this could be a cause of those types of reports, but I don't have any concrete evidence that any of the cases I've worked were actually due to this specific issue. The most recent case of this type I worked on was due to I/O saturation - which, since it happened to be EC2, really meant network saturation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 13:50:27 -0500, Robert Haas wrote: On Wed, Dec 5, 2012 at 12:47 PM, Andres Freund and...@2ndquadrant.com wrote: For me it seems pretty natural to support dumping extension the way they got created. I.e. a plain CREATE EXTENSION ...; if the extension was preinstalled and some form that includes the extension source if you installed it via the connection. Extensions that were installed in some global manner *should* not be dumped with ones installed over the connection. E.g. dumping /contrib or packaged modules seems to be a bad idea to me. That would possibly be useful as a debugging tool, but I don't see much point besides that. I agree with all of that. What I can't quite figure out is - AIUI, extensions are a way of bundling shared libraries with SQL scripts, and a way of managing the dump and restore process. If you just have SQL, there's no bundling to do, and if you reverse out the pg_dump changes (which is more or less what's being proposed here), then what do you have left other than the good feeling of being part of an extension? At that point, it seems to me that you've gone to a lot of work to add a layer of packaging that serves no real purpose. Manageability. E.g. for years I had a set of (trigger) functions to counted the number of rows in a table in a lockless manner. That's used in 10+ applications of former clients of mine. All (plpg)sql. Imagine I want to ship an updated version that 1. removes some *internal* functions, 2. adds some internal function. 3. adds a new *external* function. Now most of the clients use completely different development models and completely different ways of manageing upgrades. I needed to integrate my teensy module into all of them. If we had a way to package it nicely they could just upload the extension inside their own workflows and I (or they) would be freed from integrating foreign update scripts into their workflow. Imagine embedding a PGXN module into your application which is used on many servers and doesn't need superuser privileges or anything. Same thing. That's not something all that uncommon is it? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On Wed, Dec 5, 2012 at 2:01 PM, Andres Freund and...@2ndquadrant.com wrote: E.g. for years I had a set of (trigger) functions to counted the number of rows in a table in a lockless manner. That's used in 10+ applications of former clients of mine. All (plpg)sql. Imagine I want to ship an updated version that 1. removes some *internal* functions, 2. adds some internal function. 3. adds a new *external* function. Now most of the clients use completely different development models and completely different ways of manageing upgrades. I needed to integrate my teensy module into all of them. If we had a way to package it nicely they could just upload the extension inside their own workflows and I (or they) would be freed from integrating foreign update scripts into their workflow. OK, but let me play devil's advocate here. Under the status quo, if they used loose database objects, they would need to execute some database code that does this: DROP FUNCTION internalfunc1(int); CREATE FUNCTION internalfunc2(int); CREATE FUNCTION externalfunc3(int); IIUC, under this proposal, the client would instead need to execute some SQL code that looks something this (I'm faking the syntax here, forgive me, but the patch doesn't seem to contemplate ALTER): ALTER EXTENSION myextension UPDATE TO 1.1 USING SCRIPT $$ ALTER EXTENSION myextension DROP FUNCTION internalfunc1(int); DROP FUNCTION internalfunc1(int); CREATE FUNCTION internalfunc2(int); ALTER EXTENSION myextension ADD FUNCTION internalfunc2(int); CREATE FUNCTION externalfunc3(int); ALTER FUNCTION myextension ADD FUNCTION externalfunc3(int); $$; That doesn't really look like an improvement to me. What am I missing? Imagine embedding a PGXN module into your application which is used on many servers and doesn't need superuser privileges or anything. Same thing. That's not something all that uncommon is it? Not at all. I'm not questioning the use case at all; I'm questioning whether extensions are the right tool for addressing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On Wed, Dec 5, 2012 at 1:41 PM, Josh Berkus j...@agliodbs.com wrote: That's why we need a mechanism which either logs, or aborts on specific actions. From the perspective of the DevOps staff, abort is possibly the better option, but there may be issues with it on our end. That was the attraction of the original NOREWRITE patch, although as I said that suffers from new keywords and a total lack of extensibility. You know, event triggers seem like an awfully good solution to this problem. All we'd need is a new event called table_rewrite: CREATE EVENT TRIGGER my_event_trigger ON table_rewrite EXECUTE PROCEDURE consider_whether_to_throw_an_error(); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 14:10:34 -0500, Robert Haas wrote: On Wed, Dec 5, 2012 at 2:01 PM, Andres Freund and...@2ndquadrant.com wrote: E.g. for years I had a set of (trigger) functions to counted the number of rows in a table in a lockless manner. That's used in 10+ applications of former clients of mine. All (plpg)sql. Imagine I want to ship an updated version that 1. removes some *internal* functions, 2. adds some internal function. 3. adds a new *external* function. Now most of the clients use completely different development models and completely different ways of manageing upgrades. I needed to integrate my teensy module into all of them. If we had a way to package it nicely they could just upload the extension inside their own workflows and I (or they) would be freed from integrating foreign update scripts into their workflow. OK, but let me play devil's advocate here. Under the status quo, if they used loose database objects, they would need to execute some database code that does this: DROP FUNCTION internalfunc1(int); CREATE FUNCTION internalfunc2(int); CREATE FUNCTION externalfunc3(int); They would need to do exactly that if their database had version 1.1 and they upgrade to 1.3 but not if they already had 1.2... IIUC, under this proposal, the client would instead need to execute some SQL code that looks something this (I'm faking the syntax here, forgive me, but the patch doesn't seem to contemplate ALTER): ALTER EXTENSION myextension UPDATE TO 1.1 USING SCRIPT $$ ALTER EXTENSION myextension DROP FUNCTION internalfunc1(int); DROP FUNCTION internalfunc1(int); CREATE FUNCTION internalfunc2(int); ALTER EXTENSION myextension ADD FUNCTION internalfunc2(int); CREATE FUNCTION externalfunc3(int); ALTER FUNCTION myextension ADD FUNCTION externalfunc3(int); $$; That doesn't really look like an improvement to me. What am I missing? They should be able to simply slurp the extension from a file, possibly even install it outside their own update mechanism. Given that you don't know which version was installed beforehand thats not really possible without some infrastructure. And they should be able to drop the extension again afterwards without it leaving a trace. Nearly all I have seen out there fails at that, and the extension mechanism provides tracking of that. Imagine embedding a PGXN module into your application which is used on many servers and doesn't need superuser privileges or anything. Same thing. That's not something all that uncommon is it? Not at all. I'm not questioning the use case at all; I'm questioning whether extensions are the right tool for addressing it. Do you have some alterantive suggestion? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching timeline over streaming replication
On 05.12.2012 14:32, Amit Kapila wrote: On Tuesday, December 04, 2012 10:01 PM Heikki Linnakangas wrote: After some diversions to fix bugs and refactor existing code, I've committed a couple of small parts of this patch, which just add some sanity checks to notice incorrect PITR scenarios. Here's a new version of the main patch based on current HEAD. After testing with the new patch, the following problems are observed. Defect - 1: 1. start primary A 2. start standby B following A 3. start cascade standby C following B. 4. start another standby D following C. 5. Promote standby B. 6. After successful time line switch in cascade standby C D, stop D. 7. Restart D, Startup is successful and connecting to standby C. 8. Stop C. 9. Restart C, startup is failing. Ok, the error I get in that scenario is: C 2012-12-05 19:55:43.840 EET 9283 FATAL: requested timeline 2 does not contain minimum recovery point 0/3023F08 on timeline 1 C 2012-12-05 19:55:43.841 EET 9282 LOG: startup process (PID 9283) exited with exit code 1 C 2012-12-05 19:55:43.841 EET 9282 LOG: aborting startup due to startup process failure It seems that the commits I made to master already: http://archives.postgresql.org/pgsql-committers/2012-12/msg00116.php http://archives.postgresql.org/pgsql-committers/2012-12/msg00111.php were a few bricks shy of a load. The problem is that if recovery stops at a checkpoint record that changes timeline, so that minRecoveryPoint points to the end of the checkpoint record, we still record the old TLI as the TLI of minRecoveryPoint. This is because 1) there's a silly bug in the patch; replayEndTLI is not updated along with replayEndRecPtr. But even after fixing that, we're not good. The problem is that replayEndRecPtr is currently updated *before* calling the redo function. replayEndRecPtr is what becomes minRecoveryPoint when XLogFlush() is called. If the record is a checkpoint record, redoing it will switch recovery to the new timeline, but replayEndTLI will not be updated until the next record. IOW, as far as minRecoveryPoint is concerned, a checkpoint record that switches timeline is considered to be part of the old timeline. But when a server is promoted and a new timeline is created, the checkpoint record is considered to be part of the new timeline; that's what we write in the page header and in the control file. That mismatch causes the error. I'd like to fix this by always treating the checkpoint record to be part of the new timeline. That feels more correct. The most straightforward way to implement that would be to peek at the xlog record before updating replayEndRecPtr and replayEndTLI. If it's a checkpoint record that changes TLI, set replayEndTLI to the new timeline before calling the redo-function. But it's a bit of a modularity violation to peek into the record like that. Or we could just revert the sanity check at beginning of recovery that throws the requested timeline 2 does not contain minimum recovery point 0/3023F08 on timeline 1 error. The error I added to redo of checkpoint record that says unexpected timeline ID %u in checkpoint record, before reaching minimum recovery point %X/%X on timeline %u checks basically the same thing, but at a later stage. However, the way minRecoveryPointTLI is updated still seems wrong to me, so I'd like to fix that. I'm thinking of something like the attached (with some more comments before committing). Thoughts? - Heikki diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 702ea7c..bdae7a4 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5822,6 +5822,7 @@ StartupXLOG(void) */ do { +TimeLineID EndTLI; #ifdef WAL_DEBUG if (XLOG_DEBUG || (rmid == RM_XACT_ID trace_recovery_messages = DEBUG2) || @@ -5895,8 +5896,20 @@ StartupXLOG(void) * Update shared replayEndRecPtr before replaying this record, * so that XLogFlush will update minRecoveryPoint correctly. */ +if (record-xl_rmid == RM_XLOG_ID + (record-xl_info ~XLR_INFO_MASK) == XLOG_CHECKPOINT_SHUTDOWN) +{ + CheckPoint checkPoint; + + memcpy(checkPoint, XLogRecGetData(record), sizeof(CheckPoint)); + EndTLI = checkPoint.ThisTimeLineID; +} +else + EndTLI = ThisTimeLineID; + SpinLockAcquire(xlogctl-info_lck); xlogctl-replayEndRecPtr = EndRecPtr; +xlogctl-replayEndTLI = EndTLI; recoveryPause = xlogctl-recoveryPause; SpinLockRelease(xlogctl-info_lck); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 12:49 PM, Josh Berkus j...@agliodbs.com wrote: *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. huh -- good point. xpath at least is quite complicated and likely impractical (albeit not impossible) to marry with GIST in a meaningful way. jsonpath (at least AIUI from here: http://code.google.com/p/json-path/) seems to be lighter weight as is all things json when stacked up against xml. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On Wed, Dec 5, 2012 at 2:19 PM, Andres Freund and...@2ndquadrant.com wrote: IIUC, under this proposal, the client would instead need to execute some SQL code that looks something this (I'm faking the syntax here, forgive me, but the patch doesn't seem to contemplate ALTER): ALTER EXTENSION myextension UPDATE TO 1.1 USING SCRIPT $$ ALTER EXTENSION myextension DROP FUNCTION internalfunc1(int); DROP FUNCTION internalfunc1(int); CREATE FUNCTION internalfunc2(int); ALTER EXTENSION myextension ADD FUNCTION internalfunc2(int); CREATE FUNCTION externalfunc3(int); ALTER FUNCTION myextension ADD FUNCTION externalfunc3(int); $$; That doesn't really look like an improvement to me. What am I missing? They should be able to simply slurp the extension from a file, possibly even install it outside their own update mechanism. Given that you don't know which version was installed beforehand thats not really possible without some infrastructure. And they should be able to drop the extension again afterwards without it leaving a trace. Nearly all I have seen out there fails at that, and the extension mechanism provides tracking of that. Ah, OK. Well, it sounds like this might be a decent fit for the TEMPLATE concept proposed upthread, then. I have no objection whatsoever to the concept of storing the SQL and control files somewhere that doesn't need access to the server filesystem - in fact, I think I previously proposed allowing those to be stored in a database table. You could do that with something like: CREATE TEMPLATE yadda; ALTER TEMPLATE yadda ADD FILE 'yadda--1.0.sql' CONTENT $$...$$; ...or whatever. And that'd be 100% fine with me, and it could dump and restore just that way, and life would be good. Or at least, it sounds to me like that would meet the requirements you are articulating without breaking anything that works today. In fact, it sounds pretty cool. The root of my objections upthread, I think, is that the design this patch puts on the table seems to me to conflate the extension (which is always a database object) with the template (which is *currently* always a filesystem object). I think that's bound to create some problems. In the patch as it exists today, I think those problems are going to leak out in the form of breaking some of the things for which extensions can currently be used, but even if we address those points I have a sneaking suspicion that there will be others. For example, your point (in the portion of your email I'm not quoting here) about an upgrade across multiple version is well-taken - you need a different script depending on the version that's currently installed. Fixing that, though, seems to require a catalog of upgrade scripts, so that the server can look at the installed version and the available scripts and decide how to proceed. That catalog currently takes the form of separate files in the filesystem, but I don't see any reason why we can't store it somewhere else. What I'm not sold on is the idea of shuttling it across as part of CREATE/ALTER EXTENSION statements. I'm willing to be persuaded, but right now I can't see how that's ever going either robust or convenient. Making it part of a separate SQL object type gets around that problem rather nicely, IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
You know, event triggers seem like an awfully good solution to this problem. All we'd need is a new event called table_rewrite: CREATE EVENT TRIGGER my_event_trigger ON table_rewrite EXECUTE PROCEDURE consider_whether_to_throw_an_error(); Oh, wow, that would be perfect. That also solves the problem of I don't care if I rewrite really_small_table, but I do care if I rewrite really_big_table. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 2012-12-05 10:49:35 -0800, Josh Berkus wrote: *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? The space requirement for the paths are quite different. Its not that hard to build indexing support, its hard to build efficient support. The more you hide from postgres (i.e. behind a single very complex operator/function) the harder it is for the planner to detect whether your expression is indexable or not. We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. I don't immediately see why SP-Gist would be be beneficial. What kind of access structure do you have in mind? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 01:49 PM, Josh Berkus wrote: *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first Why is it a nogo for GiST? Ltree works, doesn't it? If we only support equality lookups in what way is a JSON doc different from a collection of ltree rows? We'd probably want to use SP-GiST for better index size/performance, but I don't see that this is impossible. Just some difficult code. The set of paths for a single json datum can be huge, as opposed to one for a single ltree datum. That strikes me as a serious barrier. In any case, nobody I know of is even offering to do this - when they do we can look at the design. Until then I'm assuming nothing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 12/05/2012 01:48 PM, David E. Wheeler wrote: I'm sorry what I have offered isn't what you want, but plenty of other people have told me it will go a long way meeting their needs. *Sigh.* I guess I have not been clear. The stuff you propose is *awesome*. I love it. The syntax with the chaining operators warms my heart, and I can’t wait to make *extensive* use of it in my procedural code. Maybe I would never *need* to do column queries of JSON contents often enough to require an expensive index. OK, sorry if I misunderstood. I guess I'm trying pretty hard to concentrate on what can be accomplished now, and other people are talking about blue sky possibilities. So I'm happy with this stuff, as long as it does not get in the way of supporting indexing at some point in the future. I can’t wait to start using it! I don't see why it should get in the way of anything like that. If anything, the parser design changes I have proposed should make later development much easier. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Dec 5, 2012 at 12:42 PM, David E. Wheeler da...@justatheory.com wrote: On Dec 5, 2012, at 9:57 AM, Merlin Moncure mmonc...@gmail.com wrote: Indexing large documents for fancy querying is a niche case but also quite complex. This isn't very well covered by xmlpath either btw -- I think for inspiration we should be looking at hstore. Agreed, although hstore, IIRC, does not support nesting. That said, how would you do that? The first thing that jumps into my mind is to cut right to the chase: Maybe the semantics could be defined so that implement hackstack @ needle would reasonable cover most cases. Yes. So my takeaways are: *) decomposition != precise searching. andrew's api handles the former and stands on it's own merits. Agreed. *) xmlpath/jsonpath do searching (and decomposition) but are very clunky from sql perspective and probably absolutely nogo in terms if GIST/GIN. postgres spiritually wants to do things via operators and we should (if possible) at least consider that first I don't understand how xmlpath/jsonpath is not able to be implemented with operators. yeah -- i phrased that badly -- by 'operators' I meant that on both sides would be json document with absolute minimum fanciness such as wildcards and predicate matches. basically, 'overlaps' and (especially) 'contains'. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Dec 5, 2012, at 11:51 AM, Andrew Dunstan and...@dunslane.net wrote: So I'm happy with this stuff, as long as it does not get in the way of supporting indexing at some point in the future. I can’t wait to start using it! I don't see why it should get in the way of anything like that. If anything, the parser design changes I have proposed should make later development much easier. Awesome, thanks! David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Robert Haas escribió: I have no objection whatsoever to the concept of storing the SQL and control files somewhere that doesn't need access to the server filesystem - in fact, I think I previously proposed allowing those to be stored in a database table. You could do that with something like: CREATE TEMPLATE yadda; ALTER TEMPLATE yadda ADD FILE 'yadda--1.0.sql' CONTENT $$...$$; ...or whatever. This seems unnecessary to me. What the patch at hand does is take the file (actually, the contents of the file) and execute it directly, without installing anything on disk. The precise contents of the extension is still tracked through pg_depend, so you can drop it without having previously saved neither the control file or the SQL script. (In fact, that's how DROP EXTENSION works currently.) There's also the pg_dump side of things; with your proposal we would be forced to move over the yadda--1.0.sql file from the old server to the new one; or, equivalently, put the whole ALTER TEMPLATE .. CONTENT command in the dump, which is equivalent to what Dimitri's patch does; so there doesn't seem to be a point. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Andres Freund and...@2ndquadrant.com writes: Maybe I am missing something, but you already can separate them per major version. You co-wrote the debian infrastructure to do so for some debian packages, so I am not sure what you mean here. The debian infrastructure I've help building is all about compiling an extension source package once and having as many binary artefacts as you have major versions of PostgreSQL lying around. So yes in debian you can actually install such extensions at different on disk locations per major version. Sorry for the confusion. Adding some *NON WRITABLE* per-cluster library directory doesn't seem to be as controversion as other suggestions. Well, it means a per-initdb (user driven) location where to store binary files, ask Tom what he thinks about that with his Red Hat Packager… Hat. On 2012-12-05 13:18:16 -0500, Tom Lane wrote: I think you're wasting your time to imagine that that case will ever be fixed. Allowing the server to scribble on executable files would set off all kinds of security alarm bells, and rightly so. If Postgres ever did ship with such a thing, I rather imagine that I'd be required to patch it out of Red Hat releases (not that SELinux wouldn't prevent it from happening anyway). That part I did understand. I still can't be happy about it, but I won't get back with any proposal where that's put into questions. That said, while you're talking about it, what if it's an opt-in GUC? I do see an argument for allowing SQL-only extensions to be installed this way, since that doesn't allow the execution of anything the user couldn't execute anyway. There's no need to worry about anything except control and script files though. […please make sure you're not drinking (coffee) before reading further…] Now if we can't fix the executable files situation, what about making the C coded extensions not require an executable anymore? I'm thinking about studying what it would take exactly to write a PL/C where the PostgreSQL backend would basically compile the embedded C code at CREATE FUNCTION time and store bytecode or binary in the probin column. I've stumbled accross more than one dynamic code or retargetable compiler thing already, and several of those even have compatible licences. Maybe the most promising ones are PL/LLVM or embeding the QEMU code transformation code (a fork of the tcc compiler). So, we're talking about a PL/C language, in-core or extension, where you could define say hstore without shipping any executable binary. Yeah, I'm crazy that way. Now I'll get back to the main thread… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 21:16:52 +0100, Dimitri Fontaine wrote: Andres Freund and...@2ndquadrant.com writes: Adding some *NON WRITABLE* per-cluster library directory doesn't seem to be as controversion as other suggestions. Well, it means a per-initdb (user driven) location where to store binary files, ask Tom what he thinks about that with his Red Hat Packager… Hat. I think it might be different if the directory is non-writable and connot be made writable by user running postgres. I do see an argument for allowing SQL-only extensions to be installed this way, since that doesn't allow the execution of anything the user couldn't execute anyway. There's no need to worry about anything except control and script files though. […please make sure you're not drinking (coffee) before reading further…] Now if we can't fix the executable files situation, what about making the C coded extensions not require an executable anymore? I'm thinking about studying what it would take exactly to write a PL/C where the PostgreSQL backend would basically compile the embedded C code at CREATE FUNCTION time and store bytecode or binary in the probin column. So, we're talking about a PL/C language, in-core or extension, where you could define say hstore without shipping any executable binary. Yeah, I'm crazy that way. Now I'll get back to the main thread… Imo thats not a sensible thing to pursue. It would seriously shorten the effort needed to run user-provided code. Yes, you can execute unrestricted perl, python whatever already but with selinux and similar things in place that won't allow you to run your own binaries. And currently execute-or-write protection will prevent you from executing compiled code. So you would have to disable that as well... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
So, Lots of things are being said, most of them are really interesting and some of them are just re-hashing what we said about a year ago in the Inline Extensions thread, whose conclusion was that the key not to have two different beasts (inline, file based) was for pg_dump to process them all the same way. Meanwhile, I think I need to address that reaction first: Robert Haas robertmh...@gmail.com writes: What I can't quite figure out is - AIUI, extensions are a way of bundling shared libraries with SQL scripts, and a way of managing the dump and restore process. Not quite. Extensions are user defined CASCADE support. It's all about pg_depend extensibility. Extensions are a way to manage dependencies of SQL objects in a way that allow you to manage them as a single entity. Now you can CREATE EXTENSION ALTER EXTENSION DROP EXTENSION and all you're doing is managing a bunch of SQL objects at once. The fact that it allows to implement a working dumprestore of aborigen extensions called contribs has been the first step, not the whole goal. You will notice that there's nothing in the whole extension patch and framework that refers to a module, those little executable binary files whose name depend on the OS PostgreSQL is running on, and that you manage at the file system level, in postgresql.conf with some GUCs, and with the LOAD command. You will also notice that we have been *very* careful not to taint any extension related SQL command with the notion of files. That part is well separated away and only meant to be known by extension authors and packagers, not by mere mortals such as DBAs or users. The current patch is willing to push that a little further away, making it optional even to extension authors. Those two facts didn't just happen. And I was not alone in designing the system that way. Let's continue the design and its implementation! :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
On 5 December 2012 19:15, Robert Haas robertmh...@gmail.com wrote: On Wed, Dec 5, 2012 at 1:41 PM, Josh Berkus j...@agliodbs.com wrote: That's why we need a mechanism which either logs, or aborts on specific actions. From the perspective of the DevOps staff, abort is possibly the better option, but there may be issues with it on our end. That was the attraction of the original NOREWRITE patch, although as I said that suffers from new keywords and a total lack of extensibility. You know, event triggers seem like an awfully good solution to this problem. All we'd need is a new event called table_rewrite: CREATE EVENT TRIGGER my_event_trigger ON table_rewrite EXECUTE PROCEDURE consider_whether_to_throw_an_error(); +1, I was just thinking that myself. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Andres Freund escribió: On 2012-12-05 21:16:52 +0100, Dimitri Fontaine wrote: Now if we can't fix the executable files situation, what about making the C coded extensions not require an executable anymore? I'm thinking about studying what it would take exactly to write a PL/C where the PostgreSQL backend would basically compile the embedded C code at CREATE FUNCTION time and store bytecode or binary in the probin column. Imo thats not a sensible thing to pursue. +1. Certainly a pg_dump patch's thread is not the place to propose it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why can't plpgsql return a row-expression?
Asif Rehman asifr.reh...@gmail.com writes: Here is the updated patch. I overlooked the loop, checking to free the conversions map. Here are the results now. I looked at this patch briefly. It seems to me to be completely schizophrenic about the coercion rules. This bit: - if (atttypid != att-atttypid || - (atttypmod != att-atttypmod atttypmod = 0)) + if ((atttypmod != att-atttypmod atttypmod = 0) || + !can_coerce_type(1, atttypid, (att-atttypid), COERCE_IMPLICIT_CAST)) says that we'll allow column types to differ if there is an implicit cast from the source to the target (or at least I think that's what's intended, although it's got the source and target backwards). Fine, but then why don't we use the cast machinery to do the conversion? This is taking plpgsql's cowboy let's-coerce-via-IO-no-matter-whether-that's- right-or-not approach and sticking it into a core part of the system. There's no guarantee at all that applying typoutput then typinput will match the conversion semantics you get from an actual cast, and in many practical cases such as int4 to int8 it'll be drastically less efficient anyway. It would make more sense to do something similar to coerce_record_to_complex(), that is modify the expression tree to coerce the columns using the regular cast machinery. Also, the typmod part of the test seems completely broken. For one thing, comparing typmods isn't sane if the types themselves aren't the same. And it's quite unclear to me why we'd want to have an anything-goes policy for type coercion, or even an implicit-casts-only policy, but then insist that the typmods match exactly. This coding will allow varchar(42) to text, but not varchar(42) to varchar(43) ... where's the sense in that? The patch also seems to go a great deal further than what was asked for originally, or indeed is mentioned in the documentation patch, namely fixing the restriction on RETURN to allow composite-typed expressions. Specifically it's changing the code that accepts composite input arguments. Do we actually want that? If so, shouldn't it be documented? I'm inclined to suggest that we drop all the coercion stuff and just do what Robert actually asked for originally, which was the mere ability to return a composite value as long as it matched the function's result type. I'm not convinced that we want automatic implicit type coercions here. In any case I'm very much against sticking such a thing into general-purpose support code like tupconvert.c. That will create a strong likelihood that plpgsql's poorly-designed coercion semantics will leak into other aspects of the system. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Alvaro Herrera alvhe...@2ndquadrant.com writes: +1. Certainly a pg_dump patch's thread is not the place to propose it. Sure. Sorry about that, the goal of that previous message was to let people come to understand better my whole vision of what is an Extension, a contrib, and where we are what I wanted us to build. I refined those ideas in another email though, so you can safely ignore this sub-thread. I'll get back to the question of storing .so in a per database location with an opt-in GUC later, when appropriate. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Andres Freund and...@2ndquadrant.com writes: On 2012-12-05 21:16:52 +0100, Dimitri Fontaine wrote: Now if we can't fix the executable files situation, what about making the C coded extensions not require an executable anymore? I'm thinking about studying what it would take exactly to write a PL/C where the PostgreSQL backend would basically compile the embedded C code at CREATE FUNCTION time and store bytecode or binary in the probin column. So, we're talking about a PL/C language, in-core or extension, where you could define say hstore without shipping any executable binary. Yeah, I'm crazy that way. Now I'll get back to the main thread⦠Imo thats not a sensible thing to pursue. That would be another thing that Red Hat would refuse to ship, as would any other distro with an ounce of concern about security. But in any case there's no way that we could implement it portably. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Dimitri Fontaine dimi...@2ndquadrant.fr writes: On 2012-12-05 13:18:16 -0500, Tom Lane wrote: I think you're wasting your time to imagine that that case will ever be fixed. Allowing the server to scribble on executable files would set off all kinds of security alarm bells, and rightly so. If Postgres ever did ship with such a thing, I rather imagine that I'd be required to patch it out of Red Hat releases (not that SELinux wouldn't prevent it from happening anyway). That part I did understand. I still can't be happy about it, but I won't get back with any proposal where that's put into questions. That said, while you're talking about it, what if it's an opt-in GUC? GUC or no GUC, it'd still be letting an unprivileged network-exposed application (PG) do something that's against any sane system-level security policy. Lipstick is not gonna help this pig. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] strange isolation test buildfarm failure on guaibasaurus
Hi all! http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=guaibasaurusdt=2012-12-05%2016%3A17%3A01 seems like a rather odd failure in the isolation test (client) Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 2012-12-05 12:09:58 -0300, Alvaro Herrera wrote: Here's a first attempt at a new documentation chapter. This goes in part Server Programming, just after the SPI chapter. I just noticed that worker_spi could use some more sample code, for example auth_counter was getting its own LWLock and also its own shmem area, which would be helpful to demonstrate I think. I am not exactly renowned for my english skills, but I have made a pass over the file made some slight changes and extended it in two places. I've also added a comment with a question that came to my mind when reading the docs... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services --- /tmp/bgworker.old.sgml 2012-12-05 22:12:00.220609454 +0100 +++ /tmp/bgworker.sgml 2012-12-05 22:26:11.604712943 +0100 @@ -11,16 +11,16 @@ para PostgreSQL can be extended to run user-supplied code in separate processes. Such processes are started, stopped and monitored by commandpostgres/command, - which permits them have a lifetime closely linked to the server's status. + which permits them to have a lifetime closely linked to the server's status. These processes have the option to attach to productnamePostgreSQL/'s - shared memory area and connect to databases internally. + shared memory area and to connect to databases internally. /para warning para There are considerable robustness and security risks in using background - worker processes, because them being written in the literalC/ language - gives them unrestricted access to data. Administrators wishing to enable + worker processes because, being written in the literalC/ language, + they have unrestricted access to data. Administrators wishing to enable modules that include background worker process should exercise extreme caution. Only carefully audited modules should be permitted to run background worker processes. @@ -31,7 +31,8 @@ Only modules listed in varnameshared_preload_libraries/ can run background workers. A module wishing to register a background worker needs to register it by calling - functionRegisterBackgroundWorker(typeBackgroundWorker *worker/type)/function. + functionRegisterBackgroundWorker(typeBackgroundWorker *worker/type)/function + from it's function_PG_Init()/. The structure structnameBackgroundWorker/structname is defined thus: programlisting typedef struct BackgroundWorker @@ -50,12 +51,12 @@ para structfieldbgw_name/ is a string to be used in log messages, process - lists and similar contexts. + listings and similar contexts. /para para structfieldbgw_flags/ is a bitwise-or'd bitmask indicating the - capabilities that the module would like to have. Possible values are + capabilities that the module wants. Possible values are literalBGWORKER_SHMEM_ACCESS/literal (requesting shared memory access) and literalBGWORKER_BACKEND_DATABASE_CONNECTION/literal (requesting the ability to establish a database connection, through which it can later run @@ -68,33 +69,35 @@ literalBgWorkerStart_PostmasterStart/ (start as soon as commandpostgres/ itself has finished its own initialization; processes requesting this are not eligible for database connections), - literalBgWorkerStart_ConsistentState/ (start as soon as consistent state + literalBgWorkerStart_ConsistentState/ (start as soon as a consistent state has been reached in a HOT standby, allowing processes to connect to databases and run read-only queries), and literalBgWorkerStart_RecoveryFinished/ (start as soon as the system has entered normal read-write state). Note the last two values are equivalent in a server that's not a HOT standby. /para - + para structfieldbgw_restart_time/structfield is the interval, in seconds, that - commandpostgres/command should wait before restarting the process, - in case it crashes. It can be any positive value, or literalBGW_NEVER_RESTART/literal, indicating not to restart the process in case of a crash. + commandpostgres/command should wait before restarting the process, in + case it crashes. It can be any positive value, + or literalBGW_NEVER_RESTART/literal, indicating not to restart the + process in case of a crash. /para para structfieldbgw_main/structfield is a pointer to the function to run once the process is started. structfieldbgw_main_arg/structfield will be - passed to it as its only argument. Note that - literalMyBgworkerEntry/literal is a pointer to a copy of the - structnameBackgroundWorker/structname structure passed - at registration time. + passed to it as its only argument. Note that the global variable + literalMyBgworkerEntry/literal points to a copy of the + structnameBackgroundWorker/structname structure passed at registration + time. /para para
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 16:20:41 -0500, Tom Lane wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: On 2012-12-05 13:18:16 -0500, Tom Lane wrote: I think you're wasting your time to imagine that that case will ever be fixed. Allowing the server to scribble on executable files would set off all kinds of security alarm bells, and rightly so. If Postgres ever did ship with such a thing, I rather imagine that I'd be required to patch it out of Red Hat releases (not that SELinux wouldn't prevent it from happening anyway). That part I did understand. I still can't be happy about it, but I won't get back with any proposal where that's put into questions. That said, while you're talking about it, what if it's an opt-in GUC? GUC or no GUC, it'd still be letting an unprivileged network-exposed application (PG) do something that's against any sane system-level security policy. Lipstick is not gonna help this pig. What about the non-writable per cluster directory? Thats something I've actively wished for in the past when developing a C module thats also used in other clusters. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Andres Freund and...@anarazel.de writes: On 2012-12-05 16:20:41 -0500, Tom Lane wrote: GUC or no GUC, it'd still be letting an unprivileged network-exposed application (PG) do something that's against any sane system-level security policy. Lipstick is not gonna help this pig. What about the non-writable per cluster directory? Thats something I've actively wished for in the past when developing a C module thats also used in other clusters. I see no security objection to either per-cluster or per-database script+control-file directories, as long as they can only contain SQL scripts and not executable files. If we allow such things to be installed by less-than-superusers, we'll have to think carefully about what privileges are given when running the script. I forget at the moment how much of that we already worked out back in the 9.1 era; I remember it was discussed but not whether we had a bulletproof solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Andres Freund wrote: On 2012-12-05 12:09:58 -0300, Alvaro Herrera wrote: Here's a first attempt at a new documentation chapter. This goes in part Server Programming, just after the SPI chapter. I just noticed that worker_spi could use some more sample code, for example auth_counter was getting its own LWLock and also its own shmem area, which would be helpful to demonstrate I think. I am not exactly renowned for my english skills, but I have made a pass over the file made some slight changes and extended it in two places. Thanks, I have applied it. I've also added a comment with a question that came to my mind when reading the docs... para structfieldbgw_sighup/structfield and structfieldbgw_sigterm/ are pointers to functions that will be installed as signal handlers for the new - process. + process. XXX: Can they be NULL? /para Hm. The code doesn't check, so what happens is probably a bug anyhow. I don't know whether sigaction crashes in this case; its manpage doesn't say. I guess the right thing to do is have RegisterBackgroundWorker check for a NULL sighandler, and set it to something standard if so (say SIG_IGN for SIGHUP and maybe quickdie() or similar for SIGTERM). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 5 December 2012 15:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Here's a first attempt at a new documentation chapter. This goes in part Server Programming, just after the SPI chapter. I just noticed that worker_spi could use some more sample code, for example auth_counter was getting its own LWLock and also its own shmem area, which would be helpful to demonstrate I think. to run once - to run when Prefer BgWorkerStart_ConsistentState to be renamed to BgWorkerRun_InHotStandby BgWorkerStart_RecoveryFinished to be renamed to BgWorkerRun_InNormalMode presumably a process will shutdown if (BgWorkerRun_InHotStandby !BgWorkerRun_InNormalMode) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On 2012-12-05 16:42:38 -0500, Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 2012-12-05 16:20:41 -0500, Tom Lane wrote: GUC or no GUC, it'd still be letting an unprivileged network-exposed application (PG) do something that's against any sane system-level security policy. Lipstick is not gonna help this pig. What about the non-writable per cluster directory? Thats something I've actively wished for in the past when developing a C module thats also used in other clusters. I see no security objection to either per-cluster or per-database script+control-file directories, as long as they can only contain SQL scripts and not executable files. Well, I was explicitly talking about C code above. The question doesn't really have to do too much with this thread, sorry. Given I am proposing the directory to be explicitly read-only and under permission that don't allow postgres to change that its not really suitable for this topic... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
On Wed, Dec 5, 2012 at 2:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas escribió: I have no objection whatsoever to the concept of storing the SQL and control files somewhere that doesn't need access to the server filesystem - in fact, I think I previously proposed allowing those to be stored in a database table. You could do that with something like: CREATE TEMPLATE yadda; ALTER TEMPLATE yadda ADD FILE 'yadda--1.0.sql' CONTENT $$...$$; ...or whatever. This seems unnecessary to me. What the patch at hand does is take the file (actually, the contents of the file) and execute it directly, without installing anything on disk. The precise contents of the extension is still tracked through pg_depend, so you can drop it without having previously saved neither the control file or the SQL script. (In fact, that's how DROP EXTENSION works currently.) Yeah, DROP will work. But what about ALTER .. UPDATE? There's also the pg_dump side of things; with your proposal we would be forced to move over the yadda--1.0.sql file from the old server to the new one; or, equivalently, put the whole ALTER TEMPLATE .. CONTENT command in the dump, which is equivalent to what Dimitri's patch does; so there doesn't seem to be a point. Well, there's certainly a point, because IIUC Dimitri's patch dumps the file into the pg_dump output no matter whether the file originally came from an SQL command or the filesystem. IMHO, anyone who thinks that isn't going to break things rather badly isn't thinking hard enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Simon Riggs wrote: On 5 December 2012 15:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Here's a first attempt at a new documentation chapter. This goes in part Server Programming, just after the SPI chapter. I just noticed that worker_spi could use some more sample code, for example auth_counter was getting its own LWLock and also its own shmem area, which would be helpful to demonstrate I think. to run once - to run when Thanks. Prefer BgWorkerStart_ConsistentState to be renamed to BgWorkerRun_InHotStandby BgWorkerStart_RecoveryFinished to be renamed to BgWorkerRun_InNormalMode presumably a process will shutdown if (BgWorkerRun_InHotStandby !BgWorkerRun_InNormalMode) Hmm, no, I haven't considered that. You mean that a bgworker that specifies to start at BgWorkerStart_ConsistentState will stop once normal mode is reached? Currently they don't do that. And since we don't have the notion that workers stop working, it wouldn't work -- postmaster would start them back up immediately. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: question on foreign key lock
On Wed, Dec 5, 2012 at 7:08 AM, Filip Rembiałkowski filip.rembialkow...@gmail.com wrote: Robert, thank you for the answer. 1. need exclusive lock anyway to add triggers. Why adding a trigger needs exclusive lock? Someone would say blocking reads is not needed (since possible trigger events are: Insert/Update/Delete/Truncate). 2. will create a risk of deadlock. From user perspective a risk of deadlock is sometimes better than excessive locking. Transactional DDL users should be prepared for exceptions/retries anyway. 3. I made a naive test of simply changing AccessExclusiveLock to ExclusiveLock, and seeing how many regression tests it breaks. It breaks none :-) Current Git head gives me 2 fails/133 tests regardless of this change. Sure. You could probably downgrade it quite a bit further without breaking the regression tests, but that doesn't mean it's safe in all cases. Rather than having this discussion all over again, I suggest that you have a look at commits 2dbbda02e7e688311e161a912a0ce00cde9bb6fc, 2c3d9db56d5d49bdc777b174982251c01348e3d8, a195e3c34f1eeb6a607c342121edf48e49067ea9, and the various mailing list discussions pertaining thereto, particularly the thread ALTER TABLE lock strength reduction patch is unsafe which was started by Tom Lane. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Robert Haas robertmh...@gmail.com writes: Yeah, DROP will work. But what about ALTER .. UPDATE? What about it? Well, there's certainly a point, because IIUC Dimitri's patch dumps the file into the pg_dump output no matter whether the file originally came from an SQL command or the filesystem. IMHO, anyone who thinks that isn't going to break things rather badly isn't thinking hard enough. Only if you ask for it using --extension-script. The default behaviour didn't change, whether you decide to install your extension from the file system or the PostgreSQL port. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: question on foreign key lock
On 2012-12-05 17:05:41 -0500, Robert Haas wrote: On Wed, Dec 5, 2012 at 7:08 AM, Filip Rembiałkowski filip.rembialkow...@gmail.com wrote: Robert, thank you for the answer. 1. need exclusive lock anyway to add triggers. Why adding a trigger needs exclusive lock? Someone would say blocking reads is not needed (since possible trigger events are: Insert/Update/Delete/Truncate). 2. will create a risk of deadlock. From user perspective a risk of deadlock is sometimes better than excessive locking. Transactional DDL users should be prepared for exceptions/retries anyway. 3. I made a naive test of simply changing AccessExclusiveLock to ExclusiveLock, and seeing how many regression tests it breaks. It breaks none :-) Current Git head gives me 2 fails/133 tests regardless of this change. Sure. You could probably downgrade it quite a bit further without breaking the regression tests, but that doesn't mean it's safe in all cases. Rather than having this discussion all over again, I suggest that you have a look at commits 2dbbda02e7e688311e161a912a0ce00cde9bb6fc, 2c3d9db56d5d49bdc777b174982251c01348e3d8, a195e3c34f1eeb6a607c342121edf48e49067ea9, and the various mailing list discussions pertaining thereto, particularly the thread ALTER TABLE lock strength reduction patch is unsafe which was started by Tom Lane. Just to give an example about the complexities surrounding this: Lowering the lock level for foreign key creation probably would be dangerous for query planning more precisely join removal. S1: BEGIN TRANSACTION ISOLATION LEVEL REPATABLE READ; S1: SELECT * FROM a; S2: DELETE FROM a WHERE a.id IN (all_duplicate_ids); S2: ALTER TABLE a ADD CONTSTRAINT a_unique UNIQUE (a); S2: ALTER TABLE b ADD CONSTRAINT b_fkey FOREIGN KEY (b_id) REFERENCES a(id)); S1: SELECT b.id FROM a LEFT JOIN b ON(b.id = a.id); The last S1 query might now remove the join to b because of the foreign key (which it sees due to SnapshotNow semantics) although rows that violate unique key (which is required for the foreign key) still exist. The existance of those duplicate values would change the result though! (come to think of it, I think we might still hit the above case if S1 doesn't access a before the foreign key gets altered...) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dumping an Extension's Script
Dimitri Fontaine escribió: Robert Haas robertmh...@gmail.com writes: Well, there's certainly a point, because IIUC Dimitri's patch dumps the file into the pg_dump output no matter whether the file originally came from an SQL command or the filesystem. IMHO, anyone who thinks that isn't going to break things rather badly isn't thinking hard enough. Only if you ask for it using --extension-script. The default behaviour didn't change, whether you decide to install your extension from the file system or the PostgreSQL port. What happens on a normal pg_dump of the complete database? For extensions that were installed using strings instead of files, do I get a string back? Because if not, the restore is clearly going to fail anyway. I mean, clearly the user doesn't want to list the extensions, figure which ones were installed by strings, and then do pg_dump --extension-script on them. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... NOREWRITE option
Simon Riggs si...@2ndquadrant.com writes: CREATE EVENT TRIGGER my_event_trigger ON table_rewrite EXECUTE PROCEDURE consider_whether_to_throw_an_error(); +1, I was just thinking that myself. +1, and I think that can happen for 9.3, as soon as we agree on the list of code points where we want that event to fire. ALTER TABLE variants that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 2012-12-05 19:03:44 -0300, Alvaro Herrera wrote: Simon Riggs wrote: Prefer BgWorkerStart_ConsistentState to be renamed to BgWorkerRun_InHotStandby BgWorkerStart_RecoveryFinished to be renamed to BgWorkerRun_InNormalMode presumably a process will shutdown if (BgWorkerRun_InHotStandby !BgWorkerRun_InNormalMode) Hmm, no, I haven't considered that. You mean that a bgworker that specifies to start at BgWorkerStart_ConsistentState will stop once normal mode is reached? Currently they don't do that. And since we don't have the notion that workers stop working, it wouldn't work -- postmaster would start them back up immediately. I personally don't see too much demand for this from a use-case perspective. Simon, did you have anything in mind that made you ask this? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
On 2012-12-05 18:42:42 -0300, Alvaro Herrera wrote: para structfieldbgw_sighup/structfield and structfieldbgw_sigterm/ are pointers to functions that will be installed as signal handlers for the new - process. + process. XXX: Can they be NULL? /para Hm. The code doesn't check, so what happens is probably a bug anyhow. I don't know whether sigaction crashes in this case; its manpage doesn't say. I guess the right thing to do is have RegisterBackgroundWorker check for a NULL sighandler, and set it to something standard if so (say SIG_IGN for SIGHUP and maybe quickdie() or similar for SIGTERM). Afair a NULL sigaction is used to query the current handler. Which indirectly might lead to problems due to the wrong handler being called. Setting up SIG_IGN and quickdie in that case seems to be sensible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers