Re: [HACKERS] Changing the concept of a DATABASE
Am 22.05.2012 15:27, schrieb Albe Laurenz: If you need different applications to routinely access each other's tables, why not assign them to different schemas in one database? I just saw another use case here. There are lots of offices / departments creating maps. Topography maps, pipeline maps, nature conservancy (e.g. where are the nests from endangered birds?), mineral resources, wire maps, street maps, bicycle / jogging maps, tourists maps, tree maps, cadastral land register, and so on. All this departments have their own databases for their own maps. They only map their own stuff. Towns / states / regions have a department where all these maps get collected. You can go to your town and ask for weird maps today - e.g. a map with all jogging routes and waste water pipes but without autobahns. You could say that you have one database per layer. As I said - I saw this construction in real world outside. I am pretty sure that other states maybe have other solutions but the described solution exist. Susanne -- Dipl. Inf. Susanne Ebrecht - 2ndQuadrant PostgreSQL Development, 24x7 Support, Training and Services www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/23 Robert Haas robertmh...@gmail.com: On Wed, May 23, 2012 at 3:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I wanted to have discussion to handle this problem. Unlike leaky-view problem, we don't need to worry about unexpected qualifier distribution on either side of join, because a scan on table never contains any join. Thus, all we need to care about is order of qualifiers chained on a particular scan node; being reordered by the cost to invoke functions. How about an idea to track FuncExpr come from the security policy and enforce 0 on its cost? Regular functions never reach zero cost, so the security policy must be re-ordered to the head. Hmm. That would disregard the relative costs of multiple qualifiers all of which were injected by the security policy, which I suspect is not a good idea. Furthermore, I think that we should not assume that there is no join involved. I would expect a fairly popular RLS qual to be something of the form WHERE NOT EXISTS (SELECT 1 FROM hide_me WHERE hide_me.pk = thistab.pk). Please ignore the approach to track cost value of qualifiers. I believe it does not work well without something fundamental updates. Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. I become to think it is a better approach than tracking origin of each qualifiers. One problem is case handling on update or delete statement. It may be possible to rewrite the update / delete query as follows: From: UPDATE tbl SET X = X + 1 WHERE f_leak(Y) To: UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT * FROM ( SELECT ctid FROM tbl WHERE uname = getpgusername() == (*) should have security-barrier ) AS tbl_subqry WHERE f_leak(Y) ); Expanded sub-queries will have security-barrier flag, so it enforces the uname = getpgusername() being checked earlier than f_leak(Y). We may need to measure the performance impact due to the reform. Also, suppose that Bob applies an RLS policy to a table, and, later, Alice selects from the table. How do we keep Bob from usurping Alice's privileges? If we insist that Bob's RLS policy function runs as Bob, then it defeats inlining; but if it runs as Alice, then Bob can steal Alice's credentials. One idea is to apply the security policy only if Alice's access to the table is granted by Bob. That way, if Alice is (for example) the superuser, she's immune to RLS. But that doesn't seem to completely solve the problem, because Alice might merely be some other relatively unprivileged user and we still don't want Bob to be able to walk off with her access. I think, this situation is similar to a case when we reference a view without privileges to underlying tables. If Bob set up a view with something tricky function, it allows Bob to reference credentials of users who reference the view. More or less, it might be a problem when a user try to invoke a user defined function declared by others. (Thus, sepgsql policy does not allow users to invoke a function declared by another one in different domain; without DBA's checks.) I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Sorry, I don't favor this idea. Even if table owner set up a function to generate additional qualifiers, it also has no guarantee the qualifiers are invoked prior to user-given one. It seems to me this approach will have same problem... Also, if the point here is to provide security for tables not views, it seems like you really need to have (at least a design for) RLS security on insert/update/delete operations. Just adding the same filter condition might be adequate for deletes, but I don't think it works at all for inserts. And for updates, what prevents the user from updating columns he shouldn't, or updating them to key values he shouldn't be able to use? If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a
Re: [HACKERS] [RFC] Interface of Row Level Security
On May23, 2012, at 22:12 , Robert Haas wrote: Also, suppose that Bob applies an RLS policy to a table, and, later, Alice selects from the table. How do we keep Bob from usurping Alice's privileges? That problem isn't restricted to RLW, though. Bob could just as well have booby-trapped any other SQL object, e.g. could have added bobs_malicious_function() to a view Alice selects from, or attached it as a trigger to a table Alice inserts to. It would be nice if there was (optional) protection against these kinds of attacks, but it's not really something that RLS should be burdened with. BTW, I've wondered in the past how tight our protection against some trying to take over the postgres role during pg_dump is. On the surface, it seems that we're safe because pg_dump doesn't invoke user-defined functions except output functions (which require superuser privileges to modify). But that's not exactly a solid line of defense... If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yeah, applying the security policy to the new row (for UPDATES and INSERTS) seems weird - the policy determines what you can see, not what you can store, which might be two different things. But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On May24, 2012, at 11:39 , Susanne Ebrecht wrote: There are lots of offices / departments creating maps. Topography maps, pipeline maps, nature conservancy (e.g. where are the nests from endangered birds?), mineral resources, wire maps, street maps, bicycle / jogging maps, tourists maps, tree maps, cadastral land register, and so on. All this departments have their own databases for their own maps. They only map their own stuff. Towns / states / regions have a department where all these maps get collected. The question is, how do they get collected? If they use some home-grown replication, they might just as well collect them into schemas instead of databases. The same is possible with slony, I think. And if they use WAL-based replication, they have no choice but to collect them in different clusters, so cross-database queries within a cluster wouldn't help. I think that you're right that reporting would one of the main use-cases for cross- database queries. But reporting is also, I think, one of the main uses-cases for WAL-based replication. So having cross-database queries with don't allow queries across multiple replicas will leave quite a few people out in the cold. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_stat_statements temporary file
If pg_stat_statements is set to store it's data across restarts, it stores it in global/pg_stat_statements.stat. This causes some interesting things to happen in combination with a base backup - namely, if you take a base backup *after* you have restarted th emaster, the slave will get a snapshot of whatever was in the temporary file at the time of the restart. This is quite unpredictable - particularly in relation to a slave where it gets a snapshot from the last restart, not from the base backup, after which it diverges. AFAICT, it also has the property that if the server crashes, it will reload the latest snapshot - not reset to 0 or anything like that. Finally, if the server were to crash *while* the file is being written, it will get a corrupt file (I haven't tested this part, but it's rather obvious from the code). I'm pretty sure this could lead to a situation where the database wouldn't restart. Fixing the last part is easy - we need to write the file to a temporary file and then rename() it into place, like we do with the stats collectors file. Fixing the first one, I can think of a few things: 1) unlink() the file after we've read it. 2) forcibly exclude the file from base backups taken with pg_basebackup. We'd still have the problem when it comes to backups taken manually. 3) avoid loading the file on a standby (that wouldn't fix the similar problem on the master of course) And perhaps some other solution I haven't thought of? If we want to go with option 2, we have another problem - it's in the global directory. But the name is dependent on what's in a contrib module. Meaning we'd have to teach core postgresql about a contrib modules filename, which pretty much breaks the abstraction layer. Perhaps a better choice here would be to create another directory under the data directory that is always excluded from base backup, and store it there? That would also work for third party modules that core can never learn about... In general, should a contrib module really store data in the global/ directory? Seems pretty ugly to me... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yeah, applying the security policy to the new row (for UPDATES and INSERTS) seems weird - the policy determines what you can see, not what you can store, which might be two different things. But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_stat_statments queryid
Was there any actual reason why we didn't end up exposing queryid in the pg_stat_statements view? It would be highly useful when tracking changes over time. Right now I see people doing md5(query) to do that, which is a lot more ugly (and obviously uses more space and is slow, too). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LISTEN/NOTIFY Security and the docs
On Fri, May 18, 2012 at 5:08 PM, Chander Ganesan chan...@otg-nc.com wrote: Hi All, I just realized that anyone can listen for notifications (using listen) so long as they know the channel name. This means that a user could receive and view the payload for another user. Perhaps it would be good to note this in the documentation (i.e., there should be no expectation of privacy/security when using listen/notify, so any user that can connect to a database could issue and receive notifications for any channel.) Might be worth a note, yes. The lack of a note really should tell you that it's a broadcast, but it wouldn't hurt to have an extra one. Want to prepare a patch? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 12:43 , Kohei KaiGai wrote: The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. I don't think preventing block holes is sensible here - it might, in fact, be *just* what the user wants. Imagine a messaging system. A reasonable RLS policy would be to allow a user to see messages addressed to him. Yet you wouldn't want to prevent her from creating messages to other people - cause what good is a messaging system that only allows you to send messages to yourself. What you probably *would* want to do, though, is to check that she did put herself in as the sender when she creates a message. And you'd probably wanna forbit updates entirely. So you'd have - A RLS policy that checks current_user = ANY(recipients) - An ON INSERT trigger which checks current_user = sender - An ON UPDATE trigger which errors out If RLS policy applies to INSERTEed rows also, how would you do that? Another example, although in the realm of filesystem permissions, is Mac OS X. Per default, every user has a Drop Box folder, which anybody can write to, yet only the owner can read. This allows you to easily transfer files from one user to another without allowing a third party to read it. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements temporary file
On 24 May 2012 11:43, Magnus Hagander mag...@hagander.net wrote: In general, should a contrib module really store data in the global/ directory? Seems pretty ugly to me... I think the case could be made for moving pg_stat_statements into core, as an optionally enabled view, like pg_stat_user_functions, since pg_stat_statements is now rather a lot more useful than it used to be. That would solve that problem, as well as putting pg_stat_statements into the hands of the largest possible number of people, which would be a positive development, in my humble and fairly predictable opinion. However, pg_stat_statements will not prevent the database from starting if the file is corrupt. It makes some basic attempts to detect that within pgss_shmem_startup(), and will simply log the problem and unlink the file in the event of detecting corruption. Otherwise, I suppose you might get garbage values in pg_stat_statements, which, while rather annoying and possibly unacceptable, is hardly the end of the world. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements temporary file
On Thu, May 24, 2012 at 1:36 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 24 May 2012 11:43, Magnus Hagander mag...@hagander.net wrote: In general, should a contrib module really store data in the global/ directory? Seems pretty ugly to me... I think the case could be made for moving pg_stat_statements into core, as an optionally enabled view, like pg_stat_user_functions, since pg_stat_statements is now rather a lot more useful than it used to be. That would solve that problem, as well as putting pg_stat_statements into the hands of the largest possible number of people, which would be a positive development, in my humble and fairly predictable opinion. Well, it would solve the problem for this specific case - but there will always be yet another extension. Actually, it would only solve the *ugliness*, and not the actual problem. (That's not to say tha tI don't agree that moving it into core would be a good idea, but that's not happening for 9.2 - and the problem exists in 9.1 as well) However, pg_stat_statements will not prevent the database from starting if the file is corrupt. It makes some basic attempts to detect that within pgss_shmem_startup(), and will simply log the problem and unlink the file in the event of detecting corruption. Otherwise, I suppose you might get garbage values in pg_stat_statements, which, while rather annoying and possibly unacceptable, is hardly the end of the world. Ok. I was worried it might crash on loading the data when it was corrupt - say a size field that ended up specifying gigabytes that it then tries to allocate, or something like that. What actually happens if it tries to repalloc() something huge? palloc will throw an elog(ERROR), and since this happens during postmaster startup, are you sure it won't prevent the server from starting? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog stops upon server restart
On Thu, Apr 19, 2012 at 1:00 PM, Thom Brown t...@linux.com wrote: On 10 April 2012 21:07, Magnus Hagander mag...@hagander.net wrote: On Friday, April 6, 2012, Thom Brown wrote: Hi, I've tried out pg_receivexlog and have noticed that when restarting the cluster, pg_receivexlog gets cut off... it doesn't keep waiting. This is surprising as the DBA would have to remember to start pg_receivexlog up again. This is intentional as far as that's how the code was written, there's not a malfunctioning piece of code somewhere. It would probably make sense to have an auto-reconnect feature, and to have an option to turn it on/off. If you haven't already (my wifi here is currently quite useless, which is why I'm working on my email backlog, so I can't check), please add it to the open items list. I think it would also be useful to add a paragraph to the documentation stating use-cases for this feature, and its advantages. Attached is a patch that implements this. Seems reasonable? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ pg_receivexlog_loop.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] shared_preload_libraries path
Would i make sense to have a postgresql.conf parameter that would add to LD_LIBRARY_PATH when loading libraries from shared_preload_libraries (and other library loads). To make it possible to configure it without having to mess around with the operating system configuration? Or is that too much reimplementing OS functionality? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements temporary file
On 24 May 2012 12:42, Magnus Hagander mag...@hagander.net wrote: What actually happens if it tries to repalloc() something huge? palloc will throw an elog(ERROR), and since this happens during postmaster startup, are you sure it won't prevent the server from starting? Oh, yes, missed that. /* Previous incarnation might have had a larger query_size */ if (temp.query_len = buffer_size) { buffer = (char *) repalloc(buffer, temp.query_len + 1); buffer_size = temp.query_len + 1; } Here, temp receives its value from an fread(). This could probably be coded to be defensive against such things, but a better fix would be preferred. I have to wonder how much of a problem corruption is likely to be though, given that we only save to disk in a corresponding pgss_shmem_shutdown() call, which actually has more protections against corruption. The window for the saved file to be corrupt seems rather small, though I accept that a better window would be zero. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements temporary file
On Thu, May 24, 2012 at 2:16 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 24 May 2012 12:42, Magnus Hagander mag...@hagander.net wrote: What actually happens if it tries to repalloc() something huge? palloc will throw an elog(ERROR), and since this happens during postmaster startup, are you sure it won't prevent the server from starting? Oh, yes, missed that. /* Previous incarnation might have had a larger query_size */ if (temp.query_len = buffer_size) { buffer = (char *) repalloc(buffer, temp.query_len + 1); buffer_size = temp.query_len + 1; } Here, temp receives its value from an fread(). This could probably be coded to be defensive against such things, but a better fix would be preferred. I have to wonder how much of a problem corruption is likely to be though, given that we only save to disk in a corresponding pgss_shmem_shutdown() call, which actually has more protections against corruption. The window for the saved file to be corrupt seems rather small, though I accept that a better window would be zero. Right. But writing to a temp file and rename()ing it into place is trivial. It's really the other issues raised that are bigger ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_preload_libraries path
On 24 May 2012 13:09, Magnus Hagander mag...@hagander.net wrote: Would i make sense to have a postgresql.conf parameter that would add to LD_LIBRARY_PATH when loading libraries from shared_preload_libraries (and other library loads). To make it possible to configure it without having to mess around with the operating system configuration? Or is that too much reimplementing OS functionality? -1, I'm afraid. I found this blog post to be insightful: https://blogs.oracle.com/rie/entry/tt_ld_library_path_tt -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_preload_libraries path
On 24 May 2012 13:22, Peter Geoghegan pe...@2ndquadrant.com wrote: I found this blog post to be insightful: https://blogs.oracle.com/rie/entry/tt_ld_library_path_tt This one might be more useful, and itself refers to the aforementioned, earlier post: https://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog stops upon server restart
On 24 May 2012 13:05, Magnus Hagander mag...@hagander.net wrote: On Thu, Apr 19, 2012 at 1:00 PM, Thom Brown t...@linux.com wrote: On 10 April 2012 21:07, Magnus Hagander mag...@hagander.net wrote: On Friday, April 6, 2012, Thom Brown wrote: Hi, I've tried out pg_receivexlog and have noticed that when restarting the cluster, pg_receivexlog gets cut off... it doesn't keep waiting. This is surprising as the DBA would have to remember to start pg_receivexlog up again. This is intentional as far as that's how the code was written, there's not a malfunctioning piece of code somewhere. It would probably make sense to have an auto-reconnect feature, and to have an option to turn it on/off. If you haven't already (my wifi here is currently quite useless, which is why I'm working on my email backlog, so I can't check), please add it to the open items list. I think it would also be useful to add a paragraph to the documentation stating use-cases for this feature, and its advantages. Attached is a patch that implements this. Seems reasonable? s/non fatal/non-fatal/ Yes, this solves the problem for me, except you forgot to translate noloop in long_options[] . :) -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog stops upon server restart
On Thu, May 24, 2012 at 2:34 PM, Thom Brown t...@linux.com wrote: On 24 May 2012 13:05, Magnus Hagander mag...@hagander.net wrote: On Thu, Apr 19, 2012 at 1:00 PM, Thom Brown t...@linux.com wrote: On 10 April 2012 21:07, Magnus Hagander mag...@hagander.net wrote: On Friday, April 6, 2012, Thom Brown wrote: Hi, I've tried out pg_receivexlog and have noticed that when restarting the cluster, pg_receivexlog gets cut off... it doesn't keep waiting. This is surprising as the DBA would have to remember to start pg_receivexlog up again. This is intentional as far as that's how the code was written, there's not a malfunctioning piece of code somewhere. It would probably make sense to have an auto-reconnect feature, and to have an option to turn it on/off. If you haven't already (my wifi here is currently quite useless, which is why I'm working on my email backlog, so I can't check), please add it to the open items list. I think it would also be useful to add a paragraph to the documentation stating use-cases for this feature, and its advantages. Attached is a patch that implements this. Seems reasonable? s/non fatal/non-fatal/ Yes, this solves the problem for me, except you forgot to translate noloop in long_options[] . :) Fixed :-) Did you test it, or just assumed it worked? ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog stops upon server restart
On 24 May 2012 13:37, Magnus Hagander mag...@hagander.net wrote: On Thu, May 24, 2012 at 2:34 PM, Thom Brown t...@linux.com wrote: On 24 May 2012 13:05, Magnus Hagander mag...@hagander.net wrote: On Thu, Apr 19, 2012 at 1:00 PM, Thom Brown t...@linux.com wrote: On 10 April 2012 21:07, Magnus Hagander mag...@hagander.net wrote: On Friday, April 6, 2012, Thom Brown wrote: Hi, I've tried out pg_receivexlog and have noticed that when restarting the cluster, pg_receivexlog gets cut off... it doesn't keep waiting. This is surprising as the DBA would have to remember to start pg_receivexlog up again. This is intentional as far as that's how the code was written, there's not a malfunctioning piece of code somewhere. It would probably make sense to have an auto-reconnect feature, and to have an option to turn it on/off. If you haven't already (my wifi here is currently quite useless, which is why I'm working on my email backlog, so I can't check), please add it to the open items list. I think it would also be useful to add a paragraph to the documentation stating use-cases for this feature, and its advantages. Attached is a patch that implements this. Seems reasonable? s/non fatal/non-fatal/ Yes, this solves the problem for me, except you forgot to translate noloop in long_options[] . :) Fixed :-) Did you test it, or just assumed it worked? ;) How very dare you. Of course I tested it. It successfully reconnects on multiple restarts, checks intermittently when I've stopped the server, showing the connection error message, successfully continues when I eventually bring the server back up, and doesn't attempt a reconnect when using -n. So looks good to me. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On Wed, May 16, 2012 at 07:30:03PM +0300, Heikki Linnakangas wrote: On 16.05.2012 15:42, Sandro Santilli wrote: But CHECK_FOR_INTERRUPTS doesn't return, right ? Is there another macro for just checking w/out yet acting upon it ? Hmm, no. CHECK_FOR_INTERRUPTS() checks the InterruptPending variable, but on Windows it also checks for UNBLOCKED_SIGNAL_QUEUE(). And even if InterruptPending is set, it's not totally certain that CHECK_FOR_INTERRUPTS() won't return. I think InterruptPending can be set spuriously (even if that's not possible today, I wouldn't rely on it), and if you're in a HOLD/RESUME_INTERRUPTS block, CHECK_FOR_INTERRUPTS() will do nothing even if InterruptPending is true. The only sane way to make 3rd party code interruptible is to add CHECK_FOR_INTERRUPTS() to it, in safe places. No place is safe if CHECK_FOR_INTERRUPTS doesn't return. How could caller code cleanup on interruption ? --strk; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Hi, I've been running some tests on pg 9.2beta1 and in particular a set of queries like create table _tmp0 as select * from ( select *, (select healpixid from idt_match as m where m.transitid=o.transitid) as x from idt_photoobservation as o offset 0 ) as y where x%16=ZZZ order by x; (where ZZZ is some number integer number 0=ZZZ16) With the following plan: -- Sort (cost=3228814504.96..3228815137.21 rows=252902 width=498) Sort Key: y.x - Subquery Scan on y (cost=0.00..3228791809.25 rows=252902 width=498) Filter: ((y.x % 16::bigint) = 0) - Limit (cost=0.00..3228033102.41 rows=50580456 width=490) - Seq Scan on idt_photoobservation o (cost=0.00..3228033102.41 rows=50580456 width=490) SubPlan 1 - Index Scan using idt_match_transitid_idx on idt_match m (cost=0.00..63.74 rows=1 width=8) Index Cond: (transitid = o.transitid) The schema of the tables are: e2es2= \d idt_match Table public.idt_match Column | Type | Modifiers ---+--+--- sourceid | bigint | transitid | bigint | healpixid | bigint | flagsxm | smallint | Indexes: idt_match_idx btree (healpixid) idt_match_transitid_idx btree (transitid) Table public.idt_photoobservation Column | Type | Modifiers ---+--+--- transitid | bigint | fluxbp0 | real | more columns Indexes: idt_photoobservation_idx btree (transitid) And I noticed than when I run the query like the one shown above in parallel (in multiple connections for ZZZ=0...8) the performance of each query drops down significantly (factor of 2)(despite the fact that during the execution of the query postgres is mostly CPU bound). So I tried to oprofile it and strace it, And that's what I saw: The strace -c of each process shows something like that # Process 18660 detached % time seconds usecs/call callserrors syscall -- --- --- - - 76.250.001342 0268987 semop 23.750.000418 0 61694 read 0.000.00 0 138 lseek 0.000.00 0 355 select 0.000.00 0 3 kill -- --- --- - - 100.000.001760331177 total ### And the oprofile shows this on top: --- 2863981 25.7117 ReleasePredicateLocks 2863981 100.000 ReleasePredicateLocks [self] --- 1246629 11.1917 LocalBufferAlloc 1246629 100.000 LocalBufferAlloc [self] --- 1135393 10.1931 CheckForSerializableConflictIn 1135393 100.000 CheckForSerializableConflictIn [self] So there is a lot of locking for some reason, And I was surprised to see anything related to SSI in the profile at all, because I'm not running serializable transactions (I was just running my queries from multiple psql sessions). Is there a reasonable explanation for what I'm seeing ? Thanks in advance, Sergey More info: there is no other concurrent activity on the machine. shared_buffers is 10G, The machine has ~ 60G of RAM, 24 cores and proper RAID Another note is that the oprofile and strace were obtained during the stage when the idt_photo.. table was scanned and data was being stored in the pgsql_tmp (not during the last bit when the data in pgsql_tmp were actually being sorted). * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On 24.05.2012 16:04, Sandro Santilli wrote: On Wed, May 16, 2012 at 07:30:03PM +0300, Heikki Linnakangas wrote: On 16.05.2012 15:42, Sandro Santilli wrote: But CHECK_FOR_INTERRUPTS doesn't return, right ? Is there another macro for just checking w/out yet acting upon it ? Hmm, no. CHECK_FOR_INTERRUPTS() checks the InterruptPending variable, but on Windows it also checks for UNBLOCKED_SIGNAL_QUEUE(). And even if InterruptPending is set, it's not totally certain that CHECK_FOR_INTERRUPTS() won't return. I think InterruptPending can be set spuriously (even if that's not possible today, I wouldn't rely on it), and if you're in a HOLD/RESUME_INTERRUPTS block, CHECK_FOR_INTERRUPTS() will do nothing even if InterruptPending is true. The only sane way to make 3rd party code interruptible is to add CHECK_FOR_INTERRUPTS() to it, in safe places. No place is safe if CHECK_FOR_INTERRUPTS doesn't return. How could caller code cleanup on interruption ? It would only be safe to call it in places where no cleanup is necessary. I don't know if there are any such places in the geos library. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On Tue, May 22, 2012 at 9:50 AM, Robert Haas robertmh...@gmail.com wrote: Hmm. I think that if you do it this way, the minimum recovery point won't be respected, which could leave you with a corrupted database. Now, if all the WAL files that you need are present in pg_xlog anyway, then they ought to get replayed anyway, but I think that if you are using restore_command (as opposed to streaming replication) we restore WAL segments under a different file name, which might cause this problem. Uhm, maybe I add some more details, so you get a better idea of what I did: The idea was to promote the standby to be the new master. There was streaming replication active but at some time I had to take the master down. IIRC from the log I saw that after the master went down, the standby continued recovering from a bunch of archived log files (via recovery_command), I had suspected that either the standby was lagging behind a bit or that the master archived them during shutdown. When the standby didn't have anything else left to recover from (saying both xlog file foo doesn't exist and cannot connect to master), I deleted recovery.conf on the standby and restarted it. I wouldn't have assumed any corruption was possible given that I did clean shutdowns on both sides... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On Thu, May 24, 2012 at 04:55:34PM +0300, Heikki Linnakangas wrote: On 24.05.2012 16:04, Sandro Santilli wrote: On Wed, May 16, 2012 at 07:30:03PM +0300, Heikki Linnakangas wrote: On 16.05.2012 15:42, Sandro Santilli wrote: But CHECK_FOR_INTERRUPTS doesn't return, right ? Is there another macro for just checking w/out yet acting upon it ? Hmm, no. CHECK_FOR_INTERRUPTS() checks the InterruptPending variable, but on Windows it also checks for UNBLOCKED_SIGNAL_QUEUE(). And even if InterruptPending is set, it's not totally certain that CHECK_FOR_INTERRUPTS() won't return. I think InterruptPending can be set spuriously (even if that's not possible today, I wouldn't rely on it), and if you're in a HOLD/RESUME_INTERRUPTS block, CHECK_FOR_INTERRUPTS() will do nothing even if InterruptPending is true. The only sane way to make 3rd party code interruptible is to add CHECK_FOR_INTERRUPTS() to it, in safe places. No place is safe if CHECK_FOR_INTERRUPTS doesn't return. How could caller code cleanup on interruption ? It would only be safe to call it in places where no cleanup is necessary. I don't know if there are any such places in the geos library. Sure, right before starting and after finishing. That is nowhere useful for a premature interruption... The whole point of the work I'm doing these days is letting the users interrupt GEOS calls which often take a long time and a lot of memory resources. The current plan is to provide custom allocators to have automatic garbage collection on interruption. Turned out not to be an easy task to bend GEOS into using palloc/pfree, but there's progress in that direction. --strk; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [9.2] crash on regex
Following query crashes backend on 9.2: select substring('asd TO foo' from ' TO (([a-z0-9._]+|([^]+|)+)+)'); It is supposed to load potentially quoted table name from CREATE RULE definition. Works fine on 8.3 .. 9.1 Backtrace: Program terminated with signal 11, Segmentation fault. #0 ExecMakeFunctionResult (fcache=0x, econtext=0x953f3e8, isNull=0xbffb8d1f \b, isDone=0x0) at execQual.c:1833 1833*isNull = fcinfo-isnull; (gdb) bt #0 ExecMakeFunctionResult (fcache=0x, econtext=0x953f3e8, isNull=0xbffb8d1f \b, isDone=0x0) at execQual.c:1833 #1 0x08203104 in ExecEvalExprSwitchContext (expression=0x953ef60, econtext=0x953f3e8, isNull=0xbffb8d1f \b, isDone=0x0) at execQual.c:4187 #2 0x082844d7 in evaluate_expr (expr=value optimized out, result_type=25, result_typmod=value optimized out, result_collation=100) at clauses.c:4408 ... -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 12:43 , Kohei KaiGai wrote: The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. I don't think preventing block holes is sensible here - it might, in fact, be *just* what the user wants. Imagine a messaging system. A reasonable RLS policy would be to allow a user to see messages addressed to him. Yet you wouldn't want to prevent her from creating messages to other people - cause what good is a messaging system that only allows you to send messages to yourself. What you probably *would* want to do, though, is to check that she did put herself in as the sender when she creates a message. And you'd probably wanna forbit updates entirely. So you'd have - A RLS policy that checks current_user = ANY(recipients) - An ON INSERT trigger which checks current_user = sender - An ON UPDATE trigger which errors out If RLS policy applies to INSERTEed rows also, how would you do that? Another example, although in the realm of filesystem permissions, is Mac OS X. Per default, every user has a Drop Box folder, which anybody can write to, yet only the owner can read. This allows you to easily transfer files from one user to another without allowing a third party to read it. Indeed, you are right. We have no special reason why to enforce same rules on both of reader and writer stage on UPDATE statement. So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. (*) I don't check whether it conflicts syntax or not yet. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
On 24 May 2012 11:50, Magnus Hagander mag...@hagander.net wrote: Was there any actual reason why we didn't end up exposing queryid in the pg_stat_statements view? It would be highly useful when tracking changes over time. Right now I see people doing md5(query) to do that, which is a lot more ugly (and obviously uses more space and is slow, too). Right. I continue to maintain that this is a good idea. I raised the issue more than once. However, my proposal was not accepted by Tom and Robert, apparently on the basis that queryId's actual value was partially dictated by things like the endianness of the architecture used, and the value of OIDs when serialising and subsequently hashing the post-analysis tree. What I'd like to be able to do is aggregate this information over time and/or across standbys in a cluster, as queries are evicted and subsequently re-entered into pg_stat_statement's shared hash table. Now, there are situations were this isn't going to work, like when a third-party logical replication system is used. That's unfortunate, but I wouldn't expect it makes the information any less useful to the large majority of people. I'd also credit our users with being discerning enough to realise that they should not jump to the conclusion that the value will be stable according to any particular standard. Arguments against including an internal value in the view could equally well be applied to any of the internal statistic collector views, many of which have oid columns, despite the fact that various name columns already unambiguously identify tuples in most cases. I see no reason for the inconsistency, particularly given that the pg_stat_statements.query column *is* still somewhat ambiguous, as described in the docs, and given that the query hash value referred to in the docs anyway. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_preload_libraries path
Peter Geoghegan pe...@2ndquadrant.com writes: On 24 May 2012 13:09, Magnus Hagander mag...@hagander.net wrote: Would i make sense to have a postgresql.conf parameter that would add to LD_LIBRARY_PATH when loading libraries from shared_preload_libraries (and other library loads). To make it possible to configure it without having to mess around with the operating system configuration? Or is that too much reimplementing OS functionality? -1, I'm afraid. I think the same. Aside from the problems Peter points out, there are too many different ways to spell that variable name on different platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On May24, 2012, at 15:04 , Sandro Santilli wrote: On Wed, May 16, 2012 at 07:30:03PM +0300, Heikki Linnakangas wrote: On 16.05.2012 15:42, Sandro Santilli wrote: But CHECK_FOR_INTERRUPTS doesn't return, right ? Is there another macro for just checking w/out yet acting upon it ? Hmm, no. CHECK_FOR_INTERRUPTS() checks the InterruptPending variable, but on Windows it also checks for UNBLOCKED_SIGNAL_QUEUE(). And even if InterruptPending is set, it's not totally certain that CHECK_FOR_INTERRUPTS() won't return. I think InterruptPending can be set spuriously (even if that's not possible today, I wouldn't rely on it), and if you're in a HOLD/RESUME_INTERRUPTS block, CHECK_FOR_INTERRUPTS() will do nothing even if InterruptPending is true. The only sane way to make 3rd party code interruptible is to add CHECK_FOR_INTERRUPTS() to it, in safe places. No place is safe if CHECK_FOR_INTERRUPTS doesn't return. How could caller code cleanup on interruption ? The postgres way is to use PG_TRY/PG_CATCH to make sure stuff gets cleaned up if an error or an interrupts occurs. You could use those to make the third-party library exception safe, but it'll probably be a quite invasive change :-(. Alternatively, you could replicate the check CHECK_FOR_INTERRUPTS() does, but then don't actually call ProcessInterrupts() but instead just make the third-party code abort signalling an error, and call CHECK_FOR_INTERRUPTS() after the third-party code has returned. On unix-like systems that'd be a simple as aborting if InterruptPending is set, while on windows you'd have to do the if(UNBLOCKED_SIGNAL_QUEUE())… stuff first, since otherwise InterruptPending will never get set. The following function should do the trick bool have_pending_interrupts() { #ifdef WIN32 if (UNBLOCKED_SIGNAL_QUEUE()) pgwin32_dispatch_queued_signals(); #endif return InterruptPending !InterruptHoldoffCount !CritSectionCount; } The third-party could would then do if (have_pending_interrupts()) return some_error; and you'd invoke in with state = third_party_code(); CHECK_FOR_INTERRUPTS(); if (state != success) ereport(…); There might be slim chance for false positives with that approach, since ProcessInterrupts() might not always ereport(), even if InterruptHoldoffCount and CritSectionCount are zero. But they'll simply get turned into spurious ereport() by the if(state != success) check after CHECK_FOR_INTERRUPTS, and should be very rare, and happen only shortly after a query cancel request was received. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
On Thu, May 24, 2012 at 4:26 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 24 May 2012 11:50, Magnus Hagander mag...@hagander.net wrote: Was there any actual reason why we didn't end up exposing queryid in the pg_stat_statements view? It would be highly useful when tracking changes over time. Right now I see people doing md5(query) to do that, which is a lot more ugly (and obviously uses more space and is slow, too). Right. I continue to maintain that this is a good idea. I raised the issue more than once. However, my proposal was not accepted by Tom and Robert, apparently on the basis that queryId's actual value was partially dictated by things like the endianness of the architecture used, and the value of OIDs when serialising and subsequently hashing the post-analysis tree. What I'd like to be able to do is aggregate this information over time and/or across standbys in a cluster, as queries are evicted and subsequently re-entered into pg_stat_statement's shared hash table. That's exactly the usecase I'm looking at here, except it's not actually across standbys in this case. Now, there are situations were this isn't going to work, like when a third-party logical replication system is used. That's unfortunate, but I wouldn't expect it makes the information any less useful to the large majority of people. I'd also credit our users with being discerning enough to realise that they should not jump to the conclusion that the value will be stable according to any particular standard. As long as it's documented as such, I don't see a problem with that at all. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
On Thu, May 24, 2012 at 10:26 AM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 24 May 2012 11:50, Magnus Hagander mag...@hagander.net wrote: Was there any actual reason why we didn't end up exposing queryid in the pg_stat_statements view? It would be highly useful when tracking changes over time. Right now I see people doing md5(query) to do that, which is a lot more ugly (and obviously uses more space and is slow, too). Right. I continue to maintain that this is a good idea. I raised the issue more than once. However, my proposal was not accepted by Tom and Robert, apparently on the basis that queryId's actual value was partially dictated by things like the endianness of the architecture used, and the value of OIDs when serialising and subsequently hashing the post-analysis tree. No, my concern was more that I wasn't clear on what you hoped to do with it. But I think this explanation is enough to convince me that it might be worthwhile: What I'd like to be able to do is aggregate this information over time and/or across standbys in a cluster, as queries are evicted and subsequently re-entered into pg_stat_statement's shared hash table. Now, there are situations were this isn't going to work, like when a third-party logical replication system is used. That's unfortunate, but I wouldn't expect it makes the information any less useful to the large majority of people. I'd also credit our users with being discerning enough to realise that they should not jump to the conclusion that the value will be stable according to any particular standard. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. And second, it avoids mimicking functionality that is already provided by an existing feature, namely triggers. People will have to deal with the trigger ordering issue, but that's nothing new, and I bet most people have a system in place for that. I usually prefix my trigger names with 'a_' to 'z_', for example, to make the ordering explicit. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
Robert Haas robertmh...@gmail.com writes: On Thu, May 24, 2012 at 10:26 AM, Peter Geoghegan pe...@2ndquadrant.com wrote: But I think this explanation is enough to convince me that it might be worthwhile: What I'd like to be able to do is aggregate this information over time and/or across standbys in a cluster, as queries are evicted and subsequently re-entered into pg_stat_statement's shared hash table. It appears to me that the above ... ... I'd also credit our users with being discerning enough to realise that they should not jump to the conclusion that the value will be stable according to any particular standard. ... is in direct contradiction to this. The proposed usage absolutely requires that the hash be stable over time and/or across standbys. I do not want to promise that it's stable over any timeframe longer than a server reboot. Aside from the OID dependence problem, we might well change the way the hash is calculated in minor releases, for example by adding or removing struct fields. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.2] crash on regex
On Thu, May 24, 2012 at 10:16 AM, Marko Kreen mark...@gmail.com wrote: Following query crashes backend on 9.2: select substring('asd TO foo' from ' TO (([a-z0-9._]+|([^]+|)+)+)'); I spent some time trying to reduce this to the simplest case that still causes a crash, and came up with this: select substring('a' from '((a))+'); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On Thu, May 24, 2012 at 10:13 AM, Joachim Wieland j...@mcknight.de wrote: On Tue, May 22, 2012 at 9:50 AM, Robert Haas robertmh...@gmail.com wrote: Hmm. I think that if you do it this way, the minimum recovery point won't be respected, which could leave you with a corrupted database. Now, if all the WAL files that you need are present in pg_xlog anyway, then they ought to get replayed anyway, but I think that if you are using restore_command (as opposed to streaming replication) we restore WAL segments under a different file name, which might cause this problem. Uhm, maybe I add some more details, so you get a better idea of what I did: The idea was to promote the standby to be the new master. There was streaming replication active but at some time I had to take the master down. IIRC from the log I saw that after the master went down, the standby continued recovering from a bunch of archived log files (via recovery_command), I had suspected that either the standby was lagging behind a bit or that the master archived them during shutdown. When the standby didn't have anything else left to recover from (saying both xlog file foo doesn't exist and cannot connect to master), I deleted recovery.conf on the standby and restarted it. I wouldn't have assumed any corruption was possible given that I did clean shutdowns on both sides... The thing that's worrying me is that there's not really any such thing as a clean shutdown on a standby. When you shut down the master, it checkpoints. When you shut down the standby, it can't checkpoint, so I think it's still going to enter recovery at startup. It'd be interesting to know where that recovery began and ended as compared with the minimum recovery point just before the shutdown. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 8:24 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've been running some tests on pg 9.2beta1 and in particular a set of queries like create table _tmp0 as select * from ( select *, (select healpixid from idt_match as m where m.transitid=o.transitid) as x from idt_photoobservation as o offset 0 ) as y where x%16=ZZZ order by x; (where ZZZ is some number integer number 0=ZZZ16) With the following plan: -- Sort (cost=3228814504.96..3228815137.21 rows=252902 width=498) Sort Key: y.x - Subquery Scan on y (cost=0.00..3228791809.25 rows=252902 width=498) Filter: ((y.x % 16::bigint) = 0) - Limit (cost=0.00..3228033102.41 rows=50580456 width=490) - Seq Scan on idt_photoobservation o (cost=0.00..3228033102.41 rows=50580456 width=490) SubPlan 1 - Index Scan using idt_match_transitid_idx on idt_match m (cost=0.00..63.74 rows=1 width=8) Index Cond: (transitid = o.transitid) The schema of the tables are: e2es2= \d idt_match Table public.idt_match Column | Type | Modifiers ---+--+--- sourceid | bigint | transitid | bigint | healpixid | bigint | flagsxm | smallint | Indexes: idt_match_idx btree (healpixid) idt_match_transitid_idx btree (transitid) Table public.idt_photoobservation Column | Type | Modifiers ---+--+--- transitid | bigint | fluxbp0 | real | more columns Indexes: idt_photoobservation_idx btree (transitid) And I noticed than when I run the query like the one shown above in parallel (in multiple connections for ZZZ=0...8) the performance of each query drops down significantly (factor of 2)(despite the fact that during the execution of the query postgres is mostly CPU bound). So I tried to oprofile it and strace it, And that's what I saw: The strace -c of each process shows something like that # Process 18660 detached % time seconds usecs/call calls errors syscall -- --- --- - - 76.25 0.001342 0 268987 semop 23.75 0.000418 0 61694 read 0.00 0.00 0 138 lseek 0.00 0.00 0 355 select 0.00 0.00 0 3 kill -- --- --- - - 100.00 0.001760 331177 total ### And the oprofile shows this on top: --- 2863981 25.7117 ReleasePredicateLocks 2863981 100.000 ReleasePredicateLocks [self] --- 1246629 11.1917 LocalBufferAlloc 1246629 100.000 LocalBufferAlloc [self] --- 1135393 10.1931 CheckForSerializableConflictIn 1135393 100.000 CheckForSerializableConflictIn [self] So there is a lot of locking for some reason, And I was surprised to see anything related to SSI in the profile at all, because I'm not running serializable transactions (I was just running my queries from multiple psql sessions). Are you sure? I looked at all the ReleasePredicateLocks calls and they appear to be guarded by: /* Nothing to do if this is not a serializable transaction */ if (MySerializableXact == InvalidSerializableXact) return false; What's the default isolation mode set to? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
On 24 May 2012 16:06, Tom Lane t...@sss.pgh.pa.us wrote: I do not want to promise that it's stable over any timeframe longer than a server reboot. You already have though, since pg_stat_statements persistently stores statistics to disk by default, and can only ever recognise statement equivalence based on the (dbid, userid, queryid) hash key. Aside from the OID dependence problem, we might well change the way the hash is calculated in minor releases, for example by adding or removing struct fields. You've already invalidated the saved statistics if you do that, so all bets are off anyway. If you have to do it, it'll be necessary to bump PGSS_FILE_HEADER, so that pg_stat_statements will be cleared upon restart. That will in turn necessitate documenting the issue in the minor version release notes. I'd hope to avoid that, but it doesn't seem to me that the situation is made any worse than before by exposing the value. On the contrary, it could help users to understand where the problem may have affected them. If you don't expose the value, users are going to do this sort of thing anyway, but will be far worse off due to using the query text or a hash thereof instead of the internal value. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating multiple indexes in one table scan.
* Gurjeet Singh (singh.gurj...@gmail.com) wrote: Bruce points out the even simpler case is to build several indexes in parallel over the same scan. I thought I had posted a patch to that effect long back, but upon searching my emails apparently I forgot about the patch. Attached is the patch that I developed in Nov. 2010, so expect a lot of bit rot. I had tried to make it elegant, but I have to admit its a hack. This patch does not imply that it is using any kind of parallelism, the context in which that above statement was made. It just helps to avoid scanning the same relation multiple times. I performed some tests on it and AFAICR, this did not produce a net win. But those tests may have been performed in a virtual machine and not on a bare metal, I forget. I'm not too surprised that it didn't help all that much since you're doing everything in one backend. My guess at what Bruce was talking about is being able to actually have multiple CREATE INDEX's going in parallel in different backends. If they didn't all grab an AccessExclusive lock, wouldn't they be able to use the same ring buffer to read through the table anyway? And with CREATE INDEX CONCURRENTLY, isn't this supported already? I haven't tried this yet, but it sure seems like something we could probably already claim to support.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [9.2] crash on regex
On 24 May 2012 16:08, Robert Haas robertmh...@gmail.com wrote: On Thu, May 24, 2012 at 10:16 AM, Marko Kreen mark...@gmail.com wrote: Following query crashes backend on 9.2: select substring('asd TO foo' from ' TO (([a-z0-9._]+|([^]+|)+)+)'); I spent some time trying to reduce this to the simplest case that still causes a crash, and came up with this: select substring('a' from '((a))+'); It appears to occur with any quantifier attached to a group that is more than 1 level deep with nothing between the groups, such as ((a))?, ((a))*, ((a)){1,4}, (((a)))+ Or if the quantifier is the only additional thing between the groups, such as ((a)+) But when breaking the groups up, it's fine, so this works: ((a)b)+ -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating multiple indexes in one table scan.
On Thu, May 24, 2012 at 11:22 AM, Stephen Frost sfr...@snowman.net wrote: * Gurjeet Singh (singh.gurj...@gmail.com) wrote: Bruce points out the even simpler case is to build several indexes in parallel over the same scan. I thought I had posted a patch to that effect long back, but upon searching my emails apparently I forgot about the patch. Attached is the patch that I developed in Nov. 2010, so expect a lot of bit rot. I had tried to make it elegant, but I have to admit its a hack. This patch does not imply that it is using any kind of parallelism, the context in which that above statement was made. It just helps to avoid scanning the same relation multiple times. I performed some tests on it and AFAICR, this did not produce a net win. But those tests may have been performed in a virtual machine and not on a bare metal, I forget. I'm not too surprised that it didn't help all that much since you're doing everything in one backend. My guess at what Bruce was talking about is being able to actually have multiple CREATE INDEX's going in parallel in different backends. If they didn't all grab an AccessExclusive lock, wouldn't they be able to use the same ring buffer to read through the table anyway? And with CREATE INDEX CONCURRENTLY, isn't this supported already? I haven't tried this yet, but it sure seems like something we could probably already claim to support.. It'd be great if one of standard utilities like pg_restore supported this, by spawning every concurrent index build in separate backends. Just a thought. -- Gurjeet Singh EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] [9.2] crash on regex
On 24 May 2012 16:24, Thom Brown t...@linux.com wrote: On 24 May 2012 16:08, Robert Haas robertmh...@gmail.com wrote: On Thu, May 24, 2012 at 10:16 AM, Marko Kreen mark...@gmail.com wrote: Following query crashes backend on 9.2: select substring('asd TO foo' from ' TO (([a-z0-9._]+|([^]+|)+)+)'); I spent some time trying to reduce this to the simplest case that still causes a crash, and came up with this: select substring('a' from '((a))+'); It appears to occur with any quantifier attached to a group that is more than 1 level deep with nothing between the groups, such as ((a))?, ((a))*, ((a)){1,4}, (((a)))+ Or if the quantifier is the only additional thing between the groups, such as ((a)+) But when breaking the groups up, it's fine, so this works: ((a)b)+ Hmmm... curiously, lazy (non-greedy) quantifiers are stable, such as: ((a))*? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statments queryid
On 24 May 2012 16:06, Tom Lane t...@sss.pgh.pa.us wrote: What I'd like to be able to do is aggregate this information over time and/or across standbys in a cluster, as queries are evicted and subsequently re-entered into pg_stat_statement's shared hash table. It appears to me that the above ... ... I'd also credit our users with being discerning enough to realise that they should not jump to the conclusion that the value will be stable according to any particular standard. ... is in direct contradiction to this. I simply meant that we ought to be able to trust that people will actually investigate the stability guarantees of a newly exposed query_id before going and writing a tool that does some sort of aggregation - they should and will make informed decisions. If that guarantee is limited to we might have to change the walker logic during a minor release, so the value might change for some queries, so we don't promise that it will be a stable identifier but will naturally strive to do our best to avoid invalidating existing statistics (within postgres and aggregated by external tools), that wouldn't put many people off. Still, I don't think it's all that likely that we'll ever have to adjust the walker logic, since pg_stat_statements already doesn't strictly promise that collisions cannot occur, while making them very improbable. I'm not even asking that possible uses for queryId be documented as being useful for this sort of thing. I only ask that we expose it and document it. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating multiple indexes in one table scan.
On Thu, May 24, 2012 at 11:25 AM, Gurjeet Singh singh.gurj...@gmail.com wrote: It'd be great if one of standard utilities like pg_restore supported this, by spawning every concurrent index build in separate backends. Just a thought. If parallel restore doesn't already take this into account when doing job scheduling, that would be a worthwhile improvement to consider. Personally, I think the big win in this area is likely to be parallel sort. There may well be some more we can squeeze out of our existing sort implementation first, and I'm all in favor of that, but ultimately if you've got 60GB of data to sort and it's all in cache, you want to be able to use more than one CPU for that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. I definitely agree with starting a new feature from simple implementation. Although I'm inclined to the approach to replace references to tables with security policy by sub-queries with security barrier flag, instead of adding qualifiers of where clause to avoid the leaky-view scenario, it will make its implementation mush simpler. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… I agree. The row level security policy should not be applied during FK checks (or other internal stuff; to be harmless). At the previous discussion, it was issued that iteration of FK/PK proving enables malicious one to estimate existence of invisible tuple and its key value, although they cannot see the actual values. It is well documented limitation, thus, user should not use row- level security (or should not use natural key) if they cannot accept this limitation. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.2] crash on regex
Robert Haas robertmh...@gmail.com writes: On Thu, May 24, 2012 at 10:16 AM, Marko Kreen mark...@gmail.com wrote: Following query crashes backend on 9.2: select substring('asd TO foo' from ' TO (([a-z0-9._]+|([^]+|)+)+)'); I spent some time trying to reduce this to the simplest case that still causes a crash, and came up with this: select substring('a' from '((a))+'); Yeah, I'm looking at it. Looks like I broke memory management somewhere in the quantifier revisions --- it seems to be stomping the stack during cleanup at the end of pg_regexec. Possibly a multiple-free deal? Haven't quite found it yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On 24.05.2012 18:16, Robert Haas wrote: On Thu, May 24, 2012 at 10:13 AM, Joachim Wielandj...@mcknight.de wrote: I wouldn't have assumed any corruption was possible given that I did clean shutdowns on both sides... The thing that's worrying me is that there's not really any such thing as a clean shutdown on a standby. When you shut down the master, it checkpoints. When you shut down the standby, it can't checkpoint, so I think it's still going to enter recovery at startup. It'd be interesting to know where that recovery began and ended as compared with the minimum recovery point just before the shutdown. Perhaps we should introduce the concept of a clean standby shutdown. We can't write a checkpoint record, but we could write the same information somewhere else. Like in the control file. At startup, we'd see that we did a clean shutdown at WAL point X/X, and start up without having to read through all the WAL from the last master checkpoint. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
I'd like to summarize the current design being discussed. syntax: ALTER TABLE tblname WITH ROW LEVEL SECURITY ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY; I tried to patch the parser/gram.y, but here was syntax conflicts on ADD / DROP sub-command. And, I noticed ROW LEVEL SECURITY allows to implement without adding new keyword, unlike SECURITY POLICY. As we discussed, it causes a problem with approach to append additional qualifiers to where clause implicitly, because it does not solve the matter corresponding to the order to execute qualifiers. So, I'm inclined to the approach to replace reference to tables with security policy by sub-queries with security barrier flag. For example, if tbl has security policy, this query shall be rewritten internally, as follows: original) SELECT * FROM tbl WHERE X 20 AND f_leak(Y); rewritten) SELECT * FROM ( SELECT * FROM tbl WHERE uname = getpgusername() ) AS tbl_subqry WHERE X 20 AND f_leak(Y); The sub-query shall have security-barrier flag, so f_leak() is never pushed down but X 20 will be pushed down because of leakproof attribute of the function. It is a bit complex at UPDATE or DELETE statement, but what I try to do is same. original) UPDATE tbl SET X = X + 1 WHERE X 20 AND f_leak(Y); rewritten) UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT ctid FROM ( SELECT ctid, * FROM uname = getpgusername() ) AS tbl_subqry WHERE X 20 AND f_leak(Y) ); That guarantees the security policy (uname = getpgusername()) is evaluated prior to user given conditions. One thing still I'm thinking is whether the security policy should be provided as a function or a clause. Enough simple sql function is inlined at simplify_function(), so here is no meaningful difference. I was afraid of code complexity, but all we should do is to append configured clause on the where clause of sub-query inside. | ALTER TABLE tblname WITH ROW LEVEL SECURITY | ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; So, I tried to put condition clause instead of a function, right now. Regarding to FK constraints, I don't think it is a situation to apply row-level security policy towards internal queries. So, I plan to disable during FK checks. One other issue we didn't have discussed is table inheritance. In case when a table TBLP has a child table TBLC and only TBLC has its security policy, what security policy should be applied when we run SELECT * FROM TBLP. My preference is, the security policy is only applied to scan on TBLC, not TBLP. It is not desirable behavior that visible tuples are different from way to reference a certain table. In addition, if and when TBLP and TBLC have their own policy individually, what is a desirable behavior? I think, the security policy of both TBLP and TBLC should be applied on TBLC; in other words, it applies the security policy of all the parent tables to scan on child table. Any comments please. Thanks, 2012/5/24 Kohei KaiGai kai...@kaigai.gr.jp: 2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. I definitely agree with starting a new feature from simple implementation. Although I'm inclined to the approach to replace references to tables with security policy by sub-queries with security barrier flag, instead of adding qualifiers of where clause to avoid the leaky-view scenario, it will make its implementation mush simpler. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On Thursday, May 24, 2012 06:35:06 PM Heikki Linnakangas wrote: On 24.05.2012 18:16, Robert Haas wrote: On Thu, May 24, 2012 at 10:13 AM, Joachim Wielandj...@mcknight.de wrote: I wouldn't have assumed any corruption was possible given that I did clean shutdowns on both sides... The thing that's worrying me is that there's not really any such thing as a clean shutdown on a standby. When you shut down the master, it checkpoints. When you shut down the standby, it can't checkpoint, so I think it's still going to enter recovery at startup. It'd be interesting to know where that recovery began and ended as compared with the minimum recovery point just before the shutdown. Perhaps we should introduce the concept of a clean standby shutdown. We can't write a checkpoint record, but we could write the same information somewhere else. Like in the control file. At startup, we'd see that we did a clean shutdown at WAL point X/X, and start up without having to read through all the WAL from the last master checkpoint. The control file currently is not a very good match because of the current requirement of staying below 512 bytes. If we would include the list of running xacts that wouldn't be enough. I wondered before if there is more to do to fix that then to do the atomic write();fsync();rename();fsync(); dance. I don't see a problem with the cost of that personally... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -x stream from the standby gets stuck
On Wed, May 23, 2012 at 9:25 PM, Magnus Hagander mag...@hagander.net wrote: While reviewing and cleaning this patch up a bit I noticed it actually broke pg_receivexlog in the renaming. Here is a new version of the patch, reworked based on the above so we're down to a single callback. I moved the rename last segment file even if it's not complete to be a parameter into ReceiveXlogStream() instead of trying to overload a third functionality on the callback (which is what broke pg_receivexlog). How does this look? Have I overlooked any cases? Thanks for the patch! Looks good to me except the followings: pg_basebackup.c:233: warning: passing argument 6 of 'ReceiveXlogStream' from incompatible pointer type I got the above warning on compile. To fix this, the third argument segment_finished needs to be added to reached_end_position(). It seems confusing that *stream_continue()* returns TRUE when streaming *cannot continue*, i.e., its name seems to be inconsistent with what it does. What about renaming it to stream_stop? Similarly, it also seems confusing that *continue_streaming()* returns TRUE when streaming *cannot continue*. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 6:11 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. I become to think it is a better approach than tracking origin of each qualifiers. One problem is case handling on update or delete statement. It may be possible to rewrite the update / delete query as follows: From: UPDATE tbl SET X = X + 1 WHERE f_leak(Y) To: UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT * FROM ( SELECT ctid FROM tbl WHERE uname = getpgusername() == (*) should have security-barrier ) AS tbl_subqry WHERE f_leak(Y) ); Expanded sub-queries will have security-barrier flag, so it enforces the uname = getpgusername() being checked earlier than f_leak(Y). We may need to measure the performance impact due to the reform. The problem with this is that it introduces an extra instance of tbl into the query - there are now two rather than one. UPDATE .. FROM is supposed to be a way to avoid this, but it's insufficiently general to handle all the cases (e.g. UPDATE a LEFT JOIN b can't be written using the existing syntax). Anyway we want to avoid inserting self-joins for performance reasons if at all possible. It should be easy to do that in the case of SELECT; UPDATE and DELETE may need a bit more work. I think, this situation is similar to a case when we reference a view without privileges to underlying tables. If Bob set up a view with something tricky function, it allows Bob to reference credentials of users who reference the view. More or less, it might be a problem when a user try to invoke a user defined function declared by others. (Thus, sepgsql policy does not allow users to invoke a function declared by another one in different domain; without DBA's checks.) This is true, but there are still some new threat models. For example, currently, pg_dump isn't going to run any user-defined code just because you do SELECT * FROM table, but that will change with this patch. Note that pg_dump need not actually select from views, only tables. I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Seems like a somewhat random wart, if it's just an exception for superusers. I think we need to do better than that. For example, at my last company, sales reps A and B were permitted to see all customers of the company, but sales reps C, D, E, F, G, H, I, and J were permitted to see only their own accounts. Those sorts of policies need to be easy to implement. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Sorry, I don't favor this idea. Even if table owner set up a function to generate additional qualifiers, it also has no guarantee the qualifiers are invoked prior to user-given one. It seems to me this approach will have same problem... It's not intended to solve the qual-ordering problem, just to allow additional policy flexibility. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yes, we don't need any new invent to check the value of new tuples. But it should be done after all the user-defined triggers. Existing trigger does not have a mechanism to enforce order to be invoked. So, what I really implement is a mechanism to inject some pseudo triggers at tail of the Trigger array. Start the trigger names with the letter z. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 6:20 AM, Florian Pflug f...@phlo.org wrote: But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? Agreed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plperl_helpers.h fix for clang
clang warns about that newish SvREFCNT_inc(sv) call in plperl_helpers.h about an unused return value, because the macro expansion of SvREFCNT_inc(sv) returns sv. The merit of that warning might be debatable, but it seems easy to fix by using SvREFCNT_inc_void(sv) instead. And we could use SvREFCNT_inc_simple_void(sv), since sv doesn't have any side effects, but that's optional. Any concerns? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… I agree. The row level security policy should not be applied during FK checks (or other internal stuff; to be harmless). At the previous discussion, it was issued that iteration of FK/PK proving enables malicious one to estimate existence of invisible tuple and its key value, although they cannot see the actual values. It is well documented limitation, thus, user should not use row- level security (or should not use natural key) if they cannot accept this limitation. You say I agree, but it seems to me that you and Florian are in fact taking opposite positions. FWIW, I'm inclined to think that you should NOT be able to create a row that references an invisible row. You might end up with that situation anyway, because we don't know what the semantics of the security policy are: rows might become visible or invisible after the fact, and we can't police that. But I think that if you take the opposite position that the select queries inside fkey triggers ought to be exempt from security policy, then you need to build some new mechanism to make that happen, which seems like extra work for no benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl_helpers.h fix for clang
On Thu, May 24, 2012 at 11:19 AM, Peter Eisentraut pete...@gmx.net wrote: clang warns about that newish SvREFCNT_inc(sv) call in plperl_helpers.h about an unused return value, because the macro expansion of SvREFCNT_inc(sv) returns sv. The merit of that warning might be debatable, but it seems easy to fix by using SvREFCNT_inc_void(sv) instead. And we could use SvREFCNT_inc_simple_void(sv), since sv doesn't have any side effects, but that's optional. Any concerns? Hrm I can't seem to find either of those functions in my copy of perl 5.8.1 or 5.16.0. I also looked for _inc_ in the event I got casing wrong :-(. Perhaps I misunderstood, are you proposing to introduce those functions? Im fine with that. Or doing (void)SvREFCNT_inc(). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 18:42 , Kohei KaiGai wrote: I'd like to summarize the current design being discussed. syntax: ALTER TABLE tblname WITH ROW LEVEL SECURITY ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY; I tried to patch the parser/gram.y, but here was syntax conflicts on ADD / DROP sub-command. And, I noticed ROW LEVEL SECURITY allows to implement without adding new keyword, unlike SECURITY POLICY. Let the bike-shedding begin ;-) ALTER TABLE … WITH sounds a bit weird. What about ALTER TABLE tblname SET ROW POLICY condition FOR { SELECT | UPDATE | DELETE } ALTER TABLE tblname RESET ROW POLICY As we discussed, it causes a problem with approach to append additional qualifiers to where clause implicitly, because it does not solve the matter corresponding to the order to execute qualifiers. So, I'm inclined to the approach to replace reference to tables with security policy by sub-queries with security barrier flag. Since the security barrier flag carries a potentially hefty performance penalty, I think it should be optional. Application which don't allow SQL-level access to the database might still benefit from row-level security, because it saves them from having to manually add the WHERE clause to every statement, or having to wrap all their tables with views. Yet without direct SQL-level access, the security barrier thing isn't really necessary, so it'd be nice if they wouldn't have to pay for it. How about ALTER TABLE … SET ROW POLICY … WITH (security_barrier) One thing still I'm thinking is whether the security policy should be provided as a function or a clause. Enough simple sql function is inlined at simplify_function(), so here is no meaningful difference. I was afraid of code complexity, but all we should do is to append configured clause on the where clause of sub-query inside. | ALTER TABLE tblname WITH ROW LEVEL SECURITY | ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; So, I tried to put condition clause instead of a function, right now. A single function seems much easier implementation-wise, since you wouldn't need to store an arbitrary expression in the catalog, just an oid. It also delegates the dependency tracking problem to the function. It also simplies the grammar, because the FOR … clause cannot be mistaken to belong to the condition clause. One other issue we didn't have discussed is table inheritance. In case when a table TBLP has a child table TBLC and only TBLC has its security policy, what security policy should be applied when we run SELECT * FROM TBLP. My preference is, the security policy is only applied to scan on TBLC, not TBLP. Agreed. It is not desirable behavior that visible tuples are different from way to reference a certain table. In addition, if and when TBLP and TBLC have their own policy individually, what is a desirable behavior? I think, the security policy of both TBLP and TBLC should be applied on TBLC; in other words, it applies the security policy of all the parent tables to scan on child table. I think security policies should only apply to the table they're declared for, not their child tables. Mostly because that is how triggers operate, and security policies and triggers will often be used together, so having their semantics regarding inheritance be the same seems to be the least surprising option. Also, if policies are inherited, how would you define a policy which applies only to the parent, not to the child? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Backends stalled in 'startup' state: index corruption
Yesterday I had a client that experienced a sudden high load on one of their servers (8.3.5 - yes, I know. Those of you with clients will understand). When I checked, almost all connections were in a startup state, very similar to this thread: http://postgresql.1045698.n5.nabble.com/9-1-3-backends-getting-stuck-in-startup-td5670712.html Running a strace showed a lot of semop activity, and the logs showed a successful connection, then a 5 minute plus wait before a query was issued. So obviously, blocking on something. Unlike the thread above, I *did* find problems in the system catalogs. For example, both pg_class and pg_index gave warnings like this for every index during a VACUUM FULL VERBOSE tablename: WARNING: index pg_class_relname_nsp_index contains 7712 row versions, but table contains 9471 row versions HINT: Rebuild the index with REINDEX. A REINDEX did not solve the problem (initially), as a REINDEX followed by a VAC showed the same warning and hint. The next step was dropping to standalone mode, but before that could be done, the REINDEXes fixed the problem (no warnings, no stalled connections). So my questions are: * Why would a REINDEX not fix the problem as the hint suggested? Why would it then start working? * What exactly is the relationship between bad indexes and shared memory locks? * Is there some other emergency fix when it happens, such as killing all backends and hoping you kill the one that is actually holding the lock (if any was). * Did anything in the 8.3 series fix this? I saw nothing relevant in the release notes for everything up to 8.3.18 (which it will be on soon). -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpwbxRekkZQf.pgp Description: PGP signature
Re: [HACKERS] plperl_helpers.h fix for clang
On Thu, May 24, 2012 at 11:31 AM, Alex Hunsaker bada...@gmail.com wrote: On Thu, May 24, 2012 at 11:19 AM, Peter Eisentraut pete...@gmx.net wrote: And we could use SvREFCNT_inc_simple_void(sv), since sv doesn't have any side effects, but that's optional. Hrm I can't seem to find either of those functions in my copy of perl 5.8.1 or 5.16.0. I also looked for _inc_ in the event I got casing wrong :-(. Doh, it is indeed there in 5.16.0, looks like it got added in 5.10 :-(. (I was on the wrong branch...). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, 24 May 2012, Merlin Moncure wrote: Are you sure? I looked at all the ReleasePredicateLocks calls and they appear to be guarded by: /* Nothing to do if this is not a serializable transaction */ if (MySerializableXact == InvalidSerializableXact) return false; What's the default isolation mode set to? Sorry, it was indeed my mistake. I was pointing opreport to the binary which was recompiled, while the old version was still running , so I guess oprofile was picking up wrong function names Here is the correct oprofile: when multiple queries are running : samples %symbol name --- 952457 13.8715 LWLockAcquire 952457 100.000 LWLockAcquire [self] --- 779077 11.3464 PinBuffer 779077 100.000 PinBuffer [self] --- 759898 11.0671 s_lock 759898 100.000 s_lock [self] --- 6897536.7178 slot_deform_tuple 689753 100.000 slot_deform_tuple [self] --- 5260025.1230 UnpinBuffer 526002 100.000 UnpinBuffer [self] When only one is running: samples %symbol name --- 163268 14.0343 slot_deform_tuple 163268 100.000 slot_deform_tuple [self] --- 126018 10.8324 _bt_compare 126018 100.000 _bt_compare [self] --- 1139759.7972 ExecProject 113975 100.000 ExecProject [self] --- 49760 4.2773 FunctionCall2Coll 49760100.000 FunctionCall2Coll [self] --- 49164 4.2261 LWLockAcquire 49164100.000 LWLockAcquire [self] --- 43526 3.7414 hash_search_with_hash_value 43526100.000 hash_search_with_hash_value [self] I guess there is nothing catastrophically wrong with that, but still I'm very suprised that you get severe locking problems (factor of two slow-down) when running parallel read-only transactions. Sergey * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.2] crash on regex
Thom Brown t...@linux.com writes: Hmmm... curiously, lazy (non-greedy) quantifiers are stable, such as: ((a))*? I've found it. The triggering conditions are (1) more than one set of capturing parens in a substring() pattern, and (2) at least one trial midpoint failing in ccondissect() or one of its siblings. That results in zaptreesubs() trying to clear an array entry that isn't there... trivial to fix fortunately. So it's not so much what the quantifiers are as whether the first attempted match succeeds. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_preload_libraries path
On tor, 2012-05-24 at 14:09 +0200, Magnus Hagander wrote: Would i make sense to have a postgresql.conf parameter that would add to LD_LIBRARY_PATH when loading libraries from shared_preload_libraries (and other library loads). Well, you could write a library that sets it in its init function, and load that before the library that needs it set. :-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 6:24 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've been running some tests on pg 9.2beta1 and in particular a set of queries like ... And I noticed than when I run the query like the one shown above in parallel (in multiple connections for ZZZ=0...8) the performance of each query drops down significantly (factor of 2)(despite the fact that during the execution of the query postgres is mostly CPU bound). Is this a regression from previous versions (9.0, 9.1), or has the slow down always been there? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: I guess there is nothing catastrophically wrong with that, but still I'm very suprised that you get severe locking problems (factor of two slow-down) when running parallel read-only transactions. Me, too. How many concurrent connections are you running, and does your working set exceed shared_buffers? Can you provide a self-contained reproducible test case? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_preload_libraries path
On Thu, May 24, 2012 at 7:58 PM, Peter Eisentraut pete...@gmx.net wrote: On tor, 2012-05-24 at 14:09 +0200, Magnus Hagander wrote: Would i make sense to have a postgresql.conf parameter that would add to LD_LIBRARY_PATH when loading libraries from shared_preload_libraries (and other library loads). Well, you could write a library that sets it in its init function, and load that before the library that needs it set. :-) :-) What's life without workarounds, eh? ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl_helpers.h fix for clang
On tor, 2012-05-24 at 11:36 -0600, Alex Hunsaker wrote: On Thu, May 24, 2012 at 11:31 AM, Alex Hunsaker bada...@gmail.com wrote: On Thu, May 24, 2012 at 11:19 AM, Peter Eisentraut pete...@gmx.net wrote: And we could use SvREFCNT_inc_simple_void(sv), since sv doesn't have any side effects, but that's optional. Hrm I can't seem to find either of those functions in my copy of perl 5.8.1 or 5.16.0. I also looked for _inc_ in the event I got casing wrong :-(. Doh, it is indeed there in 5.16.0, looks like it got added in 5.10 :-(. (I was on the wrong branch...). It's in ppport.h. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating multiple indexes in one table scan.
On 05/24/2012 11:44 AM, Robert Haas wrote: On Thu, May 24, 2012 at 11:25 AM, Gurjeet Singhsingh.gurj...@gmail.com wrote: It'd be great if one of standard utilities like pg_restore supported this, by spawning every concurrent index build in separate backends. Just a thought. If parallel restore doesn't already take this into account when doing job scheduling, that would be a worthwhile improvement to consider. Parallel restore allows it in some cases, but not in others. You can't create an index-backed constraint concurrently with any other index because ALTER TABLE takes an ACCESS EXCLUSIVE lock. We could get around that changing pg_dump to create the index first and then add the constraint using the existing index. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
Yes, we do. It would be best to conclude that things I do on hackers relate in some way to those goals, even if it isn't immediately clear how. See, now you've got me all curious. How does inter-DB queries relate to the New Replication? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Stateful pointers in set-returning functions
Hi, I'm writing a set-returning function which places a file handle into PG's FuncCallContext's user_fctx space. My problem is that when the function is ran with a limit clause (SELECT * FROM foo() LIMIT 10) the server will stop calling the function automatically, not giving me a chance to close the file handle. Is there a way to get the limit value inside of foo() and set the max max_calls parameter correctly? Thanks, Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, 24 May 2012, Robert Haas wrote: On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: I guess there is nothing catastrophically wrong with that, but still I'm very suprised that you get severe locking problems (factor of two slow-down) when running parallel read-only transactions. Me, too. How many concurrent connections are you running, and does your working set exceed shared_buffers? Can you provide a self-contained reproducible test case? The last tests I've been doing were with 8 connections. And the working set is roughly 30Gig, which is ~ 3x the shared buffers. (but ~ 50% of RAM). Regarding the test-case, I'll try to see whether I can still observe the same slowing down if I chop the main table by a factor of few, so I can put the data somewhere for download. S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stateful pointers in set-returning functions
Ian Pye ian...@gmail.com writes: I'm writing a set-returning function which places a file handle into PG's FuncCallContext's user_fctx space. My problem is that when the function is ran with a limit clause (SELECT * FROM foo() LIMIT 10) the server will stop calling the function automatically, not giving me a chance to close the file handle. Is there a way to get the limit value inside of foo() and set the max max_calls parameter correctly? No, and even if there were, this would be a very unsafe practice, since errors or other issues could result in early termination of the query. You would likely be better off using tuplestore return mode so that you can do all the reading during one call and not have to assume that you'll get control back again. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Thursday, May 24, 2012 08:12:56 PM Josh Berkus wrote: Yes, we do. It would be best to conclude that things I do on hackers relate in some way to those goals, even if it isn't immediately clear how. See, now you've got me all curious. How does inter-DB queries relate to the New Replication? Being able to apply changes from one process to multiple databases reduces implementation/runtime overhead... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stateful pointers in set-returning functions
Fair enough -- thanks for the tip. On Thu, May 24, 2012 at 11:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Ian Pye ian...@gmail.com writes: I'm writing a set-returning function which places a file handle into PG's FuncCallContext's user_fctx space. My problem is that when the function is ran with a limit clause (SELECT * FROM foo() LIMIT 10) the server will stop calling the function automatically, not giving me a chance to close the file handle. Is there a way to get the limit value inside of foo() and set the max max_calls parameter correctly? No, and even if there were, this would be a very unsafe practice, since errors or other issues could result in early termination of the query. You would likely be better off using tuplestore return mode so that you can do all the reading during one call and not have to assume that you'll get control back again. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
Andres Freund and...@anarazel.de writes: The control file currently is not a very good match because of the current requirement of staying below 512 bytes. If we would include the list of running xacts that wouldn't be enough. I wondered before if there is more to do to fix that then to do the atomic write();fsync();rename();fsync(); dance. I don't see a problem with the cost of that personally... The reason for keeping it to one sector is that you're screwed if the file is broken, so the fewer failure modes the better. I'm not sure I believe that we can make a recovery resume from an arbitrary point in WAL anyway, or that it would be worth the trouble. Can't we just resume from the last restartpoint? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On Thursday, May 24, 2012 08:32:47 PM Tom Lane wrote: Andres Freund and...@anarazel.de writes: The control file currently is not a very good match because of the current requirement of staying below 512 bytes. If we would include the list of running xacts that wouldn't be enough. I wondered before if there is more to do to fix that then to do the atomic write();fsync();rename();fsync(); dance. I don't see a problem with the cost of that personally... The reason for keeping it to one sector is that you're screwed if the file is broken, so the fewer failure modes the better. Yea, sure. But given the amount of software that depends on the above sequence to work correctly I don't really see much of a problem... I'm not sure I believe that we can make a recovery resume from an arbitrary point in WAL anyway, or that it would be worth the trouble. Can't we just resume from the last restartpoint? Well, with a decent sized checkpoint_segments getting up2date can take quite a noticeable amount of time... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 2:19 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Thu, 24 May 2012, Robert Haas wrote: On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: I guess there is nothing catastrophically wrong with that, but still I'm very suprised that you get severe locking problems (factor of two slow-down) when running parallel read-only transactions. Me, too. How many concurrent connections are you running, and does your working set exceed shared_buffers? Can you provide a self-contained reproducible test case? The last tests I've been doing were with 8 connections. And the working set is roughly 30Gig, which is ~ 3x the shared buffers. (but ~ 50% of RAM). Given that additional information, I would say these results are expected. Unfortunately, our BufFreelistLock is a serious contention point, and I think that's what you're hitting. See the graph here: http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html As you can see, raw performance isn't much worse with the larger data sets, but scalability at high connection counts is severely degraded once the working set no longer fits in shared_buffers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
On Wed, May 23, 2012 at 2:28 PM, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Dec 31, 2011 at 10:34 PM, Simon Riggs si...@2ndquadrant.com wrote: Send new protocol keepalive messages to standby servers. Allows streaming replication users to calculate transfer latency and apply delay via internal functions. No external functions yet. Is there plan to implement such external functions before 9.2 release? If not, keepalive protocol seems to be almost useless because there is no use of it for a user and the increase in the number of packets might increase the replication performance overhead slightly. No? Good point. IMHO, this shouldn't really have been committed like this, but since it was, we had better fix it, either by reverting the change or forcing an initdb to expose the functionality. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
Andres Freund and...@anarazel.de writes: On Thursday, May 24, 2012 08:32:47 PM Tom Lane wrote: I'm not sure I believe that we can make a recovery resume from an arbitrary point in WAL anyway, or that it would be worth the trouble. Can't we just resume from the last restartpoint? Well, with a decent sized checkpoint_segments getting up2date can take quite a noticeable amount of time... So? Slow restart is precisely the price you pay for a large checkpoint interval. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Robert Haas robertmh...@gmail.com writes: On Wed, May 23, 2012 at 2:28 PM, Fujii Masao masao.fu...@gmail.com wrote: Is there plan to implement such external functions before 9.2 release? If not, keepalive protocol seems to be almost useless because there is no use of it for a user and the increase in the number of packets might increase the replication performance overhead slightly. No? Good point. IMHO, this shouldn't really have been committed like this, but since it was, we had better fix it, either by reverting the change or forcing an initdb to expose the functionality. I see no reason to rip the code out if we have plans to make use of it in the near future. I am also not for going back into development mode on 9.2, which is what adding new functions now would amount to. What's wrong with leaving well enough alone? It's not like there is no unfinished work anywhere else in Postgres ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not open relation with OID errors after promoting the standby to master
On Thursday, May 24, 2012 08:46:21 PM Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Thursday, May 24, 2012 08:32:47 PM Tom Lane wrote: I'm not sure I believe that we can make a recovery resume from an arbitrary point in WAL anyway, or that it would be worth the trouble. Can't we just resume from the last restartpoint? Well, with a decent sized checkpoint_segments getting up2date can take quite a noticeable amount of time... So? Slow restart is precisely the price you pay for a large checkpoint interval. Well. no. If you stop the master in a orderly fashion it will checkpoint and thus have a sensible startup time. There is no such possibility for a standby without interaction with the master. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 1:43 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, May 24, 2012 at 2:19 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Thu, 24 May 2012, Robert Haas wrote: On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: I guess there is nothing catastrophically wrong with that, but still I'm very suprised that you get severe locking problems (factor of two slow-down) when running parallel read-only transactions. Me, too. How many concurrent connections are you running, and does your working set exceed shared_buffers? Can you provide a self-contained reproducible test case? The last tests I've been doing were with 8 connections. And the working set is roughly 30Gig, which is ~ 3x the shared buffers. (but ~ 50% of RAM). Given that additional information, I would say these results are expected. Unfortunately, our BufFreelistLock is a serious contention point, and I think that's what you're hitting. See the graph here: http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html As you can see, raw performance isn't much worse with the larger data sets, but scalability at high connection counts is severely degraded once the working set no longer fits in shared_buffers. Hm, wouldn't the BufFreelistLock issue be ameliorated if StrategyGetBuffer could reserve multiple buffers so that you'd draw down your local list and only then go back to the global pool? (easier said than done obviously). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Archiver not exiting upon crash
On Wed, May 23, 2012 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Jeff Janes jeff.ja...@gmail.com writes: But what happens if the SIGQUIT is blocked before the system(3) is invoked? Does the ignore take precedence over the block, or does the block take precedence over the ignore, and so the signal is still waiting once the block is reversed after the system(3) is over? I could write a test program to see, but that wouldn't be very good evidence of the portability. AFAICT from the POSIX spec for system(3), that would be a bug in system(). Actually, on further thought, it seems like there is *necessarily* a race condition in this. There must be some interval where the child process has already exited but the waiting parent hasn't de-ignored the signals. Yup. the posix man page doesn't say that this is guaranteed to be atomic, so I assume no such guarantee is even attempted. For that matter, the man page for system(3) doesn't even tell how one goes about signaling the child--unless the signal is sent to the negation of the group leader the child won't receive it. (Postgres handles group-leader thing correctly, I just point it out to show the man page is not intended to be complete) However, I remain unsatisfied with this idea as an explanation for the behavior you're seeing. In the first place, that race condition window ought not be wide enough to allow failure probabilities as high as 10%. In the second place, that code has been like that for a long while, so this theory absolutely does not explain why you're seeing a materially higher probability of failure in HEAD than 9.1. There is something else going on. After a while trying to bisect the behavior, I decided it was a mug's game. Both arms of the race (the firing of archive_command and the engineered crash) are triggered indirectly be the same event, the start of a checkpoint. Small changes in the code can lead to small changes in the timing which make drastic changes in how likely it is that the two arms collide exactly at the vulnerability. So my test harness is an inexplicably effective show-case for the vulnerability, but it is not the reason the vulnerability should be fixed. By the way, my archive_command is very fast, as all it does it echo the date into a log file. I want postgres to think it is archive mode, but for this purpose I don't want to actually deal with having an archive. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
Greg Sabino Mullane g...@endpoint.com writes: Yesterday I had a client that experienced a sudden high load on one of their servers (8.3.5 - yes, I know. Those of you with clients will understand). When I checked, almost all connections were in a startup state, very similar to this thread: http://postgresql.1045698.n5.nabble.com/9-1-3-backends-getting-stuck-in-startup-td5670712.html Running a strace showed a lot of semop activity, and the logs showed a successful connection, then a 5 minute plus wait before a query was issued. So obviously, blocking on something. Did you check I/O activity? I looked again at Jeff Frost's report and now think that what he saw was probably a lot of seqscans on bloated system catalogs, cf http://archives.postgresql.org/message-id/28484.1337887...@sss.pgh.pa.us Unlike the thread above, I *did* find problems in the system catalogs. For example, both pg_class and pg_index gave warnings like this for every index during a VACUUM FULL VERBOSE tablename: WARNING: index pg_class_relname_nsp_index contains 7712 row versions, but table contains 9471 row versions HINT: Rebuild the index with REINDEX. That's fairly interesting, but if it was a bloat situation then it would've been the VAC FULL that fixed it rather than the REINDEX. Did you happen to save the VERBOSE output? It'd be really useful to know whether there was any major shrinkage of the core catalogs (esp. pg_class, pg_attribute). * Did anything in the 8.3 series fix this? I think there are probably two independent issues here. The missing index entries are clearly bad but it's not clear that they had anything to do with the startup stall. There are a couple of fixes in recent 8.3.x releases that might possibly explain the index corruption, especially if you're in the habit of reindexing the system catalogs frequently. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing optimization when filters are applied after window functions
On Wed, May 16, 2012 at 09:25:13AM -0400, Tom Lane wrote: Volker Grabsch v...@notjusthosting.com writes: I propose the following general optimization: If all window functions are partitioned by the same first field (here: id), then any filter on that field should be executed before WindowAgg. I'm not sure if that rule is correct in detail, but in any case the short answer is that window aggregates are a new feature in Postgres and we basically haven't done any optimization work on them yet. Feel free to work in that area if it interests you... Is this a TODO? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
On Thu, May 24, 2012 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, May 23, 2012 at 2:28 PM, Fujii Masao masao.fu...@gmail.com wrote: Is there plan to implement such external functions before 9.2 release? If not, keepalive protocol seems to be almost useless because there is no use of it for a user and the increase in the number of packets might increase the replication performance overhead slightly. No? Good point. IMHO, this shouldn't really have been committed like this, but since it was, we had better fix it, either by reverting the change or forcing an initdb to expose the functionality. I see no reason to rip the code out if we have plans to make use of it in the near future. I am also not for going back into development mode on 9.2, which is what adding new functions now would amount to. What's wrong with leaving well enough alone? It's not like there is no unfinished work anywhere else in Postgres ... So, extra TCP overhead for no user-visible benefit doesn't bother you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 12:46 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, May 24, 2012 at 2:24 PM, Merlin Moncure mmonc...@gmail.com wrote: As you can see, raw performance isn't much worse with the larger data sets, but scalability at high connection counts is severely degraded once the working set no longer fits in shared_buffers. Hm, wouldn't the BufFreelistLock issue be ameliorated if StrategyGetBuffer could reserve multiple buffers so that you'd draw down your local list and only then go back to the global pool? (easier said than done obviously). hm, looking at the code some more, it looks like the whole point of the strategy system is to do that. I thought you were suggesting that the StrategyGetBuffer would pre-allocate multiple buffers to a backend under the cover of a single BufFreelistLock. If that is what you were suggesting, that is not what the strategy system is currently for. It is for locally reusing buffers, not for gang-allocating them. If a backend could somehow predict that the buffer it is about to read in is likely going to be a cold buffer, perhaps it would make sense for each backend to maintain an small ring of its own which it can reuse for such cold buffers. ISTM bulk insert type queries would be good candidates for a buffer strategy somehow? Probably. There is a code or README comment to that effect that I stumbled upon just ra couple hours ago, but can't immediately re-find it. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 3:46 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, May 24, 2012 at 2:24 PM, Merlin Moncure mmonc...@gmail.com wrote: As you can see, raw performance isn't much worse with the larger data sets, but scalability at high connection counts is severely degraded once the working set no longer fits in shared_buffers. Hm, wouldn't the BufFreelistLock issue be ameliorated if StrategyGetBuffer could reserve multiple buffers so that you'd draw down your local list and only then go back to the global pool? (easier said than done obviously). hm, looking at the code some more, it looks like the whole point of the strategy system is to do that. ISTM bulk insert type queries would be good candidates for a buffer strategy somehow? Yep. commit 85e2cedf985bfecaf43a18ca17433070f439fb0e Author: Tom Lane t...@sss.pgh.pa.us Date: Thu Nov 6 20:51:15 2008 + Improve bulk-insert performance by keeping the current target buffer pinned (but not locked, as that would risk deadlocks). Also, make it work in a sma ring of buffers to avoid having bulk inserts trash the whole buffer arena. Robert Haas, after an idea of Simon Riggs'. But that doesn't help here, since pgbench is only looking up one row per query. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 3:35 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, May 24, 2012 at 3:46 PM, Merlin Moncure mmonc...@gmail.com wrote: hm, looking at the code some more, it looks like the whole point of the strategy system is to do that. ISTM bulk insert type queries would be good candidates for a buffer strategy somehow? Yep. commit 85e2cedf985bfecaf43a18ca17433070f439fb0e Author: Tom Lane t...@sss.pgh.pa.us Date: Thu Nov 6 20:51:15 2008 + Improve bulk-insert performance by keeping the current target buffer pinned (but not locked, as that would risk deadlocks). Also, make it work in a sma ring of buffers to avoid having bulk inserts trash the whole buffer arena. Robert Haas, after an idea of Simon Riggs'. But that doesn't help here, since pgbench is only looking up one row per query. Wait -- OP's gripe this isn't regarding standard pgbench, but multiple large concurrent 'insert into foo select...'. I looked in the code and it appears that the only bulk insert strategy using operations are copy, create table as select, and table rewrite operations. Concurrent INSERT SELECT apparently doesn't get the benefit of a strategy and should be fighting over the freelist once the pool exhausts. We don't get to skip wal of course, but we should be able to use a bulk insert strategy, especially if there was some way of predicting that a large number of tuples were going to be inserted. I'm wondering though of contention on the free list is in fact the OP's problem. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
On Thu, May 24, 2012 at 03:54:54PM -0400, Tom Lane wrote: Did you check I/O activity? I looked again at Jeff Frost's report and now think that what he saw was probably a lot of seqscans on bloated system catalogs, cf http://archives.postgresql.org/message-id/28484.1337887...@sss.pgh.pa.us Thank you for the response. Yes, we did look at I/O, but nothing unusual was seen that would explain the load. If it happens again I'll see if the system catalogs are getting a lot of seqscans. That's fairly interesting, but if it was a bloat situation then it would've been the VAC FULL that fixed it rather than the REINDEX. Did you happen to save the VERBOSE output? It'd be really useful to know whether there was any major shrinkage of the core catalogs (esp. pg_class, pg_attribute). I did have them in screen, but the home screen box just craashed a few hours ago (after weeks of uptime: Murphy's law). It certainly could have been VAC FULL as we have processes that do both VAC FULL and REINDEX periodically; I simply assumed based on the HINT it was the REINDEX that cleared it up. The only snippet I have is: # vacuum full verbose pg_class; INFO: vacuuming pg_catalog.pg_class INFO: pg_class: found 43 removable, 10376 nonremovable row versions in 518 pages DETAIL: 6078 dead row versions cannot be removed yet. Nonremovable row versions range from 160 to 628 bytes long. There were 7367 unused item pointers. Total free space (including removable row versions) is 157000 bytes. 0 pages are or will become empty, including 0 at the end of the table. 0 pages containing 0 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index pg_class_oid_index now contains 7712 row versions in 32 pages DETAIL: 6 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. WARNING: index pg_class_oid_index contains 7712 row versions, but table contains 9471 row versions HINT: Rebuild the index with REINDEX. INFO: index pg_class_relname_nsp_index now contains 7712 row versions in 113 pages DETAIL: 6 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. WARNING: index pg_class_relname_nsp_index contains 7712 row versions, but table contains 9471 row versions HINT: Rebuild the index with REINDEX. VACUUM Time: 65.635 ms Actually, since we have full logging, those warnings should still be there... (checks) ... yep. Hints elided, the last one seen was: WARNING: index pg_class_oid_index contains 14877 row versions, but table contains 66648 row versions WARNING: index pg_class_relname_nsp_index contains 14877 row versions, but table contains 66648 row versions WARNING: index pg_class_oid_index contains 60943 row versions, but table contains 112714 row versions WARNING: index pg_class_relname_nsp_index contains 60943 row versions, but table contains 112714 row versions Looks like there are some more going back a ways: I'll gather together and send those offlist after this. I think there are probably two independent issues here. The missing index entries are clearly bad but it's not clear that they had anything to do with the startup stall. There are a couple of fixes in recent 8.3.x releases that might possibly explain the index corruption, especially if you're in the habit of reindexing the system catalogs frequently. Yes, we are in that habit. I hope they are not independent: I'd rather have one problem to worry about than two. :) I'll certainly report if I see either problem pop up again. Oh, almost forgot: reading your reply to the old thread reminded me of something I saw in one of the straces right as it woke up and left the startup state to do some work. Here's a summary: 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 (x a gazillion) ... 12:18:40 brk(0x1c0af000)= 0x1c0af000 ...(some more semops)... 12:18:40 mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062c98000 ...(handful of semops)... 12:18:40 unlink(base/1554846571/pg_internal.init.11803) = -1 ENOENT (No such file or directory) 12:18:40 open(base/1554846571/pg_internal.init.11803, O_WRONLY|O_CREAT|O_TRUNC, 0666) = 13 12:18:40 fstat(13, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 12:18:40 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062cd9000 12:18:40 write(13, ... ...(normalish looking strace output after this)... -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpgvtyJ9p6Fs.pgp Description: PGP signature
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
I think there are probably two independent issues here. The missing index entries are clearly bad but it's not clear that they had anything to do with the startup stall. On further log digging, I think you are correct, as those index warnings go back many days before the startup problems appeared. Let me know if you'd like any of those warnings from the logs. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgp9P2W8oU7Y8.pgp Description: PGP signature
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Thu, May 24, 2012 at 4:46 PM, Merlin Moncure mmonc...@gmail.com wrote: Wait -- OP's gripe this isn't regarding standard pgbench, but multiple large concurrent 'insert into foo select...'. I looked in the code and it appears that the only bulk insert strategy using operations are copy, create table as select, and table rewrite operations. Concurrent INSERT SELECT apparently doesn't get the benefit of a strategy and should be fighting over the freelist once the pool exhausts. I think you are right. We don't get to skip wal of course, but we should be able to use a bulk insert strategy, especially if there was some way of predicting that a large number of tuples were going to be inserted. I'm wondering though of contention on the free list is in fact the OP's problem. Not sure. It might be some other LWLock, but it's hard to tell which one from the information provided. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On 21 May 2012 19:10, Josh Berkus j...@agliodbs.com wrote: For these reasons, it may be timely and appropriate, from a purely advocacy point-of-view, to call our new group commit group commit in release notes and documentation, and announce it as a new feature. First, shouldn't we be having this discussion on -advocacy? Well, no, because this is a specific discussion about release notes. In any case, I've given up on the idea that we should market new group commit as group commit. I believe that that would be a useful and fair way of representing the feature, but there doesn't seem to be any support for that view. In passing, I noticed this: E.1.3.12.2. pg_stat_statements Improve pg_stat_statements to aggregate similar queries (Peter Geoghegan, Tom Lane) Improve pg_stat_statements' handling of PREPARE/EXECUTE statements (Tom Lane) Add dirtied and written block counts to pg_stat_statements (Robert Haas) I think that the second entry should be listed as a bug fix, or a compatibility note, rather than an actual feature. At the very least, it should be listed after Add dirtied and written block counts. I also think that we should separately list as a feature pg_stat_statements new ability to track I/O timings at the query granularity. Are we any closer to a list of major features? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] release note item
On Mon, May 21, 2012 at 12:19:27PM -0400, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Regarding the item: * Properly handle empty arrays returned by PL/Perl functions (Andrew Dunstan) DETAILS? This was a bug fix, not a feature, and in any case is due to Alex Hunsaker, not me. I assume this is in reference to these commits: Thanks, removed from 9.2 release notes. --- Author: Andrew Dunstan and...@dunslane.net Branch: REL9_1_STABLE Release: REL9_1_0 [22a55b3af] 2011-08-17 12:03:26 -0400 Properly handle empty arrays returned from plperl functions. Bug reported by David Wheeler, fix by Alex Hunsaker. # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # On branch master # Changes to be committed: # (use git reset HEAD file... to unstage) # # modified: src/pl/plperl/plperl.c # # Untracked files: # (use git add file... to include in what will be committed) # # autom4te.cache/ # configure.in~ # doc/src/sgml/ref/grant.sgml~ # src/backend/port/win32_latch.c~ # src/bin/psql/command.c~ # src/include/pg_config.h.win32~ # src/pl/plpython/plpython.c~ # src/tools/msvc/pgbison.bat~ # src/tools/msvc/pgbison.pl.bak # src/tools/msvc/pgflex.bat~ # src/tools/msvc/pgflex.pl.bak # src/tools/pgindent/README~ # src/tools/pgindent/pgindent.pl # src/tools/pgindent/pgindent.pl~ # x # yy Author: Andrew Dunstan and...@dunslane.net Branch: master [68c903a66] 2011-08-17 11:59:18 -0400 Properly handle empty arrays returned from plperl functions. Bug reported by David Wheeler, fix by Alex Hunsaker. git_changelog failed to merge them because of the different commit messages, which is probably why Bruce thought it was a HEAD-only commit. AFAICS we should just remove this item, since it was (or should've been) documented already in 9.1 release notes. regards, tom lane -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Thu, May 24, 2012 at 10:34:22PM +0100, Peter Geoghegan wrote: In passing, I noticed this: E.1.3.12.2. pg_stat_statements Improve pg_stat_statements to aggregate similar queries (Peter Geoghegan, Tom Lane) Improve pg_stat_statements' handling of PREPARE/EXECUTE statements (Tom Lane) Add dirtied and written block counts to pg_stat_statements (Robert Haas) I think that the second entry should be listed as a bug fix, or a compatibility note, rather than an actual feature. At the very least, it should be listed after Add dirtied and written block counts. I also think that we should separately list as a feature OK, item moved down. We have not have bug fix designation. You have a suggestion? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap metapages
On Tue, May 22, 2012 at 09:52:30AM +0100, Simon Riggs wrote: Having pg_upgrade touch data files is both dangerous and difficult to back out in case of mistake, so I am wary of putting the metapage at block 0. Doing it the way I suggest means the .meta files would be wholly new and can be deleted as a back-out. We can also clean away any unnecessary .vm/.fsm files as a later step. Pg_upgrade never modifies the old cluster, except to lock it in link mode, so there is never anything to back out. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On 24 May 2012 22:57, Bruce Momjian br...@momjian.us wrote: OK, item moved down. We have not have bug fix designation. You have a suggestion? I assumed you were going to put it beside the other compatibility note relating to pg_stat_statements, Change pg_stat_statements' total_time column to be measured in milliseconds (Tom Lane). The Improve pg_stat_statements' handling of PREPARE/EXECUTE statements is just a way of preventing SQL PREPARE and EXECUTE utility statements from being double counted in various ways as both utility statements and optimisable statements. No one actually noticed this before, and it wouldn't have been feasible to fix in back branches, I think. Here are the relevant comments: * If it's an EXECUTE statement, we don't track it and don't increment * the nesting level. This allows the cycles to be charged to the * underlying PREPARE instead (by the Executor hooks), which is much more * useful. * * We also don't track execution of PREPARE. If we did, we would get one * hash table entry for the PREPARE (with hash calculated from the query * string), and then a different one with the same query string (but hash * calculated from the query tree) would be used to accumulate costs of * ensuing EXECUTEs. This would be confusing, and inconsistent with other * cases where planning time is not included at all. Also, as I've said, this I/O timings thing certainly deserves to be separately listed as a new pg_stat_statements feature: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5b4f346611431361339253203d486789e4babb02 -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Robert Haas robertmh...@gmail.com writes: On Thu, May 24, 2012 at 4:46 PM, Merlin Moncure mmonc...@gmail.com wrote: We don't get to skip wal of course, but we should be able to use a bulk insert strategy, especially if there was some way of predicting that a large number of tuples were going to be inserted. I'm wondering though of contention on the free list is in fact the OP's problem. Not sure. It might be some other LWLock, but it's hard to tell which one from the information provided. Yeah. It seems quite plausible that Robert's select-only benchmark might be mainly tripping over the freelist lock, but I'm less convinced about something that's doing INSERT/SELECT, and therefore is also doing a lot of WAL activity, index insertions, etc. I'd want to see some instrumentation results before assuming we know where the bottleneck is there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On Tue, May 22, 2012 at 10:19:12AM -0400, Robert Haas wrote: In retrospect, I think the idea of shared catalogs was probably a bad idea. I think we should have made roles and tablespaces database objects rather than shared objects, and come up with some ad-hoc method of representing the set of available databases. But that decision seems to have been made sometime pre-1996, so the thought of changing it now is pretty painful, but I can dream... Yes, pre-1996. I think the fact that authentication/user names appear in pg_hba.conf really locked the user name idea into global objects, and we have never really been able to make a dent in that. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
Greg Sabino Mullane g...@endpoint.com writes: Oh, almost forgot: reading your reply to the old thread reminded me of something I saw in one of the straces right as it woke up and left the startup state to do some work. Here's a summary: 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 12:18:39 semop(4390981, 0x7fff66c4ec10, 1) = 0 (x a gazillion) ... 12:18:40 brk(0x1c0af000)= 0x1c0af000 ...(some more semops)... 12:18:40 mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062c98000 ...(handful of semops)... 12:18:40 unlink(base/1554846571/pg_internal.init.11803) = -1 ENOENT (No such file or directory) 12:18:40 open(base/1554846571/pg_internal.init.11803, O_WRONLY|O_CREAT|O_TRUNC, 0666) = 13 12:18:40 fstat(13, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 12:18:40 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac062cd9000 12:18:40 write(13, ... ...(normalish looking strace output after this)... Yeah, this is proof that what it was doing is the same as what we saw in Jeff's backtrace, ie loading up the system catalog relcache entries the hard way via seqscans on the core catalogs. So the question to be answered is why that's suddenly a big performance bottleneck. It's not a cheap operation of course (that's why we cache the results ;-)) but it shouldn't take minutes either. And, because they are seqscans, it doesn't seem like messed-up indexes should matter. The theory I have in mind about Jeff's case is that it was basically an I/O storm, but it's not clear whether the same explanation works for your case. There may be some other contributing factor that we haven't identified yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Hi, On Thu, 24 May 2012, Robert Haas wrote: Not sure. It might be some other LWLock, but it's hard to tell which one from the information provided. If you could tell what's the best way to find out the info that you need, then I could run it reasonably quickly. S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 11:36 PM, Ants Aasma a...@cybertec.at wrote: ... The free list itself is a bit trickier, but if it's still necessary/useful then SC-firstFreeBuffer and buf-freeNext are in effect a linked-list stack, there should plenty of tested lock free algorithms floating around for that. (btw. lastFreeBuffer looks like dead code, is that correct?) Thinking about it a bit more, if the freelist is mostly empty, a simpler alternative would be to make an unprotected read to check SC-firstFreeBuffer and only acquire BufFreelistLock if there's anything to pop. This would reduce the lock free parts to just atomically incrementing a variable. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers