Re: [HACKERS] leaky views, yet again
(2010/09/02 13:30), KaiGai Kohei wrote: (2010/09/02 12:38), Robert Haas wrote: 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com: (2010/09/02 11:57), Robert Haas wrote: 2010/9/1 KaiGai Koheikai...@ak.jp.nec.com: Right now, it stands on a strict assumption that considers operators implemented with built-in functions are safe; it does not have no possibility to leak supplied arguments anywhere. Please note that this patch does not case about a case when a function inside a view and a function outside a view are distributed into same level and the later function has lower cost value. Without making some attempt to address these two points, I don't see the point of this patch. Also, I believe we decided previously do this deoptimization only in case the user requests it with CREATE SECURITY VIEW. Perhaps, I remember the previous discussion incorrectly. If we have a hint about whether the supplied view is intended to security purpose, or not, it seems to me it is a reliable method to prevent pulling up the subqueries come from security views. Is it too much deoptimization? Well, that'd prevent something like id = 3 from getting pushed down, which seems a bit harsh. I've tried to implement a proof of the concept patch according to the following logic. (For the quick hack, it does not include statement support. It just considers view with name begun from s as security views.) Hmm. If so, we need to remember what FromExpr was come from subqueries of security views, and what were not. Then, we need to prevent leakable clause will be distributed to inside of them. In addition, we also need to care about the order of function calls in same level, because it is not implicitly solved. At first, let's consider top-half of the matter. When views are expanded into subqueries in query-rewriter, Query tree lost an information OID of the view, because RangeTblEntry does not have OID of the relation when it is RTE_SUBQUERY. So, we need to patch here to mark a flag whether the supplied view is security focused, or not. This patch added 'security_view' flag into RangeTblEntry. It shall be set when the query rewriter expands security views. Then, pull_up_simple_subquery() pulls up a supplied subquery into normal join, if possible. In this case, FromExpr is chained into the upper level. Of course, FromExpr does not have a flag to show its origin, so we also need to copy the new flag in RangeTblEntry to FromExpr. This patch also added 'security_view' flag into FromExpr. It shall be set when the pull_up_simple_subquery() pulled up a RangeTblEntry with security_view = true. Then, when distribute_qual_to_rels() is called, the caller also provides a Bitmapset of relation-Ids which are contained under the FromExpr with the flag saying it came from the security views. Even if the supplied clause references a part of the Bitmapset, we need to prevent the clause being pushed down into the relations came from security views, except for ones we can make sure these are safe. Just before distribute_qual_to_rels(), deconstruct_recurse() is invoked. It walks on the supplied join-tree to collect what relations are appeared under the current FromExpr/JoinExpr. The deconstruct_recurse() was modified to take two new arguments of 'bool below_sec_barriers' and 'Relids *sec_barriers'. The first one means the current recursion is under the FromExpr with security_view being true. At that time, it set appeared relations on the sec_barriers, then returns. In the result, the 'sec_barriers' shall become a bitmapset of relations being under the FromExpr which is originated by security views. Then, 'sec_barriers' shall be delivered to distribute_qual_to_rels(). If the supplied qualifier references a part of 'sec_barriers' and contains possibly leakable functions, it appends whole of the sec_barriers to the bitmapset of relations on which the clause is depending. In the result, it shall not be pushed down into the security view. Example) testdb=# CREATE VIEW n_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x; CREATE VIEW testdb=# CREATE VIEW s_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x; CREATE VIEW testdb=# EXPLAIN SELECT * FROM n_view WHERE f_malicious(y); QUERY PLAN --- Hash Join (cost=334.93..365.94 rows=410 width=72) Hash Cond: (t1.a = t2.x) - Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36) - Hash (cost=329.80..329.80 rows=410 width=36) - Seq Scan on t2 (cost=0.00..329.80 rows=410 width=36) Filter: f_malicious(y) (6 rows) testdb=# EXPLAIN SELECT * FROM s_view WHERE f_malicious(y); QUERY PLAN --- Hash Join (cost=37.68..384.39 rows=410 width=72) Hash Cond: (t1.a = t2.x) Join Filter: f_malicious(t2.y) - Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36)
Re: [HACKERS] git: uh-oh
On Thu, Sep 2, 2010 at 05:13, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 1, 2010 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote: That definitely didn't fix it, although I'm not quite sure why. Can you throw the modified CVS you ran this off of up somewhere I can rsync it? no rsync server on that box, but I put up a tarball for you at http://www.hagander.net/pgsql/cvsrepo.tgz OK, color me baffled. I looked at gram.c and I believe you obsoleted the right revs. The only difference I see between this and some other random deleted file is that it has a couple of tags pointing to revs that don't exist any more, but I can't see how that would cause the observed weirdness. Well, I can try removing those to see what happens and run again.. Which tags and where? (and how do I actually remove them :P) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: register/unregister standby Re: [HACKERS] Synchronous replication
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Hmm, that's clever. I was thinking that you'd initialize the standby from an existing backup, and in that context the standby would not need to connect to the master except via the replication connection. To take a base backup, you'll need not only that but also access to the filesystem in the master, ie. shell access. In fact you don't need shell access here, it's rather easy to stream the base backup from the libpq connection, as implemented here : http://github.com/dimitri/pg_basebackup There's been some talk of being able to stream a base backup over the replication connection too, which would be extremely handy. Yes please ! :) -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] array_agg() NULL Handling
David E. Wheeler da...@kineticode.com writes: On Sep 1, 2010, at 10:52 AM, Thom Brown wrote: ould appreciate the recipe for removing the NULLs. WHERE clause :P There may be cases where that's undesirable, such as there being more than one aggregate in the SELECT list, or the column being grouped on needing to return rows regardless as to whether there's NULLs in the column being targeted by array_agg() or not. Exactly the issue I ran into: SELECT name AS distribution, array_agg( CASE relstatus WHEN 'stable' THEN version ELSE NULL END ORDER BY version) AS stable, array_agg( CASE relstatus WHEN 'testing' THEN version ELSE NULL END ORDER BY version) AS testing FROM distributions GROUP BY name; What about adding WHERE support to aggregates, adding to the ORDER BY capability they already have? SELECT array_agg(version WHERE relstatus = 'stable' ORDER BY version) The current way to do that is using a subquery and unnest() and where clause there, but that's not a good way to avoid to process stored data in the aggregate / in the query. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: register/unregister standby Re: [HACKERS] Synchronous replication
On Thu, Sep 2, 2010 at 6:41 PM, Dimitri Fontaine dfonta...@hi-media.com wrote: In fact you don't need shell access here, it's rather easy to stream the base backup from the libpq connection, as implemented here : http://github.com/dimitri/pg_basebackup There's been some talk of being able to stream a base backup over the replication connection too, which would be extremely handy. Yes please ! :) One issue of the base backup function is that the operation will be a long transaction. So, non-transactional special commands, as like as VACUUM, would be better in terms of performance. For example, CREATE or ALTER REPLICATION. Of course, function-based approach is more flexible and less invasive to the SQL parser. There are trade-offs. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Wed, Sep 1, 2010 at 7:23 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That requirement falls out from the handling of disconnected standbys. If a standby is not connected, what does the master do with commits? If the answer is anything else than acknowledge them to the client immediately, as if the standby never existed, the master needs to know what standby servers exist. Otherwise it can't know if all the standbys are connected or not. Thanks. I understood why the registration is required. I'd like to keep this as simple as possible, yet flexible so that with enough scripting and extensions, you can get all sorts of behavior. I think quorum commit falls into the extension category; if you're setup is complex enough, it's going to be impossible to represent that in our config files no matter what. But if you write a little proxy, you can implement arbitrary rules there. Agreed. I think recv/fsync/replay should be specified in the standby. It has no direct effect on the master, the master would just relay the setting to the standby when it connects, or the standby would send multiple XLogRecPtrs and let the master decide when the WAL is persistent enough. The latter seems wasteful since the master uses only one XLogRecPtr even if the standby sends multiple ones. So I prefer the former design. Which also makes the code and design very simple, and we can easily write the proxy. sync vs async on the other hand should be specified in the master, because it has a direct impact on the behavior of commits in the master. I propose a configuration file standbys.conf, in the master: # STANDBY NAME SYNCHRONOUS TIMEOUT importantreplica yes 100ms tempcopy no 10s Seems good. In fact, instead of yes/no, async/recv/fsync/replay is specified in SYNCHRONOUS field? OTOH, something like standby_name parameter should be introduced in recovery.conf. We should allow multiple standbys with the same name? Probably yes. We might need to add NUMBER field into the standbys.conf, in the future. Yeah, though of course you might want to set that per-standby too.. Yep. Let's step back a bit and ask what would be the simplest thing that you could call synchronous replication in good conscience, and also be useful at least to some people. Let's leave out the down mode, because that requires registration. We'll probably have to do registration at some point, but let's take as small steps as possible. Agreed. Without the down mode in the master, frankly I don't see the point of the recv and fsync levels in the standby. Either way, when the master acknowledges a commit to the client, you don't know if it has made it to the standby yet because the replication connection might be down for some reason. True. We cannot know whether the standby can be brought up to the master without any data loss when the master crashes, because the standby might be disconnected before for some reasons and not have some latest data. But the situation would be the same even when 'replay' mode is chosen. Though we might be able to check whether the latest transaction has replicated to the standby by running read only query to the standby, it's actually difficult to do that. How can we know the content of the latest transaction? Also even when 'recv' or 'fsync' is chosen, we might be able to check that by doing pg_last_xlog_receive_location() on the standby. But the similar question occurs to me: How can we know the LSN of the latest transaction? I'm thinking to introduce new parameter specifying the command which is executed when the standby is disconnected. This command is executed by walsender before resuming the transaction processings which have been suspended by the disconnection. For example, if STONISH against the standby is supplied as the command, we can prevent the standby not having the latest data from becoming the master by forcibly shutting such a delayed standby down. Thought? That leaves us the 'replay' mode, which *is* useful, because it gives you the guarantee that when the master acknowledges a commit, it will appear committed in all hot standby servers that are currently connected. With that guarantee you can build a reliable cluster with something pgpool-II where all writes go to one node, and reads are distributed to multiple nodes. I'm concerned that the conflict by read-only query and recovery might harm the performance on the master in 'replay' mode. If the conflict occurs, all running transactions on the master have to wait for it to disappear, and which can take very long. Of course, wihtout the conflict, waiting until the standby has received, fsync'd, read and replayed WAL would take long. So I'd like to support also 'recv' and 'fsync'. I believe that it's not complicated and difficult to implement those two modes. I'm not sure what we should aim for in the first phase. But if you want as little code as possible yet
Re: register/unregister standby Re: [HACKERS] Synchronous replication
On 30 August 2010 13:14, Fujii Masao masao.fu...@gmail.com wrote: I think that the advantage of registering standbys is that we can specify which WAL files the master has to keep for the upcoming standby. IMO, it's usually called together with pg_start_backup as follows: SELECT register_standby('foo', pg_start_backup()) This requests the master keep to all the WAL files following the backup starting location which pg_start_backup returns. Now we can do that by using wal_keep_segments, but it's not easy to set because it's difficult to predict how many WAL files the standby will require. +1 I don't like the idea of having to guess how many WAL files you think you'll need to keep around. And if these standby instances have to register, could there be a view to list subscriber information? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: register/unregister standby Re: [HACKERS] Synchronous replication
Itagaki Takahiro itagaki.takah...@gmail.com writes: http://github.com/dimitri/pg_basebackup There's been some talk of being able to stream a base backup over the replication connection too, which would be extremely handy. Yes please ! :) One issue of the base backup function is that the operation will be a long transaction. So, non-transactional special commands, as like as VACUUM, would be better in terms of performance. For example, CREATE or ALTER REPLICATION. Well, you still need to stream the data to the client in a format it will understand. Would that be the plan of your command proposal? Of course, function-based approach is more flexible and less invasive to the SQL parser. There are trade-offs. Well that was easier for a proof-of-concept at least. -- Dimitri Fontaine PostgreSQL DBA, Architecte -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: register/unregister standby Re: [HACKERS] Synchronous replication
On Thu, Sep 2, 2010 at 7:54 PM, Dimitri Fontaine dfonta...@hi-media.com wrote: One issue of the base backup function is that the operation will be a long transaction. So, non-transactional special commands, as like as VACUUM, would be better in terms of performance. For example, CREATE or ALTER REPLICATION. Well, you still need to stream the data to the client in a format it will understand. True, but using libpq connection might be not the most important thing. The most simplest proof-of-concept might be system(rsync) in the function ;-) Would that be the plan of your command proposal? What I meant was function-based maintenance does not work well in some cases. I heard before pg_start_backup( no-fast-checkpoint ) caused table bloating problem because it was a long transaction for 20+ minutes. The backup function would have the similar issue. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, 2010-09-02 at 19:24 +0900, Fujii Masao wrote: On Wed, Sep 1, 2010 at 7:23 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That requirement falls out from the handling of disconnected standbys. If a standby is not connected, what does the master do with commits? If the answer is anything else than acknowledge them to the client immediately, as if the standby never existed, the master needs to know what standby servers exist. Otherwise it can't know if all the standbys are connected or not. Thanks. I understood why the registration is required. I don't. There is a simpler design that does not require registration. Please explain why we need registration, with an explanation that does not presume it as a requirement. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Robert Haas wrote: On Wed, Sep 1, 2010 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote: That definitely didn't fix it, although I'm not quite sure why. Can you throw the modified CVS you ran this off of up somewhere I can rsync it? no rsync server on that box, but I put up a tarball for you at http://www.hagander.net/pgsql/cvsrepo.tgz OK, color me baffled. I looked at gram.c and I believe you obsoleted the right revs. The only difference I see between this and some other random deleted file is that it has a couple of tags pointing to revs that don't exist any more, but I can't see how that would cause the observed weirdness. What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On 02/09/10 15:03, Simon Riggs wrote: On Thu, 2010-09-02 at 19:24 +0900, Fujii Masao wrote: On Wed, Sep 1, 2010 at 7:23 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That requirement falls out from the handling of disconnected standbys. If a standby is not connected, what does the master do with commits? If the answer is anything else than acknowledge them to the client immediately, as if the standby never existed, the master needs to know what standby servers exist. Otherwise it can't know if all the standbys are connected or not. Thanks. I understood why the registration is required. I don't. There is a simpler design that does not require registration. Please explain why we need registration, with an explanation that does not presume it as a requirement. Please explain how you would implement don't acknowledge commits until they're replicated to all standbys without standby registration. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, 2010-09-02 at 15:15 +0300, Heikki Linnakangas wrote: On 02/09/10 15:03, Simon Riggs wrote: On Thu, 2010-09-02 at 19:24 +0900, Fujii Masao wrote: On Wed, Sep 1, 2010 at 7:23 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That requirement falls out from the handling of disconnected standbys. If a standby is not connected, what does the master do with commits? If the answer is anything else than acknowledge them to the client immediately, as if the standby never existed, the master needs to know what standby servers exist. Otherwise it can't know if all the standbys are connected or not. Thanks. I understood why the registration is required. I don't. There is a simpler design that does not require registration. Please explain why we need registration, with an explanation that does not presume it as a requirement. Please explain how you would implement don't acknowledge commits until they're replicated to all standbys without standby registration. All standbys has no meaning without registration. It is not a question that needs an answer. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, Sep 2, 2010 at 8:44 AM, Simon Riggs si...@2ndquadrant.com wrote: All standbys has no meaning without registration. It is not a question that needs an answer. Tell that to the DBA. I bet s/he knows what all standbys means. The fact that the system doesn't know something doesn't make it unimportant. I agree that we don't absolutely need standby registration for some really basic version of synchronous replication. But I think we'd be better off biting the bullet and adding it. I think that without it we're going to resort to a series of increasingly grotty and user-unfriendly hacks to make this work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On Thu, Sep 2, 2010 at 8:13 AM, Michael Haggerty mhag...@alum.mit.edu wrote: Robert Haas wrote: On Wed, Sep 1, 2010 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote: That definitely didn't fix it, although I'm not quite sure why. Can you throw the modified CVS you ran this off of up somewhere I can rsync it? no rsync server on that box, but I put up a tarball for you at http://www.hagander.net/pgsql/cvsrepo.tgz OK, color me baffled. I looked at gram.c and I believe you obsoleted the right revs. The only difference I see between this and some other random deleted file is that it has a couple of tags pointing to revs that don't exist any more, but I can't see how that would cause the observed weirdness. What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Lots of commits that look like this: commit c50da22b6050e0bdd5e2ef97541d91aa1d2e63fb Author: PostgreSQL Daemon webmas...@postgresql.org Date: Sat Dec 2 08:36:42 2006 + This commit was manufactured by cvs2svn to create branch 'REL8_2_STABLE'. Sprout from master 2006-12-02 08:36:41 UTC PostgreSQL Daemon webmas...@postgresql.org '' Delete: src/backend/parser/gram.c src/interfaces/ecpg/preproc/pgc.c src/interfaces/ecpg/preproc/preproc.c It seems there's something that cvs(2svn) doesn't like about the history of those files. Magnus tried obsoleting the revisions that show up as modifications of the dead revision, which seems to make that history basically identical to the histories of other files that are handled properly, but evidently there's still something wonky going on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compiling with RELCACHE_FORCE_RELEASE doesn't pass regression
Jeff Davis pg...@j-davis.com writes: I think I see how this fixes the problem, but I still don't completely understand. Why can't we just make a real copy of the tuple descriptor for the type cache entry, rather than sharing it between the relcache and the type cache? The issue isn't really about whether we're sharing the physical copy of the tupdesc. The problem the code is trying to deal with is making sure that the typcache's copy gets thrown away (so it can be refreshed on next use) when the relation's rowtype changes, due to ALTER TABLE ADD COLUMN for example. So we need to do that whenever we get a SI inval event for the rel. We were driving that purely off of relcache flushes, which meant that discarding a relcache entry had to force a typcache flush, since nothing would happen if a SI inval arrived at an instant where we had no relcache entry for the rel. Now the typcache is wired directly to the SI inval events, so it'll get a call whether there is a corresponding relcache entry or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Robert Haas wrote: On Thu, Sep 2, 2010 at 8:13 AM, Michael Haggerty mhag...@alum.mit.edu wrote: What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Lots of commits that look like this: commit c50da22b6050e0bdd5e2ef97541d91aa1d2e63fb Author: PostgreSQL Daemon webmas...@postgresql.org Date: Sat Dec 2 08:36:42 2006 + This commit was manufactured by cvs2svn to create branch 'REL8_2_STABLE'. Sprout from master 2006-12-02 08:36:41 UTC PostgreSQL Daemon webmas...@postgresql.org '' Delete: src/backend/parser/gram.c src/interfaces/ecpg/preproc/pgc.c src/interfaces/ecpg/preproc/preproc.c I addressed that problem in this email: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01819.php Summary: it is caused by a known weakness in cvs2svn's branch-parent-choosing code that would be difficult to solve. But it just occurred to me--the script contrib/git-move-refs.py is supposed to fix problems like this. Have you run this script against your git repository? (Caveat: I am not very familiar with the script, which was contributed by a user. Please check the results carefully and let us know how it works for you.) Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
Robert Haas robertmh...@gmail.com writes: Tell that to the DBA. I bet s/he knows what all standbys means. The fact that the system doesn't know something doesn't make it unimportant. Well as a DBA I think I'd much prefer to attribute votes to each standby so that each ack is weighted. Let me explain in more details the setup I'm thinking about. The transaction on the master wants a certain service level (async, recv, fsync, replay) and a certain number of votes. As proposed earlier, the standby would feedback the last XID known locally in each state (received, synced, replayed) and its current weight, and the master would arbitrate given those information. That's highly flexible, you can have slaves join the party at any point in time, and change 2 user GUC (set by session, transaction, function, database, role, in postgresql.conf) to setup the service level target you want to ensure, from the master. (We could go as far as wanting fsync:2,replay:1 as a service level.) From that you have either the fail when slave disappear and the please don't shut the service down if a slave disappear settings, per transaction, and per slave too (that depends on its weight, remember). (You can setup the slave weights as powers of 2 and have the service level be masks to allow you to choose precisely which slave will ack your fsync service level, and you can switch this slave at run time easily — sounds cleverer, but sounds also easier to implement given the flexibility it gives — precedents in PostgreSQL? the PITR and WAL Shipping facilities are hard to use, full of traps, but very flexible). You can even give some more weight to one slave while you're maintaining another so that the master just don't complain. I see a need for very dynamic *and decentralized* replication topology setup, I fail to see a need for a centralized registration based setup. I agree that we don't absolutely need standby registration for some really basic version of synchronous replication. But I think we'd be better off biting the bullet and adding it. What does that mechanism allow us to implement we can't do without? -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, 2010-09-02 at 08:59 -0400, Robert Haas wrote: On Thu, Sep 2, 2010 at 8:44 AM, Simon Riggs si...@2ndquadrant.com wrote: All standbys has no meaning without registration. It is not a question that needs an answer. Tell that to the DBA. I bet s/he knows what all standbys means. The fact that the system doesn't know something doesn't make it unimportant. I agree that we don't absolutely need standby registration for some really basic version of synchronous replication. But I think we'd be better off biting the bullet and adding it. I think that without it we're going to resort to a series of increasingly grotty and user-unfriendly hacks to make this work. I'm personally quite happy to have server registration. My interest is in ensuring we have master-controlled robustness, which is so far being ignored because we need simple. Refrring to above, we are clearly quite willing to go beyond the most basic implementation, so there's no further argument to exclude it for that reason. The implementation of master-controlled robustness is no more difficult than the alternative. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, Sep 2, 2010 at 10:06 AM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, 2010-09-02 at 08:59 -0400, Robert Haas wrote: On Thu, Sep 2, 2010 at 8:44 AM, Simon Riggs si...@2ndquadrant.com wrote: All standbys has no meaning without registration. It is not a question that needs an answer. Tell that to the DBA. I bet s/he knows what all standbys means. The fact that the system doesn't know something doesn't make it unimportant. I agree that we don't absolutely need standby registration for some really basic version of synchronous replication. But I think we'd be better off biting the bullet and adding it. I think that without it we're going to resort to a series of increasingly grotty and user-unfriendly hacks to make this work. I'm personally quite happy to have server registration. OK, thanks for clarifying. My interest is in ensuring we have master-controlled robustness, which is so far being ignored because we need simple. Refrring to above, we are clearly quite willing to go beyond the most basic implementation, so there's no further argument to exclude it for that reason. The implementation of master-controlled robustness is no more difficult than the alternative. But I'm not sure I quite follow this part. I don't think I know what you mean by master-controlled robustness. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On 02/09/10 14:40, Michael Haggerty wrote: Robert Haas wrote: On Thu, Sep 2, 2010 at 8:13 AM, Michael Haggerty mhag...@alum.mit.edu wrote: What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Lots of commits that look like this: commit c50da22b6050e0bdd5e2ef97541d91aa1d2e63fb Author: PostgreSQL Daemon webmas...@postgresql.org Date: Sat Dec 2 08:36:42 2006 + This commit was manufactured by cvs2svn to create branch 'REL8_2_STABLE'. Sprout from master 2006-12-02 08:36:41 UTC PostgreSQL Daemon webmas...@postgresql.org '' Delete: src/backend/parser/gram.c src/interfaces/ecpg/preproc/pgc.c src/interfaces/ecpg/preproc/preproc.c I addressed that problem in this email: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01819.php Summary: it is caused by a known weakness in cvs2svn's branch-parent-choosing code that would be difficult to solve. But it just occurred to me--the script contrib/git-move-refs.py is supposed to fix problems like this. Have you run this script against your git repository? (Caveat: I am not very familiar with the script, which was contributed by a user. Please check the results carefully and let us know how it works for you.) Moving refs can't possibly splice out branch creation commits. Max. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Synchronous replication - patch status inquiry
On 02/09/10 17:06, Simon Riggs wrote: On Thu, 2010-09-02 at 08:59 -0400, Robert Haas wrote: On Thu, Sep 2, 2010 at 8:44 AM, Simon Riggssi...@2ndquadrant.com wrote: All standbys has no meaning without registration. It is not a question that needs an answer. Tell that to the DBA. I bet s/he knows what all standbys means. The fact that the system doesn't know something doesn't make it unimportant. I agree that we don't absolutely need standby registration for some really basic version of synchronous replication. But I think we'd be better off biting the bullet and adding it. I think that without it we're going to resort to a series of increasingly grotty and user-unfriendly hacks to make this work. I'm personally quite happy to have server registration. My interest is in ensuring we have master-controlled robustness, which is so far being ignored because we need simple. Refrring to above, we are clearly quite willing to go beyond the most basic implementation, so there's no further argument to exclude it for that reason. The implementation of master-controlled robustness is no more difficult than the alternative. I understand what you're after, the idea of being able to set synchronization level on a per-transaction basis is cool. But I haven't seen a satisfactory design for it. I don't understand how it would work in practice. Even though it's cool, having different kinds of standbys connected is a more common scenario, and the design needs to accommodate that too. I'm all ears if you can sketch a design that can do that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Robert Haas robertmh...@gmail.com wrote: I could knock out a couple other files from the main patch if people considered it acceptable to enable the SHMQueueIsDetached function now, which the patch uses in several places within asserts. I would remove the #ifdef NOT_USED from around the (very short) function, and add it to the .h file. -1. OK, I'll leave that part out. The changes to the comments and local variables seem pretty safe. The change of IsXactIsoLevelSerializable to IsXactIsoLevelXactSnapshotBased (or whatever name the community prefers) How about IsXactIsoLevelSnapshot? Just to be a bit shorter. I need two macros -- one which has the same definition as the current IsXactIsoLevelSerializable, to be used everywhere the old macro name currently is used, which conveys that it is an isolation level which is based on a transaction snapshot rather than statement snapshots (i.e., REPEATABLE READ or SERIALIZABLE) and a new macro (which I was planning to call IsXactIsoLevelFullySerializable) which conveys that it is the SERIALIZABLE isolation level. Do you feel that IsXactIsoLevelSnapshot works with IsXactIsoLevelFullySerializable to convey the right semantics? If not, what would you suggest? I'm not attached to any particular names; what matters is that when people see them, they get the right meanings from them. I have some concern that IsXactIsoLevelSnapshot might suggest that it excludes the fully serializable transaction isolation level. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Max Bowsher wrote: On 02/09/10 14:40, Michael Haggerty wrote: Robert Haas wrote: On Thu, Sep 2, 2010 at 8:13 AM, Michael Haggerty mhag...@alum.mit.edu wrote: What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Lots of commits that look like this: commit c50da22b6050e0bdd5e2ef97541d91aa1d2e63fb Author: PostgreSQL Daemon webmas...@postgresql.org Date: Sat Dec 2 08:36:42 2006 + This commit was manufactured by cvs2svn to create branch 'REL8_2_STABLE'. Sprout from master 2006-12-02 08:36:41 UTC PostgreSQL Daemon webmas...@postgresql.org '' Delete: src/backend/parser/gram.c src/interfaces/ecpg/preproc/pgc.c src/interfaces/ecpg/preproc/preproc.c I addressed that problem in this email: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01819.php Summary: it is caused by a known weakness in cvs2svn's branch-parent-choosing code that would be difficult to solve. But it just occurred to me--the script contrib/git-move-refs.py is supposed to fix problems like this. Have you run this script against your git repository? (Caveat: I am not very familiar with the script, which was contributed by a user. Please check the results carefully and let us know how it works for you.) Moving refs can't possibly splice out branch creation commits. Max, My understanding was that the problem is not that the branches are created, but that they are created from a non-optimal starting point, making it necessary for each of them to be doctored using a fixup commit. Since the tree contents following the first branch commit is identical to the tree contents on trunk one commit later, moving the branch tags will give the same branch contents without the need for branch fixup commits, and the old (branch-fixed) commits, no longer being referenced, will be garbage collected at the next git gc. Why don't you think this will work? Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Wed, Sep 01, 2010 at 04:53:38PM +0900, Fujii Masao wrote: - down When that situation occurs, the master shuts down immediately. Though this is unsafe for the system requiring high availability, as far as I recall, some people wanted this mode in the previous discussion. Oracle provides this, among other possible configurations; perhaps that's why it came up earlier. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
[HACKERS] installcheck-world failure
I just started trying out the new make targets with world in the name. `make world` and `make install-world` seem to work (unless I'm missing something), but `make installcheck-world` ends with Error 2. The relevant bits seem to be: make[2]: Entering directory `/home/kgrittn/git/postgresql/kgrittn/contrib/dblink' make -C ../../src/test/regress pg_regress make[3]: Entering directory `/home/kgrittn/git/postgresql/kgrittn/src/test/regress' make[3]: `pg_regress' is up to date. make[3]: Leaving directory `/home/kgrittn/git/postgresql/kgrittn/src/test/regress' ../../src/test/regress/pg_regress --inputdir=. --psqldir=/usr/local/pgsql-serializable/bin --dbname=contrib_regression dblink (using postmaster on Unix socket, default port) == dropping database contrib_regression == DROP DATABASE == creating database contrib_regression == CREATE DATABASE ALTER DATABASE == running regression test queries== test dblink ... FAILED == 1 of 1 tests failed. == The differences that caused some tests to fail can be viewed in the file /home/kgrittn/git/postgresql/kgrittn/contrib/dblink/regression.diffs. A copy of the test summary that you see above is saved in the file /home/kgrittn/git/postgresql/kgrittn/contrib/dblink/regression.out. make[2]: *** [installcheck] Error 1 make[2]: Leaving directory `/home/kgrittn/git/postgresql/kgrittn/contrib/dblink' *** /home/kgrittn/git/postgresql/kgrittn/contrib/dblink/expected/dblink.out 2010-06-16 08:47:55.0 -0500 --- /home/kgrittn/git/postgresql/kgrittn/contrib/dblink/results/dblink.out 2010-09-02 11:51:11.0 -0500 *** *** 905,926 ADD COLUMN col4 INT NOT NULL DEFAULT 42; SELECT dblink_build_sql_insert('test_dropped', '2', 1, ARRAY['1'::TEXT], ARRAY['2'::TEXT]); ! dblink_build_sql_insert ! --- ! INSERT INTO test_dropped(id,col2b,col3,col4) VALUES('2','113','foo','42') ! (1 row) ! SELECT dblink_build_sql_update('test_dropped', '2', 1, ARRAY['1'::TEXT], ARRAY['2'::TEXT]); ! dblink_build_sql_update ! --- ! UPDATE test_dropped SET id = '2', col2b = '113', col3 = 'foo', col4 = '42' WHERE id = '2' ! (1 row) ! SELECT dblink_build_sql_delete('test_dropped', '2', 1, ARRAY['2'::TEXT]); ! dblink_build_sql_delete ! - ! DELETE FROM test_dropped WHERE id = '2' (1 row) --- 905,918 ADD COLUMN col4 INT NOT NULL DEFAULT 42; SELECT dblink_build_sql_insert('test_dropped', '2', 1, ARRAY['1'::TEXT], ARRAY['2'::TEXT]); ! ERROR: source row not found SELECT dblink_build_sql_update('test_dropped', '2', 1, ARRAY['1'::TEXT], ARRAY['2'::TEXT]); ! ERROR: source row not found SELECT dblink_build_sql_delete('test_dropped', '2', 1, ARRAY['2'::TEXT]); ! dblink_build_sql_delete ! ! DELETE FROM test_dropped WHERE col2b = '2' (1 row) == Suggestions? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] installcheck-world failure
Kevin Grittner kevin.gritt...@wicourts.gov writes: I just started trying out the new make targets with world in the name. `make world` and `make install-world` seem to work (unless I'm missing something), but `make installcheck-world` ends with Error 2. You're pulling from that broken git repository, aren't you? The symptom you're showing is inconsistent regression-test files for dblink, which is the same thing fennec was showing till its owner moved it back to pulling from CVS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] upcoming wraps
Peter Eisentraut pete...@gmx.net writes: And what about 9.1alpha1? Peter muttered something about doing that this week. The major blocker is preparing the release notes. If someone has time for that ... Done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] installcheck-world failure
Tom Lane t...@sss.pgh.pa.us wrote: You're pulling from that broken git repository, aren't you? I'm pulling from git, so apparently it's a broken repository. I guess I'll give up on the installcheck-world target until we have a working git repository. Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
On Thu, Sep 2, 2010 at 11:41 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: How about IsXactIsoLevelSnapshot? Just to be a bit shorter. I need two macros -- one which has the same definition as the current IsXactIsoLevelSerializable, to be used everywhere the old macro name currently is used, which conveys that it is an isolation level which is based on a transaction snapshot rather than statement snapshots (i.e., REPEATABLE READ or SERIALIZABLE) and a new macro (which I was planning to call IsXactIsoLevelFullySerializable) which conveys that it is the SERIALIZABLE isolation level. Do you feel that IsXactIsoLevelSnapshot works with IsXactIsoLevelFullySerializable to convey the right semantics? If not, what would you suggest? OK, I see what you were going for. The current definition is: #define IsXactIsoLevelSerializable (XactIsoLevel = XACT_REPEATABLE_READ) ...which is certainly a bit odd, since you'd think it would be comparing against XACT_SERIALIZABLE given the name. IsXactIsoLevelRepeatableRead()? XactUsesPerXactSnapshot()? Or, inverting the sense of it, XactUsesPerStatementSnapshot()? Just brainstorming... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Robert Haas robertmh...@gmail.com wrote: The current definition is: #define IsXactIsoLevelSerializable (XactIsoLevel = XACT_REPEATABLE_READ) ...which is certainly a bit odd, since you'd think it would be comparing against XACT_SERIALIZABLE given the name. Precisely why I want to rename it. ;-) IsXactIsoLevelRepeatableRead()? Since the SSI implementation of a fully serializable transaction isolation level needs to do everything that the snapshot isolation of REPEATABLE READ does, plus a wee bit more, it is convenient to have a macro with the same semantics; just a less confusing name. I don't see anywhere in the code where there's a need to test for *just* REPEATABLE READ -- anything done for that also needs to be done for SERIALIZABLE. XactUsesPerXactSnapshot()? That seems unambiguous. I think I prefer it to IsXactIsoLevelXactSnapshotBased, so if there are no objections, I'll switch to XactUsesPerXactSnapshot. The current code uses a macro without parentheses; are you suggesting that the new code add those? Or, inverting the sense of it, XactUsesPerStatementSnapshot()? I don't see anywhere that the code is throwing an exclamation point in front of the macro name, so inverting it seems like a bad idea. I'd rather go from: if (IsXactIsoLevelSerializable) to: if (XactUsesPerXactSnapshot) than: if (!XactUsesPerStatementSnapshot) Given the suggested name above, IsXactIsoLevelFullySerializable no longer seems, well, symmetrical. How do you feel about XactIsFullySerializable? Names starting with IsXactIsoLevel seem more technically correct, but the names get long enough that it seems to me that the meaning gets a bit lost in the jumble of words -- which is why I like the shorter suggested name. Any other opinions out there? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On 02/09/10 06:46, Fujii Masao wrote: On Wed, Sep 1, 2010 at 4:11 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: The obvious next question is how to wait for multiple sockets and a latch at the same time? Perhaps we should have a select()-like interface where you can pass multiple file descriptors. Then again, looking at the current callers of select() in the backend, apart from postmaster they all wait for only one fd. Currently backends have not waited for multiple sockets, so I don't think that interface is required for now. Similarly, we don't need to wait for the socket to be ready to *write* because there is no use case for now. Ok, here's an updated patch with WaitLatchOrSocket that let's you do that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/configure b/configure index bd9b347..432cd58 100755 --- a/configure +++ b/configure @@ -27773,6 +27773,13 @@ _ACEOF SHMEM_IMPLEMENTATION=src/backend/port/win32_shmem.c fi +# Select latch implementation type. +if test $PORTNAME != win32; then + LATCH_IMPLEMENTATION=src/backend/port/unix_latch.c +else + LATCH_IMPLEMENTATION=src/backend/port/win32_latch.c +fi + # If not set in template file, set bytes to use libc memset() if test x$MEMSET_LOOP_LIMIT = x ; then MEMSET_LOOP_LIMIT=1024 @@ -29098,7 +29105,7 @@ fi ac_config_files=$ac_config_files GNUmakefile src/Makefile.global -ac_config_links=$ac_config_links src/backend/port/dynloader.c:src/backend/port/dynloader/${template}.c src/backend/port/pg_sema.c:${SEMA_IMPLEMENTATION} src/backend/port/pg_shmem.c:${SHMEM_IMPLEMENTATION} src/include/dynloader.h:src/backend/port/dynloader/${template}.h src/include/pg_config_os.h:src/include/port/${template}.h src/Makefile.port:src/makefiles/Makefile.${template} +ac_config_links=$ac_config_links src/backend/port/dynloader.c:src/backend/port/dynloader/${template}.c src/backend/port/pg_sema.c:${SEMA_IMPLEMENTATION} src/backend/port/pg_shmem.c:${SHMEM_IMPLEMENTATION} src/backend/port/pg_latch.c:${LATCH_IMPLEMENTATION} src/include/dynloader.h:src/backend/port/dynloader/${template}.h src/include/pg_config_os.h:src/include/port/${template}.h src/Makefile.port:src/makefiles/Makefile.${template} if test $PORTNAME = win32; then @@ -29722,6 +29729,7 @@ do src/backend/port/dynloader.c) CONFIG_LINKS=$CONFIG_LINKS src/backend/port/dynloader.c:src/backend/port/dynloader/${template}.c ;; src/backend/port/pg_sema.c) CONFIG_LINKS=$CONFIG_LINKS src/backend/port/pg_sema.c:${SEMA_IMPLEMENTATION} ;; src/backend/port/pg_shmem.c) CONFIG_LINKS=$CONFIG_LINKS src/backend/port/pg_shmem.c:${SHMEM_IMPLEMENTATION} ;; +src/backend/port/pg_latch.c) CONFIG_LINKS=$CONFIG_LINKS src/backend/port/pg_latch.c:${LATCH_IMPLEMENTATION} ;; src/include/dynloader.h) CONFIG_LINKS=$CONFIG_LINKS src/include/dynloader.h:src/backend/port/dynloader/${template}.h ;; src/include/pg_config_os.h) CONFIG_LINKS=$CONFIG_LINKS src/include/pg_config_os.h:src/include/port/${template}.h ;; src/Makefile.port) CONFIG_LINKS=$CONFIG_LINKS src/Makefile.port:src/makefiles/Makefile.${template} ;; diff --git a/configure.in b/configure.in index 7b09986..7f84cea 100644 --- a/configure.in +++ b/configure.in @@ -1700,6 +1700,13 @@ else SHMEM_IMPLEMENTATION=src/backend/port/win32_shmem.c fi +# Select latch implementation type. +if test $PORTNAME != win32; then + LATCH_IMPLEMENTATION=src/backend/port/unix_latch.c +else + LATCH_IMPLEMENTATION=src/backend/port/win32_latch.c +fi + # If not set in template file, set bytes to use libc memset() if test x$MEMSET_LOOP_LIMIT = x ; then MEMSET_LOOP_LIMIT=1024 @@ -1841,6 +1848,7 @@ AC_CONFIG_LINKS([ src/backend/port/dynloader.c:src/backend/port/dynloader/${template}.c src/backend/port/pg_sema.c:${SEMA_IMPLEMENTATION} src/backend/port/pg_shmem.c:${SHMEM_IMPLEMENTATION} + src/backend/port/pg_latch.c:${LATCH_IMPLEMENTATION} src/include/dynloader.h:src/backend/port/dynloader/${template}.h src/include/pg_config_os.h:src/include/port/${template}.h src/Makefile.port:src/makefiles/Makefile.${template} diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 615a7fa..094d0c9 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -55,6 +55,7 @@ #include miscadmin.h #include pg_trace.h #include pgstat.h +#include replication/walsender.h #include storage/fd.h #include storage/procarray.h #include storage/sinvaladt.h @@ -1025,6 +1026,13 @@ EndPrepare(GlobalTransaction gxact) /* If we crash now, we have prepared: WAL replay will fix things */ + /* + * Wake up all walsenders to send WAL up to the PREPARE record + * immediately if replication is enabled + */ + if (max_wal_senders 0) + WalSndWakeup(); + /* write correct CRC and close file */ if ((write(fd, statefile_crc, sizeof(pg_crc32))) != sizeof(pg_crc32)) { @@ -2005,6 +2013,13 @@
Re: [HACKERS] serializable in comments and names
Kevin Grittner kevin.gritt...@wicourts.gov writes: Robert Haas robertmh...@gmail.com wrote: XactUsesPerXactSnapshot()? That seems unambiguous. I think I prefer it to IsXactIsoLevelXactSnapshotBased, so if there are no objections, I'll switch to XactUsesPerXactSnapshot. The current code uses a macro without parentheses; are you suggesting that the new code add those? +1 for adding parens; we might want to make a function of it someday. Names starting with IsXactIsoLevel seem more technically correct, but the names get long enough that it seems to me that the meaning gets a bit lost in the jumble of words -- which is why I like the shorter suggested name. Any other opinions out there? I don't much like the XactUses... aspect of it; that's just about meaningless, because almost everything in PG could be said to be used by a transaction. How about IsolationUsesXactSnapshot (versus IsolationUsesStmtSnapshot)? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable in comments and names
Tom Lane t...@sss.pgh.pa.us wrote: +1 for adding parens; we might want to make a function of it someday. Makes sense; will do. I don't much like the XactUses... aspect of it; that's just about meaningless, because almost everything in PG could be said to be used by a transaction. How about IsolationUsesXactSnapshot (versus IsolationUsesStmtSnapshot)? And IsolationIsSerializable to make that test symmetrical? Any objections to this plan? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Needs Suggestion
Can anyone explain this ? My question is - What is lnext:; , lreplace:; in postgres code ? I found lnext:; in 1501 and lreplace:; in 2065 in execMain.c file. -- Thank You, Subham Roy. CSE, IIT Bombay. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Ok, here's an updated patch with WaitLatchOrSocket that let's you do that. Minor code review items: s/There is three/There are three/ in unix_latch.c header comment The header comment points out the correct usage of ResetLatch, but I think it should also emphasize that the correct usage of SetLatch is to set whatever flags indicate work-to-do before SetLatch. I don't care for the Asserts in InitLatch and InitSharedLatch. Initialization functions ought not assume that the struct they are told to initialize contains anything but garbage. And *especially* not assume that without documentation. s/inter-proess/inter-process/, at least 2 places Does ReleaseLatch() have any actual use-case, and if so what would it be? I think we could do without it. The WaitLatch timeout API could use a bit of refinement. I'd suggest defining negative timeout as meaning wait forever, so that timeout = 0 can be used for check but don't wait. Also, it seems like the function shouldn't just return void but should return a bool to show whether it saw the latch set or timed out. (Yeah, I realize the caller could look into the latch to find that out, but callers really ought to treat latches as opaque structs.) I don't think you have the select-failed logic right in WaitLatchOrSocket; on EINTR it will suppose that FD_ISSET is a valid test to make, which I think ain't the case. Just continue around the loop. Comment for unix_latch's latch_sigusr1_handler refers to SetEvent, which is a Windows-ism. +* XXX: Is it safe to elog(ERROR) in a signal handler? No, it isn't. It seems like both implementations are #include'ing more than they ought to --- why replication/walsender.h, in particular? I don't think unix_latch needs spin.h either. +typedef struct +{ + volatile sig_atomic_t is_set; + volatile sig_atomic_t owner_pid; +} Latch; I don't believe it is either sane or portable to declare struct fields as volatile. You need to attach the volatile qualifier to the function arguments instead, eg extern WaitLatch(volatile Latch *latch, ...) Also, using sig_atomic_t for owner_pid is entirely not sane. On many platforms sig_atomic_t is only a byte, and besides which you have no need for that field to be settable by a signal handler. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Needs Suggestion
On Mon, Aug 30, 2010 at 11:12 AM, sub...@cse.iitb.ac.in wrote: Can anyone explain this ? My question is - What is lnext:; , lreplace:; in postgres code ? I found lnext:; in 1501 and lreplace:; in 2065 in execMain.c file. It's a label. http://www.lysator.liu.se/c/bwk-tutor.html#goto -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Cost estimates for parameterized paths
Awhile back I ranted about replacing the planner's concept of inner indexscans with a more generalized notion of parameterized paths: http://archives.postgresql.org/pgsql-hackers/2009-10/msg00994.php The executor fixes for that are done, and now I'm grappling with getting the planner to do something useful with it. The biggest problem I've run into is that a parameterized path can't really be assigned a fixed cost in the same way that a normal path can. The current implementation of cost_index() depends on knowing the size of the outer relation --- that is, the expected number of execution loops for the indexscan --- in order to account for cache effects sanely while estimating the average cost of any one inner indexscan. We know that that is an important thing to do because the cost estimates seem to be a lot closer to reality now that we do that than what we were getting before; so dropping the consideration is entirely out of the question. The planner is already cheating on this to a considerable extent, because it estimates the cost of an inner indexscan only once, using the first outer rel we try to join to. That cost is cached and reused with other potential outer-rel join partners, even though very different numbers of outer rows might be involved. This problem will get a lot worse with the types of plans that I hope the planner will be able to come up with after this fix goes in, because the indexscan might be at the bottom of a join nest. So we need a real fix not another hack. The best idea I can come up with at the moment is to compute best case and worst case costs for a parameterized path, corresponding to the largest and smallest numbers of loops we might expect it to be repeated for. The largest number of loops could be estimated via the cartesian product of the sizes of the other tables in the query, for example. The worst case cost is its cost if only repeated once. Then, during path pruning in add_path, we only discard a parameterized path if its best-case cost is worse than the worst-case cost of some otherwise comparable path. Whenever we join the parameterized path with the required outer relation, we redo the cost calculation using that rel's actual rowcount estimate in order to form a final cost estimate for the no-longer-parameterized join path. While this looks like it would work in principle, I'm concerned that it would be unable to prune very many parameterized paths, and thus that planning time might run unacceptably long. The repeated cost calculations aren't much fun either, although we could probably cache most of that work if we're willing to throw memory at the problem. I wonder if anyone's got a better idea ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replacing the pg_get_expr security hack with a datatype solution
Peter Eisentraut pete...@gmx.net writes: On lör, 2010-08-21 at 15:30 -0400, Tom Lane wrote: The only thing that seems like it might need discussion is the name to give the datatype. My first instinct was pg_expr or pg_expression, but there are some cases where this doesn't exactly fit. In particular, pg_rewrite.ev_action contains a whole Query, in fact a list of them. Perhaps pg_node then. pg_node sounds like there's just one. Maybe pg_node_tree? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] returning multiple result sets from a stored procedure
I noticed in postgres you cannot return multiple result sets from a stored procedure (surprisingly as it looks like a very good dbms). I would like to suggest adding this feature. - It is very usefull - It is supported by all other dbmss I have worked with. - makes porting applications to postgres very difficult (we have used this feature in our stored procedures and now there is no easy way of porting to postgres) Thanks and we are waiting
Re: [HACKERS] returning multiple result sets from a stored procedure
Excerpts from John Adams's message of jue sep 02 18:25:45 -0400 2010: I noticed in postgres you cannot return multiple result sets from a stored procedure (surprisingly as it looks like a very good dbms). If you're really intent about doing this, you can emulate it by returning a set of refcursors. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On Fri, Sep 3, 2010 at 5:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Does ReleaseLatch() have any actual use-case, and if so what would it be? I think we could do without it. In Unix, probably we can live without that. But in Windows, we need to free SharedEventHandles slot for upcoming process using a latch when ending. The WaitLatch timeout API could use a bit of refinement. I'd suggest defining negative timeout as meaning wait forever, so that timeout = 0 can be used for check but don't wait. Also, it seems like the function shouldn't just return void but should return a bool to show whether it saw the latch set or timed out. (Yeah, I realize the caller could look into the latch to find that out, but callers really ought to treat latches as opaque structs.) Agreed. I don't think you have the select-failed logic right in WaitLatchOrSocket; on EINTR it will suppose that FD_ISSET is a valid test to make, which I think ain't the case. Just continue around the loop. EINTR already makes us go back to the top of the loop since FD_ISSET(pipe) is not checked. Then we would clear the pipe and break out of the loop because of latch-is_set == true. + * XXX: Is it safe to elog(ERROR) in a signal handler? No, it isn't. We should use elog(FATAL) or check proc_exit_inprogress, instead? + if (errno != EAGAIN errno != EWOULDBLOCK) + { + /* +* XXX: Is it safe to elog(ERROR) in a signal handler? +*/ + elog(ERROR, write() on self-pipe failed: %m); + } + if (errno == EINTR) + goto retry; errno == EINTR) seems to be never checked. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Fujii Masao masao.fu...@gmail.com writes: + * XXX: Is it safe to elog(ERROR) in a signal handler? No, it isn't. We should use elog(FATAL) or check proc_exit_inprogress, instead? elog(FATAL) is *certainly* not a better idea. I think there's really nothing that can be done, you just have to silently ignore the error. BTW, if we retry, there had probably better be a limit on how many times to retry ... + if (errno != EAGAIN errno != EWOULDBLOCK) + { + /* + * XXX: Is it safe to elog(ERROR) in a signal handler? + */ + elog(ERROR, write() on self-pipe failed: %m); + } + if (errno == EINTR) + goto retry; errno == EINTR) seems to be never checked. Another issue with coding like that is that it supposes elog() won't change errno. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On 02/09/10 16:44, Michael Haggerty wrote: Max Bowsher wrote: On 02/09/10 14:40, Michael Haggerty wrote: Robert Haas wrote: On Thu, Sep 2, 2010 at 8:13 AM, Michael Haggerty mhag...@alum.mit.edu wrote: What weirdness, exactly, are you discussing now? I've lost track of which problem(s) are still unresolved. Lots of commits that look like this: commit c50da22b6050e0bdd5e2ef97541d91aa1d2e63fb Author: PostgreSQL Daemon webmas...@postgresql.org Date: Sat Dec 2 08:36:42 2006 + This commit was manufactured by cvs2svn to create branch 'REL8_2_STABLE'. Sprout from master 2006-12-02 08:36:41 UTC PostgreSQL Daemon webmas...@postgresql.org '' Delete: src/backend/parser/gram.c src/interfaces/ecpg/preproc/pgc.c src/interfaces/ecpg/preproc/preproc.c I addressed that problem in this email: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01819.php Summary: it is caused by a known weakness in cvs2svn's branch-parent-choosing code that would be difficult to solve. But it just occurred to me--the script contrib/git-move-refs.py is supposed to fix problems like this. Have you run this script against your git repository? (Caveat: I am not very familiar with the script, which was contributed by a user. Please check the results carefully and let us know how it works for you.) Moving refs can't possibly splice out branch creation commits. Max, My understanding was that the problem is not that the branches are created, but that they are created from a non-optimal starting point, making it necessary for each of them to be doctored using a fixup commit. Since the tree contents following the first branch commit is identical to the tree contents on trunk one commit later, moving the branch tags will give the same branch contents without the need for branch fixup commits, and the old (branch-fixed) commits, no longer being referenced, will be garbage collected at the next git gc. Why don't you think this will work? You can't move a branchpoint after there are commits on the branch. I'm pretty certain there will be commits on the REL8_2_STABLE branch :-) Also, IIUC, this isn't the one commit later version of the problem - it's a case of, for a period of *years*, the RCS files for these three files claim they exist on trunk but no branches branching off trunk during this period. I am exploring the option of setting the unwanted revisions of the files to the dead state (removing them outright doesn't work, since they have a branch from one of the revisions in question.) I have a test conversion running (well, a test conversion to bzr, because I like qbzr so much more than gitk) and will report back. Max. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] git: uh-oh
Max Bowsher wrote: On 02/09/10 16:44, Michael Haggerty wrote: My understanding was that the problem is not that the branches are created, but that they are created from a non-optimal starting point, making it necessary for each of them to be doctored using a fixup commit. Since the tree contents following the first branch commit is identical to the tree contents on trunk one commit later, moving the branch tags will give the same branch contents without the need for branch fixup commits, and the old (branch-fixed) commits, no longer being referenced, will be garbage collected at the next git gc. Why don't you think this will work? You can't move a branchpoint after there are commits on the branch. I'm pretty certain there will be commits on the REL8_2_STABLE branch :-) Good point. In the case of git, the branchpoint for a branch with commits could be moved using grafts and then baked in using git filter-branch. But you are right that this is beyond the abilities of contrib/git-move-refs.py, harder to justify, and wouldn't help in the current case given your next point. Also, IIUC, this isn't the one commit later version of the problem - it's a case of, for a period of *years*, the RCS files for these three files claim they exist on trunk but no branches branching off trunk during this period. I didn't realize that the anomaly was so long-lived. I am exploring the option of setting the unwanted revisions of the files to the dead state (removing them outright doesn't work, since they have a branch from one of the revisions in question.) That sounds promising. If it doesn't work, perhaps manually changing the timestamps on the trunk revisions to an earlier date would help isolate the problem and allow the branches to sprout from the post-delete revision... Thanks for the explanation. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication - patch status inquiry
On Thu, Sep 2, 2010 at 11:32 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I understand what you're after, the idea of being able to set synchronization level on a per-transaction basis is cool. But I haven't seen a satisfactory design for it. I don't understand how it would work in practice. Even though it's cool, having different kinds of standbys connected is a more common scenario, and the design needs to accommodate that too. I'm all ears if you can sketch a design that can do that. That design would affect what the standby should reply. If we choose async/recv/fsync/replay on a per-transaction basis, the standby should send multiple LSNs and the master needs to decide when replication has been completed. OTOH, if we choose just sync/async, the standby has only to send one LSN. The former seems to be more useful, but triples the number of ACK from the standby. I'm not sure whether its overhead is ignorable, especially when the distance between the master and the standby is very long. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On Fri, Sep 3, 2010 at 11:08 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: + * XXX: Is it safe to elog(ERROR) in a signal handler? No, it isn't. We should use elog(FATAL) or check proc_exit_inprogress, instead? elog(FATAL) is *certainly* not a better idea. I think there's really nothing that can be done, you just have to silently ignore the error. Hmm.. some functions called by a signal handler use elog(FATAL), e.g., RecoveryConflictInterrupt() do that when unknown conflict mode is given as an argument. Are these calls unsafe, too? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers