Re: [HACKERS] Sync Rep Design
On 12/30/2010 10:27 PM, Simon Riggs wrote: On Thu, 2010-12-30 at 22:08 +0100, Stefan Kaltenbrunner wrote: On 12/30/2010 10:01 PM, Simon Riggs wrote: On Thu, 2010-12-30 at 15:51 -0500, Robert Treat wrote: Still, one thing that has me concerned is that in the case of two slaves, you don't know which one is the more up-to-date one if you need to failover. It'd be nice if you could just guarantee they both are... Regrettably, nobody can know that, without checking. how exactly would you check? - this seems like something that needs to be done from the SQL and the CLI level and also very well documented (which I cannot see in your proposal). This is a proposal for sync rep, not multi-node failover. I'm definitely not going to widen the scope of this project. Functions already exist to check the thing you're asking. well your proposal includes a lot of stuff on how to avoid dataloss and getting High Availability - so I think it is a requirement for us to tell the DBA what the procedures are for both setting it up (which is what is in the docs - but only 50% of the thing) and what to do in the case of a desaster (which is the other part of the problem). Or said otherwise - sync rep is not very useful if there is no easy and reliable way to get that information, if that stuff is already available even better but I'm not aware of what is there and what not, so I expect others to have the same problem. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 12/30/2010 10:23 PM, Simon Riggs wrote: On Thu, 2010-12-30 at 21:42 +0100, Stefan Kaltenbrunner wrote: Synchronous replication offers the ability to guarantee that all changes made by a transaction have been transferred to at least one remote standby server. This is an extension to the standard level of durability offered by a transaction commit. This is referred to as semi-synchronous replication. When synchronous replication is requested, the commit of a write transaction will wait until confirmation that the commit record has been transferred successfully to at least one standby server. Waiting for confirmation increases the user's confidence that the changes will not be lost in the event of server crashes but it also necessarily increases the response time for the requesting transaction. The minimum wait time is the roundtrip time from primary to standby. hmm this is one of the main problems I see with the proposed master is sometimes aware of the standby(as in the feedback mode) concept this proposal has. If it waits for only one of the standbys there is some issue with the terminology. As a DBA I would expect the master to only return if ALL of the sync replication declared nodes replied ok. Well, as a DBA, I expect it to work with just one. That's how MySQL and Oracle work at least. If ALL standbys reply, it takes longer, makes the code harder, how do you determine what all is robustly etc.. Plus its been discussed already. Maybe it has been discussed but I still don't see way it makes any sense. If I declare a standby a sync standby I better want it sync - not maybe sync. consider the case of a 1 master and two identical sync standbys - one sync standby is in the same datacenter the other is in a backup location say 15km away. Given there is a small constant latency to the second box (even if you have fast networks) the end effect is that the second standby will NEVER be sync (because the local one will always be faster) and you end up with an async slave that cannot be used per your business rules? What I'm really missing with that proposal is how people expect that solution to be managed - What aspect do you wish to monitor? I'm happy to consider your suggestions. given there is only sometimes a feedback channel into the master you can't do the monitoring. Not sure what you mean. Please explain more. well hot_standby_feedback(not sure I like the name but I can't think of anything better either) - provides feedback to the master(like sync rep) and the master is actually acting on that feedback. so in effect we have stuff affecting the master that we need to be able to monitor clearly. But to make that information useful it would help to see what standby provided what kind of feedback so we are back to having the master being aware of what standbys are connected and what standbys are supposed to be there - which is the issue sync rep got stalled before... Even if you could (which we really need!) there is nothing in the proposal yet that will help to determine on what the most recent standby (in the case of more1 sync standby) might be. Functions to determine that already exist. - but it would require a real standby registration or at least standby management possibility on the master not a halfway done one - so do we really need hot_standby_feedback as part of the inital sync-rep patch? It is a Hot Standby feature, but so tightly integrated with this code that it isn't possible for me to submit as two separate patches. well you are the developer of that feature but if it is already a know that can be turned on and off it seems not to hard to submit as a seperate feature... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On 12/30/2010 10:45 PM, Heikki Linnakangas wrote: On 30.12.2010 16:49, Florian Pflug wrote: On Dec30, 2010, at 13:31 , Joachim Wieland wrote: We return snapshot information as a chunk of data to the client. At the same time however, we set a checksum in shared memory to protect against modification of the snapshot. A publishing backend can revoke its snapshot by deleting the checksum and a backend that is asked to install a snapshot can verify that the snapshot is correct and current by calculating the checksum and comparing it with the one in shared memory. We'd still have to stream these checksums to the standbys though, or would they be exempt from the checksum checks? I still wonder whether these checks are worth the complexity. I believe we'd only allow snapshot modifications for read-only queries anyway, so what point is there in preventing clients from setting broken snapshots? Hmm, our definition of read-only is a bit fuzzy. While a transaction doesn't modify the database itself, it could still send NOTIFYs or call a PL function to do all sorts of things outside the database. Imagine that you're paranoid about data integrity, and have a security definer function that runs cross checks on the data. If it finds any anomalities, it wakes up the operator or forces shutdown or similar. are people actually doing that in reality? I'm also having a hard time picturing a realistic example of what that data integrity check function would actually being able to check with default isolation mode and concurrent activity... Now a malicious user could set a snapshot that passes the basic validity checks, ie. xmin = GlobalXmin, but contains a combination of still in-progress that never existed in reality. If he then calls the paranoia-function, it would see an inconsistent state of committed tuples and get upset. sure but I would expect being able to set a snapshot requiring either superuser or some sort of WITH SNAPSHOT grant thingy - and in general there are much more trivial and not that obscure scenarios a normal user can cause the admin to get paged :) Maybe that's a bit far-stretched, but it's not entirely clear that running with an inconsistent snapshot is harmless. well there has been some discussion with to the SSI stuff that we might want to reconsider our definition of read-only maybe that would be the right way to approach the problem? Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Old git repo
On Fri, Dec 31, 2010 at 03:04, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Davis pg...@j-davis.com writes: Personally, my utility for the old repo is not much (if it was anything important, I wouldn't have relied on the unofficial repo). But we should probably give a little bit of warning for folks that might want to rebase or translate some old notes. Well, I guess the question is how much warning. I suggested O(1 week) but Robert seems to want O(1 year). As long as there's some agreed deadline, I'm not very picky about what it is. I don't need to get rid of it *now*, but I agree a deadline is good. How about we either say when 9.1 is released, or we say september 2011 because that's a year after we made the switch? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 31.12.2010 09:50, Hannu Krosing wrote: On 30.12.2010 22:27, Robert Haas wrote: On Thu, Dec 30, 2010 at 2:04 PM, Simon Riggssi...@2ndquadrant.com wrote: synchronous_replication (boolean) Specifies whether transaction commit will wait for WAL records to be replicated before the command returns a success indication to the client. The word replicated here could be taken to mean different things, most obviously: - slave has received the WAL - slave has fsync'd the WAL - slave has applied the WAL Perhaps the level of replication guarantee should be decided on the slave side, by having a configuration parameter there report_as_replicated = received|written_to_disk|fsynced|applied for different types of hosts may have wildly different guarantees and performance parameters for these. One could envision a WAL-archive type standby which is there for data persistence only will and never apply WAL. Agreed, it feels natural to specify when a piece of WAL is acknowledged in the standby. Regarding the rest of the proposal, I would still prefer the UI discussed here: http://archives.postgresql.org/message-id/4cae030a.2060...@enterprisedb.com It ought to be the same amount of work to implement, and provides the same feature set, but makes administration a bit easier by being able to name the standbys. Also, I dislike the idea of having the standby specify that it's a synchronous standby that the master has to wait for. Behavior on the master should be configured on the master. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, 2010-12-31 at 09:27 +0100, Stefan Kaltenbrunner wrote: Maybe it has been discussed but I still don't see way it makes any sense. If I declare a standby a sync standby I better want it sync - not maybe sync. consider the case of a 1 master and two identical sync standbys - one sync standby is in the same datacenter the other is in a backup location say 15km away. Given there is a small constant latency to the second box (even if you have fast networks) the end effect is that the second standby will NEVER be sync (because the local one will always be faster) and you end up with an async slave that cannot be used per your business rules? Your picture above is a common misconception. I will add something to the docs to explain this. 1. sync is a guarantee about how we respond to the client when we commit. If we wait for more than one response that slows things down, makes the cluster more fragile, complicates the code and doesn't appreciably improve the guarantee. 2. sync does not guarantee that the updates to the standbys are in any way coordinated. You can run a query on one standby and get one answer and at the exact same time run the same query on another standby and get a different answer (slightly ahead/behind). That also means that if the master crashes one of the servers will be ahead or behind. You can use pg_last_xlog_receive_location() to check which one that is. When people say they want *all* servers to respond, its usually because they want (2), but that is literally impossible in a distributed system. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 12/31/2010 11:06 AM, Heikki Linnakangas wrote: On 31.12.2010 09:50, Hannu Krosing wrote: On 30.12.2010 22:27, Robert Haas wrote: On Thu, Dec 30, 2010 at 2:04 PM, Simon Riggssi...@2ndquadrant.com wrote: synchronous_replication (boolean) Specifies whether transaction commit will wait for WAL records to be replicated before the command returns a success indication to the client. The word replicated here could be taken to mean different things, most obviously: - slave has received the WAL - slave has fsync'd the WAL - slave has applied the WAL Perhaps the level of replication guarantee should be decided on the slave side, by having a configuration parameter there report_as_replicated = received|written_to_disk|fsynced|applied for different types of hosts may have wildly different guarantees and performance parameters for these. One could envision a WAL-archive type standby which is there for data persistence only will and never apply WAL. Agreed, it feels natural to specify when a piece of WAL is acknowledged in the standby. Regarding the rest of the proposal, I would still prefer the UI discussed here: http://archives.postgresql.org/message-id/4cae030a.2060...@enterprisedb.com It ought to be the same amount of work to implement, and provides the same feature set, but makes administration a bit easier by being able to name the standbys. Also, I dislike the idea of having the standby specify that it's a synchronous standby that the master has to wait for. Behavior on the master should be configured on the master. well that proposal is much closer to what I want as an admin - except that it would be nice to configure that through actual DDL. My wish would be more like: * standby provides a unique name identifier * standby has a flag to say the maximum(or minimum?) replication_reported support it can do * standby connects to the master async by default and the master registers the standby automatically * on the master I can do the following with every standby that is visible to the master or has been in the past: * enable/disable and add/remove permanently(if not added permanently the registration is transient) - enabled if not set explicitly * sync_rep_enabled (boolean) it (so you can still do per transaction or per database or whatever sync rep) - disabled if not set explicitly * sync_reply_required (booleant), (per sync standby flag to require a reply before the master returns - if there is only one sync standby this is default behaviour if there are more the admin can choose) * wait_forever or timeout per standby * maybe a way to set the report_as_replicated from the master (if feasible) up to the max of what the standby can do so you would get the proposed semi sync rep mode by simply setting more than one standby as sync_rep_enabled and sync_reply_required is false for all of them (ie any one of them can reply and the master returns) - if you want better than that just require a reply from a specific one or more than one. this would also help in us providing a simple view with a nice and handy status report on the slaves (which ones are there, which ones should be there, how far are they in terms of applying wal, what status do they have). Imho an interface like this would be: a) convinient because it would not require any additional setup requirements for async rep except providing a name on the standby by default b) it would enable the master to specify the business rules clearly c) would still support the simple one sync reply is enough semisync replication case people like to have d) would also enable the admin to get more than ONE sync standby that is really sync - not maybe sync e) hot_standby_feedback (if enabled) would look at only the permanently enabled slaves so only an DBA approved standby would be able to affect the master for table bloat f) provide the necessary meta information for providing the handy quick nice replication status overview reporting feature people want and need g) for all the permanently enabled async nodes we could keep track of the required oldest required WAL and keep that (optionally) so we could get rid of the hard to size max_keep_segements and maintain that automatically. the only disadvantage I can see would be that you would have to manually remove a non-functional slave from the master(and only one that you set some explicit configuration for!) if you decide you don't need it any more. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote: On 31.12.2010 09:50, Hannu Krosing wrote: On 30.12.2010 22:27, Robert Haas wrote: On Thu, Dec 30, 2010 at 2:04 PM, Simon Riggssi...@2ndquadrant.com wrote: synchronous_replication (boolean) Specifies whether transaction commit will wait for WAL records to be replicated before the command returns a success indication to the client. The word replicated here could be taken to mean different things, most obviously: - slave has received the WAL - slave has fsync'd the WAL - slave has applied the WAL Perhaps the level of replication guarantee should be decided on the slave side, by having a configuration parameter there report_as_replicated = received|written_to_disk|fsynced|applied for different types of hosts may have wildly different guarantees and performance parameters for these. One could envision a WAL-archive type standby which is there for data persistence only will and never apply WAL. Agreed, it feels natural to specify when a piece of WAL is acknowledged in the standby. That can also be done, its not a problem. Many people asked for just on or off. Currently on -- slave has fsynced WAL. Also, I dislike the idea of having the standby specify that it's a synchronous standby that the master has to wait for. Behavior on the master should be configured on the master. The parameter on the standby affects the behaviour of the standby. The standby is saying don't pick me, I'm not a good target. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote: Regarding the rest of the proposal, I would still prefer the UI discussed here: http://archives.postgresql.org/message-id/4cae030a.2060...@enterprisedb.com It ought to be the same amount of work to implement, and provides the same feature set, but makes administration a bit easier by being able to name the standbys. Also, I dislike the idea of having the standby specify that it's a synchronous standby that the master has to wait for. Behavior on the master should be configured on the master. Good point; I've added the people on the copy list from that post. This question is they key, so please respond after careful thought on my points below. There are ways to blend together the two approaches, discussed later, though first we need to look at the reasons behind my proposals. I see significant real-world issues with configuring replication using multiple named servers, as described in the link above: 1. Syncing to multiple standbys does not guarantee that the updates to the standbys are in any way coordinated. You can run a query on one standby and get one answer and at the exact same time run the same query on another standby and get a different answer (slightly ahead/behind). That also means that if the master crashes one of the servers can still be ahead or behind, even though you asked them to be the same. So you don't actually get what you think you're getting. 2. Availability of the cluster just went down. If *any* of the important nodes goes down, then everything just freezes. (I accept that you want that, and have provided that as an option). 3. Administrative complexity just jumped a huge amount. (a) If you add or remove servers to the config you need to respecify all the parameters, which need to be specific to the exact set of servers. There is no way to test that you have configured the parameters correctly without a testbed that exactly mirrors production with same names etc., or trying it in directly in production. So availability takes another potential hit because of user error. (b) After failover, the list of synchronous_standbys needs to be re-specified, yet what is the correct list of servers? The only way to make that config work is with complex middleware that automatically generates new config files. I don't think that is the same amount of work to implement, its an order of magnitude harder overall. 4. As a result of the administrative complexity, testing the full range of function will take significantly longer and that is likely to have a direct impact on the robustness of PostgreSQL 9.1. 5. Requesting sync from more than one server performs poorly, since you must wait for additional servers. If there are sporadic or systemic network performance issues you will be badly hit by them. Monitoring that just got harder also. First-response-wins is more robust in the case of volatile resources since it implies responsiveness to changing conditions. 6. You just lost the ability to control performance on the master, with a userset. Performance is a huge issue with sync rep. If you can't control it, you'll simply turn it off. Having a feature that we daren't ever use because it performs poorly helps nobody. This is not a tick-box in our marketing checklist, I want it to be genuinely real-world usable. I understand very well that Oracle provides that level of configuration, though I think it is undesirable in 90% of real world use cases. I also understand how sexy that level of configuration *sounds*, but I genuinely believe trying to deliver that would be a mistake for PostgreSQL. IMHO we should take the same road here as we do in other things: simplicity encouraged, complexity allowed. So I don't have any objection to supporting that functionality in the future, but I believe it is not something we should be encouraging (ever), nor is it something we need for this release. I suppose we might regard the feature set I am proposing as being the same as making synchronous_standbys a USERSET parameter, and allowing just two options: none - allowing the user to specify async if they wish it * - allowing people to specify that syncing to *any* standby is acceptable We can blend the two approaches together, if we wish, by having two parameters (plus server naming) synchronous_replication = on | off (USERSET) synchronous_standbys = '...' If synchronous_standbys is not set and synchronous_replication = on then we sync to any standby. If synchronous_replication = off then we use async replication, whatever synchronous_standbys is set to. If synchronous_standbys is set, then we use sync rep to all listed servers. My proposal amounts to lets add synchronous_standbys as a parameter in 9.2. If you really think that we need that functionality in this release, lets get the basic stuff added now and then fold in those ideas on top afterwards. If we do that, I will help. However, my only insistence is that we
Re: [HACKERS] Sync Rep Design
On Thu, 2010-12-30 at 20:26 -0700, Joshua Tolley wrote: 2) initiate fsync on the primary first - In this case, the slave is always slightly behind. If if your primary falls over, you don't give commit messages to the clients, but if it recovers, it might have committed data, and slaves will still be able to catch up. The thing is that currently, even without replication, #2 can happen. For what little it's worth, I vote for this option, because it's a problem that can already happen (as opposed to adding an entirely new type of problem to the mix). This proposal provides #2, so your wish is met. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Old git repo
On Fri, Dec 31, 2010 at 4:58 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Dec 31, 2010 at 03:04, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Davis pg...@j-davis.com writes: Personally, my utility for the old repo is not much (if it was anything important, I wouldn't have relied on the unofficial repo). But we should probably give a little bit of warning for folks that might want to rebase or translate some old notes. Well, I guess the question is how much warning. I suggested O(1 week) but Robert seems to want O(1 year). As long as there's some agreed deadline, I'm not very picky about what it is. I don't need to get rid of it *now*, but I agree a deadline is good. How about we either say when 9.1 is released, or we say september 2011 because that's a year after we made the switch? Either of those would be fine with me. Thanks! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, Dec 31, 2010 at 5:26 AM, Simon Riggs si...@2ndquadrant.com wrote: Your picture above is a common misconception. I will add something to the docs to explain this. 2. sync does not guarantee that the updates to the standbys are in any way coordinated. You can run a query on one standby and get one answer and at the exact same time run the same query on another standby and get a different answer (slightly ahead/behind). That also means that if the master crashes one of the servers will be ahead or behind. You can use pg_last_xlog_receive_location() to check which one that is. When people say they want *all* servers to respond, its usually because they want (2), but that is literally impossible in a distributed system. Just to try and be clear again, in sync that Stefan and I are talking about, we really don't care that the slave could be a hot standby answering queries. In fact, mine wouldn't be. Mine would likely be pg_streamrecv or something. I'm just looking for a guarantee that I've got a copy of the data safely in the next rack, and a separate building before I tell the client I've moved his money. I want a synchronous replication of the *data*, and not a system where I can distribute queries. I'm looking for disaster mitigation, not load mitigation. A replacement for clustered/replicated devices/filesystems under pg_xlog. Having the next rack slave be hot in terms of applying WAL and ready to take over instantly would be a bonus, as long as I can guarantee it's current (i.e has all data a primary's COMMIT has acknowledged). So, that's what I want, and that's what your docs suggest is impossible currently; 1st past post means that I can only ever reliably configure 1 sync slave and be sure it will have all acknowledged commits. I can likely get *close* to that by putting only my slowest slave as the only sync slave, and monitoring the heck out of my asynchronous but I want to be synchronous slave, but I'ld rather trust the PG community to build robust synchronization than myself to build robust enough monitoring to catch that my slave is farther behind than the slower synchronous one. That said, I think the expectation is that if I were building a query-able hot standby cluster in sync rep mode, once I get a commit confirmation, I should be able to then initiate a new transaction on any member of that sync rep cluster and see the data I just committed. Yes, I know I could see *newer* data. And I know that the primary could already have newer data. Yes, we have the problem even on a single pg cluster on a single machine. But the point is that if you've committed, any new transactions see *at least* that data or newer. But no chance of older. But personally, I'm not interested in that ;-) -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, Dec 31, 2010 at 6:48 AM, Simon Riggs si...@2ndquadrant.com wrote: I suppose we might regard the feature set I am proposing as being the same as making synchronous_standbys a USERSET parameter, and allowing just two options: none - allowing the user to specify async if they wish it * - allowing people to specify that syncing to *any* standby is acceptable We can blend the two approaches together, if we wish, by having two parameters (plus server naming) synchronous_replication = on | off (USERSET) synchronous_standbys = '...' If synchronous_standbys is not set and synchronous_replication = on then we sync to any standby. If synchronous_replication = off then we use async replication, whatever synchronous_standbys is set to. If synchronous_standbys is set, then we use sync rep to all listed servers. My proposal amounts to lets add synchronous_standbys as a parameter in 9.2. FWIW, this plan of attack would be OK with me. I had taken your previous comments to imply that you were opposed to EVER having a parameter like this on the master, but I could see deferring it. In my ideal world, we'd not have the synchronous_replication_service parameter on the standbys at all - all standbys would be candidates, unless someone decides to name them and set synchronous_standbys on the master. But maybe having a simple on/off on the standby is reasonable. *thinks a little bit* Someone may have proposed this before, but one way of getting standby naming for free would be to make the standby names the same as the roles used to log in, rather than adding a separate parameter. We could just recommend to people that they use a separate, descriptive role name for each standby. Then the synchronous_standbys parameter - when added - would mean a standby from one of these roles. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 31.12.2010 13:48, Simon Riggs wrote: On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote: Regarding the rest of the proposal, I would still prefer the UI discussed here: http://archives.postgresql.org/message-id/4cae030a.2060...@enterprisedb.com It ought to be the same amount of work to implement, and provides the same feature set, but makes administration a bit easier by being able to name the standbys. Also, I dislike the idea of having the standby specify that it's a synchronous standby that the master has to wait for. Behavior on the master should be configured on the master. Good point; I've added the people on the copy list from that post. This question is they key, so please respond after careful thought on my points below. There are ways to blend together the two approaches, discussed later, though first we need to look at the reasons behind my proposals. I see significant real-world issues with configuring replication using multiple named servers, as described in the link above: All of these points only apply to specifying *multiple* named servers in the synchronous_standbys='...' list. That's certainly a more complicated scenario, and the configuration is more complicated as a result. With your proposal, it's not possible in the first place. Multiple synchronous standbys probably isn't needed by most people, so I'm fine with leaving that out for now, keeping the design the same otherwise. I included it in the proposal because it easily falls out of the design. So, if you're worried about the complexities of multiple synchronous standbys, let's keep the UI exactly the same as what I described in the link above, but only allow one name in the synchronous_standbys setting, instead of a list. 3. Administrative complexity just jumped a huge amount. (a) If you add or remove servers to the config you need to respecify all the parameters, which need to be specific to the exact set of servers. Hmm, this could be alleviated by allowing the master to have a name too. All the configs could then be identical, except for the unique name for each server. For example, for a configuration with three servers that are all synchronous with each other, each server would have synchronous_standbys='server1, server2, server3' in the config file. The master would simply ignore the entry for itself. (b) After failover, the list of synchronous_standbys needs to be re-specified, yet what is the correct list of servers? The only way to make that config work is with complex middleware that automatically generates new config files. It depends on what you want. I think you're envisioning that the original server is taken out of the system and not waited for, meaning that you accept a lower level of persistence after failover. Yes, then you need to change the config. Or more likely you prepare the config file in the standby that way to begin with. I don't think that is the same amount of work to implement, its an order of magnitude harder overall. I meant it's the same amount of work to implement the feature in PostgreSQL. No doubt that maintaining such a setup in production is more complicated. 5. Requesting sync from more than one server performs poorly, since you must wait for additional servers. If there are sporadic or systemic network performance issues you will be badly hit by them. Monitoring that just got harder also. First-response-wins is more robust in the case of volatile resources since it implies responsiveness to changing conditions. 6. You just lost the ability to control performance on the master, with a userset. Performance is a huge issue with sync rep. If you can't control it, you'll simply turn it off. Having a feature that we daren't ever use because it performs poorly helps nobody. This is not a tick-box in our marketing checklist, I want it to be genuinely real-world usable. You could make synchronous_standbys a user-settable GUC, just like your proposed boolean switch. You could then control on a per-transaction basis which servers you want to wait to respond. Although perhaps it would be more user-friendly to just have an additional boolean GUC, similar to synchronous_commit=on/off. Or maybe synchronous_commit is enough to control that. I suppose we might regard the feature set I am proposing as being the same as making synchronous_standbys a USERSET parameter, and allowing just two options: none - allowing the user to specify async if they wish it * - allowing people to specify that syncing to *any* standby is acceptable We can blend the two approaches together, if we wish, by having two parameters (plus server naming) synchronous_replication = on | off (USERSET) synchronous_standbys = '...' If synchronous_standbys is not set and synchronous_replication = on then we sync to any standby. If synchronous_replication = off then we use async replication, whatever synchronous_standbys is set to. If synchronous_standbys is
Re: [HACKERS] Old git repo
On Fri, Dec 31, 2010 at 13:10, Robert Haas robertmh...@gmail.com wrote: On Fri, Dec 31, 2010 at 4:58 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Dec 31, 2010 at 03:04, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Davis pg...@j-davis.com writes: Personally, my utility for the old repo is not much (if it was anything important, I wouldn't have relied on the unofficial repo). But we should probably give a little bit of warning for folks that might want to rebase or translate some old notes. Well, I guess the question is how much warning. I suggested O(1 week) but Robert seems to want O(1 year). As long as there's some agreed deadline, I'm not very picky about what it is. I don't need to get rid of it *now*, but I agree a deadline is good. How about we either say when 9.1 is released, or we say september 2011 because that's a year after we made the switch? Either of those would be fine with me. Let's just decide it's when 9.1 is released. And I put it on you to remind me when the time comes ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 31.12.2010 14:40, Robert Haas wrote: Someone may have proposed this before, but one way of getting standby naming for free would be to make the standby names the same as the roles used to log in, rather than adding a separate parameter. We could just recommend to people that they use a separate, descriptive role name for each standby. Then the synchronous_standbys parameter - when added - would mean a standby from one of these roles. Seems a bit weird. It's not a lot of effort to give each standby a name. But if you want something automatic, how about gethostname() ? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] contrib/snapshot
Happy new year fellow pgsql-hackers! This is the first alpha release of a new hopefully quite interesting little tool, named snapshot. Feedback welcomed. -- Best regards, Joel Jacobson Glue Finance URL https://github.com/gluefinance/snapshot DESCRIPTION Take a snapshot or rollback all your stored procedures in your PostgreSQL database. RATIONALE Before reading any further, ask yourselves the following questions. 1. Have you ever, a) modified stored procedures in your production database and b) thought it went OK because all your tests passed and c) later on realized something is wrong and d) not being able to find nor fix the bug immediately leaving you no other option than to do a revert? If so, go to step 2. If not, go to step 4. 2. During the minutes/hours while your malfunctional patch made a mess in the production database, was there any user activity causing important writes to the database? If so, go to step 3. If not, go to step 4. 3. Did you enjoy the revert experience in step 1? If so, go to step 4. If not, go to step 5. 4. Are any of the following statements TRUE? a) your application is not very database centric. b) your users won't stop using your service if you lose their data. c) your application is read-only. d) your application does not have a lot of user traffic. If so, lucky you! If not, you probably have a good solution to my problem already, I would highly appreciate if you wanted to share it with me, please contact me at j...@gluefinance.com. 5. This proposed solution might be interesting for you. I would highly appreciate your feedback on how to improve it, please contact me at j...@gluefinance.com. INTRODUCTION snapshot can take a snapshot of all your database functions and objects depending on them, such as constraints and views using functions. snapshot can rollback to a previous snapshot without modifying any of your data or tables. It will only execute the minimum set of drop/create commands to carry out the rollback. snapshot depends on the pgcrypto contrib package. TERMINOLOGY object type objects of the same type are created and dropped the same way, i.e. they use the same functions to build proper create and drop SQL-commands. object is of an object type and has a SHA1 hash of its content consisting of two SQL-commands, one to create and another to drop the object. revisionhas a timestamp when it was created and a list of objects snapshothas a timestamp when it was taken and has a revision active snapshot the last snapshot taken take snapshot create a new revision of all objects currently live in the database and then create a new snapshot if the revision is different compared to the active snapshot. rollbackrestores a previously taken snapshot SYNOPSIS -- 1. Take a snapshot. postgres=# SELECT * FROM snapshot(); _snapshotid | _revisionid -+- 1 | 1 (1 row) -- 2. Take a snapshot. postgres=# SELECT * FROM snapshot(); _snapshotid | _revisionid -+- 1 | 1 (1 row) -- 3. We notice nothing changed between step 1 and 2. -- 4. Modify your functions. postgres=# CREATE FUNCTION myfunc() RETURNS VOID AS $$ $$ LANGUAGE sql; CREATE FUNCTION glue=# \df myfunc List of functions Schema | Name | Result data type | Argument data types | Type ++--+-+ public | myfunc | void | | normal (1 row) -- 5. Take a snapshot. postgres=# SELECT * FROM snapshot(); _snapshotid | _revisionid -+- 2 | 2 (1 row) -- 4. Rollback to snapshot 1. postgres=# SELECT * FROM snapshot(1); _snapshotid | _revisionid -+- 3 | 1 (1 row) -- 5. We notice the function we created in step 4 has been dropped. postgres=# \df myfunc List of functions Schema | Name | Result data type | Argument data types | Type +--+--+-+-- (0 rows) -- 6. Rollback to snapshot 2. postgres=# SELECT * FROM snapshot(2); _snapshotid | _revisionid -+- 4 | 2 (1 row) -- 7. We notice the function we created in step 4 has been created. postgres=# \df myfunc List of functions Schema | Name | Result data type | Argument data types | Type ++--+-+ public | myfunc | void
Re: [HACKERS] and it's not a bunny rabbit, either
Excerpts from Robert Haas's message of vie dic 31 02:07:18 -0300 2010: I think that's true in some cases but not all. The system-generated attribute names thing actually applies in several cases, and I think it's pretty cut-and-dried. When you get into something like which kinds of relations support triggers, that's a lot more arbitrary. I think part of the problem with the phrase system-generated attribute names is: how are users supposed to figure out what that means, and what relation types it applies to? It seems entirely non-obvious. I think for now what I had better do is try to get this SQL/MED patch finished up by soldiering through this mess rather than trying to fix it. I think it's going to be kind of ugly, but we haven't got another plan then we're just going to have to live with the ugliness. Perhaps it would make sense to fix the cases for which there is a consensus, and leave the rest alone for now. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
Excerpts from Joachim Wieland's message of vie dic 31 00:15:57 -0300 2010: On Thu, Dec 30, 2010 at 9:40 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Disadvantage of b: It doesn't allow a snapshot to be installed on a different server. It requires a serializable open transaction to hold the snapshot. Why does it require a serializable transaction? You could simply register the snapshot in any transaction. (Of course, the net effect would be pretty similar to a serializable transaction). I am not assuming that the publishing transaction blocks until its snapshot is being picked up. A read committed transaction would get a new snapshot for every other query, so the published snapshot is no longer represented by an actual backend until it is being picked up by one. Since nobody is holding off xmin/GlobalXmin, eventually vacuum would remove tuples that the published-but-not-yet-picked-up snapshot should still be able to see, no? A backend can have any number of snapshots registered, and those don't allow GlobalXmin to advance. Consider an open cursor, for example. Even if the rest of the transaction is read committed, the snapshot registered by the open cursor still holds back GlobalXmin. My (handwavy) idea is that whenever the transaction calls pg_publish_snapshot(), said snapshot is registered, which makes it safe to use even if the transaction continues to operate and obtain newer snapshots. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] estimating # of distinct values
Excerpts from Tom Lane's message of jue dic 30 23:02:04 -0300 2010: Alvaro Herrera alvhe...@commandprompt.com writes: I was thinking that we could have two different ANALYZE modes, one full and one incremental; autovacuum could be modified to use one or the other depending on how many changes there are (of course, the user could request one or the other, too; not sure what should be the default behavior). How is an incremental ANALYZE going to work at all? It has no way to find out the recent changes in the table, for *either* inserts or deletes. Unless you want to seqscan the whole table looking for tuples with xmin later than something-or-other ... which more or less defeats the purpose. Yeah, I was thinking that this incremental ANALYZE would be the stream in the stream-based estimator but evidently it doesn't work that way. The stream that needs to be passed to the estimator consists of new tuples as they are being inserted into the table, so this would need to be done by the inserter process ... or it'd need to transmit the CTIDs for someone else to stream them ... not an easy thing, in itself. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] and it's not a bunny rabbit, either
On Fri, Dec 31, 2010 at 8:10 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: I think for now what I had better do is try to get this SQL/MED patch finished up by soldiering through this mess rather than trying to fix it. I think it's going to be kind of ugly, but we haven't got another plan then we're just going to have to live with the ugliness. Perhaps it would make sense to fix the cases for which there is a consensus, and leave the rest alone for now. Sure. Which cases do we have consensus on? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, Dec 31, 2010 at 7:57 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 31.12.2010 14:40, Robert Haas wrote: Someone may have proposed this before, but one way of getting standby naming for free would be to make the standby names the same as the roles used to log in, rather than adding a separate parameter. We could just recommend to people that they use a separate, descriptive role name for each standby. Then the synchronous_standbys parameter - when added - would mean a standby from one of these roles. Seems a bit weird. It's not a lot of effort to give each standby a name. But if you want something automatic, how about gethostname() ? Uh, wow, no way. That would be making a critical aspect of system reliability depend on something way, way outside of our configuration. What's weird about using the role name? That's our standard way of distinguishing between two or more users. Why invent something new? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] and it's not a bunny rabbit, either
On tor, 2010-12-30 at 11:03 -0500, Robert Haas wrote: No, quite the opposite. With the other approach, you needed: constraints cannot be used on views constraints cannot be used on composite types constraints cannot be used on TOAST tables constraints cannot be used on indexes constraints cannot be used on foreign tables With this, you just need: constraints can only be used on tables At the beginning of this thread you said that the error messages should focus on what you tried to do, not what you could do instead. Also, in this particular case, the user could very well assume that a TOAST table or a foreign table is a table. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 12/31/2010 02:39 PM, Robert Haas wrote: On Fri, Dec 31, 2010 at 7:57 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 31.12.2010 14:40, Robert Haas wrote: Someone may have proposed this before, but one way of getting standby naming for free would be to make the standby names the same as the roles used to log in, rather than adding a separate parameter. We could just recommend to people that they use a separate, descriptive role name for each standby. Then the synchronous_standbys parameter - when added - would mean a standby from one of these roles. Seems a bit weird. It's not a lot of effort to give each standby a name. But if you want something automatic, how about gethostname() ? Uh, wow, no way. That would be making a critical aspect of system reliability depend on something way, way outside of our configuration. +1 What's weird about using the role name? That's our standard way of distinguishing between two or more users. Why invent something new? wel a user is not a host/server for me - given there is no real benefit from using distinct roles for each standby yet I don't see why we should complicate the replication setup procedure by requiring 1 role per standby. So I'm all for giving each standby a name but please make it an explicit one and not something that is only vaguely related to the actual standby host. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] and it's not a bunny rabbit, either
On tor, 2010-12-30 at 11:49 -0500, Tom Lane wrote: ISTM there are four things we might potentially want to state in the error message: the feature/operation you tried to apply, the name of the object you tried to apply it to, the type of that object, and the set of object types that the feature/operation will actually work for. I think the latter should be completely omitted unless it's exceptionally important. You can construct pretty silly things down this line: ERROR: permission denied for relation x ERROR: relation x does not exist vs. ERROR: you only have permission on relation a, b, c ERROR: only the following relations exist: a, b, c -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, Dec 31, 2010 at 8:48 AM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: What's weird about using the role name? That's our standard way of distinguishing between two or more users. Why invent something new? wel a user is not a host/server for me - given there is no real benefit from using distinct roles for each standby yet I don't see why we should complicate the replication setup procedure by requiring 1 role per standby. So I'm all for giving each standby a name but please make it an explicit one and not something that is only vaguely related to the actual standby host. OK. If that's the consensus, can someone post a patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication as a separate permissions
On Thu, Dec 30, 2010 at 15:54, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-12-29 at 11:09 +0100, Magnus Hagander wrote: I've applied this version (with some minor typo-fixes). This page is now somewhat invalidated: http://developer.postgresql.org/pgdocs/postgres/role-attributes.html Hmm. Somehow I missed that page completely when looking through the docs. I'll go update that. First, it doesn't mention the replication privilege, and second it continues to claim that superuser status bypasses all permission checks. Well, that was *already* wrong. superuser doesn't bypass NOLOGIN. That doesn't mean it shouldn't be fixed, but that's independent of the replication role. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Fri, Dec 31, 2010 at 8:28 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: A backend can have any number of snapshots registered, and those don't allow GlobalXmin to advance. Consider an open cursor, for example. Even if the rest of the transaction is read committed, the snapshot registered by the open cursor still holds back GlobalXmin. My (handwavy) idea is that whenever the transaction calls pg_publish_snapshot(), said snapshot is registered, which makes it safe to use even if the transaction continues to operate and obtain newer snapshots. Cool, even better that this is taken care of already :-) Thanks, Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, 2010-12-31 at 07:33 -0500, Aidan Van Dyk wrote: On Fri, Dec 31, 2010 at 5:26 AM, Simon Riggs si...@2ndquadrant.com wrote: Your picture above is a common misconception. I will add something to the docs to explain this. 2. sync does not guarantee that the updates to the standbys are in any way coordinated. You can run a query on one standby and get one answer and at the exact same time run the same query on another standby and get a different answer (slightly ahead/behind). That also means that if the master crashes one of the servers will be ahead or behind. You can use pg_last_xlog_receive_location() to check which one that is. When people say they want *all* servers to respond, its usually because they want (2), but that is literally impossible in a distributed system. Just to try and be clear again, in sync that Stefan and I are talking about, we really don't care that the slave could be a hot standby answering queries. In fact, mine wouldn't be. Mine would likely be pg_streamrecv or something. I'm just looking for a guarantee that I've got a copy of the data safely in the next rack, and a separate building before I tell the client I've moved his money. I want a synchronous replication of the *data*, and not a system where I can distribute queries. I'm looking for disaster mitigation, not load mitigation. A replacement for clustered/replicated devices/filesystems under pg_xlog. Having the next rack slave be hot in terms of applying WAL and ready to take over instantly would be a bonus, as long as I can guarantee it's current (i.e has all data a primary's COMMIT has acknowledged). So, that's what I want, and that's what your docs suggest is impossible currently; 1st past post means that I can only ever reliably configure 1 sync slave and be sure it will have all acknowledged commits. I can likely get *close* to that by putting only my slowest slave as the only sync slave, and monitoring the heck out of my asynchronous but I want to be synchronous slave, but I'ld rather trust the PG community to build robust synchronization than myself to build robust enough monitoring to catch that my slave is farther behind than the slower synchronous one. That said, I think the expectation is that if I were building a query-able hot standby cluster in sync rep mode, once I get a commit confirmation, I should be able to then initiate a new transaction on any member of that sync rep cluster and see the data I just committed. Yes, I know I could see *newer* data. And I know that the primary could already have newer data. Yes, we have the problem even on a single pg cluster on a single machine. But the point is that if you've committed, any new transactions see *at least* that data or newer. But no chance of older. But personally, I'm not interested in that ;-) I understand your requirements, listed above. There are good technical reasons why trying to achieve *all* of the above lets slip the other unstated requirements of availability, complexity, performance etc.. Inventing parameter combinations merely hides the fact that these things aren't all simultaneously achievable. In light of that, I have been espousing a simple approach to the typical case, and for the first release. I can see that people may assume my words have various other reasons behind them, but that's not the case. If I could give it all to you, I would. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/snapshot
On Dec 31, 2010, at 5:00 AM, Joel Jacobson wrote: Happy new year fellow pgsql-hackers! This is the first alpha release of a new hopefully quite interesting little tool, named snapshot. Feedback welcomed. This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're just using it to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes? Looks really cool. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/snapshot
2010/12/31 David E. Wheeler da...@kineticode.com This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're just using it to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes? Thanks for fast reply. My guests hate me becuase I had to escape from the dinner party, but I simply couldn't wait a whole year fixing this bug. Commit. -- Best regards, Joel Jacobson Glue Finance
Re: [HACKERS] contrib/snapshot
On Dec 31, 2010, at 10:15 AM, Joel Jacobson wrote: 2010/12/31 David E. Wheeler da...@kineticode.com This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're just using it to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes? Thanks for fast reply. My guests hate me becuase I had to escape from the dinner party, but I simply couldn't wait a whole year fixing this bug. lol! Go enjoy the party. There was no rush -- and no bug! :-) Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/snapshot
On Fri, 2010-12-31 at 14:00 +0100, Joel Jacobson wrote: This is the first alpha release of a new hopefully quite interesting little tool, named snapshot. Please call it something other than snapshot. There's already about 3 tools called something similar and a couple of different meanings of the term in the world of Postgres. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On Fri, 2010-12-31 at 14:40 +0200, Heikki Linnakangas wrote: On 31.12.2010 13:48, Simon Riggs wrote: I see significant real-world issues with configuring replication using multiple named servers, as described in the link above: All of these points only apply to specifying *multiple* named servers in the synchronous_standbys='...' list. Unfortunately, some of the points apply to using named servers ever, even if there is only one. That's certainly a more complicated scenario, and the configuration is more complicated as a result. With your proposal, it's not possible in the first place. Multiple synchronous standbys probably isn't needed by most people, so I'm fine with leaving that out for now, keeping the design the same otherwise. I included it in the proposal because it easily falls out of the design. So, if you're worried about the complexities of multiple synchronous standbys, let's keep the UI exactly the same as what I described in the link above, but only allow one name in the synchronous_standbys setting, instead of a list. The best usage recommendation is still to have 2+ standbys, *any* of which can be used to provide sync rep. That is the best performance, best availability and easiest to configure that I know of. That best usage is not achievable with uniquely named servers; using non-unique names defeats the point of having names in the first place. I accept that the best usage is a general case and there may be circumstances that make the difficulties of named servers worth the trouble. So replicating to multiple synchronous standbys is definitely needed in this release. *Confirming* replication to multiple named sync standbys is the thing we don't need in this release. I suppose we might regard the feature set I am proposing as being the same as making synchronous_standbys a USERSET parameter, and allowing just two options: none - allowing the user to specify async if they wish it * - allowing people to specify that syncing to *any* standby is acceptable We can blend the two approaches together, if we wish, by having two parameters (plus server naming) synchronous_replication = on | off (USERSET) synchronous_standbys = '...' If synchronous_standbys is not set and synchronous_replication = on then we sync to any standby. If synchronous_replication = off then we use async replication, whatever synchronous_standbys is set to. If synchronous_standbys is set, then we use sync rep to all listed servers. Sounds good. I still don't like the synchronous_standbys='' and synchronous_replication=on combination, though. IMHO that still amounts to letting the standby control the behavior on master, and it makes it impossible to temporarily add an asynchronous standby to the mix. I could live with it, you wouldn't be forced to use it that way after all, but I would still prefer to throw an error on that combination. Or at least document the pitfalls and recommend always naming the standbys. We need a parameter set that makes the best practice easy/easiest to specify, and yet more complicated configurations possible. So I'm happy to add synchronous_standbys parameter, as long as it is possible to specify any (for which I would use *), which would be the default. Initially that would be restricted to just one name. Will pass the server name as an option after IDENTIFY SYSTEM name. Anyway, lets continue the discussion next year. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/snapshot
2010/12/31 Simon Riggs si...@2ndquadrant.com Please call it something other than snapshot. There's already about 3 tools called something similar and a couple of different meanings of the term in the world of Postgres. Thanks, good point. Renamed to fsnapshot. Commit. -- Best regards, Joel Jacobson Glue Finance
Re: [HACKERS] contrib/snapshot
Hi. Will be useful to add a column with timestamp of the revision and a comment can you do it? not today in order that your friends dont kill you .. -- Sent from my mobile device pasman -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 31.12.2010 13:40, Heikki Linnakangas wrote: Sounds good. I still don't like the synchronous_standbys='' and synchronous_replication=on combination, though. IMHO that still amounts to letting the standby control the behavior on master, and it makes it impossible to temporarily add an asynchronous standby to the mix. A sync standby _will_have_ the ability to control the master anyway by simply being there or not. What is currently proposed is having dual power lines / dual UPS' and working happily on when one of them fails. Requiring both of them to be present defeats the original purpose of doubling them. So following Simons design of 2 standbys and only one required to ACK to commit you get 2X reliability of single standby. In a design where you have 2 standbys and both are required to ACK to commit you get only 1/2 the reliability of single standby. Having a list of 10 standbys and requiring ACK from all, you get only 10% of the reliability. I agree that there can be scenarios where you may need 10 sync copies before committing on master - usually for non-technical reasons like some accounting law or whatever - these are far rarer than requirement to have reasonable performance and 99.999% system uptime when using only 99.99% reliable hardware. And in such cases where you need multiple copies you will need some out-of-database technology (like signed timestamps) to make the data non-falsifiable as well, so you can't solve this with just configuring sync rep. I could live with it, you wouldn't be forced to use it that way after all, but I would still prefer to throw an error on that combination. Or at least document the pitfalls and recommend always naming the standbys. My proposal amounts to lets add synchronous_standbys as a parameter in 9.2. If you really think that we need that functionality in this release, lets get the basic stuff added now and then fold in those ideas on top afterwards. If we do that, I will help. However, my only insistence is that we explain the above points very clearly in the docs to specifically dissuade people from using those features for typical cases. Huh, wait, if you leave out synchronous_standbys, that's a completely different UI again. I think we've finally reached agreement on how this should be configured, let's stick to that, please. (I would be fine with limiting synchronous_standbys to just one server in this release though.) If you wondered why I ignored your post previously, its because I understood that Fujii's post of 15 Oct, one week later, effectively accepted my approach, albeit with two additional parameters. That is the UI that I had been following. http://archives.postgresql.org/pgsql-hackers/2010-10/msg01009.php That thread makes no mention of how to specify which standbys are synchronous and which are not. The simplest way would be to have separate database users for sync and async standbys ? That would allow any standby with right credentials act as a sync user, and those who are not eligible are not accepted even if they try to act as a synchronity (?) provider. It's about specifying the timeout and whether to wait for a disconnected standby. Yeah, Fujii-san's proposal seems reasonable for configuring that. Hannu Krosing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO item for pg_ctl and server detection
Bruce Momjian wrote: Bruce Momjian wrote: Yes, that was my calculus too. I realized that we create session ids by merging the process id and backend start time, so I went ahead and added the postmaster start time epoch to the postmaster.pid file. While there is no way to pass back the postmaster start time from PQping, I added code to pg_ctl to make sure the time in the postmaster.pid file is not _before_ pg_ctl started running. We only check PQping() after we have started the postmaster ourselves, so it fits our needs. Tom suggested that there might be clock skew between pg_ctl and the postmaster, so I added a 2-second slop in checking the postmaster start time. Tom also wanted the connection information to be output all at once, but that causes a problem with detecting pre-9.1 servers so I avoided it. Patch applied, and TODO item removed because patch mostly detects if a stale postmaster created the postmaster.pid file. The TODO was: Allow pg_ctl to work properly with configuration files located outside the PGDATA directory) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python refactoring
On tor, 2010-12-23 at 14:41 +0100, Jan Urbański wrote: It does some architectural changes to PL/Python that make it easier to implement other features, like for instance a validator function. The full list of changes in the patch is: I would review this and the following patches, but I'd really prefer it if you could split this particular patch into about 11 single-purpose patches. I think most of the changes here are not interrelated. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep Design
On 12/31/10 4:40 AM, Robert Haas wrote: Someone may have proposed this before, but one way of getting standby naming for free would be to make the standby names the same as the roles used to log in, rather than adding a separate parameter. We could just recommend to people that they use a separate, descriptive role name for each standby. Then the synchronous_standbys parameter - when added - would mean a standby from one of these roles. I like this idea; it has an elegant simplicity about it and right now I can't think of any real faults. It would have the added benefit that each standby group would show up by the group name in ps and on pg_stat_activity. However, I agree strongly with Simon that we really want the simplest possible synch rep implementation for 9.1, given that we're currently 15 days away from the *last* commitfest. Nobody, at this point, has really even test any of the sync rep patches for reliability or performance. Here's how I could see us developing: Simplest (9.1): -- synch/async capability set on the standby in recovery.conf -- synch/async transaction status (async, recv, fsync, apply) declared as a userset by the executing session (from Simon's patch) -- only one ack for sync, regardless of the number of standbys This would allow users to have a single sync standby plus a pool of async standbys, which is what I think 90% of users who care about sync standby want. More Complex (9.2): -- all of the above, *plus* the ability to have standbys with ROLEs and require acks from one server in each ROLE. Web-Scale (9.3): -- all of the above, plus group commit, the ability to specific a number of servers in each ROLE who must ack. ... but the most important thing is to make the feature for 9.1 simple, simple, simple. We know we won't get it right on the first try. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] estimating # of distinct values
On Dec 31, 2010, at 7:34 AM, Alvaro Herrera wrote: Excerpts from Tom Lane's message of jue dic 30 23:02:04 -0300 2010: Alvaro Herrera alvhe...@commandprompt.com writes: I was thinking that we could have two different ANALYZE modes, one full and one incremental; autovacuum could be modified to use one or the other depending on how many changes there are (of course, the user could request one or the other, too; not sure what should be the default behavior). How is an incremental ANALYZE going to work at all? It has no way to find out the recent changes in the table, for *either* inserts or deletes. Unless you want to seqscan the whole table looking for tuples with xmin later than something-or-other ... which more or less defeats the purpose. Yeah, I was thinking that this incremental ANALYZE would be the stream in the stream-based estimator but evidently it doesn't work that way. The stream that needs to be passed to the estimator consists of new tuples as they are being inserted into the table, so this would need to be done by the inserter process ... or it'd need to transmit the CTIDs for someone else to stream them ... not an easy thing, in itself. Perhaps listen/notify could be used for this, now that it allows passing a payload. BTW, if we reduce the frequency at which full scans of large tables are needed then presumably the cost of the scans could be largely ignored. If we don't need to scan frequently then we shouldn't care very much about how long a scan takes, which means we could throttle it heavily. Presumably even a heavily used system can spare 500kB/s of IO to perform background scanning... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] and it's not a bunny rabbit, either
On Fri, Dec 31, 2010 at 8:48 AM, Peter Eisentraut pete...@gmx.net wrote: On tor, 2010-12-30 at 11:03 -0500, Robert Haas wrote: No, quite the opposite. With the other approach, you needed: constraints cannot be used on views constraints cannot be used on composite types constraints cannot be used on TOAST tables constraints cannot be used on indexes constraints cannot be used on foreign tables With this, you just need: constraints can only be used on tables At the beginning of this thread you said that the error messages should focus on what you tried to do, not what you could do instead. Yeah, and I still believe that. I'm having difficulty coming up with a workable approach, though. It would be simple enough if we could write: /* translator: first %s is a feature, second %s is a relation type */ %s cannot be used on %s ...but I think this is likely to cause some translation headaches. Also, in this particular case, the user could very well assume that a TOAST table or a foreign table is a table. There's a limited amount we can do about confused users, but it is true that the negative phrasing is better for that case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers