Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 6:43 PM, Joshua Berkus j...@agliodbs.com wrote: Simon, The point I have made is that I disagree with a feature freeze date fixed ahead of time without regard to the content of the forthcoming release. I've not said I disagree with feature freezes altogether, which would be utterly ridiculous. Fixed dates are IMHO much less important than a sensible and useful feature set for our users. This is such a non-argument it's silly. We have so many new major features for 9.1 that I'm having trouble writing sensible press releases which don't sound like a laundry list. You're right this is a non-argument. I am not continuing this debate using the above point. I am merely correcting people's assertions about what I think, which is a little tiresome for all of us and it would be much better if people didn't foolishly put words in my mouth, as multiple people have done on this thread. I'm also quite happy with the feature set for 9.1. MySQL repeatedly delivered releases with half-finished features and earned much disrespect. We have never done that previously and I am against doing so in the future. This is also total BS. I worked on the MySQL team. Before Sun/Oracle, MySQL specifically had feature-driven releases, where Marketing decided what features 5.0, 5.1 and 5.2 would have. They also accepted new features during beta if Marketing liked them enough. This resulted in the 5.1 release being *three years late*, and 5.3 being cancelled altogether. And let's talk about the legendary instability of 5.0, because they decided that they couldn't cancel partitioning and stored procedures, whether they were ready for prime time or not and because they kept changing the API during beta. MySQL never had time-based releases before Oracle took them over. And Oracle has been having feature-free releases because they're trying to work through MySQL's list of thousands of unfixed bugs which dates back to 2003. I claimed they delivered half-finished features. You clearly agree with me on that. I'm not sure which part you see as BS? An argument for feature-driven releases is in fact an argument for the MySQL AB development model. And that's not a company I want to emulate. Yes, I've also experienced totally marketing-driven software development, and that's why I'm *here*. I've spoken at length about how good our process is and have considerable respect for it and the people that have made it work. I am not advocating any changes to it at all, especially not to the model used by MYSQL AB. I have asked that we maintain the Reasonableness we have always had about how the feature freeze date was applied. An example of such reasonableness is that if a feature is a few days late and it is important, then it would still go into the release. An example of unreasonableness would be to close the feature freeze on a predetermined date, without regard to the state of the feature set in the release. To date, we have always been reasonable and I don't want to change the process in the way Robert has suggested we should change. I was one of a number of developers making that point at the developer meeting and I would say I was part of the majority view. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Thu, Jun 9, 2011 at 5:09 AM, Simon Riggs si...@2ndquadrant.com wrote: I have asked that we maintain the Reasonableness we have always had about how the feature freeze date was applied. An example of such reasonableness is that if a feature is a few days late and it is important, then it would still go into the release. An example of unreasonableness would be to close the feature freeze on a predetermined date, without regard to the state of the feature set in the release. To date, we have always been reasonable and I don't want to change the process in the way Robert has suggested we should change. Now you're putting words in my mouth. I wouldn't want to put out a release without a good feature set, either, but we don't have that problem. Getting them out on a fairly regular schedule without a really long feature freeze has traditionally been a bit harder. I believe that over the last few releases we've actually gotten better at integrating larger patches while also sticking closer to the schedule; and I'd like to continue to get better at both of those things. I don't advocate blind adherence to the feature freeze date either, but I do prefer to see deviations measured in days or at most weeks rather than months; and I have a lot more sympathy for the patch submitted and no one got around to reviewing it situation than I do for the patch just plain got here late case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Thu, Jun 9, 2011 at 2:13 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 9, 2011 at 5:09 AM, Simon Riggs si...@2ndquadrant.com wrote: I have asked that we maintain the Reasonableness we have always had about how the feature freeze date was applied. An example of such reasonableness is that if a feature is a few days late and it is important, then it would still go into the release. An example of unreasonableness would be to close the feature freeze on a predetermined date, without regard to the state of the feature set in the release. To date, we have always been reasonable and I don't want to change the process in the way Robert has suggested we should change. Now you're putting words in my mouth. I wouldn't want to put out a release without a good feature set, either, but we don't have that problem. Getting them out on a fairly regular schedule without a really long feature freeze has traditionally been a bit harder. I believe that over the last few releases we've actually gotten better at integrating larger patches while also sticking closer to the schedule; and I'd like to continue to get better at both of those things. I don't advocate blind adherence to the feature freeze date either, but I do prefer to see deviations measured in days or at most weeks rather than months; and I have a lot more sympathy for the patch submitted and no one got around to reviewing it situation than I do for the patch just plain got here late case. Can we make this the last post on this topic please? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Can we make this the last post on this topic please? +1 :) Thanks, Pavan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Jun 7, 2011, at 8:24 AM, Stephen Frost wrote: * Alvaro Herrera (alvhe...@commandprompt.com) wrote: I note that if 2nd Quadrant is interested in having a game-changing platform without having to wait a full year for 9.2, they can obviously distribute a modified version of Postgres that integrates Robert's patch. Having thought about this, I've got to agree with Alvaro on this one. The people who need this patch are likely to pull it down and patch it in and use it, regardless of if it's in a release or not. My money is that Treat's already got it running on some massive prod system that he supports ( ;) ). If we get it into the first CF of 9.2 then people are going to be even more likely to pull it down and back-patch it into 9.1. As soon as we wrap up CF1 and put out our first alpha, the performance testers will have something to point at and say look! PG scales *even better* now! and they're not going to particularly care that it's an alpha and the blog-o-sphere isn't going to either, especially if we can say and it'll be in the next release which is scheduled for May. From the Thinking Outside The Box dept.: Also, if the performance gains prove to be as earth-shattering as initial results indicate, there's nothing that says we *have* to wait until the middle of next year to get this out. We could push to get 9.2 out with fewer other features, or possibly even break with tradition and backport this to 9.1 (or perhaps have a fork of 9.1 that we only support until 9.2 is out). Obviously, those options all involve serious time commitments and the community will have to weigh those carefully. And we'd have to have very strong evidence of the benefits before even having that discussion, because the discussion itself will likely be resource intensive. But the option *is* there, should we decide to pursue it. This means that this patch is too important to wait another 12 months isn't really a valid point: it only has to wait 12 months if thats what the community thinks is best; otherwise it could miss 9.1 *and* be out significantly before 12 months from now. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 5:19 AM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote: My point was that we have in the past implemented performance changes to increase scalability at the last minute, and also that our personal risk perspectives are not always set in stone. Robert has highlighted the value of this change and its clearly not beyond our wit to include it, even if it is beyond our will to do so. So, at the risk of totally derailing this thread -- what this boils down to is a philosophical disagreement. It seems to me (and, I think, to Tom and Heikki and others as well) that it's not possible to keep on making changes to the release right up until the last minute and then expect the release to be of high quality. If we keep committing new features, then we'll keep introducing new bugs. The only hope of making the bug count go down at some point is to stop making changes that aren't bug fixes. We could come up with some complex procedure for determining whether a patch is important enough and non-invasive enough to bypass the normal deadline, but that would probably lead to a lot more arguing about procedure, and realistically, it's still going to increase the bug count at least somewhat. IMHO, it's better to just have a deadline, and stuff either makes it or it doesn't. I realize we haven't always adhered to the principle in the past, but at least IMV that's not a mistake we want to continue repeating. Simon is right that we slipped the vxid patch into 8.3 when a Postgres user I talked to at Linuxworld mentioned high vacuum freeze activity and simple calculations showed the many read-only queries could cause high xid usage. Fortunately we already had a patch available and Tom applied it during beta. It was an existing patch that took on new urgency during beta. Robert's point above is that it isn't so much making the decision of whether something should slip past the deadline, but the time-sapping discussion of whether something should slip, and the frankly disturbing behavior of some in this group to not accept a clear consensus, therefore prolonging the discussion of slippage far longer than necessary. Basically, if you propose something, and it gets shot down due to procedure, accept that unless you have some very good _new_ reason for continuing the discussion. If you don't like that, then you are not going to do well in our group and maybe this isn't the group for you. I think we are going to need to be much more forceful about this, and if the threat that someone has commit rights and therefore we can't ignore them, we will have to reconsider who can commit to this project. Do I need to be any clearer? You are very clear, but as to why, I am not sure. On Monday, realising that Robert had discovered something of massive potential benefit to the community, I asked Tom to take a look at the patch to see if I could get his interest in including it in this release. I did that out of pure altruism; how could I possibly benefit from highlighting the work of another person, another company? Tom has agreed with me that making tuning proposals during beta is acceptable. In this case, he thinks it is too risky to apply. In fact, I agreed, having reviewed the patch myself, suggesting a much simpler, non-invasive patch instead (a new reason, as you say). I then immediately accepted his decision to exclude any patch involving locking from further consideration. Given the level of potential benefit, I don't have a problem tapping Tom on the shoulder to review it and see if it is tweakable. At no point have I discussed applying the patch myself, nor have I ever even considered it. The main point is that in his hands a task can be done in days, not the months others have quoted. You can read that as respect and optimism, or you can see chaos and disrespect, but that is all in the eye of the beholder. As a result of this, I've been insulted, told I have no respect for process and even suggested there was a threat of patch war. None of that is reasonable or anywhere close to truth. If there has been a time sapping discussion, it is because people have jumped to conclusions and responded irrationally. To be honest, I'm completely surprised by all of that. I had no idea that me asking Tom a question was perceived as a denial of service attack on the community, nor that it would result in the comments made to me and about me. As long as I am allowed the freedom to speak in this forum then I will speak up for PostgreSQL users, committer or not. As long as I'm a committer, I will take responsibility for the code and seek to improve it and fix it according to the community process. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 11:39 AM, Jim Nasby j...@nasby.net wrote: On Jun 7, 2011, at 8:24 AM, Stephen Frost wrote: * Alvaro Herrera (alvhe...@commandprompt.com) wrote: I note that if 2nd Quadrant is interested in having a game-changing platform without having to wait a full year for 9.2, they can obviously distribute a modified version of Postgres that integrates Robert's patch. Having thought about this, I've got to agree with Alvaro on this one. The people who need this patch are likely to pull it down and patch it in and use it, regardless of if it's in a release or not. My money is that Treat's already got it running on some massive prod system that he supports ( ;) ). If we get it into the first CF of 9.2 then people are going to be even more likely to pull it down and back-patch it into 9.1. As soon as we wrap up CF1 and put out our first alpha, the performance testers will have something to point at and say look! PG scales *even better* now! and they're not going to particularly care that it's an alpha and the blog-o-sphere isn't going to either, especially if we can say and it'll be in the next release which is scheduled for May. From the Thinking Outside The Box dept.: Also, if the performance gains prove to be as earth-shattering as initial results indicate, there's nothing that says we *have* to wait until the middle of next year to get this out. We could push to get 9.2 out with fewer other features, or possibly even break with tradition and backport this to 9.1 (or perhaps have a fork of 9.1 that we only support until 9.2 is out). Obviously, those options all involve serious time commitments and the community will have to weigh those carefully. And we'd have to have very strong evidence of the benefits before even having that discussion, because the discussion itself will likely be resource intensive. But the option *is* there, should we decide to pursue it. This means that this patch is too important to wait another 12 months isn't really a valid point: it only has to wait 12 months if thats what the community thinks is best; otherwise it could miss 9.1 *and* be out significantly before 12 months from now. Right. The community gets to decide when the community wants to release, and with what features. Right now, the consensus is that we want to finish up 9.1 and release it. It doesn't seem impossible that we could manage to do that before this patch is ready for commit, which is why I don't want to try to slip this into 9.1 no matter how valuable it is. I also feel that the fundamental thing we need in order to have better releases is more developers spending more time developing cool stuff. That is why I am somewhat dismayed to see this discussion veer off on what I consider to be a tangent about release scheduling. It took me about 3 days to write the patch. I've now spent the better part of a day on this scheduling discussion. I would rather have spent that time improving the patch. Or working on some other patch. Or getting 9.1 out the door. Now, mind you, I think release scheduling is important. I believe in the value of good project management. But if we make every cool patch that comes along into an opportunity to fight about the release schedule, that's not productive. Already, I feel that any hope I might have had of getting useful technical feedback on this patch anytime in the near future has been basically obliterated. What a bummer. As for the 9.2 schedule, I'm actually hoping that 9.2 will be a big release for performance, sorta like 8.3 was. I think that to make that happen, we're going to need more than one good patch. This patch can be part of that picture, but there are many users who derive no benefit or only a small benefit from it. Of course, there are some who will get a big benefit, and I'm as excited about that as everyone else, but if we can broaden the aperture a bit and come up with a variety of improvements that hit on a variety of use cases, then we'll really have something to brag about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 6:02 AM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Simon is right that we slipped the vxid patch into 8.3 when a Postgres user I talked to at Linuxworld mentioned high vacuum freeze activity and simple calculations showed the many read-only queries could cause high xid usage. Fortunately we already had a patch available and Tom applied it during beta. It was an existing patch that took on new urgency during beta. Just to set the record straight on this ... the vxid patch went in on 2007-09-05: http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php which was a day shy of a month before we wrapped 8.3beta1: http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php so it was during alpha phase not beta. And 8.3RC1 was stamped on 2008-01-03. So Simon's assertion that this was days before we produced a release candidate is correct, if you take days as 4 months. The patch went in slightly more than 6 months after feature freeze, even though it was written by a summer student and did not even pass review by the student's mentor (me). The patch is invasive, involving core changes to the transaction infrastructure and touching the more than 30 files. It was a brilliant contribution from Florian. I take it as an example of * what you can do when you set your mind to it, given sufficient cause and a good starting point * how people can propose things of value to the community even at a late stage * how I have respected the process at other times -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 5:33 AM, Bruce Momjian br...@momjian.us wrote: One more thing --- when Tom applied that patch during 8.3 beta it was with everyone's agreement, so the policy should be that if we are going to break the rules, everyone has to agree --- if anyone disagrees, the rules stand. I spoke against applying the patch, and to my knowledge was the only person to have reviewed it at that stage. I was happy that Tom applied it, but I would not have done so myself then, nor would I do so now. I would trust only Tom to do that, which is why I proposed to Tom that he look at Robert's patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 12:25 PM, Simon Riggs si...@2ndquadrant.com wrote: As a result of this, I've been insulted, told I have no respect for process and even suggested there was a threat of patch war. Well, you've pretty much said flat out you don't like the process, and you don't agree with having a firm feature freeze. I think it's a perfectly legitimate question to ask whether we're going to have to continually relitigate that point. This is at least the second major dust-up on this point since the end of 9.1CF4, and there were some smaller ones, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 5:32 PM, Robert Haas robertmh...@gmail.com wrote: It took me about 3 days to write the patch. I've now spent the better part of a day on this scheduling discussion. I would rather have spent that time improving the patch. Or working on some other patch. Or getting 9.1 out the door. Sync Rep took 6 days to write initially and about 6 months to discuss it, so you have a long way to go before your experience matches mine. Sometimes people side track you onto things you think are pointless, and sometimes you voice the opinion that they shouldn't have done so. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 5:44 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 8, 2011 at 12:25 PM, Simon Riggs si...@2ndquadrant.com wrote: As a result of this, I've been insulted, told I have no respect for process and even suggested there was a threat of patch war. Well, you've pretty much said flat out you don't like the process, and you don't agree with having a firm feature freeze. I think it's a perfectly legitimate question to ask whether we're going to have to continually relitigate that point. This is at least the second major dust-up on this point since the end of 9.1CF4, and there were some smaller ones, too. Why do you address this to me? Many others have been committing patches against raised issues well after feature freeze. You do not wish to stop all patches, only those you disagree with. How would I know you disagree with a patch without discussing it? I note that you've claimed *everything* I have discussed is a new feature, whereas everything you or others have done is an open item. You can claim that everything I suggest is a dust-up if you wish, but who makes it a dust up and why? The point I have made is that I disagree with a feature freeze date fixed ahead of time without regard to the content of the forthcoming release. I've not said I disagree with feature freezes altogether, which would be utterly ridiculous. Fixed dates are IMHO much less important than a sensible and useful feature set for our users. MySQL repeatedly delivered releases with half-finished features and earned much disrespect. We have never done that previously and I am against doing so in the future. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon, The point I have made is that I disagree with a feature freeze date fixed ahead of time without regard to the content of the forthcoming release. I've not said I disagree with feature freezes altogether, which would be utterly ridiculous. Fixed dates are IMHO much less important than a sensible and useful feature set for our users. This is such a non-argument it's silly. We have so many new major features for 9.1 that I'm having trouble writing sensible press releases which don't sound like a laundry list. MySQL repeatedly delivered releases with half-finished features and earned much disrespect. We have never done that previously and I am against doing so in the future. This is also total BS. I worked on the MySQL team. Before Sun/Oracle, MySQL specifically had feature-driven releases, where Marketing decided what features 5.0, 5.1 and 5.2 would have. They also accepted new features during beta if Marketing liked them enough. This resulted in the 5.1 release being *three years late*, and 5.3 being cancelled altogether. And let's talk about the legendary instability of 5.0, because they decided that they couldn't cancel partitioning and stored procedures, whether they were ready for prime time or not and because they kept changing the API during beta. MySQL never had time-based releases before Oracle took them over. And Oracle has been having feature-free releases because they're trying to work through MySQL's list of thousands of unfixed bugs which dates back to 2003. An argument for feature-driven releases is in fact an argument for the MySQL AB development model. And that's not a company I want to emulate. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/07/2011 11:55 AM, Tom Lane wrote: Simon Riggssi...@2ndquadrant.com writes: Before you arrived, it was quite normal to suggest tuning patches after feature freeze. *Low risk* tuning patches make sense at this stage, yes. Fooling with the lock mechanisms doesn't qualify as low risk in my book. The probability of undetected subtle problems is just too great. regards, tom lane I would like to see us continue on the path of release not destabilization. Any patch that breaks into core feature mechanisms (like locking) is bound to have something unsuspecting in the wings. +1 for submitting for 9.2. +1 for not comitting to 9.1. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Wed, Jun 8, 2011 at 1:10 PM, Simon Riggs si...@2ndquadrant.com wrote: Why do you address this to me? Many others have been committing patches against raised issues well after feature freeze. No one other than you has proposed committing anything nearly as invasive as this, and the great majority of what we've committed has been targeted at new regressions in 9.1. There is a difference between a feature and a bug fix. Sometimes the distinction is arguable, but this isn't one of those cases. A feature freeze does not mean an absolute code freeze; it means a freeze on *features*. You do not wish to stop all patches, only those you disagree with. How would I know you disagree with a patch without discussing it? I note that you've claimed *everything* I have discussed is a new feature, whereas everything you or others have done is an open item. You can claim that everything I suggest is a dust-up if you wish, but who makes it a dust up and why? I think the people, including me, who feel that it's not a good idea to commit new features have been very clear about the reasons for their position - namely, (1) the desire to get the release out the door in a timely fashion, and (2) the desire to treat everyone's patches in a fair and even-handed way rather than privileging some over others. I'm just as much against committing my own features, or Tom's features, or Alvaro's features as I am against committing your features - not because I don't like the features (I do) but because I want to release 9.1 in about a month. The point I have made is that I disagree with a feature freeze date fixed ahead of time without regard to the content of the forthcoming release. I've not said I disagree with feature freezes altogether, which would be utterly ridiculous. Fixed dates are IMHO much less important than a sensible and useful feature set for our users. MySQL repeatedly delivered releases with half-finished features and earned much disrespect. We have never done that previously and I am against doing so in the future. So am I. But apparently, we have very different ideas of what that means. I thought that making the server shuts down properly, even if you are using sync rep was a clear-cut case of correcting a half-finished feature, but you argued against that change. And I think that revamping the locking mechanism so it's faster is clearly a new feature, not a repair to something half-finished. I don't expect it's very realistic to think that everyone is going to agree on every patch, but we can't agree that bug fixes and features should be treated differently, or if we can't agree at least in most cases on what the difference is between one and the other, then we will spend a lot of time talking past each other. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: On Wed, Jun 8, 2011 at 6:02 AM, Tom Lane t...@sss.pgh.pa.us wrote: Just to set the record straight on this ... the vxid patch went in on 2007-09-05: http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php which was a day shy of a month before we wrapped 8.3beta1: http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php so it was during alpha phase not beta. And 8.3RC1 was stamped on 2008-01-03. So Simon's assertion that this was days before we produced a release candidate is correct, if you take days as 4 months. The patch went in slightly more than 6 months after feature freeze, even though it was written by a summer student and did not even pass review by the student's mentor (me). I'm not sure why you're having such a hard time distinguishing before beta from after beta, but in any case please notice that you're describing a cycle where we spent nine months in feature freeze. Nobody else here is going to hold that up as an example of sound project management that we ought to repeat. And the way to not repeat it is to not accept risky new patches late in the cycle. (This may be something of an apples-to-oranges comparison, though, since as best I can tell from a quick look in the archives, we were not then using the term feature freeze the same as we are now --- 2007-04-01 seems to have been the point that we would now call beginning of the last CF, ie, all feature patches for 8.3 were supposed to have been *submitted*, not necessarily committed. And we had a lot of them pending at that point, because of lack of the CF process to get things in earlier.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Joshua Berkus j...@agliodbs.com writes: Simon, The point I have made is that I disagree with a feature freeze date fixed ahead of time without regard to the content of the forthcoming release. I've not said I disagree with feature freezes altogether, which would be utterly ridiculous. Fixed dates are IMHO much less important than a sensible and useful feature set for our users. This is such a non-argument it's silly. Perhaps more to the point, we've tried that approach in the past, repeatedly, and it's been a scheduling disaster every single time. Slipping the release date in order to get in newly-written features, no matter *how* attractive they are, does not work. Maybe there are people who can make it work, but not us. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 12:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: Dave Page dp...@pgadmin.org writes: On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote: If we're going to start putting in changes like this, I'd suggest that we try and target something like September for 9.1 to actually be released. Playing with the lock management isn't something we want to be doing lightly and I think we definitely need to have serious testing of this, similar to what has been done for the SSI changes, before we're going to be able to release it. Completely aside from the issue at hand, aren't we looking at a September release by now anyway (assuming we have to void late July/August as we usually do)? Very possibly. So if we add this in, we're talking November or December instead of September. You can't argue that July/August will be lost time for one development path but not another. That would depend on 2 things - a) whether testing and review of this single patch would really add 2 - 3 months to the schedule (I'm no expert on our locking, but I suspect it would not), and b) whether there are people around over the summer who could test/review. The reason we usually skip the summer isn't actually a wholesale lack of people - it's because it's not so good from a publicity perspective, and it's hard to get all the packagers around at the same time. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
* Alvaro Herrera (alvhe...@commandprompt.com) wrote: I note that if 2nd Quadrant is interested in having a game-changing platform without having to wait a full year for 9.2, they can obviously distribute a modified version of Postgres that integrates Robert's patch. Having thought about this, I've got to agree with Alvaro on this one. The people who need this patch are likely to pull it down and patch it in and use it, regardless of if it's in a release or not. My money is that Treat's already got it running on some massive prod system that he supports ( ;) ). If we get it into the first CF of 9.2 then people are going to be even more likely to pull it down and back-patch it into 9.1. As soon as we wrap up CF1 and put out our first alpha, the performance testers will have something to point at and say look! PG scales *even better* now! and they're not going to particularly care that it's an alpha and the blog-o-sphere isn't going to either, especially if we can say and it'll be in the next release which is scheduled for May. So, all-in-all, -1 from me on trying to get this into 9.1. Let's get 9.1 done and out the door already, hopefully before summer saps away *too* many resources.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/06/2011 04:43 PM, Robert Haas wrote: On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. I'll start with some performance numbers. I hereby recommend that people with patches such as this one while on the last weeks till release should refrain from posting them until the release has actually taken place. %@#! Next time I'll be sure to only post my patches during beta if they suck. I think Alvaro's point isn't directed at you Robert but at the idea that this should be applied to 9.1. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:50 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 06/06/2011 09:24 PM, Dave Page wrote: On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. I disagree - the proposed patch maybe provides a very significant improvment for a certain workload type(nothing less but nothing more), but it was posted way after -BETA and I'm not sure we yet understand all implications of the changes. We certainly need to be happy with the implications if we were to make such a decision. We also have to consider that the underlying issues are known problems for multiple years^releases so I don't think there is a particular rush to force them into a particular release (as in 9.1). No, there's no *technical* reason we need to do this, as there would be if it were a bug fix for example. I would just like to see us narrow the gap with our competitors sooner rather than later, *if* we're a) happy with the change, and b) we're talking about a minimal delay (which we may be - Robert says he thinks the patch is good, so with another review and beta testing). Stefan/Robert's observation that we perform a VirtualXactLockTableInsert() to no real benefit is a good one. It leads to the following simple patch to remove one lock table hit per transaction. It's a lot smaller impact on the LockMgr locks, but it will still be substantial. Performance tests please? This patch is much less invasive and has impact only on CREATE INDEX CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and test and there's no way it will cause any delay at all to the release schedule. (Though I'm sure Robert can improve it). If we combine this patch with Koichi-san's recommended changes to the number of lock partitions, we will have considerable impact for 9.1. Robert will still get his day in the sun, just with 9.2. This way we get something now *and* something later, while the risk minimisers will have succeeded in protecting the code. A compromise for everyone. Please consider this as a serious proposal for tuning in 9.1. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services remove_VirtualXactLockTableInsert.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 12:51 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Jun 6, 2011 at 8:50 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 06/06/2011 09:24 PM, Dave Page wrote: On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. I disagree - the proposed patch maybe provides a very significant improvment for a certain workload type(nothing less but nothing more), but it was posted way after -BETA and I'm not sure we yet understand all implications of the changes. We certainly need to be happy with the implications if we were to make such a decision. We also have to consider that the underlying issues are known problems for multiple years^releases so I don't think there is a particular rush to force them into a particular release (as in 9.1). No, there's no *technical* reason we need to do this, as there would be if it were a bug fix for example. I would just like to see us narrow the gap with our competitors sooner rather than later, *if* we're a) happy with the change, and b) we're talking about a minimal delay (which we may be - Robert says he thinks the patch is good, so with another review and beta testing). Stefan/Robert's observation that we perform a VirtualXactLockTableInsert() to no real benefit is a good one. It leads to the following simple patch to remove one lock table hit per transaction. It's a lot smaller impact on the LockMgr locks, but it will still be substantial. Performance tests please? This patch is much less invasive and has impact only on CREATE INDEX CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and test and there's no way it will cause any delay at all to the release schedule. (Though I'm sure Robert can improve it). If we combine this patch with Koichi-san's recommended changes to the number of lock partitions, we will have considerable impact for 9.1. Robert will still get his day in the sun, just with 9.2. This way we get something now *and* something later, while the risk minimisers will have succeeded in protecting the code. A compromise for everyone. Please consider this as a serious proposal for tuning in 9.1. You seem to have completely ignored the reason why it works that way in the first place, which is that there is otherwise a risk of undetected deadlock. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 11:56 AM, Joshua D. Drake j...@commandprompt.com wrote: On 06/06/2011 04:43 PM, Robert Haas wrote: On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. I'll start with some performance numbers. I hereby recommend that people with patches such as this one while on the last weeks till release should refrain from posting them until the release has actually taken place. %@#! Next time I'll be sure to only post my patches during beta if they suck. I think Alvaro's point isn't directed at you Robert but at the idea that this should be applied to 9.1. Oh, I get that. I'm just dismayed that we can't have a discussion about the patch without getting sidetracked into a conversation about whether we should throw feature freeze out the window. If posting patches that do interesting things during beta results in everyone ignoring both the work that needs to be done to get from beta to final release, and the patch itself, in favor of talking about the release schedule, then I think at the next developer meeting we're going to get to hear Tom argue that overlapping the end of beta with the beginning of the next release cycle is a mistake and we should go back to the old system where we yell at everyone to shut up unless they're helping test or fix bugs. Since that overlap is going to (hopefully) allow this patch to get into the tree ~2-3 months SOONER than it would have under the old system, I would be unhappy to see it abolished. Everyone who is arguing for the inclusion of this patch in 9.1 should take a minute to think about the following fact: If the PostgreSQL development process does not work for Tom, it does not work. Full stop. We all know that Tom is conservative with respect to release management, but we also know that his output is enormous, that he fixes virtually all of the bugs that *get* fixed, and that our well-deserved reputation for high quality releases is in large part attributable to him. We will not be better off if we design a process that leaves him cold. The fact that Alvaro, Heikki, Andrew, Kevin, and myself don't like the proposed process either is just icing on the cake. And I use the term process loosely, because what's really being proposed is the complete absence of any process. The idea of having a feature freeze some time prior to release is hardly a novel roadblock that we've invented here at the PostgreSQL Global Development Group. It's a basic software engineering principle that has been universally adopted by just about every open and closed source development project in existence, and with good reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: Please consider this as a serious proposal for tuning in 9.1. Look: it is at least four months too late for anything of the sort in 9.1. We should be fixing bugs, and nothing else, if we ever want to get 9.1 out the door. Performance improvements don't qualify, especially not ones that tinker with fundamental parts of the system and seem highly likely to introduce new bugs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
iew. The reason we usually skip the summer isn't actually a wholesale lack of people - it's because it's not so good from a publicity perspective, and it's hard to get all the packagers around at the same time. Actually, the summer is *excellent* from a publicity perspective ... at least, June and July are. Both of those months are full of US conferences whose PR we can piggyback on to make a splash. August is really the only bad month from a PR perspective, because we lose a lot of our European RCs, and there's no bandwagons to jump on. But even August has the advantage of having no major US or Christian holidays to interfere with release dates. However, we're more likely to have an issue with *packager* availability in August. Besides, isn't this a little premature? Last I looked, we still have some big nasty open items. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com writes: ... I think at the next developer meeting we're going to get to hear Tom argue that overlapping the end of beta with the beginning of the next release cycle is a mistake and we should go back to the old system where we yell at everyone to shut up unless they're helping test or fix bugs. I think we have already got quite enough evidence to conclude that this approach is broken. Not only does it appear that hardly anybody but me is actively working on stabilizing 9.1, but I'm wasting quite a bit of my time trying to keep Simon from destabilizing it; to say nothing of reacting to design proposals for 9.2 work (or else feeling guilty because I'm ignoring them, which is in fact what I've mostly been doing). As a measure of how completely this is not working: I've had read the SSI code as a number one priority item for about two months now, and still haven't found time to read one line of it. Everyone who is arguing for the inclusion of this patch in 9.1 should take a minute to think about the following fact: If the PostgreSQL development process does not work for Tom, it does not work. I'd like to think that I'm not the sole driver of this process. However, if everybody else is going to start playing in their 9.2 sandbox and ignore getting a release out, then yeah it comes down to how much bandwidth I've got. And that's finite. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert, Oh, I get that. I'm just dismayed that we can't have a discussion about the patch without getting sidetracked into a conversation about whether we should throw feature freeze out the window. That's not something you can change. Whatever the patch is, even if it's a psql improvement, *someone* will argue that it's super-critical to shoehorn it into the release at the last minute. It's a truism of human nature to rationalize exceptions where your own interest is concerned. As long as we have solidarity of the committers that this is not allowed, however, this is not a real problem. And it appears that we do. In the future, it shouldn't even be necessary to discuss it. For my part, I'm excited that we seem to be getting some big hairy important patches in to CF1, which means that those patches will be well-tested by the time 9.2 reaches beta. Espeically getting Robert's patch and Simons's WALInsertLock work into CF1 means that we'll have 7 months to find serious bugs before beta starts. So I'd really like to carry on with the current development schedule. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)
Joshua Berkus j...@agliodbs.com writes: Actually, the summer is *excellent* from a publicity perspective ... at least, June and July are. Both of those months are full of US conferences whose PR we can piggyback on to make a splash. August is really the only bad month from a PR perspective, because we lose a lot of our European RCs, and there's no bandwagons to jump on. But even August has the advantage of having no major US or Christian holidays to interfere with release dates. However, we're more likely to have an issue with *packager* availability in August. Besides, isn't this a little premature? Last I looked, we still have some big nasty open items. Well, we're trying to fix them --- I'm still hoping that the known beta blockers will be cleared by Thursday so we can ship beta2. However, what happens after that is uncertain. I'm concerned that once the CF starts, the number of developer cycles devoted to 9.1 testing will go to zero, meaning that four weeks or so from now when the CF is over, we'll have made no real progress beyond beta2. It's hard to see how we have a release before August if that's how things stand in early July. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote: As long as we have solidarity of the committers that this is not allowed, however, this is not a real problem. And it appears that we do. In the future, it shouldn't even be necessary to discuss it. Solidarity? Simon - who was a committer last time I checked - seems to think that the current process is entirely bunko. And that is resulting in the waste of a lot of time that could be better spent. Our ability to sustain this development process rests on the idea that we have some kind of shared idea of what is and is not acceptable in general and at particular points in the release cycle. It *shouldn't* be necessary to discuss it, but it apparently is. Over and over and over again, in fact. It is critically important for the future success of this project that we learn to walk and chew gum at the same time. We are failing outright. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)
On 7 June 2011 19:32, Tom Lane t...@sss.pgh.pa.us wrote: Joshua Berkus j...@agliodbs.com writes: Actually, the summer is *excellent* from a publicity perspective ... at least, June and July are. Both of those months are full of US conferences whose PR we can piggyback on to make a splash. August is really the only bad month from a PR perspective, because we lose a lot of our European RCs, and there's no bandwagons to jump on. But even August has the advantage of having no major US or Christian holidays to interfere with release dates. However, we're more likely to have an issue with *packager* availability in August. Besides, isn't this a little premature? Last I looked, we still have some big nasty open items. Well, we're trying to fix them --- I'm still hoping that the known beta blockers will be cleared by Thursday so we can ship beta2. However, what happens after that is uncertain. I'm concerned that once the CF starts, the number of developer cycles devoted to 9.1 testing will go to zero, meaning that four weeks or so from now when the CF is over, we'll have made no real progress beyond beta2. It's hard to see how we have a release before August if that's how things stand in early July. Speaking of which, is it now safe to remove the NOT VALID constraints don't dump properly issue from the blocker list since the fix has been committed? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 1:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: ... I think at the next developer meeting we're going to get to hear Tom argue that overlapping the end of beta with the beginning of the next release cycle is a mistake and we should go back to the old system where we yell at everyone to shut up unless they're helping test or fix bugs. I think we have already got quite enough evidence to conclude that this approach is broken. Not only does it appear that hardly anybody but me is actively working on stabilizing 9.1, but I'm wasting quite a bit of my time trying to keep Simon from destabilizing it; to say nothing of reacting to design proposals for 9.2 work (or else feeling guilty because I'm ignoring them, which is in fact what I've mostly been doing). As a measure of how completely this is not working: I've had read the SSI code as a number one priority item for about two months now, and still haven't found time to read one line of it. Everyone who is arguing for the inclusion of this patch in 9.1 should take a minute to think about the following fact: If the PostgreSQL development process does not work for Tom, it does not work. I'd like to think that I'm not the sole driver of this process. However, if everybody else is going to start playing in their 9.2 sandbox and ignore getting a release out, then yeah it comes down to how much bandwidth I've got. And that's finite. I plead guilty to taking my eye off the ball post-beta1. I busted my ass for two months stabilizing other people's code after CF4 was over, and then I moved on to other things. I will try to get my eye back on the ball - but actually I'm not sure there's all that much to do. A quick review of the open items list suggests that we have fixed a total of six issues since beta1, as opposed to 47 prior to beta1. And all of those are being handled (two by you). I also don't see much in the way of unanswered 9.1 bug reports on pgsql-bugs, either. There may well be other open items, and I'm not unwilling to work on them, but I don't read minds. What needs doing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)
On Tue, Jun 7, 2011 at 1:45 PM, Thom Brown t...@linux.com wrote: Speaking of which, is it now safe to remove the NOT VALID constraints don't dump properly issue from the blocker list since the fix has been committed? I hope so, because I just did that (before noticing this email from you). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com writes: On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote: As long as we have solidarity of the committers that this is not allowed, however, this is not a real problem. And it appears that we do. In the future, it shouldn't even be necessary to discuss it. Solidarity? Simon - who was a committer last time I checked - seems to think that the current process is entirely bunko. And that is resulting in the waste of a lot of time that could be better spent. Yes. If it were anybody but Simon, we wouldn't be spending a lot of time on it; we'd just say sorry, this has to wait for 9.2 and that would be the end of it. As things stand, we have to convince him not to commit these things ... or else be prepared to fight a war over whether to revert them, which will be even more time-consuming and trust-destroying. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote: As long as we have solidarity of the committers that this is not allowed, however, this is not a real problem. And it appears that we do. In the future, it shouldn't even be necessary to discuss it. Solidarity? Simon - who was a committer last time I checked - seems to think that the current process is entirely bunko. I'm not sure why anyone that disagrees with you should be accused of wanting to junk the whole process. I've not said that and I don't think this. Before you arrived, it was quite normal to suggest tuning patches after feature freeze. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
* Simon Riggs (si...@2ndquadrant.com) wrote: Before you arrived, it was quite normal to suggest tuning patches after feature freeze. I haven't been around as long as some, but I think I've been around longer than Robert, and I can say that I don't recall serious performance patches, particularly ones around lock management and which change a fair bit of good, generally being white-listed from feature freeze or being pushed in after beta1. Perhaps I've missed them or perhaps there's been a few exceptions that I'm not remembering that make it look routine rather than an exception basis. We might have tweaked a config variable or changed a #define somewhere close to the end of a cycle, but I really don't put those into the same category as this change. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com wrote: Before you arrived, it was quite normal to suggest tuning patches after feature freeze. I've worn a lot of hats in the practical end of this industry, and regardless of which perspective I look at this from, I can't think of anything so destructive to productivity, developer morale, meeting deadlines or release quality as slipping in just one more item after feature freeze. It's *always* something that someone feels is so important that it's worth the delay and/or risk, and it never works out well. There are a lot of aspects of the development and release processes on which I can see valid trade-offs and a lot of room for negotiations and compromise, but having a feature freeze which is treated seriously isn't one of them. If nobody else was making an issue of this, I still would be. There's absolutely nothing personal or political in this -- I just know what I've seen work and what I've seen cause problems. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 2:06 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, Jun 7, 2011 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jun 7, 2011 at 1:27 PM, Joshua Berkus j...@agliodbs.com wrote: As long as we have solidarity of the committers that this is not allowed, however, this is not a real problem. And it appears that we do. In the future, it shouldn't even be necessary to discuss it. Solidarity? Simon - who was a committer last time I checked - seems to think that the current process is entirely bunko. I'm not sure why anyone that disagrees with you should be accused of wanting to junk the whole process. I've not said that and I don't think this. Before you arrived, it was quite normal to suggest tuning patches after feature freeze. I, of course, am not in a position to comment on what happened before I arrived. But of the six committers who have weighed in on this thread, you're the only one who thinks this can plausibly be called a tuning patch. Nor would the outcome of this discussion have been any different if I hadn't participated in it, which is why I steered clear of the whole topic of how the patch should be handled procedurally for the first three days. By the time I weighed in with my opinion, Tom and Heikki had already expressed theirs. Now it's possible that my influence is so widespread and pernicious that I've managed to convince to change Tom and Heikki's opinions on the topic of feature freeze. Perhaps, three years ago, they would have been willing to accept the patch at the last minute, but now, because of my advocacy for a disciplined feature freeze, they are not. To accept this argument, you would have to believe that I have the power to make Tom Lane more conservative. I don't believe I have either the power or the inclination to do any such thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: Before you arrived, it was quite normal to suggest tuning patches after feature freeze. *Low risk* tuning patches make sense at this stage, yes. Fooling with the lock mechanisms doesn't qualify as low risk in my book. The probability of undetected subtle problems is just too great. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 11:20 PM, Jignesh Shah jks...@gmail.com wrote: Okay I tried it out with sysbench read scaling test.. Note I had tried that earlier on 9.0 http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html And on that test I found that doing that test on anything bigger than 4 cores lead to decreased performance .. Redoing the same test with 100 users on 4 vCPU Virtual Machine with 8GB with 1M rows I get transactions: 17870082 (59566.46 per sec.) which is inline with the best number on 9.0. This test hardly had any idle CPUs. However where it made a huge impact was doing the same test on my 8 vCPU VM with 8GB RAM I get transactions: 33274594 (110914.85 per sec.) which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU).. My idle cpu was less than 7% which when taken into consideration that the useful work is line with my expectations is really impressive.. (And plus the last time I did MySQL they were around 95K or so for the same test). Next step DBT-2.. I tried with a warehouse size of 50 all cached in memory and my initial tests with DBT-2 using 8 vCPU does not show any major changes for a quick 10 minute run. I did eliminate write bottlenecks for this test so as to stress on locks (using full_page_writes=off, synchronous_commit=off, etc). I also have a large enough bufferpool to fit the all 50 warehouse DB in memory Without patch score: 29088 NOTPM With patch patch score: 30161 NOTPM It could be that I have other problems in the setup..One of the things I noticed is that there are too many Idle in Connections being reported which tells me something else is becoming a bottleneck here :-) I also tested with multiple clients but similar results.. both postgresql shows multiple idle in transaction and fetch in waiting while the clients show waiting in SocketCheck.. like shown below for example. #0 0x7fc4e83a43c6 in poll () from /lib64/libc.so.6 #1 0x7fc4e8abd61a in pqSocketCheck () #2 0x7fc4e8abd730 in pqWaitTimed () #3 0x7fc4e8abc215 in PQgetResult () #4 0x7fc4e8abc398 in PQexecFinish () #5 0x004050e1 in execute_new_order () #6 0x0040374f in process_transaction () #7 0x00403519 in db_worker () So yes for DBT2 I think this is inconclusive since there still could be other bottlenecks in play.. (Networking included) But overall yes I like the sysbench read scaling numbers quite a bit.. Regards, Jignesh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com writes: I plead guilty to taking my eye off the ball post-beta1. I busted my ass for two months stabilizing other people's code after CF4 was over, and then I moved on to other things. I will try to get my eye back on the ball - but actually I'm not sure there's all that much to do. A quick review of the open items list suggests that we have fixed a total of six issues since beta1, as opposed to 47 prior to beta1. And all of those are being handled (two by you). I also don't see much in the way of unanswered 9.1 bug reports on pgsql-bugs, either. There may well be other open items, and I'm not unwilling to work on them, but I don't read minds. What needs doing? Well, right at the moment there's not that much (if there were, I'd not have proposed wrapping beta2 in two days). You could look at some of the not blocker items on the open-items list --- we really ought to either do those things, or punt them off to TODO or the next CF as appropriate, sometime before 9.1 final. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 7:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Before you arrived, it was quite normal to suggest tuning patches after feature freeze. *Low risk* tuning patches make sense at this stage, yes. Fooling with the lock mechanisms doesn't qualify as low risk in my book. The probability of undetected subtle problems is just too great. Good, then we do agree. Some things are allowed, with suitable justification. That has not been a point accepted by everybody here though. Upthread, I proposed that we leave Robert's patch until 9.2. That was *after* I had reviewed it for impact and risk. I agree, its High Risk, and so must be put off until normal dev opens because of the sensitivity and criticality of getting the locking interactions right. Moving on from that, I have proposed other solutions. Koichi, Jignesh and and then Robert have shown measurements of the huge contention in this area of our software. Robert's patch addresses the problems, as do Koichi's and my latest patch. I would like to see us do *something* about these problems for 9.1. Not all of them are risky or time consuming. I'm clearly not alone in this thought; Dave, Dimitri and Koichi-san have also spoken in favour of action for this release. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: Moving on from that, I have proposed other solutions. Koichi, Jignesh and and then Robert have shown measurements of the huge contention in this area of our software. Robert's patch addresses the problems, as do Koichi's and my latest patch. I would like to see us do *something* about these problems for 9.1. Not all of them are risky or time consuming. In the first place, all of these issues predate 9.1 by years. They are not regressions or new bugs, and they have not suddenly gotten more urgent. In the second place, I haven't seen any proposals in the area that appear low risk. I seriously doubt that I would consider *any* meaningful change in the locking area to be low risk. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 3:44 PM, Jignesh Shah jks...@gmail.com wrote: On Mon, Jun 6, 2011 at 11:20 PM, Jignesh Shah jks...@gmail.com wrote: Okay I tried it out with sysbench read scaling test.. Note I had tried that earlier on 9.0 http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html And on that test I found that doing that test on anything bigger than 4 cores lead to decreased performance .. Redoing the same test with 100 users on 4 vCPU Virtual Machine with 8GB with 1M rows I get transactions: 17870082 (59566.46 per sec.) which is inline with the best number on 9.0. This test hardly had any idle CPUs. However where it made a huge impact was doing the same test on my 8 vCPU VM with 8GB RAM I get transactions: 33274594 (110914.85 per sec.) which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU).. My idle cpu was less than 7% which when taken into consideration that the useful work is line with my expectations is really impressive.. (And plus the last time I did MySQL they were around 95K or so for the same test). Next step DBT-2.. I tried with a warehouse size of 50 all cached in memory and my initial tests with DBT-2 using 8 vCPU does not show any major changes for a quick 10 minute run. I did eliminate write bottlenecks for this test so as to stress on locks (using full_page_writes=off, synchronous_commit=off, etc). I also have a large enough bufferpool to fit the all 50 warehouse DB in memory Without patch score: 29088 NOTPM With patch patch score: 30161 NOTPM It could be that I have other problems in the setup..One of the things I noticed is that there are too many Idle in Connections being reported which tells me something else is becoming a bottleneck here :-) I also tested with multiple clients but similar results.. both postgresql shows multiple idle in transaction and fetch in waiting while the clients show waiting in SocketCheck.. like shown below for example. #0 0x7fc4e83a43c6 in poll () from /lib64/libc.so.6 #1 0x7fc4e8abd61a in pqSocketCheck () #2 0x7fc4e8abd730 in pqWaitTimed () #3 0x7fc4e8abc215 in PQgetResult () #4 0x7fc4e8abc398 in PQexecFinish () #5 0x004050e1 in execute_new_order () #6 0x0040374f in process_transaction () #7 0x00403519 in db_worker () So yes for DBT2 I think this is inconclusive since there still could be other bottlenecks in play.. (Networking included) But overall yes I like the sysbench read scaling numbers quite a bit.. I think you will find that for write workloads WALInsertLock is so badly contended that nothing else matters. We really need to spend some time working on that during the 9.2 cycle, but I don't have anything that resembles a plan at this point. If you have the cycles, try compiling with LWLOCK_STATS defined and looking at the blk numbers just to confirm that's where the bottleneck is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 9:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Moving on from that, I have proposed other solutions. Koichi, Jignesh and and then Robert have shown measurements of the huge contention in this area of our software. Robert's patch addresses the problems, as do Koichi's and my latest patch. I would like to see us do *something* about these problems for 9.1. Not all of them are risky or time consuming. In the first place, all of these issues predate 9.1 by years. They are not regressions or new bugs, and they have not suddenly gotten more urgent. In the second place, I haven't seen any proposals in the area that appear low risk. I seriously doubt that I would consider *any* meaningful change in the locking area to be low risk. That's a shame. We'll fix it in 9.2 then. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 12:51 PM, Simon Riggs si...@2ndquadrant.com wrote: Stefan/Robert's observation that we perform a VirtualXactLockTableInsert() to no real benefit is a good one. It leads to the following simple patch to remove one lock table hit per transaction. It's a lot smaller impact on the LockMgr locks, but it will still be substantial. Performance tests please? This patch is much less invasive and has impact only on CREATE INDEX CONCURRENTLY and Hot Standby. It's taken me about 2 hours to write and test and there's no way it will cause any delay at all to the release schedule. (Though I'm sure Robert can improve it). Incidentally, I spent the morning (before we got off on this tangent) writing a patch to make VXID locks spring into existence on demand instead of creating them for every transaction. This applies on top of my fastlock patch and fits in quite nicely with the existing infrastructure that patch creates, and it helps modestly. Well, according to one metric, at least, it helps dramatically: traffic on each lock manager partition locks drops from hundreds of thousands of lock requests in a five minute period to just a few hundred. But the actual user-visible performance benefit is fairly modest - it goes from ~36K TPS unpatched to ~129K TPS with the fast relation locks alone to ~138K TPS with the fast relation locks plus a similar hack for fast VXID locks (all results with pgbench -c 36 -j 36 -n -S -T 300 on a Nate-Boley-provided 24-core box). Now, I'm not going to knock a 7% performance improvement and the benefit may be larger on Stefan's 80-core box and I think it's definitely worth going to the trouble to implement that optimization for 9.2, but it appears at least based on the testing that I've done so far that the fast relation locks are the big win and after that it gets much harder to make an improvement. If we were to fix ONLY the vxid issue in 9.1 as you were advocating, the benefit would probably be much less, because at least in my tests, the fast relation lock patch increases overall system throughput sufficiently to cause a 12x increase in contention due to vxid traffic. With both the fast-relation locks and the fast-vxid locks in place, as I mentioned, the lock manager partition lock contention is completely gone; in fact the lock manager partition traffic is pretty much gone. The remaining contention comes mostly from the free list locks (blk ~13%) and the buffer mapping locks (which were roughly: 800k shacq, 12000 exacq, 850 blk) Interestingly, I saw that one buffer mapping lock got about 5x hotter than the others, which is odd, but possibly harmless, since the absolute amount of blocking is really rather small (~0.1%). At least for read performance, we may need to start looking less at reducing lock contention and more at making the actual underlying operations faster. In the process of doing all of this, I discovered that I had neglected to update GetLockConflicts() and, consequently, fastlock-v2 is broken insofar as CREATE INDEX CONCURRENTLY and Hot Standby are concerned. I will fix that and post an updated version; and I'll also post the follow-on patch to accelerate the VXID locks at that time. In the meantime, I would appreciate any review or testing of the remainder of the patch. If we combine this patch with Koichi-san's recommended changes to the number of lock partitions, we will have considerable impact for 9.1. Robert will still get his day in the sun, just with 9.2. I am at this point of the viewpoint that there is little point in raising the number of lock partitions. If you are doing very simple SELECT statements across a large number of tables, then increasing the number of lock partitions will help. On read-write workloads, there's really no benefit, because WALInsertLock contention is the bottleneck. And on read-only workloads that only touch one or a handful of tables, the individual lock manager partitions where the locks fall get very hot regardless of how many partitions you have. Now that does still leave some space for improvement - specifically, lots of tables, read-only or read-mostly - but the fast-relation-lock and fast-vxid-lock stuff will address those bottlenecks far more thoroughly. And increasing the number of lock partitions also has a downside: it will slow down end-of-transaction cleanup, which is already an area where we know we have problems. There might be some point in raising the number of buffer mapping partitions, but I don't know how to create a test case where it's actually material, especially without the fastlock stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 9:52 PM, Robert Haas robertmh...@gmail.com wrote: If we were to fix ONLY the vxid issue in 9.1 as you were advocating Sensible debate is impossible when you don't read what I've written. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Tue, Jun 7, 2011 at 5:43 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, Jun 7, 2011 at 9:52 PM, Robert Haas robertmh...@gmail.com wrote: If we were to fix ONLY the vxid issue in 9.1 as you were advocating Sensible debate is impossible when you don't read what I've written. I've read every word you've written on this thread. Much of it, multiple times. I am unclear what we are arguing about. I don't want to have a debate. I want to figure out what works, and do it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 6/7/11 1:11 PM, Simon Riggs wrote: that appear low risk. I seriously doubt that I would consider *any* meaningful change in the locking area to be low risk. That's a shame. We'll fix it in 9.2 then. I will point out that we bounced Alvaro's FK patch, which *was* submitted in time for CF4, because of unknown locking impact. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: 9.1 release scheduling (was Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch)
Excerpts from Robert Haas's message of mar jun 07 13:53:23 -0400 2011: On Tue, Jun 7, 2011 at 1:45 PM, Thom Brown t...@linux.com wrote: Speaking of which, is it now safe to remove the NOT VALID constraints don't dump properly issue from the blocker list since the fix has been committed? I hope so, because I just did that (before noticing this email from you). Yeah, pg_dump works in HEAD ... the bug now is that psql prints NOT VALID twice. Will fix. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas wrote: On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote: My point was that we have in the past implemented performance changes to increase scalability at the last minute, and also that our personal risk perspectives are not always set in stone. Robert has highlighted the value of this change and its clearly not beyond our wit to include it, even if it is beyond our will to do so. So, at the risk of totally derailing this thread -- what this boils down to is a philosophical disagreement. It seems to me (and, I think, to Tom and Heikki and others as well) that it's not possible to keep on making changes to the release right up until the last minute and then expect the release to be of high quality. If we keep committing new features, then we'll keep introducing new bugs. The only hope of making the bug count go down at some point is to stop making changes that aren't bug fixes. We could come up with some complex procedure for determining whether a patch is important enough and non-invasive enough to bypass the normal deadline, but that would probably lead to a lot more arguing about procedure, and realistically, it's still going to increase the bug count at least somewhat. IMHO, it's better to just have a deadline, and stuff either makes it or it doesn't. I realize we haven't always adhered to the principle in the past, but at least IMV that's not a mistake we want to continue repeating. Simon is right that we slipped the vxid patch into 8.3 when a Postgres user I talked to at Linuxworld mentioned high vacuum freeze activity and simple calculations showed the many read-only queries could cause high xid usage. Fortunately we already had a patch available and Tom applied it during beta. It was an existing patch that took on new urgency during beta. Robert's point above is that it isn't so much making the decision of whether something should slip past the deadline, but the time-sapping discussion of whether something should slip, and the frankly disturbing behavior of some in this group to not accept a clear consensus, therefore prolonging the discussion of slippage far longer than necessary. Basically, if you propose something, and it gets shot down due to procedure, accept that unless you have some very good _new_ reason for continuing the discussion. If you don't like that, then you are not going to do well in our group and maybe this isn't the group for you. I think we are going to need to be much more forceful about this, and if the threat that someone has commit rights and therefore we can't ignore them, we will have to reconsider who can commit to this project. Do I need to be any clearer? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Bruce Momjian wrote: Simon is right that we slipped the vxid patch into 8.3 when a Postgres user I talked to at Linuxworld mentioned high vacuum freeze activity and simple calculations showed the many read-only queries could cause high xid usage. Fortunately we already had a patch available and Tom applied it during beta. It was an existing patch that took on new urgency during beta. Robert's point above is that it isn't so much making the decision of whether something should slip past the deadline, but the time-sapping discussion of whether something should slip, and the frankly disturbing behavior of some in this group to not accept a clear consensus, therefore prolonging the discussion of slippage far longer than necessary. Basically, if you propose something, and it gets shot down due to procedure, accept that unless you have some very good _new_ reason for continuing the discussion. If you don't like that, then you are not going to do well in our group and maybe this isn't the group for you. I think we are going to need to be much more forceful about this, and if the threat that someone has commit rights and therefore we can't ignore them, we will have to reconsider who can commit to this project. Do I need to be any clearer? One more thing --- when Tom applied that patch during 8.3 beta it was with everyone's agreement, so the policy should be that if we are going to break the rules, everyone has to agree --- if anyone disagrees, the rules stand. In this case, several people early felt we should stick with the rules --- at that point, there should have been no further discussion of slipping things into 9.1. Discussion takes energy, and discussing slipping things into 9.1 after anyone objects is just wasting our valuable time. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Bruce Momjian br...@momjian.us writes: Simon is right that we slipped the vxid patch into 8.3 when a Postgres user I talked to at Linuxworld mentioned high vacuum freeze activity and simple calculations showed the many read-only queries could cause high xid usage. Fortunately we already had a patch available and Tom applied it during beta. It was an existing patch that took on new urgency during beta. Just to set the record straight on this ... the vxid patch went in on 2007-09-05: http://archives.postgresql.org/pgsql-committers/2007-09/msg00026.php which was a day shy of a month before we wrapped 8.3beta1: http://archives.postgresql.org/pgsql-committers/2007-10/msg00089.php so it was during alpha phase not beta. And 8.3RC1 was stamped on 2008-01-03. So Simon's assertion that this was days before we produced a release candidate is correct, if you take days as 4 months. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06.06.2011 07:12, Robert Haas wrote: I did some further investigation of this. It appears that more than 99% of the lock manager lwlock traffic that remains with this patch applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION. Every SELECT statement runs in a separate transaction, and for each new transaction we run VirtualXactLockTableInsert(), which takes a lock on the vxid of that transaction, so that other processes can wait for it. That requires acquiring and releasing a lock manager partition lock, and we have to do the same thing a moment later at transaction end to dump the lock. A quick grep seems to indicate that the only places where we actually make use of those VXID locks are in DefineIndex(), when CREATE INDEX CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay expires. Considering that these are not commonplace events, it seems tremendously wasteful to incur the overhead for every transaction. It might be possible to make the lock entry spring into existence on demand - i.e. if a backend wants to wait on a vxid entry, it creates the LOCK and PROCLOCK objects for that vxid. That presents a few synchronization challenges, and plus we have to make sure that the backend that's just been given a lock knows that it needs to release it, but those seem like they might be manageable problems, especially given the new infrastructure introduced by the current patch, which already has to deal with some of those issues. I'll look into this further. Ah, I remember I saw that vxid lock pop up quite high in an oprofile profile recently. I think it was the case of executing a lot of very simple prepared queries. So it would be nice to address that, even from a single CPU point of view. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sat, Jun 4, 2011 at 5:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. That suggestion is completely insane. The patch is only WIP and full of bugs, even according to its author. Even if it were solid, it is way too late to be pushing such stuff into 9.1. We're trying to ship a release, not find ways to cause it to slip more. In 8.3, you implemented virtual transactionids days before we produced a Release Candidate, against my recommendation. At that time, I didn't start questioning your sanity. In fact we all applauded that because it was a great performance gain. The fact that you disagree with me does not make me insane. Inaction on this point, resulting in a year's delay, will be considered to be a gross waste by the majority of objective observers. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06.06.2011 12:40, Simon Riggs wrote: On Sat, Jun 4, 2011 at 5:55 PM, Tom Lanet...@sss.pgh.pa.us wrote: Simon Riggssi...@2ndquadrant.com writes: The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. That suggestion is completely insane. The patch is only WIP and full of bugs, even according to its author. Even if it were solid, it is way too late to be pushing such stuff into 9.1. We're trying to ship a release, not find ways to cause it to slip more. In 8.3, you implemented virtual transactionids days before we produced a Release Candidate, against my recommendation. FWIW, this bottleneck was not introduced by the introduction of virtual transaction ids. Before that patch, we just took the lock on the real transaction id instead. The fact that you disagree with me does not make me insane. You are not insane, even if your suggestion is. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 2:54 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ah, I remember I saw that vxid lock pop up quite high in an oprofile profile recently. I think it was the case of executing a lot of very simple prepared queries. So it would be nice to address that, even from a single CPU point of view. It doesn't seem too hard to do, although I have to think about the details. Even though the VXID locks involved are Exclusive locks, they are actually very much like the weak locks that the current patch accelerates, because the Exclusive lock is taken only by the VXID owner, and it can therefore be safely assumed that the initial lock acquisition won't block anything. Therefore, it's really unnecessary to touch the primary lock table at transaction start (and to only touch it at the end if someone's waiting). However, there's a fly in the ointment: when someone tries to ShareLock a VXID, we need to determine whether that VXID is still around and, if so, make an Exclusive lock entry for it in the primary lock table. And, unlike what I'm doing for strong relation locks, it's probably NOT acceptable for that to acquire and release every per-backend LWLock, because every place that waits for VXID locks waits for a list of locks in sequence, so we could end up with O(n^2) behavior. Now, in theory that's not a huge problem: the VXID includes the backend ID, so we ought to be able to figure out which single per-backend LWLock is of interest and just acquire/release that one. Unfortunately, it appears that there's no easy way to go from a backend ID to a PGPROC. The backend IDs are offsets into the ProcState array, so they give us a pointer to the backend's sinval state, not its PGPROC. And while the PGPROC has a pointer to the sinval info, there's no pointer in the opposite direction. Even if there were, we'd probably need to hold SInvalWriteLock in shared mode to follow it. That might not be the end of the world, since VXID locks are fairly infrequently used, but it's certainly a little grotty. I do rather wonder if we should be trying to reduce the number of separate places where we list the running processes. We have arrays of PGPROC structures, and then we have one set of pointers to PGPROCs in the ProcArray, and then we have the ProcState structures for sinval. I wonder if there's some way to rearrange all this to simplify the bookkeeping. BTW, how do you identify from oprofile that *vxid* locks were the problem? I didn't think it could produce that level of detail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06.06.2011 07:12, Robert Haas wrote: I did some further investigation of this. It appears that more than 99% of the lock manager lwlock traffic that remains with this patch applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION. Every SELECT statement runs in a separate transaction, and for each new transaction we run VirtualXactLockTableInsert(), which takes a lock on the vxid of that transaction, so that other processes can wait for it. That requires acquiring and releasing a lock manager partition lock, and we have to do the same thing a moment later at transaction end to dump the lock. A quick grep seems to indicate that the only places where we actually make use of those VXID locks are in DefineIndex(), when CREATE INDEX CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay expires. Considering that these are not commonplace events, it seems tremendously wasteful to incur the overhead for every transaction. It might be possible to make the lock entry spring into existence on demand - i.e. if a backend wants to wait on a vxid entry, it creates the LOCK and PROCLOCK objects for that vxid. That presents a few synchronization challenges, and plus we have to make sure that the backend that's just been given a lock knows that it needs to release it, but those seem like they might be manageable problems, especially given the new infrastructure introduced by the current patch, which already has to deal with some of those issues. I'll look into this further. At the moment, the transaction with given vxid acquires an ExclusiveLock on the vxid, and anyone who wants to wait for it to finish acquires a ShareLock. If we simply reverse that, so that the transaction itself takes ShareLock, and anyone wanting to wait on it take an ExclusiveLock, will this fastlock patch bust this bottleneck too? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:02 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 06.06.2011 07:12, Robert Haas wrote: I did some further investigation of this. It appears that more than 99% of the lock manager lwlock traffic that remains with this patch applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION. Every SELECT statement runs in a separate transaction, and for each new transaction we run VirtualXactLockTableInsert(), which takes a lock on the vxid of that transaction, so that other processes can wait for it. That requires acquiring and releasing a lock manager partition lock, and we have to do the same thing a moment later at transaction end to dump the lock. A quick grep seems to indicate that the only places where we actually make use of those VXID locks are in DefineIndex(), when CREATE INDEX CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay expires. Considering that these are not commonplace events, it seems tremendously wasteful to incur the overhead for every transaction. It might be possible to make the lock entry spring into existence on demand - i.e. if a backend wants to wait on a vxid entry, it creates the LOCK and PROCLOCK objects for that vxid. That presents a few synchronization challenges, and plus we have to make sure that the backend that's just been given a lock knows that it needs to release it, but those seem like they might be manageable problems, especially given the new infrastructure introduced by the current patch, which already has to deal with some of those issues. I'll look into this further. At the moment, the transaction with given vxid acquires an ExclusiveLock on the vxid, and anyone who wants to wait for it to finish acquires a ShareLock. If we simply reverse that, so that the transaction itself takes ShareLock, and anyone wanting to wait on it take an ExclusiveLock, will this fastlock patch bust this bottleneck too? Not without some further twaddling. Right now, the fast path only applies when you are taking a lock ShareUpdateExclusiveLock on an unshared relation. See also the email I just sent on why using the exact same mechanism might not be such a hot idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06.06.2011 14:59, Robert Haas wrote: BTW, how do you identify from oprofile that *vxid* locks were the problem? I didn't think it could produce that level of detail. It can show the call stack of each call, with --callgraph=n option, where you can see what percentage of the calls to LockAcquire come from VirtualXactLockTableInsert. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 11:19 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 06.06.2011 12:40, Simon Riggs wrote: On Sat, Jun 4, 2011 at 5:55 PM, Tom Lanet...@sss.pgh.pa.us wrote: Simon Riggssi...@2ndquadrant.com writes: The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. That suggestion is completely insane. The patch is only WIP and full of bugs, even according to its author. Even if it were solid, it is way too late to be pushing such stuff into 9.1. We're trying to ship a release, not find ways to cause it to slip more. In 8.3, you implemented virtual transactionids days before we produced a Release Candidate, against my recommendation. FWIW, this bottleneck was not introduced by the introduction of virtual transaction ids. Before that patch, we just took the lock on the real transaction id instead. Of course it wasn't. You've misunderstood completely. My point was that we have in the past implemented performance changes to increase scalability at the last minute, and also that our personal risk perspectives are not always set in stone. Robert has highlighted the value of this change and its clearly not beyond our wit to include it, even if it is beyond our will to do so. The fact that you disagree with me does not make me insane. You are not insane, even if your suggestion is. LOL. Your logic is still poor though. :-) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 10:49 AM, Simon Riggs si...@2ndquadrant.com wrote: My point was that we have in the past implemented performance changes to increase scalability at the last minute, and also that our personal risk perspectives are not always set in stone. Robert has highlighted the value of this change and its clearly not beyond our wit to include it, even if it is beyond our will to do so. So, at the risk of totally derailing this thread -- what this boils down to is a philosophical disagreement. It seems to me (and, I think, to Tom and Heikki and others as well) that it's not possible to keep on making changes to the release right up until the last minute and then expect the release to be of high quality. If we keep committing new features, then we'll keep introducing new bugs. The only hope of making the bug count go down at some point is to stop making changes that aren't bug fixes. We could come up with some complex procedure for determining whether a patch is important enough and non-invasive enough to bypass the normal deadline, but that would probably lead to a lot more arguing about procedure, and realistically, it's still going to increase the bug count at least somewhat. IMHO, it's better to just have a deadline, and stuff either makes it or it doesn't. I realize we haven't always adhered to the principle in the past, but at least IMV that's not a mistake we want to continue repeating. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com wrote: IMHO, it's better to just have a deadline, and stuff either makes it or it doesn't. I realize we haven't always adhered to the principle in the past, but at least IMV that's not a mistake we want to continue repeating. +1 I've said it before, but I think it bears repeating, that deferring this to 9.2 doesn't mean that it comes out in a production release 12 months later -- unless we continue to repeat this mistake endlessly. It means that this release comes out closer to when we said it would -- for the sake of argument let's hypothesize one month. So by holding the line on such inclusions all the current 9.1 features come out one month sooner, and this feature comes out 11 months later than it would have if we'd put it into 9.1. With some feature we consider squeezing in, it would be more like delaying everything which is done by three months so that one feature gets out nine months earlier. Perhaps the best way to describe the suggestion that this be included in 9.1 isn't that it's an insane suggestion; but that it's a suggestion which, if adopted, would be likely to drive those who are striving for a more organized development and release process insane. Or one could look at it in a cost/benefit format -- major features delivered per year go up by holding the line, administrative costs are reduced, and people who are focusing on release stability get more months per year to do development. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 5:14 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Perhaps the best way to describe the suggestion that this be included in 9.1 isn't that it's an insane suggestion; but that it's a suggestion which, if adopted, would be likely to drive those who are striving for a more organized development and release process insane. Kevin, I respect your opinion and thank you for stating your case without insults. In this discussion it should be recognised that I have personally driven the development of a more organized dev and release process. I requested and argued for stated release dates to assist resource planning and suggested commitfests as a mechanism to reduce the feedback times for developers. I also provided the first guide to patch reviews we published. So I am a proponent of planning and organization, though some would like to claim I see things differently. The major problems of the dev process are now solved, yet more organization is still being discussed, as if more == better. What I hear is changed organization and I am not certain that all change == better in what I see is a leading example of how to produce great software. Releasing regularly is important, but not more important than anything. Ever. Period. Trying to force that will definitely make you mad, I can see. I request that people stop trying to enforce a process so strictly that sensible and important change cannot take place when needed. Or one could look at it in a cost/benefit format -- major features delivered per year go up by holding the line, administrative costs are reduced, and people who are focusing on release stability get more months per year to do development. I do look at it in a cost/benefit format. The problem is the above statement has nothing user-centric about it. The cost to us is a few days work and the benefit is a whole year's worth of increased performance for our user base, which has a hardware equivalent well into the millions of dollars. And that's ignoring the users that would've switched to using Postgres earlier, and those who might leave because of competitive comparison. I won't say any more about this because I am in no way a beneficiary from this and even my opinion is given unpaid. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Wow! That's fantastic. Jignesh, are you in a position to test any of Robert's work using DBT or other benchmarks? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com writes: IMHO, it's better to just have a deadline, Well, that's the fine point we're now talking about. I still think that we should try at making the best release possible. And if that means including changes at beta time because that's when someone got around to doing them, so be it — well, they should really worth it. So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/06/2011 09:24 PM, Dave Page wrote: On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. I disagree - the proposed patch maybe provides a very significant improvment for a certain workload type(nothing less but nothing more), but it was posted way after -BETA and I'm not sure we yet understand all implications of the changes. We also have to consider that the underlying issues are known problems for multiple years^releases so I don't think there is a particular rush to force them into a particular release (as in 9.1). Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
* Dave Page (dp...@pgadmin.org) wrote: Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. So, with folks putting up that we should hammer this patch out and force it into 9.1.. What should our new release date for 9.1 be? What about other patches that didn't make it into 9.1? What about the upcoming CommitFest that we've asked people to start working on? If we're going to start putting in changes like this, I'd suggest that we try and target something like September for 9.1 to actually be released. Playing with the lock management isn't something we want to be doing lightly and I think we definitely need to have serious testing of this, similar to what has been done for the SSI changes, before we're going to be able to release it. I don't agree that we should delay 9.1, but if people really want this in, then we need to figure out what the new schedule is going to be. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:40 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 06/06/2011 09:24 PM, Dave Page wrote: On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. I disagree - the proposed patch maybe provides a very significant improvment for a certain workload type(nothing less but nothing more), but it was posted way after -BETA and I'm not sure we yet understand all implications of the changes. We certainly need to be happy with the implications if we were to make such a decision. We also have to consider that the underlying issues are known problems for multiple years^releases so I don't think there is a particular rush to force them into a particular release (as in 9.1). No, there's no *technical* reason we need to do this, as there would be if it were a bug fix for example. I would just like to see us narrow the gap with our competitors sooner rather than later, *if* we're a) happy with the change, and b) we're talking about a minimal delay (which we may be - Robert says he thinks the patch is good, so with another review and beta testing). -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 6/6/11 12:12 PM, Dimitri Fontaine wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. I could not disagree more strongly. We're in *beta* now. It's not like the last CF closed a couple weeks ago. Heck, I'm about to open the first CF for 9.2 in just over a week. Also, a patch like this needs several months of development, discussion and testing in order to fix the issues Robert already identified and make sure it doesn't break something fundamental to concurrency. Which would mean delaying the release would be delayed until at least November, screwing over all the users who don't care about this patch. There will *always* be another really cool patch. If we keep delaying release to get in one more patch, then we never release. At some point you just have to take what you have and call it a release, and we are months past that point. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/06/2011 03:24 PM, Dave Page wrote: On Mon, Jun 6, 2011 at 8:12 PM, Dimitri Fontainedimi...@2ndquadrant.fr wrote: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. I'm not a fan of hard and fast deadlines for releases - it puts too much pressure on us to release before we might be ready. But I'm also not a fan of totally abandoning our established processes, which accepting this would. I don't mind bending the rules a bit occasionally; I do mind throwing them out the door. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote: * Dave Page (dp...@pgadmin.org) wrote: Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. So, with folks putting up that we should hammer this patch out and force it into 9.1.. What should our new release date for 9.1 be? What about other patches that didn't make it into 9.1? What about the upcoming CommitFest that we've asked people to start working on? If we're going to start putting in changes like this, I'd suggest that we try and target something like September for 9.1 to actually be released. Playing with the lock management isn't something we want to be doing lightly and I think we definitely need to have serious testing of this, similar to what has been done for the SSI changes, before we're going to be able to release it. Completely aside from the issue at hand, aren't we looking at a September release by now anyway (assuming we have to void late July/August as we usually do)? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote: That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Wow! That's fantastic. Jignesh, are you in a position to test any of Robert's work using DBT or other benchmarks? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com I missed the discussion. Can you send me the patch (will that work with 9.1 beta?)? I can do a before and after with DBT2 and let you know. And also test it with sysbench read test which also has a relation locking bottleneck. Thanks. Regards, Jignesh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 5:13 PM, Simon Riggs si...@2ndquadrant.com wrote: The cost to us is a few days work and the benefit is a whole year's worth of increased performance for our user base, which has a hardware equivalent well into the millions of dollars. I doubt that this is an accurate reflection of the cost. What was presented by Robert Haas was a proof of concept, and he pointed out that it had numerous problems. To requote: There are numerous problems with the code as it stands at this point. It crashes if you try to use 2PC, which means the regression tests fail; it probably does horrible things if you run out of shared memory; pg_locks knows nothing about the new mechanism (arguably, we could leave it that way: only locks that can't possibly be conflicting with anything can be taken using this mechanism, but it would be nice to fix, I think); and there are likely some other gotchas as well. Turning this into something ready for production deployment in 9.1 would require a non-trivial amount of additional effort, and would likely have the adverse effect of deferring the release of 9.1, as well as of further deferring all the effects of the patches submitted for the latest commitfest (https://commitfest.postgresql.org/action/commitfest_view?id=10), since this defers release of 9.2, as well. While the patch is a fine one, in that it has interesting effects, it seems like a way wiser idea to me to let it go through the 9.2 process, so that it has 6 months worth of buildfarm runs before it gets deployed for real just like all the other items in the 2011-06 CommitFest. Note that it may lead to further discoveries, so that perhaps, in the 9.2 series, we'd see further improvements due to things that are discovered as further consequence of testing https://commitfest.postgresql.org/action/patch_view?id=572. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 3:59 PM, Christopher Browne cbbro...@gmail.com wrote: On Mon, Jun 6, 2011 at 5:13 PM, Simon Riggs si...@2ndquadrant.com wrote: The cost to us is a few days work and the benefit is a whole year's worth of increased performance for our user base, which has a hardware equivalent well into the millions of dollars. I doubt that this is an accurate reflection of the cost. What was presented by Robert Haas was a proof of concept, and he pointed out that it had numerous problems. To requote: There are numerous problems with the code as it stands at this point. It crashes if you try to use 2PC, which means the regression tests fail; it probably does horrible things if you run out of shared memory; pg_locks knows nothing about the new mechanism (arguably, we could leave it that way: only locks that can't possibly be conflicting with anything can be taken using this mechanism, but it would be nice to fix, I think); and there are likely some other gotchas as well. The latest version of the patch is in much better shape: http://archives.postgresql.org/pgsql-hackers/2011-06/msg00403.php But this is not intended as disparagement for the balance of your argument. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Stephen Frost sfr...@snowman.net wrote: if people really want this in, then we need to figure out what the new schedule is going to be. I suggest June, 2012. That way we can get a whole bunch more really cool patches in, and the users won't have to wait for 9.2 to get them. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 8:52 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote: * Dave Page (dp...@pgadmin.org) wrote: Much as I hate to say it (I too want to keep our schedule as predictable and organised as possible), I have to agree. Assuming the patch is good, I think this is something we should push into 9.1. It really could be a game changer. So, with folks putting up that we should hammer this patch out and force it into 9.1.. What should our new release date for 9.1 be? What about other patches that didn't make it into 9.1? What about the upcoming CommitFest that we've asked people to start working on? If we're going to start putting in changes like this, I'd suggest that we try and target something like September for 9.1 to actually be released. Playing with the lock management isn't something we want to be doing lightly and I think we definitely need to have serious testing of this, similar to what has been done for the SSI changes, before we're going to be able to release it. Completely aside from the issue at hand, aren't we looking at a September release by now anyway (assuming we have to void late July/August as we usually do)? I see no reason to delay from a July release as has long been planned. What open items are genuine blockers? If we need deadlines anywhere its in beta and final release, otherwise we all just sit around shrugging and saying another week I guess. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Excerpts from Dimitri Fontaine's message of lun jun 06 15:12:54 -0400 2011: So, to the question “do we want hard deadlines?” I think the answer is “no”, to “do we need hard deadlines?”, my answer is still “no”, and to the question “does this very change should be considered this late?” my answer is yes. Because it really changes the game for PostgreSQL users. Maybe so, but the problem is that the patch is really WIP at this point and it obviously still needs a lot of work, judging from the patch author's comments. I note that if 2nd Quadrant is interested in having a game-changing platform without having to wait a full year for 9.2, they can obviously distribute a modified version of Postgres that integrates Robert's patch. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. I'll start with some performance numbers. I hereby recommend that people with patches such as this one while on the last weeks till release should refrain from posting them until the release has actually taken place. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Dave Page dp...@pgadmin.org writes: On Mon, Jun 6, 2011 at 8:44 PM, Stephen Frost sfr...@snowman.net wrote: If we're going to start putting in changes like this, I'd suggest that we try and target something like September for 9.1 to actually be released. Playing with the lock management isn't something we want to be doing lightly and I think we definitely need to have serious testing of this, similar to what has been done for the SSI changes, before we're going to be able to release it. Completely aside from the issue at hand, aren't we looking at a September release by now anyway (assuming we have to void late July/August as we usually do)? Very possibly. So if we add this in, we're talking November or December instead of September. You can't argue that July/August will be lost time for one development path but not another. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 6:53 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of vie jun 03 09:17:08 -0400 2011: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. I'll start with some performance numbers. I hereby recommend that people with patches such as this one while on the last weeks till release should refrain from posting them until the release has actually taken place. %@#! Next time I'll be sure to only post my patches during beta if they suck. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
* Simon Riggs (si...@2ndquadrant.com) wrote: I see no reason to delay from a July release as has long been planned. What open items are genuine blockers? If we need deadlines anywhere its in beta and final release, otherwise we all just sit around shrugging and saying another week I guess. I'm a bit confused by your response here. Clearly, if we're going to try and get this patch cleaned up and committable, then it's an open item and a genuine blocker with a couple of months of work associated with it. If we don't try to shove this patch in then perhaps we can get a release out in the next month or so. It was my understand that we're in beta and final release right now, and we're trying to hit deadlines now which are associated with that. Adding this patch into the queue of things to be done before release moves us back out of the beta testing and final release stage. In other words, if you're argueing to stick to a release soon then it doesn't make sense, to me anyway, to advocate applying a mostly untested patch which changes a great deal of very important core logic. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Mon, Jun 6, 2011 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote: That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Wow! That's fantastic. Jignesh, are you in a position to test any of Robert's work using DBT or other benchmarks? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com Okay I tried it out with sysbench read scaling test.. Note I had tried that earlier on 9.0 http://jkshah.blogspot.com/2010/11/postgresql-90-simple-select-scaling.html And on that test I found that doing that test on anything bigger than 4 cores lead to decreased performance .. Redoing the same test with 100 users on 4 vCPU Virtual Machine with 8GB with 1M rows I get transactions:17870082 (59566.46 per sec.) which is inline with the best number on 9.0. This test hardly had any idle CPUs. However where it made a huge impact was doing the same test on my 8 vCPU VM with 8GB RAM I get transactions:33274594 (110914.85 per sec.) which is a whopping 1.8x scaling for 2x scaling (from 4 to 8 vCPU).. My idle cpu was less than 7% which when taken into consideration that the useful work is line with my expectations is really impressive.. (And plus the last time I did MySQL they were around 95K or so for the same test). Also note that in my earlier case 60K was the max irrespective of the hardware I threw at it.. For this fastlock patch that does not seem to be the problem anymore :-) This gain is impressive.. Next step DBT-2.. Regards, Jignesh Next step -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/03/2011 03:17 PM, Robert Haas wrote: [...] As you can see, this works out to a bit more than a 4% improvement on this two-core box. I also got access (thanks to Nate Boley) to a 24-core box and ran the same test with scale factor 100 and shared_buffers=8GB. Here are the results of alternating runs without and with the patch on that machine: tps = 36291.996228 (including connections establishing) tps = 129242.054578 (including connections establishing) tps = 36704.393055 (including connections establishing) tps = 128998.648106 (including connections establishing) tps = 36531.208898 (including connections establishing) tps = 131341.367344 (including connections establishing) That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. nice - but lets see on real hardware... Testing this on a brand new E7-4850 4 Socket/10cores+HT Box - so 80 hardware threads: first some numbers with -HEAD(-T 120, runtimes at lower -c counts have fairly high variation in the results, first number is the number of connections/threads): -j1:tps = 7928.965493 (including connections establishing) -j8:tps = 53610.572347 (including connections establishing) -j16: tps = 80835.446118 (including connections establishing) -j32: tps = 75666.731883 (including connections establishing) -j40: tps = 74628.568388 (including connections establishing) -j64. tps = 68268.081973 (including connections establishing) -c80tps = 66704.216166 (including connections establishing) postgresql is completely lock limited in this test anything beyond around -j10 is basically not able to push the box to more than 80% IDLE(!) and now with the patch applied: -j1:tps = 7783.295587 (including connections establishing) -j8:tps = 44361.661947 (including connections establishing) -j16: tps = 92270.464541 (including connections establishing) -j24: tps = 108259.524782 (including connections establishing) -j32: tps = 183337.422612 (including connections establishing) -j40tps = 209616.052430 (including connections establishing) -j48: tps = 229621.292382 (including connections establishing) -j56: tps = 218690.391603 (including connections establishing) -j64: tps = 188028.348501 (including connections establishing) -j80. tps = 118814.741609 (including connections establishing) so much better - but I still think there is some headroom left still, although pgbench itself is a CPU hog in those benchmark with eating up to 10 cores in the worst case scenario - will retest with sysbench which in the past showed more reasonable CPU usage for me. and a profile(patched code) for the -j48(aka fastest) case: 731535 11.8408 postgres s_lock 2918784.7244 postgres LWLockAcquire 2423733.9231 postgres AllocSetAlloc 2390833.8698 postgres LWLockRelease 2023413.2751 postgres SearchCatCache 1900553.0763 postgres hash_search_with_hash_value 1871483.0292 postgres base_yyparse 1732652.8045 postgres GetSnapshotData 75700 1.2253 postgres core_yylex 74974 1.2135 postgres MemoryContextAllocZeroAligned 61404 0.9939 postgres _bt_compare 57529 0.9312 postgres MemoryContextAlloc and one for the -j80 case(also patched). 485798 48.9667 postgres s_lock 60327 6.0808 postgres LWLockAcquire 57049 5.7503 postgres LWLockRelease 18357 1.8503 postgres hash_search_with_hash_value 17033 1.7169 postgres GetSnapshotData 14763 1.4881 postgres base_yyparse 14460 1.4575 postgres SearchCatCache 13975 1.4086 postgres AllocSetAlloc 6416 0.6467 postgres PinBuffer 5024 0.5064 postgres SIGetDataEntries 4704 0.4741 postgres core_yylex 4625 0.4662 postgres _bt_compare Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 05.06.2011 22:04, Stefan Kaltenbrunner wrote: and one for the -j80 case(also patched). 485798 48.9667 postgres s_lock 60327 6.0808 postgres LWLockAcquire 57049 5.7503 postgres LWLockRelease 18357 1.8503 postgres hash_search_with_hash_value 17033 1.7169 postgres GetSnapshotData 14763 1.4881 postgres base_yyparse 14460 1.4575 postgres SearchCatCache 13975 1.4086 postgres AllocSetAlloc 6416 0.6467 postgres PinBuffer 5024 0.5064 postgres SIGetDataEntries 4704 0.4741 postgres core_yylex 4625 0.4662 postgres _bt_compare Hmm, does that mean that it's spending 50% of the time spinning on a spinlock? That's bad. It's one thing to be contended on a lock, and have a lot of idle time because of that, but it's even worse to spend a lot of time spinning because that CPU time won't be spent on doing more useful work, even if there is some other process on the system that could make use of that CPU time. I like the overall improvement on the throughput, of course, but we have to find a way to avoid the busy-wait. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 06/05/2011 09:12 PM, Heikki Linnakangas wrote: On 05.06.2011 22:04, Stefan Kaltenbrunner wrote: and one for the -j80 case(also patched). 485798 48.9667 postgres s_lock 60327 6.0808 postgres LWLockAcquire 57049 5.7503 postgres LWLockRelease 18357 1.8503 postgres hash_search_with_hash_value 17033 1.7169 postgres GetSnapshotData 14763 1.4881 postgres base_yyparse 14460 1.4575 postgres SearchCatCache 13975 1.4086 postgres AllocSetAlloc 6416 0.6467 postgres PinBuffer 5024 0.5064 postgres SIGetDataEntries 4704 0.4741 postgres core_yylex 4625 0.4662 postgres _bt_compare Hmm, does that mean that it's spending 50% of the time spinning on a spinlock? That's bad. It's one thing to be contended on a lock, and have a lot of idle time because of that, but it's even worse to spend a lot of time spinning because that CPU time won't be spent on doing more useful work, even if there is some other process on the system that could make use of that CPU time. well yeah - we are broken right now with only being able to use ~20% of CPU on a modern mid-range box, but using 80% CPU (or 4x like in the above case) and only getting less than 2x the performance seems wrong as well. I also wonder if we are still missing something fundamental - because even with the current patch we are quite far away from linear scaling and light-years from some of our competitors... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sun, Jun 5, 2011 at 4:01 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 06/05/2011 09:12 PM, Heikki Linnakangas wrote: On 05.06.2011 22:04, Stefan Kaltenbrunner wrote: and one for the -j80 case(also patched). 485798 48.9667 postgres s_lock 60327 6.0808 postgres LWLockAcquire 57049 5.7503 postgres LWLockRelease 18357 1.8503 postgres hash_search_with_hash_value 17033 1.7169 postgres GetSnapshotData 14763 1.4881 postgres base_yyparse 14460 1.4575 postgres SearchCatCache 13975 1.4086 postgres AllocSetAlloc 6416 0.6467 postgres PinBuffer 5024 0.5064 postgres SIGetDataEntries 4704 0.4741 postgres core_yylex 4625 0.4662 postgres _bt_compare Hmm, does that mean that it's spending 50% of the time spinning on a spinlock? That's bad. It's one thing to be contended on a lock, and have a lot of idle time because of that, but it's even worse to spend a lot of time spinning because that CPU time won't be spent on doing more useful work, even if there is some other process on the system that could make use of that CPU time. well yeah - we are broken right now with only being able to use ~20% of CPU on a modern mid-range box, but using 80% CPU (or 4x like in the above case) and only getting less than 2x the performance seems wrong as well. I also wonder if we are still missing something fundamental - because even with the current patch we are quite far away from linear scaling and light-years from some of our competitors... Could you compile with LWLOCK_STATS, rerun these tests, total up the blk numbers by LWLockId, and post the results? (Actually, totalling up the shacq and exacq numbers would be useful as well, if you wouldn't mind.) Unless I very much miss my guess, we're going to see zero contention on the new structures introduced by this patch. Rather, I suspect what we're going to find is that, with the hideous contention on one particular lock manager partition lock removed, there's a more spread-out contention problem, likely involving the lock manager partition lock, the buffer mapping locks, and possibly other LWLocks as well. The fact that the system is busy-waiting rather than just not using the CPU at all probably means that the remaining contention is more spread out than that which is removed by this patch. We don't actually have everything pile up on a single LWLock (as happens in git master), but we do spend a lot of time fighting cache lines away from other CPUs. Or at any rate, that's my guess: we need some real numbers to know for sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sun, Jun 5, 2011 at 5:46 PM, Robert Haas robertmh...@gmail.com wrote: Could you compile with LWLOCK_STATS, rerun these tests, total up the blk numbers by LWLockId, and post the results? (Actually, totalling up the shacq and exacq numbers would be useful as well, if you wouldn't mind.) I did this on the loaner 24-core box from Nate Boley and got the following results. This is just the LWLocks that had blk0. lwlock 0: shacq 0 exacq 200625 blk 24044 lwlock 4: shacq 80101430 exacq 196 blk 28 lwlock 33: shacq 8333673 exacq 11977 blk 864 lwlock 34: shacq 7092293 exacq 11890 blk 803 lwlock 35: shacq 7893875 exacq 11909 blk 848 lwlock 36: shacq 7567514 exacq 11912 blk 830 lwlock 37: shacq 7427774 exacq 11930 blk 745 lwlock 38: shacq 7120108 exacq 11989 blk 853 lwlock 39: shacq 7584952 exacq 11982 blk 782 lwlock 40: shacq 7949867 exacq 12056 blk 821 lwlock 41: shacq 6612240 exacq 11929 blk 746 lwlock 42: shacq 47512112 exacq 11844 blk 4503 lwlock 43: shacq 7943511 exacq 11871 blk 878 lwlock 44: shacq 7534558 exacq 12033 blk 800 lwlock 45: shacq 7128256 exacq 12045 blk 856 lwlock 46: shacq 7575339 exacq 12015 blk 818 lwlock 47: shacq 6745173 exacq 12094 blk 806 lwlock 48: shacq 8410348 exacq 12104 blk 977 lwlock 49: shacq 0 exacq 5007594 blk 172533 lwlock 50: shacq 0 exacq 5011704 blk 172282 lwlock 51: shacq 0 exacq 5003356 blk 172802 lwlock 52: shacq 0 exacq 5009020 blk 174648 lwlock 53: shacq 0 exacq 5010808 blk 172080 lwlock 54: shacq 0 exacq 5004908 blk 169934 lwlock 55: shacq 0 exacq 5009324 blk 170281 lwlock 56: shacq 0 exacq 5005904 blk 171001 lwlock 57: shacq 0 exacq 5006984 blk 169942 lwlock 58: shacq 0 exacq 5000346 blk 170001 lwlock 59: shacq 0 exacq 5004884 blk 170484 lwlock 60: shacq 0 exacq 5006304 blk 171325 lwlock 61: shacq 0 exacq 5008421 blk 170866 lwlock 62: shacq 0 exacq 5008162 blk 170868 lwlock 63: shacq 0 exacq 5002238 blk 170291 lwlock 64: shacq 0 exacq 5005348 blk 169764 lwlock 307: shacq 0 exacq 2 blk 1 lwlock 315: shacq 0 exacq 3 blk 2 lwlock 337: shacq 0 exacq 4 blk 3 lwlock 345: shacq 0 exacq 2 blk 1 lwlock 349: shacq 0 exacq 2 blk 1 lwlock 231251: shacq 0 exacq 2 blk 1 lwlock 253831: shacq 0 exacq 2 blk 1 So basically, even with the patch, at 24 cores the lock manager locks are still under tremendous pressure. But note that there's a big difference between what's happening here and what's happening without the patch. Here's without the patch: lwlock 0: shacq 0 exacq 191613 blk 17591 lwlock 4: shacq 21543085 exacq 102 blk 20 lwlock 33: shacq 2237938 exacq 11976 blk 463 lwlock 34: shacq 1907344 exacq 11890 blk 458 lwlock 35: shacq 2125308 exacq 11908 blk 442 lwlock 36: shacq 2038220 exacq 11912 blk 430 lwlock 37: shacq 1998059 exacq 11927 blk 449 lwlock 38: shacq 1916179 exacq 11953 blk 409 lwlock 39: shacq 2042173 exacq 12019 blk 479 lwlock 40: shacq 2140002 exacq 12056 blk 448 lwlock 41: shacq 1776772 exacq 11928 blk 392 lwlock 42: shacq 12777368 exacq 11842 blk 2451 lwlock 43: shacq 2132240 exacq 11869 blk 478 lwlock 44: shacq 2026845 exacq 12031 blk 446 lwlock 45: shacq 1918618 exacq 12045 blk 449 lwlock 46: shacq 2038437 exacq 12011 blk 472 lwlock 47: shacq 1814660 exacq 12089 blk 401 lwlock 48: shacq 2261208 exacq 12105 blk 478 lwlock 49: shacq 0 exacq 1347524 blk 17020 lwlock 50: shacq 0 exacq 1350678 blk 16888 lwlock 51: shacq 0 exacq 1346260 blk 16744 lwlock 52: shacq 0 exacq 1348432 blk 16864 lwlock 53: shacq 0 exacq 22216779 blk 4914363 lwlock 54: shacq 0 exacq 22217309 blk 4525381 lwlock 55: shacq 0 exacq 1348406 blk 13438 lwlock 56: shacq 0 exacq 1345996 blk 13299 lwlock 57: shacq 0 exacq 1347890 blk 13654 lwlock 58: shacq 0 exacq 1343486 blk 13349 lwlock 59: shacq 0 exacq 1346198 blk 13471 lwlock 60: shacq 0 exacq 1346236 blk 13532 lwlock 61: shacq 0 exacq 1343688 blk 13547 lwlock 62: shacq 0 exacq 1350068 blk 13614 lwlock 63: shacq 0 exacq 1345302 blk 13420 lwlock 64: shacq 0 exacq 1348858 blk 13635 lwlock 321: shacq 0 exacq 2 blk 1 lwlock 329: shacq 0 exacq 4 blk 3 lwlock 337: shacq 0 exacq 6 blk 4 lwlock 347: shacq 0 exacq 5 blk 4 lwlock 357: shacq 0 exacq 3 blk 2 lwlock 363: shacq 0 exacq 3 blk 2 lwlock 369: shacq 0 exacq 4 blk 3 lwlock 379: shacq 0 exacq 2 blk 1 lwlock 383: shacq 0 exacq 2 blk 1 lwlock 445: shacq 0 exacq 2 blk 1 lwlock 449: shacq 0 exacq 2 blk 1 lwlock 451: shacq 0 exacq 2 blk 1 lwlock 1023: shacq 0 exacq 2 blk 1 lwlock 11401: shacq 0 exacq 2 blk 1 lwlock 115591: shacq 0 exacq 2 blk 1 lwlock 117177: shacq 0 exacq 2 blk 1 lwlock 362839: shacq 0 exacq 2 blk 1 In the unpatched case, two lock manager locks are getting beaten to death, and the others all about equally contended. By eliminating the portion of the lock manager contention that pertains specifically to the two heavily trafficked locks, system throughput improves by about 3.5x - and, not surprisingly, traffic on the lock manager locks increases by approximately the same multiple. Those locks now become the contention bottleneck, with about 12x the blocking they had pre-patch.
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sun, Jun 5, 2011 at 10:16 PM, Robert Haas robertmh...@gmail.com wrote: I'm definitely interested in investigating what to do about that, but I don't think it's this patch's problem to fix all of our lock manager bottlenecks. I did some further investigation of this. It appears that more than 99% of the lock manager lwlock traffic that remains with this patch applied has locktag_type == LOCKTAG_VIRTUALTRANSACTION. Every SELECT statement runs in a separate transaction, and for each new transaction we run VirtualXactLockTableInsert(), which takes a lock on the vxid of that transaction, so that other processes can wait for it. That requires acquiring and releasing a lock manager partition lock, and we have to do the same thing a moment later at transaction end to dump the lock. A quick grep seems to indicate that the only places where we actually make use of those VXID locks are in DefineIndex(), when CREATE INDEX CONCURRENTLY is in use, and during Hot Standby, when max_standby_delay expires. Considering that these are not commonplace events, it seems tremendously wasteful to incur the overhead for every transaction. It might be possible to make the lock entry spring into existence on demand - i.e. if a backend wants to wait on a vxid entry, it creates the LOCK and PROCLOCK objects for that vxid. That presents a few synchronization challenges, and plus we have to make sure that the backend that's just been given a lock knows that it needs to release it, but those seem like they might be manageable problems, especially given the new infrastructure introduced by the current patch, which already has to deal with some of those issues. I'll look into this further. It's likely that if we lick this problem, the BufFreelistLock and BufMappingLocks are going to be the next hot spot. Of course, we're ignoring the ten-thousand pound gorilla in the corner, which is that on write workloads we have a pretty bad contention problem with WALInsertLock, which I fear will not be so easily addressed. But one problem at a time, I guess. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Fri, Jun 3, 2011 at 2:17 PM, Robert Haas robertmh...@gmail.com wrote: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. Yes, the approach has merits and I'm sure we can work out the kinks. As you can see, this works out to a bit more than a 4% improvement on this two-core box. I also got access (thanks to Nate Boley) to a 24-core box and ran the same test with scale factor 100 and shared_buffers=8GB. Here are the results of alternating runs without and with the patch on that machine: tps = 36291.996228 (including connections establishing) tps = 129242.054578 (including connections establishing) tps = 36704.393055 (including connections establishing) tps = 128998.648106 (including connections establishing) tps = 36531.208898 (including connections establishing) tps = 131341.367344 (including connections establishing) That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Congratulations. I believe that is realistic based upon my investigations. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sat, Jun 4, 2011 at 2:59 PM, Simon Riggs si...@2ndquadrant.com wrote: As you can see, this works out to a bit more than a 4% improvement on this two-core box. I also got access (thanks to Nate Boley) to a 24-core box and ran the same test with scale factor 100 and shared_buffers=8GB. Here are the results of alternating runs without and with the patch on that machine: tps = 36291.996228 (including connections establishing) tps = 129242.054578 (including connections establishing) tps = 36704.393055 (including connections establishing) tps = 128998.648106 (including connections establishing) tps = 36531.208898 (including connections establishing) tps = 131341.367344 (including connections establishing) That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Congratulations. I believe that is realistic based upon my investigations. Tom, You should look at this. It's good. The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. I will happily add its a completely different approach to the one I'd been working on, and even more happily is so different from the Oracle approach that we are definitely unencumbered by patent issues here. Well done Robert, Noah. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 04.06.2011 18:01, Simon Riggs wrote: It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. -1 -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs wrote: we should be considering this for inclusion in 9.1, not wait another year. -1 I'm really happy that we're addressing the problems with scaling to a large number of cores, and this patch sounds great. Adding a new feature at this point in the release cycle would be horrible. Frankly, from the tone of Robert's post, it probably wouldn't be appropriate to include it in a release if it showed up in this condition at the start of the last CF for that release. The nice thing about annual releases is there's never one too far away -- unless, of course, we hold up a release up to squeeze in just one more feature. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. That suggestion is completely insane. The patch is only WIP and full of bugs, even according to its author. Even if it were solid, it is way too late to be pushing such stuff into 9.1. We're trying to ship a release, not find ways to cause it to slip more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Robert Haas robertmh...@gmail.com wrote: That's an improvement of about ~3.5x. Outstanding! I don't want to even peek at this until I've posted the two WIP SSI patches (now both listed on the Open Items page), but will definitely take a look after that. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Fri, Jun 3, 2011 at 10:13 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: That's an improvement of about ~3.5x. Outstanding! I don't want to even peek at this until I've posted the two WIP SSI patches (now both listed on the Open Items page), but will definitely take a look after that. Yeah, those SSI items are important to get nailed down RSN. But thanks for your interest in this patch. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers