Re: [HACKERS] Minor inheritance/check bug: Inconsistent behavior
On Fri, Sep 20, 2013 at 7:17 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 20, 2013 at 9:43 AM, Amit Kapila amit.kapil...@gmail.com wrote: Handling for OID is not clear, shall we allow it or not in check constraint? In my current patch OID will not be allowed in check constraint. I vote for allowing it. I had tried to allow for OID case, but setting OID in case of UPDATE operation is bit tricky. In ExecUpdate(), we need to set OID in new tuple by getting it from old tuple, but we only have old tupleid in this function. Using old tupleid, I can get tuple and then fetch OID from it to set in new tuple as is done in heap_update(), but it seems bit more work and also same has to be done in heap_update anyway unless we change signature of heap_update(). For now, I had updated the patch for below points: a. to set tableoid in Copy case b. to set tableoid in Alter case c. update the docs for Create Table Page. The other places like Alter Table and Constraints page doesn't have any information related to what is not allowed in check constraints, so I left them as is. I have not added new tests for Alter/Copy as they will test what testcase for insert will test. However if you feel we should add test for Alter/Copy/Update operation, then I will update the patch. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com sys_col_constr_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
Hello Noah, Thanks a lot for all these comments! I'm not planning to apply all of them directly, especially removing features that I think really desirable. Please find a defense against some of your suggestions. I wish to wait for some more feedback about these arguments before spending time in heavy changes. Improve pgbench measurements progress report These changes are loosely coupled; please separate them into several patch files: I thought about this. I submitted a bunch of very small pgbench patches to the previous commit fest, and it resulted in a lot of duplicated tests by reviewers, which made it a waste of their precious time. ISTM that what you suggest will result in more reviewer time, which is the critical resource of the commitfest. Moreover, splitting the patch as you suggest, especially in 7 patches, would result in conflicting or dependent patches, which creates yet more issues for the reviews. There is *some* coupling, because the reports are about the measurements performed, and I change both to try to have something consistent in the end. I can separate the doc fix, but this is just one line... Basically, I tried to optimize reviewer time by a combined patch focussed on pgbench measurements and their reporting. But this does not prevent discussing features separately! - Use progress option both under init bench. Activate progress report by default, every 5 seconds. When initializing, --quiet reverts to the old every 100,000 insertions behavior... Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. The gettimeofday call time is very small compared to network and database (disk!) accesses involved in a pgbench run. On my small laptop, gettimeofday takes about 0.0002 seconds (0.02 µs -- micro seconds) per call, which is typically under 1/1000 of the fastest local-socket read-only one-table cache-hit prepared trivial transaction. This is a little different when initializing, but I put a guard on gettimeofday in that case. If this is a blocker, I can put 0, but I really do not think it is necessary because of the performance impact. About the default behavior : it should be what is more useful. I found that running pgbench to get significant results requires long minutes or even hours because of warmup time. Running a command for 30 minutes without any feedback is annoying. The second point is that the feedback would help user notice that the figures evolve slowly but significantly, and realise that their too short runs are not significant at all. So it seems to me that it should be the desirable behavior, and --progress=0 is always possible anyway for the very performance (over-)conscious tester. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. I totally agree that this quiet is not convincing! My actual opinion is that quiet should just mean quiet:-), i.e. no progress report. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. - Measure transaction latency instead of computing it. Patch (3): Report the standard deviation of latency. Seems potentially useful; see inline comments below. In my limited testing, the interesting latency cases involved stray clusters of transactions showing 10x-100x mean latency. If that's typical, I'd doubt mean/stddev is a great analytical tool. But I have little reason to believe that my experience here was representative. The stddev measurement is sensitive to the cache hit ratio, so this is a good indicator of whether the steady state is reached. Indeed, mean is the most basic tool, and stddev the second most basic... They are better than nothing, and the stddev does add value.
Re: [HACKERS] pgbench progress report improvements
On Sat, Sep 21, 2013 at 4:55 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: I suggest getting the term stddev in there somehow, maybe like this: progress: 37.0 s, 115.2 tps, latency avg=8.678 ms stddev=1.792 My issue is to try to keep the line width under control so as to avoid line breaks in the terminal. Under throttling, there is also the time lag appended to the line. Moreover, using 'xxx=figure breaks simple cut pipelining to extract the figures, so I would prefer to stick to spaces. Maybe: progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms but I liked my +- approach:-) 100 +- 3 implies a range of 97 to 103 and no values are outside of that range.
Re: [HACKERS] pgbench progress report improvements
Moreover, using 'xxx=figure breaks simple cut pipelining to extract the figures, so I would prefer to stick to spaces. Maybe: progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms but I liked my +- approach:-) 100 +- 3 implies a range of 97 to 103 and no values are outside of that range. This is one of the possible meaning, but ISTM that it is not exclusive. Anyway here we do not have a symmetric distribution of run times. Well, it seems I'm alone in liking my +-, so I'll backtrack on that. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VMs for Reviewers Available
Folks, In order to make sure that nobody is prevented from reviewing due to not having a suitable development environment available, the PostgreSQL community is offering free virtual machines for reviewing and testing patches for this CommitFest. If you want a VM for this purpose, please email me the following at any point during the CF: 1. Your public key. 2. Which patch you intend to review/test. 3. Preference for CentOS or Ubuntu. 4. When you expect to do the review. Since these are EC2 virtual servers, they will NOT be suitable for performance testing. However, if you want to review a replication patch, we can make multiple VMs available. All VMs will be terminated at the end of the CF. VMs will come equipped with all tools required to build PostgreSQL, plus a checkout of the PostgreSQL code from the beginning of the CF. You may need to install additional requirements if you are reviewing extensions or special features with dependancies. Note that I may take a day to respond with your VM access. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
On Sat, Sep 21, 2013 at 10:55:54AM +0200, Fabien COELHO wrote: Improve pgbench measurements progress report These changes are loosely coupled; please separate them into several patch files: I thought about this. I submitted a bunch of very small pgbench patches to the previous commit fest, and it resulted in a lot of duplicated tests by reviewers, which made it a waste of their precious time. ISTM that what you suggest will result in more reviewer time, which is the critical resource of the commitfest. Moreover, splitting the patch as you suggest, especially in 7 patches, would result in conflicting or dependent patches, which creates yet more issues for the reviews. There is *some* coupling, because the reports are about the measurements performed, and I change both to try to have something consistent in the end. I can separate the doc fix, but this is just one line... Post all the patch files to the same mailing list thread and have them reviewed under a single CommitFest entry. A reviewer or committer preferring less granularity can squash patches together effortlessly. The inverse task, splitting a patch, is not so mechanical. Consequently, err on the side of more, smaller patches; it almost always saves reviewer and committer time. Basically, I tried to optimize reviewer time by a combined patch focussed on pgbench measurements and their reporting. Relatedness of changes is a continuum, and committers sometimes disagree on how far to subdivide before commit. This patch is not a borderline case; removing a bias from --rate throttle delays does not belong in the same commit as enabling progress reports by default. But this does not prevent discussing features separately! Actually, it does hamper debate. With smaller patches, a person interested in just one issue can read just that patch. With a monolithic patch, one must take in the entire thing to fully comprehend any single issue. Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. The gettimeofday call time is very small compared to network and database (disk!) accesses involved in a pgbench run. On my small laptop, gettimeofday takes about 0.0002 seconds (0.02 µs -- micro seconds) per call, which is typically under 1/1000 of the fastest local-socket read-only one-table cache-hit prepared trivial transaction. Many systems do have cheap gettimeofday(). Many don't. We recently added contrib/pg_test_timing in response to the variability in this area. I found that running pgbench to get significant results requires long minutes or even hours because of warmup time. Running a command for 30 minutes without any feedback is annoying. The second point is that the feedback would help user notice that the figures evolve slowly but significantly, and realise that their too short runs are not significant at all. So it seems to me that it should be the desirable behavior, and --progress=0 is always possible anyway for the very performance (over-)conscious tester. Those benefits aren't compelling enough to counterbalance the risk of gettimeofday() overhead affecting results. (Other opinions welcome.) For a tool like pgbench that requires considerable skill to use effectively, changing a default only helps slightly. It doesn't take much of a risk to make us better off leaving the default unchanged. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. I totally agree that this quiet is not convincing! My actual opinion is that quiet should just mean quiet:-), i.e. no progress report. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. I would be fine with
Re: [HACKERS] SSI freezing bug
Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@2ndquadrant.com wrote: On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote: When a tuple is predicate-locked, the key of the lock is ctid+xmin. However, when a tuple is frozen, its xmin is changed to FrozenXid. That effectively invalidates any predicate lock on the tuple, as checking for a lock on the same tuple later won't find it as the xmin is different. Attached is an isolationtester spec to demonstrate this. Do you have any idea to fix that besides keeping the xmin horizon below the lowest of the xids that are predicate locked? Which seems nasty to compute and is probably not trivial to fit into the procarray.c machinery? A better solution probably is to promote tuple-level locks if they exist to a relation level one upon freezing I guess? That would work. A couple other ideas would be to use the oldest serializable xmin which is being calculated in predicate.c to either prevent freezing of newer transaction or to summarize predicate locks (using the existing SLRU-based summarization functions) which use xmin values for xids which are being frozen. The transactions must already be committed, and so are eligible for summarization. What's the point of using xmin as part of the lock key in the first place, wouldn't ctid alone suffice? If the tuple was visible to the reading transaction, it cannot be vacuumed away until the transaction commits. No other tuple can appear with the same ctid. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI freezing bug
Heikki Linnakangas hlinnakan...@vmware.com schrieb: Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@2ndquadrant.com wrote: On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote: When a tuple is predicate-locked, the key of the lock is ctid+xmin. However, when a tuple is frozen, its xmin is changed to FrozenXid. That effectively invalidates any predicate lock on the tuple, as checking for a lock on the same tuple later won't find it as the xmin is different. Attached is an isolationtester spec to demonstrate this. Do you have any idea to fix that besides keeping the xmin horizon below the lowest of the xids that are predicate locked? Which seems nasty to compute and is probably not trivial to fit into the procarray.c machinery? A better solution probably is to promote tuple-level locks if they exist to a relation level one upon freezing I guess? That would work. A couple other ideas would be to use the oldest serializable xmin which is being calculated in predicate.c to either prevent freezing of newer transaction or to summarize predicate locks (using the existing SLRU-based summarization functions) which use xmin values for xids which are being frozen. The transactions must already be committed, and so are eligible for summarization. What's the point of using xmin as part of the lock key in the first place, wouldn't ctid alone suffice? If the tuple was visible to the reading transaction, it cannot be vacuumed away until the transaction commits. No other tuple can appear with the same ctid. SSI locks can live longer than one TX/snapshot. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI freezing bug
On 09/21/2013 10:46 PM, Andres Freund wrote: Heikki Linnakangas hlinnakan...@vmware.com schrieb: Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@2ndquadrant.com wrote: On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote: When a tuple is predicate-locked, the key of the lock is ctid+xmin. However, when a tuple is frozen, its xmin is changed to FrozenXid. That effectively invalidates any predicate lock on the tuple, as checking for a lock on the same tuple later won't find it as the xmin is different. Attached is an isolationtester spec to demonstrate this. Do you have any idea to fix that besides keeping the xmin horizon below the lowest of the xids that are predicate locked? Which seems nasty to compute and is probably not trivial to fit into the procarray.c machinery? A better solution probably is to promote tuple-level locks if they exist to a relation level one upon freezing I guess? That would work. A couple other ideas would be to use the oldest serializable xmin which is being calculated in predicate.c to either prevent freezing of newer transaction or to summarize predicate locks (using the existing SLRU-based summarization functions) which use xmin values for xids which are being frozen. The transactions must already be committed, and so are eligible for summarization. What's the point of using xmin as part of the lock key in the first place, wouldn't ctid alone suffice? If the tuple was visible to the reading transaction, it cannot be vacuumed away until the transaction commits. No other tuple can appear with the same ctid. SSI locks can live longer than one TX/snapshot. But the only way that ctid can change without xmin changing is when you update the tuple in the same TX that created it. Could it be the case here or can we safely exclude this ? -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 9/19/13 7:08 AM, Pavel Stehule wrote: FWIW, we've written a framework (currently available in the EnovaTools project on pgFoundry) that allows for very, very fine-grain control over asserts. - Every assert has a name (and an optional sub-name) as well as a level - You can globally set the minimum level that will trigger an assert. This is useful for some debugging stuff; have an assert with a negative level and normally it won't fire unless you set the minimum level to be less than zero. - You can disable an assert globally (across all backends) - You can disable an assert only within your session We should eventually allow for disabling an assert only for your transaction; we just haven't gotten around to it yet. The reason for all this flexibility is the concept of it should be very difficult but not impossible for the code to do X. We use it for sanity-checking things. I think so similar frameworks will be exists (we have some similar functionality) in orafce too - and it is not reason why we should not merge some function to core. I am with Marko, so some simple, user friendly statement for assertions should be very nice plpgsql feature. I am different in opinion how to implementat it and about syntax. I prefer a possibility (not necessary currently implemented) to enhance this feature for similar tasks (as buildin or external feature) Probably You and me have a same opinion so only simple and very primitive assert is not enough: I see as useful feature for assertions: a) possibility to specify a message (two parametric assert) b) possibility to specify some threshold c) possibility to specify some level (exception, warning, notice) .. default should be exception c) possibility to specify a handled/unhandled exception I'm not opposed to the improvements you're proposing, but I do want to point out that none of them would allow us to use these asserts, because we definitely need the ability to enable and disable individual asserts. (Understand that what we've developed is actually rather different from the C concept of asserts...) I'm not saying that's necessarily bad, but there is an interesting point here: different environments might have radically different needs for dealing with asserts that fail. What we *could* make use of would be asserts that are extremely fast when the assert passes but then allow us to do whatever we want when an assert fails (including possibly ignoring the fact that the assert failed). Of course, if the community wanted to just accept the API and functionality we've developed I'd be fine with that too... ;P -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On Fri, Sep 20, 2013 at 5:17 AM, Marko Tiikkaja ma...@joh.to wrote: On 9/20/13 12:09 PM, Amit Khandekar wrote: On 16 September 2013 03:43, Marko Tiikkaja ma...@joh.to wrote: I think it would be extremely surprising if a command like that got optimized away based on a GUC, so I don't think that would be a good idea. In pl_gram.y, in the rule stmt_raise, determine that this RAISE is for ASSERT, and then return NULL if plpgsql_curr_compile-enable_assertions is false. Isn't this possible ? Of course it's possible. But I, as a PostgreSQL user writing PL/PgSQL code, would be extremely surprised if this new cool option to RAISE didn't work for some reason. If we use ASSERT the situation is different; most people will realize it's a new command and works differently from RAISE. What about just adding a clause WHEN to the RAISE statement and use the level machinery (client_min_messages) to make it appear or not of course, this has the disadvantage that an EXCEPTION level will always happen... or you can make it a new loglevel that mean EXCEPTION if asserts_enabled -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
I think the point here is that, at least as I understand it, encoding conversion and sanitization happens at a very early stage right now, when we first receive the input from the client. If the user sends a string of bytes as part of a query or bind placeholder that's not valid in the database encoding, it's going to error out before any type-specific code has an opportunity to get control. Look at textin(), for example. There's no encoding check there. That means it's already been done at that point. To make this work, someone's going to have to figure out what to do about *that*. Until we have a sketch of what the design for that looks like, I don't see how we can credibly entertain more specific proposals. I don't think the bind placeholder is the case. That is processed by exec_bind_message() in postgres.c. It has enough info about the type of the placeholder, and I think we can easily deal with NCHAR. Same thing can be said to COPY case. Problem is an ordinary query (simple protocol Q message) as you pointed out. Encoding conversion happens at a very early stage (note that fast-path case has the same issue). If a query message contains, say, SHIFT-JIS and EUC-JP, then we are going into trouble because the encoding conversion routine (pg_client_to_server) regards that the message from client contains only one encoding. However my question is, does it really happen? Because there's any text editor which can create SHIFT-JIS and EUC-JP mixed text. So my guess is, when user want to use NCHAR as SHIFT-JIS text, the rest of query consist of either SHIFT-JIS or plain ASCII. If so, what the user need to do is, set the client encoding to SJIFT-JIS and everything should be fine. Maumau, is my guess correct? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VMs for Reviewers Available
3. Preference for CentOS or Ubuntu. Windows VMs are also available, but I don't have the ability to preconfigure them with tools. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Hi Stephen, On Fri, Sep 20, 2013 at 6:55 PM, Stephen Frost sfr...@snowman.net wrote: I'm not sure I follow this completely- you're saying that a definition of 'upsert' which includes having to lock rows which aren't in your current snapshot (for reasons stated) isn't a useful one. Is the implication that a useful definition of 'upsert' is that it *doesn't* have to lock rows which aren't in your current snapshot, and if so, then what would the semantics of that upsert look like? No, I'm suggesting that the useful semantics are that it does potentially lock rows not yet visible to our snapshot that have committed - the latest row version. I see no alternative (we can't throw a serialization failure at read committed isolation level), and Andres seemed to agree that this was the way forward. Robert described problems he saw with this a few years ago [1]. It *is* a problem (we need to think very carefully about it), but, as I've said, it is a problem that anyone implementing this feature for a Snapshot Isolation/MVCC database would have to deal with, and several have. So, what the patch does right now is (if you squint) analogous to how SELECT FOR UPDATE uses EvalPlanQual already. However, instead of re-verifying a qual, we're re-verifying that the value locking has identified the right tid (there will probably be a different one in the subsequent iteration, or maybe we *can* go insert this time). We need consensus across unique indexes to go ahead with insertion, but once we know that we can't (and have a tid to lock), value locks can go away - we'll know if anything has changed about the tid's logical row that we need to care about when row locking. Besides, holding value locks while row locking has deadlock hazards, and, because value locks only stop insertions *finishing*, holding on to them is at best pointless. The tid we get from locking, that points to a would-be duplicate heap tuple has always committed - otherwise we'd never return from locking, because that blocks pending the outcome of a duplicate-inserting-xact (and only returns the tid when that xact commits). Even though this tuple is known to be visible, it may be deleted in the interim before row locking, in which case restarting from before value locking is appropriate. It might also be updated, which would necessitate locking a later row version in order to prevent race conditions. But it seems dangerous, invasive, and maybe even generally impossible to try and wait for the transaction that updated to commit or abort so that we can lock that later version the usual way (the usual EvalPlanQual looping thing) - better to restart value locking. The fundamental observation about value locking (at least for any half-way reasonable implementation), that I'd like to emphasize, is that short of a radical overhaul that would have many downsides, it can only ever prevent insertion from *finishing*. The big picture of my design is that it tries to quickly grab value locks, release them and grab a row lock (or insert heap tuples, index tuples, and then release value locks). If row locking fails, it waits for the conflicter xact to finish, and then restarts before the value locking of the current slot. If you think that's kind of questionable, maybe you have a point, but consider: 1. How else are you going to handle it if row locking needs to handle conflicts? You might say I can re-verify that no unique index columns were affected instead, and maybe you can, but what if that doesn't help because they *were* changed? Besides, doesn't this break the amcanunique contract? Surely judging what's really a duplicate is the AM's job. You're back to I need to throw an error to get out of this but I have no good excuse to do so at read committed -- you've lost the usual duplicate key error excuse. I don't think you can expect holding the value locks throughout row locking to help, because, as I've said, that causes morally indefensible deadlocks, and besides, it doesn't stop what row locking would consider to be a conflict, it just stops insertion from *finishing*. 2. In the existing btree index insertion code, the order that retries occur in the event of unique index tuple insertion finding an unfinished conflicting xact *is* undefined. Yes, that's right - the first waiter is not guaranteed to be the first to get a second chance. It's not even particularly probable! See remarks from my last mail to Robert for more information. 3. People with a real purist's view on the (re)ordering of value locking must already think that EvalPlanQual() is completely ghetto for very similar reasons, and as such should just go use a higher isolation level. For the rest of us, what concurrency control anomaly can allowing this cause over and above what's already possible there? Are lock starvation risks actually appreciably raised at all? What Andres and Robert seem to expect generally - that value locks only be released when we the locker has a definitive answer
Re: [HACKERS] VMs for Reviewers Available
On Sat, Sep 21, 2013 at 3:35 PM, Josh Berkus j...@agliodbs.com wrote: Windows VMs are also available, but I don't have the ability to preconfigure them with tools. Wasn't there an EC2 image doing the rounds that Magnus created, that took care of all of that for you? http://blog.hagander.net/archives/151-Testing-PostgreSQL-patches-on-Windows-using-Amazon-EC2.html I'm not sure how current it is - was this before or after we started shipping our own WinFlex? Magnus? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VMs for Reviewers Available
On 09/21/2013 06:48 PM, Peter Geoghegan wrote: On Sat, Sep 21, 2013 at 3:35 PM, Josh Berkus j...@agliodbs.com wrote: Windows VMs are also available, but I don't have the ability to preconfigure them with tools. Wasn't there an EC2 image doing the rounds that Magnus created, that took care of all of that for you? http://blog.hagander.net/archives/151-Testing-PostgreSQL-patches-on-Windows-using-Amazon-EC2.html I'm not sure how current it is - was this before or after we started shipping our own WinFlex? Magnus? There is really no longer any need to use our WinFlex. Just use the flex that comes with MSysGit - it works perfectly well, and you only need flex if you're building from git anyway. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Sun, Sep 15, 2013 at 8:23 AM, Andres Freund and...@2ndquadrant.com wrote: I'll find it very difficult to accept any implementation that is going to bloat things even worse than our upsert looping example. How would any even halfway sensible example cause *more* bloat than the upsert looping thing? I was away in Chicago over the week, and didn't get to answer this. Sorry about that. In the average/uncontended case, the subxact example bloats less than all alternatives to my design proposed to date (including the unborn heap tuple idea Robert mentioned in passing to me in person the other day, which I think is somewhat similar to a suggestion of Heikki's [1]). The average case is very important, because in general contention usually doesn't happen. But you need to also appreciate that because of the way row locking works and the guarantees value locking makes, any ON DUPLICATE KEY LOCK FOR UPDATE implementation is going to have to potentially restart in more places (as compared to the doc's example), maybe including value locking of each unique index and certainly including row locking. So the contended case might even be worse as well. On average, it is quite likely that either the UPDATE or INSERT will succeed - there has to be some concurrent activity around the same values for either to fail, and in general that's quite unlikely. If the UPDATE doesn't succeed, it won't bloat, and it's then very likely that the INSERT at the end of the loop will go ahead and succeed without itself creating bloat. Going forward with this discussion, I would like us all to take as read that the buffer locking stuff is a prototype approach to value locking, to be refined later (subject to my basic design being judged fundamentally sound). I don't think anyone believes that it's fundamentally incorrect in that it doesn't do something that it claims to do (concerns are more around what it might do or prevent that it shouldn't), and it can still drive discussion in a very useful direction. So far criticism of this patch has almost entirely been on aspects of buffer locking, but it would be much more useful for the time being to simply assume that the buffer locks *are* interruptible. It's probably okay with me to still be a bit suspicious of deadlocking, though, because if we refine the buffer locking using a more granular SLRU value locking approach, that doesn't necessarily guarantee that it's impossible, even if it does (I guess) prevent undesirable interactions with other buffer locking. [1] http://www.postgresql.org/message-id/45e845c4.6030...@enterprisedb.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6
On 09/20/2013 06:33 AM, Andres Freund wrote: Hi, The points I find daunting are the semantics, like: * How do we control whether a standby is allowed prevent WAL file removal. What if archiving is configured? * How do we control whether a standby is allowed to peg xmin? * How long do we peg an xmin/wal file removal if the standby is gone * How does the userinterface look to remove a slot if a standby is gone * How do we decide/control which commands use a slot in which cases? I think we are going to want to be flexible enough to support users with a couple of different points of use-cases. * Some people will want to keep xmin pegged and prevent WAL removal so a standby with a slot can always catch up, and wi * Most people will want to say keep X megabytes of WA (if needed by a behind slot) and keep xmin pegged so that the WAL can be consumed by a logical plugin. I can see us also implementing a restore_command that the walsender could use to get archived segments but for logical replication xmin would still need to be low enough I don't think the current patch set is incompatible with us later implementing any of the above. I'd rather see us focus on getting the core functionality committed and worry about a good interface for managing slots later. Greetings, Andres Freund Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Sep 20, 2013 at 5:48 PM, Peter Geoghegan p...@heroku.com wrote: ProcLockWakeup() only wakes as many waiters from the head of the queue as can all be granted the lock without any conflicts. So I don't think there is a race condition in that path. Right, but what about XactLockTableWait() itself? It only acquires a ShareLock on the xid of the got-there-first inserter that potentially hasn't yet committed/aborted. There will be no conflicts between multiple second-chance-seeking blockers trying to acquire this lock concurrently, and so in fact there is (what I guess you'd consider to be) a race condition in the current btree insertion code. I should add: README.tuplock says the following: The protocol for waiting for a tuple-level lock is really LockTuple() XactLockTableWait() mark tuple as locked by me UnlockTuple() When there are multiple waiters, arbitration of who is to get the lock next is provided by LockTuple(). So because this isn't a tuple-level lock - it's really a value-level lock - LockTuple() is not called by the btree code at all, and so arbitration of who gets the lock is, as I've said, essentially undefined. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Sat, Sep 21, 2013 at 7:22 PM, Peter Geoghegan p...@heroku.com wrote: So because this isn't a tuple-level lock - it's really a value-level lock - LockTuple() is not called by the btree code at all, and so arbitration of who gets the lock is, as I've said, essentially undefined. Addendum: It isn't even a value-level lock, because the buffer locks are of course released before the XactLockTableWait() call. It's a simple attempt to acquire a shared lock on an xid. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VMs for Reviewers Available
El 21/09/2013 18:09, Andrew Dunstan and...@dunslane.net escribió: On 09/21/2013 06:48 PM, Peter Geoghegan wrote: On Sat, Sep 21, 2013 at 3:35 PM, Josh Berkus j...@agliodbs.com wrote: Windows VMs are also available, but I don't have the ability to preconfigure them with tools. Wasn't there an EC2 image doing the rounds that Magnus created, that took care of all of that for you? http://blog.hagander.net/archives/151-Testing-PostgreSQL-patches-on-Windows-using-Amazon-EC2.html I'm not sure how current it is - was this before or after we started shipping our own WinFlex? Magnus? There is really no longer any need to use our WinFlex. Just use the flex that comes with MSysGit - it works perfectly well, and you only need flex if you're building from git anyway. If you're reviewing patches you're probably compiling from git. those win machines come with msys (and flex)? not always our reviewers know how to install those tools -- Jaime Casanova 2ndQuadrant: Your PostgreSQL partner
Re: [HACKERS] Assertions in PL/PgSQL
El 21/09/2013 17:16, Jaime Casanova ja...@2ndquadrant.com escribió: On Fri, Sep 20, 2013 at 5:17 AM, Marko Tiikkaja ma...@joh.to wrote: On 9/20/13 12:09 PM, Amit Khandekar wrote: On 16 September 2013 03:43, Marko Tiikkaja ma...@joh.to wrote: I think it would be extremely surprising if a command like that got optimized away based on a GUC, so I don't think that would be a good idea. In pl_gram.y, in the rule stmt_raise, determine that this RAISE is for ASSERT, and then return NULL if plpgsql_curr_compile-enable_assertions is false. Isn't this possible ? Of course it's possible. But I, as a PostgreSQL user writing PL/PgSQL code, would be extremely surprised if this new cool option to RAISE didn't work for some reason. If we use ASSERT the situation is different; most people will realize it's a new command and works differently from RAISE. What about just adding a clause WHEN to the RAISE statement and use the level machinery (client_min_messages) to make it appear or not of course, this has the disadvantage that an EXCEPTION level will always happen... or you can make it a new loglevel that mean EXCEPTION if asserts_enabled meaning RAISE ASSERT of course -- Jaime Casanova 2ndQuadrant: Your PostgreSQL partner
Re: [HACKERS] VMs for Reviewers Available
On 09/21/2013 10:48 PM, Jaime Casanova wrote: El 21/09/2013 18:09, Andrew Dunstan and...@dunslane.net mailto:and...@dunslane.net escribió: On 09/21/2013 06:48 PM, Peter Geoghegan wrote: On Sat, Sep 21, 2013 at 3:35 PM, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: Windows VMs are also available, but I don't have the ability to preconfigure them with tools. Wasn't there an EC2 image doing the rounds that Magnus created, that took care of all of that for you? http://blog.hagander.net/archives/151-Testing-PostgreSQL-patches-on-Windows-using-Amazon-EC2.html I'm not sure how current it is - was this before or after we started shipping our own WinFlex? Magnus? There is really no longer any need to use our WinFlex. Just use the flex that comes with MSysGit - it works perfectly well, and you only need flex if you're building from git anyway. If you're reviewing patches you're probably compiling from git. those win machines come with msys (and flex)? not always our reviewers know how to install those tools It's do0cumented on the wiki - see http://wiki.postgresql.org/wiki/Building_With_MinGW cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add use of asprintf()
On Mon, 2013-09-16 at 17:31 -0300, Alvaro Herrera wrote: Looks good to me, except that pg_asprintf seems to be checking ret instead of rc. Ah, good catch! Is there a reason for the API discrepancy of pg_asprintf vs. psprintf? I don't see that we use the integer return value anywhere. Callers interested in the return value can use asprintf directly (and you have already inserted callers that do nonstandard things using direct asprintf). I wanted to keep pg_asprintf the same as asprintf. I think there is some value in that, but it's not supremely important. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add use of asprintf()
On Tue, 2013-09-17 at 15:13 +0500, Asif Naeem wrote: 1. It seems that you have used strdup() on multiple places in the patch, e.g. in the below code snippet is it going to lead crash if newp-ident is NULL because of strdup() failure ? static EPlan * find_plan(char *ident, EPlan **eplan, int *nplans) { ... newp-ident = strdup(ident); ... } The previous code used unchecked malloc, so this doesn't change anything. It's only example code anyway. (The use of malloc instead of palloc is dubious anyway.) 2. Can we rely on return value of asprintf() instead of recompute length as strlen(cmdbuf) ? if (asprintf(cmdbuf, COMMENT ON LARGE OBJECT %u IS ', loid) 0) return fail_lo_xact(\\lo_import, own_transaction); bufptr = cmdbuf + strlen(cmdbuf); Yes, good point. 3. There seems naming convention for functions like pfree (that seems similar to free()), pstrdup etc; but psprintf seems different than sprintf. Can't we use pg_asprintf instead in the patch, as it seems that both (psprintf and pg_asprintf) provide almost same functionality ? pg_asprintf uses malloc, psprintf uses palloc, so they have to be different functions. The naming convention where psomething = something with memory context management pg_something = something with error checking isn't very helpful, but it's what we have. Better ideas welcome. 5. It seems that in the previously implementation, error handling was not there, is it appropriate here that if there is issue in duplicating environment, quit the application ? i.e. /* * Handy subroutine for setting an environment variable var to val */ static void doputenv(const char *var, const char *val) { char*s; pg_asprintf(s, %s=%s, var, val); putenv(s); } I think so, yes. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions makefiles - coverage
On Thu, 2013-07-25 at 17:07 +0200, Ronan Dunklau wrote: I am using approximatively the layout that was proposed here: http://www.postgresql.org/message-id/51bb1b6e.2070...@dunslane.net It looks like everything is hard-coded to take the source and the gcda, gcno files in the base directory, but these files lay in a src directory with the proposed layout. The PostgreSQL build system isn't going to work very well if you build files outside of the current directory. If you want to put your source files into a src/ subdirectory, then your top-level makefile should to a $(MAKE) -C src, and you need to have a second makefile in the src directory. If you do that, then the existing coverage targets will work alright, I think. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers