Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Thu, Jun 5, 2014 at 5:37 PM, Peter Geoghegan wrote: > One thing that isn't all that obvious about this worst case is that > it's in general very qsort() friendly, and therefore the startup costs > (copying) totally dominates. Actually, you're not even sorting - > you're verifying that the tuples are already exactly in order (a > questionable optimization we apply at every level). Kevin mentioned something about the Wisconsin courts having columns that all began with "The State of Wisconsin Vs." in the dev meeting in Ottawa. I thought that this was an interesting case, because it is representative of reality, which is crucially important to consider here. I decided to simulate it. In my original test database: postgres=# create table wisconsin(casen text); CREATE TABLE postgres=# insert into wisconsin select 'The State of Wisconsin Vs. ' || city from cities; INSERT 0 317102 sort-wisconsin.sql: select * from (select * from wisconsin order by casen offset 100) d; Master: pgbench -M prepared -f sort-wisconsin.sql -T 300 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 300 s number of transactions actually processed: 55 latency average: 5454.545 ms tps = 0.181191 (including connections establishing) tps = 0.181191 (excluding connections establishing) Patch (most recent revision, with ameliorations, HyperLogLog, etc): pgbench -M prepared -f sort-wisconsin.sql -T 300 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 300 s number of transactions actually processed: 55 latency average: 5454.545 ms tps = 0.182593 (including connections establishing) tps = 0.182594 (excluding connections establishing) Earlier patch (no ameliorations for Heikki's case): pgbench -M prepared -f sort-wisconsin.sql -T 300 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 300 s number of transactions actually processed: 54 latency average: .556 ms tps = 0.176914 (including connections establishing) tps = 0.176915 (excluding connections establishing) With my most recent revision, the ameliorating measures are effective enough that with the sortsupport shim and fmgr trampoline avoided, we still come out ahead even for this case. Great. But you may be surprised that the regression is so small in the case of the patch without any ameliorating measures (the original patch). That's because the data isn't *perfectly* logically/physically correlated here, as in Heikki's worst case. So, the 317,102 wasted strxfrm() calls are relatively inexpensive. Consider how cost_sort() models the cost of a sort when an in memory quicksort is anticipated: /* We'll use plain quicksort on all the input tuples */ startup_cost += comparison_cost * tuples * LOG2(tuples); In the case of this quicksort, the planner guesses there'll be "317102 * LOG2(317102)" comparisons -- about 5,794,908 comparisons, which implies over 10 times as many strcoll() calls as wasted strxfrm() calls. The cost of those strxfrm() calls begins to look insignificant before n gets too big (at n = 100, it's 100 wasted strxfrm() calls to about 664 strcoll() calls). Unless, of course, you have a "bubble sort best case" where everything is already completely in order, in which case there'll be a 1:1 ratio between wasted strxfrm() calls and strcoll() calls. This optimization was something that we added to our qsort(). It doesn't appear in the original NetBSD implementation, and it doesn't appear in the Bentley/McIlroy paper, and it doesn't appear anywhere else that I'm aware of. I'm not the only person to regard it with suspicion - Tom has in the past expressed doubts about that too [1]. Also, note that no sorting algorithm can do better than O(n log n) in the average case - that's the information-theoretical lower bound on the average-case speed of any comparison-based sorting algorithm. To be clear: I'm certainly not saying that we shouldn't fix Heikki's worst case, and indeed I believe I have, but we should also put this worst case in perspective. By the way, I just realized that I failed to fully remove client overhead (I should have put an extra 0 on the end of my offset for the city.sql query), which added noise to the "city"/"country"/"province" tests. Revised figures are as follows (these are better than before): Master: == pgbench -M prepared -f sort-city.sql -T 300 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 300 s number of transactions actually processed: 278 latency average: 1079.137 ms tps = 0.924358 (including connections establishing) tps = 0.924362 (excluding connections establishing) Patch: = pgbench -M prepared -f sort-city.sql -T 300 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of thr
Re: [HACKERS] Why is it "JSQuery"?
On 06/06/2014 03:23 PM, David E. Wheeler wrote: > On Jun 6, 2014, at 12:51 PM, Josh Berkus wrote: >> Well, I'd also say that we don't care about syntaxes which are not >> already popular. There's no point in being compatible with something >> nobody uses. How many of the above have any uptake? > > I think there is JQuery, JSONPath, and everything else, really. If we can > draw some parallels, I think that would be sufficient to make people > comfortable. Well, then those are the only ones worth considering. >>> I do think that the name should be changed if we don’t follow an existing >>> standard, as [JSQuery](https://code.google.com/p/gwtquery/wiki/JsQuery) is >>> already a thing. >> >> I saw that too, but I don't get the impression that Google jsquery is >> all that active. No? > > It’s Google. You really want to wrangle with their attorneys? Google is not going to sue us over a minor OSS project which isn't a commercial product. The relevant question is: are users liable to confuse our jsquery with Google jsquery? Maybe we should call it "jsonesque" ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it "JSQuery"?
On Jun 6, 2014, at 12:51 PM, Josh Berkus wrote: >> * [JAQL](https://code.google.com/p/jaql/): Too different from SQL >> * [JSONPath](http://goessner.net/articles/JsonPath/): Too verbose > > I don't agree with the too verbose, but lacking AND|OR is pretty crippling. I had enough people complain about Test::XPath, which tests the structure of XML and HTML documents using XPath. They didn't like how verbose XPath was, preferring CSS selectors. So I ended up with a patch to support CSS syntax, too. CSS-style syntax is part of what people like about JQuery, too. > Well, I'd also say that we don't care about syntaxes which are not > already popular. There's no point in being compatible with something > nobody uses. How many of the above have any uptake? I think there is JQuery, JSONPath, and everything else, really. If we can draw some parallels, I think that would be sufficient to make people comfortable. >> I do think that the name should be changed if we don’t follow an existing >> standard, as [JSQuery](https://code.google.com/p/gwtquery/wiki/JsQuery) is >> already a thing. > > I saw that too, but I don't get the impression that Google jsquery is > all that active. No? It’s Google. You really want to wrangle with their attorneys? David signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [HACKERS] Suppressing unused subquery output columns
On 6/5/14, 9:54 PM, Tom Lane wrote: Rod Taylor writes: On Thu, Jun 5, 2014 at 10:27 PM, Tom Lane wrote: I'm not entirely convinced that it's worth the extra planning cycles, though. Given the small number of complaints to date, it might not be worth doing this. Thoughts? Would this avoid execution of expensive functions in views when their output is discarded? Yes, as long as they're not marked volatile and don't return sets. That would certainly make it useful for us. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inaccuracy in VACUUM's tuple count estimates
On 2014-06-06 15:44:25 -0400, Tom Lane wrote: > I figured it'd be easy enough to get a better estimate by adding another > counter to count just LIVE and INSERT_IN_PROGRESS tuples (thus effectively > assuming that in-progress inserts and deletes will both commit). I did > that, and found that it helped Tim's test case not at all :-(. A bit of > sleuthing revealed that HeapTupleSatisfiesVacuum actually returns > INSERT_IN_PROGRESS for any tuple whose xmin isn't committed, regardless of > whether the transaction has since marked it for deletion: > > /* > * It'd be possible to discern between INSERT/DELETE in progress > * here by looking at xmax - but that doesn't seem beneficial for > * the majority of callers and even detrimental for some. We'd > * rather have callers look at/wait for xmin than xmax. It's > * always correct to return INSERT_IN_PROGRESS because that's > * what's happening from the view of other backends. > */ > return HEAPTUPLE_INSERT_IN_PROGRESS; That's only the case of a couple of days ago. I really wasn't sure wheter to go that way or discern the two cases. That changed in the wake of: http://www.postgresql.org/message-id/20140530143150.GA11051@localhost I tried to solicit feedback (e.g. by CCing you :)) but I mostly failed. Alvaro agreed, on IM, that it's better this way. > It did not use to blow this question off: back around 8.3 you got > DELETE_IN_PROGRESS if the tuple had a delete pending. I think we need > less laziness + fuzzy thinking here. My argument for not discerning wasn't that it's hard to do, but that it might confuse callers more the other way round. E.g. doing a XactLockTableWait(xmax) might not be sufficient for the tuple being alive. > Maybe we should have a separate > HEAPTUPLE_INSERT_AND_DELETE_IN_PROGRESS result code? Maybe. > Is it *really* > the case that callers other than VACUUM itself are okay with failing > to make this distinction? I'm dubious: there are very few if any > callers that treat the INSERT and DELETE cases exactly alike. I looked through all of them and saw none that'd be problematic. And some, like predicate.c, where the new behaviour seems to be better. Most of the ones that care about INSERT/DELETE_IN_PROGRESS wait on xmin/xmax respectively. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it "JSQuery"?
On 06/06/2014 09:12 AM, David E. Wheeler wrote: > On Jun 6, 2014, at 6:54 AM, Oleg Bartunov wrote: > >> Jsquery - is QUERY language, JsonPath - is language to EXTRACT json parts. > > Sure, but could we not potentially build on its syntax, instead of building a > new one? I’m not saying we *should*, but if we don’t, I think there should be > a discussion about why not. For example, I think it would not be a good idea > to follow [JSONiq](http://www.jsoniq.org/) because who wants to write queries > in JSON? (Have we learned nothing from XSLT?). > > Here’s a (partial) list of existing JSON query languages: > > http://stackoverflow.com/a/7812073/79202 > > The arguments might be: > > * [JSONiq](http://jsoniq.org/): Queries in JSON? Gross! Also overly complex for the functionality we support. There's also no way to make the jsquery strings valid JSON without adding a bunch of extra text. > * [UNQL](http://www.unqlspec.org/): Too similar to SQL ... also intended to be a *complete* replacement for SQL, whereas we just want a syntax to search JSON fields. > * [JAQL](https://code.google.com/p/jaql/): Too different from SQL > * [JSONPath](http://goessner.net/articles/JsonPath/): Too verbose I don't agree with the too verbose, but lacking AND|OR is pretty crippling. > * [JSON Query](https://github.com/mmckegg/json-query): Too little there > * [Mongo](http://www.mongodb.org/display/DOCS/Inserting#Inserting-JSON): > Gross syntax > * [LINQ](http://james.newtonking.com/archive/2008/03/02/json-net-2-0-beta-2): > Too similar to SQL > * [searchjs](https://github.com/deitch/searchjs): Queries in JSON? Gross! > * [JQuery](http://jquery.org/): It's for HTML, not JSON > * [SpahQL](http://danski.github.io/spahql/): More like XPath > * [ObjectPath](http://adriank.github.io/ObjectPath/): Too verbose > * [JFunk](https://code.google.com/p/jfunk/): XPathy > * [JData](http://jaydata.org): Queries in JavaScript? C’mon. > > These are just off-the-cuff evaluations in 10 minutes of looking -- surely > not all of them are accurate. Some of them maybe *are* useful to emulate. > It’s definitely worthwhile, IMHO, to evaluate prior art and decide what, if > any of it, should inspire the JSQuery syntax, and there should be reasons why > and why not. Well, I'd also say that we don't care about syntaxes which are not already popular. There's no point in being compatible with something nobody uses. How many of the above have any uptake? Also, the explosion of query languages in this area is not an encouraging sign for us being able to pick the "right" one. UUID-OSSP anyone? So the advantage of the current "jsquery" syntax is that it's similar to tsquery, which already has some adoption in our userbase. On the other hand, I'm not sure how many people actually understand the tsquery syntax, and jsquery will be different enough to trip people up. > I do think that the name should be changed if we don’t follow an existing > standard, as [JSQuery](https://code.google.com/p/gwtquery/wiki/JsQuery) is > already a thing. I saw that too, but I don't get the impression that Google jsquery is all that active. No? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Inaccuracy in VACUUM's tuple count estimates
I've been looking at the complaint Tim Wilson posted in pgsql-performance about badly inaccurate reltuples updates coming from VACUUM. There seem to be a number of problems leading to that. The key point is that when VACUUM has scanned only part of the relation, it assumes that the live-tuple density in that part of the relation is num_tuples (I'm speaking of the counter accumulated in lazy_scan_heap) divided by scanned_pages, and then it tries to extrapolate that information to the rest of the relation. Now, the validity of that extrapolation is a bit questionable given that VACUUM is considering a highly nonrandom subset of the table's pages, but the real problem is the values are wrong even for the pages we did look at. To wit: * scanned_pages is not reliably the number of pages we scanned, because somebody thought it would be cute to bump it even for pages we decided didn't need to be scanned because they contain no freezable tuples. So we have an increment in scanned_pages, but no corresponding increment in the tuple count, leading to a density underestimate. This seems to only happen in vacuum-for-wraparound cases, but it's still wrong. We need to separate the logic about whether we skipped any pages from the statistical counters. * num_tuples has very little to do with the number of live tuples, because it actually counts all nonremovable tuples, including RECENTLY_DEAD, INSERT_IN_PROGRESS, and DELETE_IN_PROGRESS tuples. In the case Tim is complaining about, the VACUUM happens concurrently with a long transaction that is bulk-updating most tuples in the relation, some of them several times, so that VACUUM sees multiple images of every tuple (one INSERT_IN_PROGRESS, the rest DELETE_IN_PROGRESS), and thus arrives at a horrid overestimate of the number of live tuples. I figured it'd be easy enough to get a better estimate by adding another counter to count just LIVE and INSERT_IN_PROGRESS tuples (thus effectively assuming that in-progress inserts and deletes will both commit). I did that, and found that it helped Tim's test case not at all :-(. A bit of sleuthing revealed that HeapTupleSatisfiesVacuum actually returns INSERT_IN_PROGRESS for any tuple whose xmin isn't committed, regardless of whether the transaction has since marked it for deletion: /* * It'd be possible to discern between INSERT/DELETE in progress * here by looking at xmax - but that doesn't seem beneficial for * the majority of callers and even detrimental for some. We'd * rather have callers look at/wait for xmin than xmax. It's * always correct to return INSERT_IN_PROGRESS because that's * what's happening from the view of other backends. */ return HEAPTUPLE_INSERT_IN_PROGRESS; It did not use to blow this question off: back around 8.3 you got DELETE_IN_PROGRESS if the tuple had a delete pending. I think we need less laziness + fuzzy thinking here. Maybe we should have a separate HEAPTUPLE_INSERT_AND_DELETE_IN_PROGRESS result code? Is it *really* the case that callers other than VACUUM itself are okay with failing to make this distinction? I'm dubious: there are very few if any callers that treat the INSERT and DELETE cases exactly alike. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposing pg_hibernate
On 6/4/14, 8:56 AM, Andres Freund wrote: On 2014-06-04 09:51:36 -0400, Robert Haas wrote: >On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund wrote: > >On 2014-06-04 10:24:13 +0530, Amit Kapila wrote: > >>Incase of recovery, the shared buffers saved by this utility are > >>from previous shutdown which doesn't seem to be of more use > >>than buffers loaded by recovery. > > > >Why? The server might have been queried if it's a hot standby one? > >I think that's essentially the same point Amit is making. Gurjeet is >arguing for reloading the buffers from the previous shutdown at end of >recovery; IIUC, Amit, you, and I all think this isn't a good idea. I think I am actually arguing for Gurjeet's position. If the server is actively being queried (i.e. hot_standby=on and actually used for queries) it's quite reasonable to expect that shared_buffers has lots of content that is*not* determined by WAL replay. Perhaps instead of trying to get data actually into shared buffers it would be better to just advise the kernel that we think we're going to need it? ISTM it's reasonably fast to pull data from disk cache into shared buffers. On a related note, what I really wish for is the ability to restore the disk cash after a restart/unmount... -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build
Bruce Momjian wrote: > On Wed, May 28, 2014 at 12:29:28PM -0400, Tom Lane wrote: > > Bruce Momjian writes: > > > I think this is caused because the variable is not defined as SOCKET. > > > The attached patch fixes this. This should prevent the warning. > > > > Surely that's just going to move the errors somewhere else. The call > > site still expects the argument to be int[]. > > Ah, yes, you are right. This is a similar problem I had with libpq > where PQsocket() had to return an int. > > Attached is an updated patch which follows my previous coding of > checking for PGINVALID_SOCKET, and if not equal, assigns the value to an > integer handle. I would also like to rename variable 's' to > 'listen_sock', but that is not in the patch, for clarity reasons. > > Should this be held for 9.5? I think it is only warning removal. On > the other hand, portability is what we do during beta testing. I think this should go in 9.4, but as you say it's only warning removal so there is probably little point in patching further back (this code dates back to 9.3.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog add synchronous mode
On Fri, Jun 6, 2014 at 8:05 PM, wrote: >> -Original Message- >> > > Flush is not performed every time write, it is performed >> > > collectively like walrecever. >> > >> > I only glanced at this, but afaics you're only flushing at the end >> > every WAL segment. That will result in absolutely horrible performance, >> right? >> > Walreceiver does flush more frequently than that. It basically syncs >> > every chunk of received WAL... >> >> IMO the completion of the write loop was completion of received WAL. >> And Walreceiver same. >> >> I confirm it about the flush position. > As you say,Walreceiver does flush more frequently than that. No. IIUC walreceiver does flush *less* frequently than what you implemented on pg_receivexlog. Your version of pg_receivexlog tries to do flush every time when it receives one WAL chunk. OTOH, walreceiver does flush only when there is no extra WAL chunk in receive buffer. IOW, after writing WAL chunk, if there is another WAL chunk that walreceiver can receive immediately, it postpones flush later. > However, it seems difficult to apply as same way. Why? ISTM that's not so difficult. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it "JSQuery"?
On Jun 6, 2014, at 6:54 AM, Oleg Bartunov wrote: > Jsquery - is QUERY language, JsonPath - is language to EXTRACT json parts. Sure, but could we not potentially build on its syntax, instead of building a new one? I’m not saying we *should*, but if we don’t, I think there should be a discussion about why not. For example, I think it would not be a good idea to follow [JSONiq](http://www.jsoniq.org/) because who wants to write queries in JSON? (Have we learned nothing from XSLT?). Here’s a (partial) list of existing JSON query languages: http://stackoverflow.com/a/7812073/79202 The arguments might be: * [JSONiq](http://jsoniq.org/): Queries in JSON? Gross! * [UNQL](http://www.unqlspec.org/): Too similar to SQL * [JAQL](https://code.google.com/p/jaql/): Too different from SQL * [JSONPath](http://goessner.net/articles/JsonPath/): Too verbose * [JSON Query](https://github.com/mmckegg/json-query): Too little there * [Mongo](http://www.mongodb.org/display/DOCS/Inserting#Inserting-JSON): Gross syntax * [LINQ](http://james.newtonking.com/archive/2008/03/02/json-net-2-0-beta-2): Too similar to SQL * [searchjs](https://github.com/deitch/searchjs): Queries in JSON? Gross! * [JQuery](http://jquery.org/): It's for HTML, not JSON * [SpahQL](http://danski.github.io/spahql/): More like XPath * [ObjectPath](http://adriank.github.io/ObjectPath/): Too verbose * [JFunk](https://code.google.com/p/jfunk/): XPathy * [JData](http://jaydata.org): Queries in JavaScript? C’mon. These are just off-the-cuff evaluations in 10 minutes of looking -- surely not all of them are accurate. Some of them maybe *are* useful to emulate. It’s definitely worthwhile, IMHO, to evaluate prior art and decide what, if any of it, should inspire the JSQuery syntax, and there should be reasons why and why not. I do think that the name should be changed if we don’t follow an existing standard, as [JSQuery](https://code.google.com/p/gwtquery/wiki/JsQuery) is already a thing. Best, David signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [HACKERS] Why is it "JSQuery"?
Jsquery - is QUERY language, JsonPath - is language to EXTRACT json parts. On Fri, Jun 6, 2014 at 4:34 AM, David E. Wheeler wrote: > On Jun 5, 2014, at 5:25 PM, Andrew Dunstan wrote: > >> My understanding is that it's meant to be analogous to tsquery. >> >> At first glance, JsonPath doesn't seem to support AND and OR operators, >> which would make it rather less expressive than I gather JSQuery is meant to >> be. > > Yes, but perhaps it could be a superset. > > I guess my real question is: Should it not be based on some existing dialect, > preferably something in fairly wide use outside the Postgres community? > > Unless that something is awful, of course. > > David > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposing pg_hibernate
On Thu, Jun 5, 2014 at 11:32 PM, Amit Kapila wrote: > On Thu, Jun 5, 2014 at 5:39 PM, Gurjeet Singh wrote: >> >> > On Tue, Jun 3, 2014 at 5:43 PM, Gurjeet Singh wrote: >> Case 2 also won't cause any buffer restores because the save-files are >> created only on clean shutdowons; not on a crash or immediate >> shutdown. > > How do you ensure that buffers are saved only on clean shutdown? Postmaster sends SIGTERM only in "smart" or "fast" shutdown requests. > Buffer saver process itself can crash while saving or restoring > buffers. True. That may lead to partial list of buffers being saved. And the code in Reader process tries hard to read only valid data, and punts at the first sight of data that doesn't make sense or on ERROR raised from Postgres API call. > IIUC on shutdown request, postmaster will send signal to BG Saver > and BG Saver will save the buffers and then postmaster will send > signal to checkpointer to shutdown. So before writing Checkpoint > record, BG Saver can crash (it might have saved half the buffers) Case handled as described above. > or may BG saver saves buffers, but checkpointer crashes (due to > power outage or any such thing). Checkpointer process' crash seems to be irrelevant to Postgres Hibernator's workings. I think you are trying to argue the wording in my claim "save-files are created only on clean shutdowons; not on a crash or immediate shutdown", by implying that a crash may occur at any time during and after the BufferSaver processing. I agree the wording can be improved. How about ... save-files are created only when Postgres is requested to shutdown in normal (smart or fast) modes. Note that I am leaving out the mention of crash. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wrapping in extended mode doesn't work well with default pager
Hello where we are with this feature? Is there some barriers to commit bugfix? Regards Pavel 2014-05-18 19:46 GMT+02:00 Sergey Muraviov : > I found some new bugs and fix them. > And I had to make many changes. > > > 2014-05-17 21:31 GMT+04:00 Greg Stark : > > Sorry, a couple things still look to not be quite right. >> >> 1) The width of the table when linestyle=old-ascii and border=0 or >> border=1 (and expanded=on and format=wrapped) seems to off by one. >> >> 2) The hyphens following the RECORD NN are short by one >> >> I'm surprised the last patch was so big since it sounded like a simple >> off-by-one bug. It looks like you've removed the leading space on the >> border=0 expanded case. I guess that makes sense but we should >> probably stop making significant changes now and just focus on fixing >> the off by one bugs. >> > > > > -- > Best regards, > Sergey Muraviov >
Re: [HACKERS] pg_receivexlog add synchronous mode
> -Original Message- > > > Flush is not performed every time write, it is performed > > > collectively like walrecever. > > > > I only glanced at this, but afaics you're only flushing at the end > > every WAL segment. That will result in absolutely horrible performance, > right? > > Walreceiver does flush more frequently than that. It basically syncs > > every chunk of received WAL... > > IMO the completion of the write loop was completion of received WAL. > And Walreceiver same. > > I confirm it about the flush position. As you say,Walreceiver does flush more frequently than that. However, it seems difficult to apply as same way. So, I have tried a different approach. 1. select () time-out 100msec setting. 2. flush check is time-out of the select (). 3. wirte() only when flush. I think this is what cause the problem, but I don't have some good idea to solve it. Can someone please advise me? Regards, -- Furuya Osamu pg_receivexlog-add-synchronous-mode-v2.patch Description: pg_receivexlog-add-synchronous-mode-v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposing pg_hibernate
On Thu, Jun 5, 2014 at 11:32 PM, Amit Kapila wrote: > Another thing is don't you want to handle SIGQUIT signal in bg saver? I think bgworker_quickdie registered in StartBackgroundWorker() serves the purpose just fine. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] "cancelling statement due to user request error" occurs but the transaction has committed.
Hi All, When log_duration is true ( or log_min_duration_statement>=0 ), If a transaction has internally been commited receives a SIGINT signal then a query cancellation error is output. For example, 1. A query like a TRUNCATE is removing bigger table files. 2. The session receives SIGINT signal. 3. Query cancellation error occurs. 4. But the query has commited. e.g.) --- naoya=# \d List of relations Schema | Name | Type | Owner +--+---+--- public | hoge | table | naoya (1 row) naoya=# set log_duration=on; SET naoya=# select count(*) from hoge; count 10 (1 row) naoya=# truncate hoge; Cancel request sent ERROR: canceling statement due to user request naoya=# select count(*) from hoge; count --- 0 (1 row) --- This is because ProcessInterrupts function is called by errfinish ( in query-duration ereport). I think this cancellation request must not interrupt the internal commited transaction. This is because clients may misunderstand "the transaction has rollbacked". Now, I tried to fix the problem. --- postgresql-fe7337f/src/backend/utils/error/elog.c 2014-06-06 11:57:44.0 +0900 +++ postgresql-fe7337f.new/src/backend/utils/error/elog.c 2014-06-06 13:10:51.0 +0900 @@ -580,7 +580,8 @@ * can stop a query emitting tons of notice or warning messages, even if * it's in a loop that otherwise fails to check for interrupts. */ - CHECK_FOR_INTERRUPTS(); + if (IsTransactionState()) + CHECK_FOR_INTERRUPTS(); } Thereby, When ereport(non error level) calls and not in-transaction state, PostgreSQL never calls ProcessInterrupts function by errfinish. But I have a anxiety to fix errfinish function because errfinish is called in many many situations.. Could you please confirm it? Regards, Naoya --- Naoya Anzai Engineering Department NEC Solution Inovetors, Ltd. E-Mail: anzai-na...@mxu.nes.nec.co.jp --- postgresql-fe7337f_elog.patch Description: postgresql-fe7337f_elog.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
On Fri, Jun 6, 2014 at 11:44 AM, Tom Lane wrote: > Noah Misch writes: > > On Thu, Jun 05, 2014 at 02:12:33AM +0200, Andres Freund wrote: > >> A bit more crazy, but how about trying trying to plan joins with a added > >> one-time qual that checks the size of the deferred trigger queue? Then > >> we wouldn't even need special case plans. > > > That, too, sounds promising to investigate. > > Not terribly. You can't actually do join removal in such a case, so it's > not clear to me that there's much win to be had. The planner would be at > a loss as to what cost to assign such a construct, either. > > Moreover, what happens if the trigger queue gets some entries after the > query starts? > > In the scripts below I've created a scenario (scenario 1) that the inner query which I've put in a trigger function does see the the referenced table before the RI triggers execute, so it gives 1 row in the SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id) query. This works and I agree it's a problem that needs looked at in the patch. I'm also trying to create the situation that you describe where the RI trigger queue gets added to during the query. I'm likely doing it wrong somehow, but I can't see what I'm doing wrong. Here's both scripts. I need help with scenario 2 to create the problem you describe, I can't get my version to give me any stale non-cascaded records. -- Scenario 1: Outer command causes a foreign key trigger to be queued -- and this results in a window of time where we have records -- in the referencing table which don't yet exist in the -- referenced table. DROP TABLE IF EXISTS j1; DROP TABLE IF EXISTS j2; DROP TABLE IF EXISTS records_violating_fkey; CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY); CREATE TABLE j1 ( id INT PRIMARY KEY, j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE ); INSERT INTO j2 VALUES(10),(20); INSERT INTO j1 VALUES(1,10),(2,20); -- create a table to store records that 'violate' the fkey. CREATE TABLE records_violating_fkey (j2_id INT NOT NULL); CREATE OR REPLACE FUNCTION j1_update() RETURNS TRIGGER AS $$ BEGIN RAISE notice 'Trigger fired'; INSERT INTO records_violating_fkey SELECT j2_id FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = j2.id); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER j1_update_trigger BEFORE UPDATE ON j2 FOR EACH ROW EXECUTE PROCEDURE j1_update(); UPDATE j2 SET id = id+1; -- returns 1 row. SELECT * FROM records_violating_fkey; -- -- Scenario 2: Inner command causes a foreign key trigger to be queued. DROP TABLE IF EXISTS j1; DROP TABLE IF EXISTS j2; CREATE TABLE j2 (id INT NOT NULL PRIMARY KEY); CREATE TABLE j1 ( id INT PRIMARY KEY, j2_id INT NOT NULL REFERENCES j2 (id) MATCH FULL ON DELETE CASCADE ON UPDATE CASCADE ); INSERT INTO j2 VALUES(10),(20); INSERT INTO j1 VALUES(1,10),(2,20); CREATE OR REPLACE FUNCTION update_j2(p_id int) RETURNS int AS $$ BEGIN RAISE NOTICE 'Updating j2 id = % to %', p_id, p_id + 1; UPDATE j2 SET id = id + 1 WHERE id = p_id; RETURN 1; END; $$ LANGUAGE plpgsql; -- try and get some records to be returned by causing an update on the record that is not the current record. SELECT * FROM j1 WHERE NOT EXISTS(SELECT 1 FROM j2 WHERE j2_id = id) AND update_j2((SELECT MIN(j2_id) FROM j1 ij1 WHERE ij1.j2_id <> j1.j2_id)) = 1; Regards David Rowley