Re: [HACKERS] patch for new feature: Buffer Cache Hibernation
Hi, Sorry, I missed these messages because I didn't subscribe to this list. # I've just subscribed temporary I think that all the complexity with CRCs etc. is unlikely to lead anywhere too, and those two issues are not completely unrelated. The simplest, safest thing here is the right way to approach this, not the most complicated one, and a simpler format might add some flexibility here to reload more cache state too. The bottleneck on reloading the cache state is reading everything from disk. Trying to micro-optimize any other part of that is moving in the wrong direction to me. I doubt you'll ever measure a useful benefit that overcomes the expense of maintaining the code. And you seem to be moving to where someone can't restore cache state when they change shared_buffers. A simpler implementation might still work in that situation; reload until you run out of buffers if shared_buffers shrinks, reload until you're done with the original size. Yeah, I'm pretty well convinced this whole approach is a dead end. Priming the OS buffer cache seems way more useful. I also think saving the blocks to be read rather than the actual blocks makes a lot more sense. OK, there are two your suggestions here IIUC. # if not, please correct me. 1. restore buffer blocks based on buffer descriptors, not from the saved file. 2. support restoring cache state even if shared_buffers had changed. For 1, I've just finish my work. The latest patch is available at: http://people.freebsd.org/~iwasaki/postgres/buffer-cache-hibernation-postgresql-20110507.patch On my box, shared_buffers can be set up to only 200MB. Elapsed time for starting up is almost the same, about 3 sec (w/o hibernation takes about 1 sec). For shutdown, writing buffer blocks takes about 10 sec, otherwise about 1 sec. Well, it seems you were right :) By restoring buffer blocks based on buffer descriptors, the OS buffer cache will be filled too. This can help buffer updating performance I believe. I think saving buffer blocks is still useful for debugging or portability, so I would like to remain the support code in my patch. For 2, I'm not sure how to implement this. The problem is that freelist.c:StrategyControl is also restored at startup, but I have no idea currently how to adjust StrategyControl when shared_buffer had changed. StrategyControl has important data on buffer allocation, so this should be matched with shared_buffer, I belive. Changing shared_buffer is not so often on production environment. Current implementation like this; If shared_buffer had changed, restoring is aborted only on that time and saving is executed with new shared_buffer at shutdown, restoring is executed at startup on next time. I have one more day for working on this, but I may give up... Thanks -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed patch: Smooth replication during VACUUM FULL
On Mon, May 2, 2011 at 6:15 PM, Gabriele Bartolini gabriele.bartol...@2ndquadrant.it wrote: You can easily spot in the graphs the point where VACUUM FULL terminates, then it is just a matter of flushing the WAL delay for replication. Agreed. Anyway, I hope I can give you more detailed information tomorrow. Thanks. Did you find anything else of note, or is your patch ready to commit? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] clog_redo causing very long recovery time
On Fri, May 6, 2011 at 4:22 AM, Tom Lane t...@sss.pgh.pa.us wrote: The attached fix-clogredo diff is my proposal for a fix for this. That seems pretty grotty :-( I think a more elegant fix might be to just swap the order of the ExtendCLOG and ExtendSUBTRANS calls in GetNewTransactionId. The reason that would help is that pg_subtrans isn't WAL-logged, so if we succeed doing ExtendSUBTRANS and then fail in ExtendCLOG, we won't have written any XLOG entry, and thus repeated failures will not result in repeated XLOG entries. I seem to recall having considered exactly that point when the clog WAL support was first done, but the scenario evidently wasn't considered when subtransactions were stuck in :-(. I agree with Tom about the need for a fix that prevents generation of repeated WAL records. OTOH, I also like Joe's fix in the recovery code to avoid responding to repeated records. Can we have both please? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
I've taken a look into it, and I'm not optimistic about the likelihood of the way I've suggested that we can register a callback on process termination on windows being acceptable. It seems to be a kludge too far. It does work on Vista, just not very well. There is a considerable delay on closing the above console application that uses this technique, for example, and there seems to be an unpredictable delay in the callback occurring. A simpler solution on Windows might be to make the timeout on auxiliary processes much smaller, but have it increase on each subsequent timeout (starting from scratch if we wakeup for any reason other than timeout) until eventually it maxes out at something like the current value for PGARCH_AUTOWAKE_INTERVAL. If backends are sleeping for increasing periods of time, the chance of the postmaster crashing goes down, so denial of service is much less of a concern. An alternative might be to just not do this on Windows. Certainly, idle wakeups are likely to be less important on that platform, which is not a very popular choice for virtual machines deployed on cloudy infrastructure, the use case that will benefit from these enhancements the most, by some margin. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not write block xlog flush request 3FD/0 is not satisfied
All right, what would we be the best way to debug such a problem? Yves -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Greg Stark Sent: Sunday, May 08, 2011 6:36 PM To: weis...@rbg.informatik.tu-darmstadt.de Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] could not write block xlog flush request 3FD/0 is not satisfied On Sun, May 8, 2011 at 4:18 PM, Yves Weißig weis...@rbg.informatik.tu-darmstadt.de wrote: ERROR: xlog flush request 3FD/0 is not satisfied --- flushed only to 0/20E2DC4 That's a pretty big difference in log positions. It seems likely you've overwritten the block header writing garbage to the LSN. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On 09.05.2011 12:20, Peter Geoghegan wrote: I've taken a look into it, and I'm not optimistic about the likelihood of the way I've suggested that we can register a callback on process termination on windows being acceptable. It seems to be a kludge too far. It does work on Vista, just not very well. There is a considerable delay on closing the above console application that uses this technique, for example, and there seems to be an unpredictable delay in the callback occurring. Can't we use the pipe trick on Windows? The API is different, but we use pipes on Windows for other things already. When a process is launched, open a pipe between postmaster and the child process. In the child, spawn a thread that just calls ReadFile() on the pipe, which blocks. If postmaster dies, the ReadFile() call will return with an error. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed patch: Smooth replication during VACUUM FULL
Il 09/05/11 09:14, Simon Riggs ha scritto: Anyway, I hope I can give you more detailed information tomorrow. Thanks. Did you find anything else of note, or is your patch ready to commit? Unfortunately I did not have much time to run further tests. The ones I have done so far show that it mostly works (see attached graph), but there are some unresolved spikes that will require further work in 9.2. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it attachment: smooth-replication-vacuum-full-s50-600.png -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On 9 May 2011 11:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Can't we use the pipe trick on Windows? The API is different, but we use pipes on Windows for other things already. When a process is launched, open a pipe between postmaster and the child process. In the child, spawn a thread that just calls ReadFile() on the pipe, which blocks. If postmaster dies, the ReadFile() call will return with an error. Alright. I'm currently working on a proof-of-concept implementation of that. In the meantime, any thoughts on how this should meld with the existing latch implementation? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superusers are members of all roles?
On Sat, May 7, 2011 at 11:42 PM, Bruce Momjian br...@momjian.us wrote: Is this a TODO? I think so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make check in src/test/isolation is unworkable
On 05/08/2011 09:54 PM, Andrew Dunstan wrote: On 05/08/2011 07:35 PM, Tom Lane wrote: I believe that the make check target in src/test/isolation is fundamentally unportable, as is illustrated by the fact that buildfarm member coypu is currently choking on it. The reason is that the pg_isolation_regress program depends on libpq, and in particular it depends on having an *installed* libpq. Anyplace where it appears to work, it's because you already installed Postgres, or at least libpq. darn, you're right. OK, I have crake running the installation checks: http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2011-05-09%2004%3A17%3A01stg=isolation-check, so I have checked in a hot fix for the buildfarm client: https://github.com/PGBuildFarm/client-code/commit/c3c20a6457f57efcdcecb83e9c8168791f33f699 What's a bit annoying is that these tests were checked in without a vestige of MSVC support, and nobody pinged the usual suspects (i.e. Magnus and me) to ask for help in providing it, unless my memory is even worse than usual. We have a bit of work to do to enable that, which I'll try to get done before pgcon. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [pgsql-advocacy] New Canadian nonprofit for trademark, postgresql.org domain, etc.
On Sat, May 7, 2011 at 2:01 PM, Peter Eisentraut pete...@gmx.net wrote: On fre, 2011-05-06 at 21:53 +0200, Cédric Villemain wrote: I think it might be better if the association don't need (or have ) activity other than 'technical' and to set up another nonprofit association for real activity. If instead you limit yourselves to holding and maintaining the mentioned assets, the board meets once a year to approve last year's minutes, file the paperwork, and go home, you can't do much wrong. I quite agree on this. The NPO new is excellent, btw. Roberto
Re: [HACKERS] clog_redo causing very long recovery time
Simon Riggs si...@2ndquadrant.com writes: I agree with Tom about the need for a fix that prevents generation of repeated WAL records. OTOH, I also like Joe's fix in the recovery code to avoid responding to repeated records. Can we have both please? Why? The patch in the recovery code is seriously ugly, and it won't do anything useful once we've fixed the other end. Please notice also that we'd need several instances of that kluge if we want to cover all the SLRU-based cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make check in src/test/isolation is unworkable
Andrew Dunstan and...@dunslane.net writes: What's a bit annoying is that these tests were checked in without a vestige of MSVC support, and nobody pinged the usual suspects (i.e. Magnus and me) to ask for help in providing it, Speaking of pinging Windows people, have either of you noticed the reports that CREATE/ALTER USER VALID UNTIL 'infinity' is crashing on Windows? http://archives.postgresql.org/pgsql-bugs/2011-05/msg9.php http://archives.postgresql.org/pgsql-bugs/2011-05/msg00030.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
On Sat, May 7, 2011 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On the flip side, the risk of it flat-out blowing up seems pretty small. For someone to invent their own version of wchar_t that uses something other than Unicode code points would be pretty much pure masochism, wouldn't it? Well, no, that's not clear. The C standard has pretty carefully avoided constraining the wchar_t representation, so implementors are free to do whatever is most convenient from the standpoint of their library routines. I could easily see somebody deciding to do something that wasn't quite Unicode because it let him re-use lookup tables designed for some other encoding, or some such. Now it's also perfectly possible, maybe even likely, that nobody's done that on any platform we care about. But I don't believe we know that with any degree of certainty. We definitely have not made any effort to establish whether it's true --- for example, we have no regression tests that address the point. (I think that collate.linux.utf8 touches on it, but we're a long way from being able to run that on non-glibc platforms...) Well, since any problems in this are are going to bite us eventually in 9.0+ even without any further action on our part, maybe it would be wise to think up something we could add to the regression tests. That would give us some immediate feedback from the buildfarm, and also significantly improve the odds of someone compiling on a weird platform noticing if things are broken. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf error checking strategy
On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Robert Haas wrote: On Wed, Apr 6, 2011 at 5:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: So I'm thinking we should adopt a strategy that's less likely to result in divergent behavior among different backends. ?The idea I have in mind is to have the first validation pass only check that each name is a legal GUC variable name, and not look at the values at all. ?If so, try to apply all the values. ?Any that fail to apply we log as usual, but still apply the others. ?ISTM that verifying the names should be enough protection against broken files for practical purposes, and it should be something that all backends will agree on even if there are individual values that are not valid for all. Comments? I don't think now is a good time for a major behavior change in this area, and I'm not convinced this is the best possible design. There are a number of parameters which are currently PGC_POSTMASTER rather than PGC_SIGHUP precisely because of the possibility of backends being out of step with each other. wal_level is an obvious example, and one that it would be *really* nice to be able to change without a server restart. It would be nice to have a real solution to that problem, but this isn't it, and I don't want to engineer it right now. Is this a TODO? Yes, definitely. Perhaps summarize as rethink how we handle partially correct postgresql.conf files. Or maybe Robert sees it as rethink approach to making sure all backends share the same value of critical settings? Or maybe those are two different TODOs? The second is what I had in mind. I'm thinking that at least for critical GUCs we need a different mechanism for making sure everything stays in sync, like having the postmaster write a precompiled file and convincing the backends to read it in some carefully synchronized fashion. However, it's not clear to me whether something along those lines (or some other lines) would solve the problem you were complaining about; therefore it's possible, as you say, that there are two separate action items here. Or maybe not: maybe someone can come up with an approach that swats both problems in one go. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
Robert Haas robertmh...@gmail.com writes: Well, since any problems in this are are going to bite us eventually in 9.0+ even without any further action on our part, maybe it would be wise to think up something we could add to the regression tests. That would give us some immediate feedback from the buildfarm, and also significantly improve the odds of someone compiling on a weird platform noticing if things are broken. No objection here, but how will we do that? The regression tests are designed to work in any locale/encoding, and would become significantly less useful if they weren't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
On Sun, May 8, 2011 at 12:02 AM, Greg Smith g...@2ndquadrant.com wrote: Attached patch is a first cut at what moving one contrib module (in this case pg_buffercache) to a new directory structure might look like. The idea is that src/extension could become a place for first-class extensions to live. Those are ones community is committed to providing in core, but are just better implemented as extensions than in-database functions, for reasons that include security. This idea has been shared by a lot of people for a while, only problem is that it wasn't really practical to implement cleanly until the extensions code hit. I think it is now, this attempts to prove it. Since patches involving file renaming are clunky, the changes might be easier to see at https://github.com/greg2ndQuadrant/postgres/commit/507923e21e963c873a84f1b850d64e895776574f where I just pushed this change too. The install step for the modules looks like this now: gsmith@grace:~/pgwork/src/move-contrib/src/extension/pg_buffercache$ make install /bin/mkdir -p '/home/gsmith/pgwork/inst/move-contrib/lib/postgresql' /bin/mkdir -p '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension' /bin/sh ../../../config/install-sh -c -m 755 pg_buffercache.so '/home/gsmith/pgwork/inst/move-contrib/lib/postgresql/pg_buffercache.so' /bin/sh ../../../config/install-sh -c -m 644 ./pg_buffercache.control '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension/' /bin/sh ../../../config/install-sh -c -m 644 ./pg_buffercache--1.0.sql ./pg_buffercache--unpackaged--1.0.sql '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension/' $ psql -c create extension pg_buffercache CREATE EXTENSION The only clunky bit I wasn't really happy with is the amount of code duplication that comes from having a src/extension/Makefile that looks almost, but not quite, identical to contrib/Makefile. The rest of the changes don't seem too bad to me, and even that's really only 36 lines that aren't touched often. Yes, the paths are different, so backports won't happen without an extra step. But the code changes required were easier than I was expecting, due to the general good modularity of the extensions infrastructure. So long as the result ends up in share/postgresql/extension/ , whether they started in contrib/module or src/extension/module doesn't really matter to CREATE EXTENSION. But having them broke out this way makes it easy for the default Makefile to build and install them all. (I recognize I didn't do that last step yet though) I'll happily go covert pgstattuple and the rest of the internal diagnostics modules to this scheme, and do the doc cleanups, this upcoming week if it means I'll be able to use those things without installing all of contrib one day. Ditto for proposing RPM and Debian packaging changes that match them. All that work will get paid back the first time I don't have to fill out a bunch of paperwork (again) at a customer site justifying why they need to install the contrib [RPM|deb] package (which has some scary stuff in it) on all their servers, just so I can get some bloat or buffer inspection module. I would really like to see us try to group things by topic, and not just by whether or not we can all agree that the extension is important enough to be first-class (which is bound to be a bit tendentious). We probably can't completely avoid some bikeshedding on that topic, but even there it strikes me that sorting by topic might make things a bit more clear. For example, if we could somehow group together all the diagnostic tools, maybe something like the list below, I think that would be a start. Now then we might go on to argue about which are the more useful diagnostic tools, but I think it's easier to argue about that category than it is to argue in the abstract about whether you'd rather have hstore or pgstattuple, to which the answer can only be that depends. auto_explain oid2name pageinspect pg_buffercache pg_freespacemap pg_stat_statements pg_test_fsync (perhaps) pgrowlocks pgstattuple -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
On Mon, May 9, 2011 at 10:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, since any problems in this are are going to bite us eventually in 9.0+ even without any further action on our part, maybe it would be wise to think up something we could add to the regression tests. That would give us some immediate feedback from the buildfarm, and also significantly improve the odds of someone compiling on a weird platform noticing if things are broken. No objection here, but how will we do that? The regression tests are designed to work in any locale/encoding, and would become significantly less useful if they weren't. I'm just shooting from the hip here, but maybe we could have a separate (probably smaller) set of tests that are only designed to work in a limited range of locales and/or encodings. I'm really pleased that we now have the src/test/isolation stuff, and I think some more auxilliary test suites would be quite excellent. Even if people didn't always want to run every single one when doing things manually, the buildfarm certainly could. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: ... Maybe someone out there is under the impression that I get high off of rejecting patches; but the statistics you cite from the CF app don't exactly support the contention that I'm going around looking for reasons to reject things, or if I am, I'm doing a pretty terrible job finding them. Hm ... there are people out there who think *I* get high off rejecting patches. I have a t-shirt to prove it. But I seem to be pretty ineffective at it too, judging from these numbers. Late reply, but almost all the things Tom rejects I would have rejected too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On Mon, May 9, 2011 at 10:58 AM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: ... Maybe someone out there is under the impression that I get high off of rejecting patches; but the statistics you cite from the CF app don't exactly support the contention that I'm going around looking for reasons to reject things, or if I am, I'm doing a pretty terrible job finding them. Hm ... there are people out there who think *I* get high off rejecting patches. I have a t-shirt to prove it. But I seem to be pretty ineffective at it too, judging from these numbers. Late reply, but almost all the things Tom rejects I would have rejected too. Well, I think I've been guilty more than once of leaning on Tom to try to get him to accept patches that he might've been inclined to reject. I think that my standards for code quality are similar to Tom's (though sometimes I let through things he would have caught, woops) but I think I am more inclined to commit feature changes that he might not find entirely worthwhile. Like Tom, I'm reasonably wary of random knickknacks that are extremely special-purpose or will slow down common cases, but on the average I think I'm slightly more new-feature-positive than he is. Not without some exceptions, of course. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf error checking strategy
Robert Haas robertmh...@gmail.com writes: On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yes, definitely. Perhaps summarize as rethink how we handle partially correct postgresql.conf files. Or maybe Robert sees it as rethink approach to making sure all backends share the same value of critical settings? Or maybe those are two different TODOs? The second is what I had in mind. I'm thinking that at least for critical GUCs we need a different mechanism for making sure everything stays in sync, like having the postmaster write a precompiled file and convincing the backends to read it in some carefully synchronized fashion. However, it's not clear to me whether something along those lines (or some other lines) would solve the problem you were complaining about; therefore it's possible, as you say, that there are two separate action items here. Or maybe not: maybe someone can come up with an approach that swats both problems in one go. Well, the thing that was annoying me was that because a backend saw one value in postgresql.conf as incorrect, it was refusing to apply any changes at all from postgresql.conf. And worse, there was no log entry to give any hint what was going on. This doesn't seem to me to have much to do with the problem you're on about. I agree it's conceivable that someone might think of a way to solve both issues at once, but I think we'd better list them as separate TODOs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Greg Smith wrote: On 04/21/2011 12:39 PM, Robert Haas wrote: In fact, I've been wondering if we shouldn't consider extending the support window for 8.2 past the currently-planned December 2011. There seem to be quite a lot of people running that release precisely because the casting changes in 8.3 were so painful, and I think the incremental effort on our part to extend support for another year would be reasonably small. The pending EOL for 8.2 is the only thing that keeps me sane when speaking with people who refuse to upgrade, yet complain that their 8.2 install is slow. This last month, that seems to be more than usual why does autovacuum suck so much? complaints that would all go away with an 8.3 upgrade. Extending the EOL is not doing any of these users a favor. Every day that goes by when someone is on a version of PostgreSQL that won't ever allow in-place upgrade is just making worse the eventual dump and reload they face worse. The time spent porting to 8.3 is a one-time thing; the suffering you get trying to have a 2011 sized database on 2006's 8.2 just keeps adding up the longer you postpone it. Interesting. You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
Robert Haas robertmh...@gmail.com writes: On Mon, May 9, 2011 at 10:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: No objection here, but how will we do that? The regression tests are designed to work in any locale/encoding, and would become significantly less useful if they weren't. I'm just shooting from the hip here, but maybe we could have a separate (probably smaller) set of tests that are only designed to work in a limited range of locales and/or encodings. I'm really pleased that we now have the src/test/isolation stuff, and I think some more auxilliary test suites would be quite excellent. Even if people didn't always want to run every single one when doing things manually, the buildfarm certainly could. Hmm. We don't need new infrastructure like the isolation tests do, so another subdirectory seems like overkill. I am thinking about a new target installcheck-collations in src/test/regress/GNUmakefile that creates a UTF8-encoding database and runs a different test schedule than the regular tests. The problem we'd have is that there's no way (at present) to make such a test pass on every platform. Windows has its own set of locale names (which initdb fails to install as collations anyway) and we also have the problem that OS X can be counted on to get UTF8 sorting wrong. (It might be okay for case-folding though; not sure.) Possibly we could just provide an alternate expected file for OS X, but I don't see a decent workaround for Windows --- it would pretty much have to have its very own test case. Andrew, what kinds of options have we got for persuading the buildfarm to run such tests only on a subset of platforms? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] switch UNLOGGED to LOGGED
Excerpts from Robert Haas's message of vie may 06 23:25:09 -0300 2011: On Fri, Apr 22, 2011 at 4:13 AM, Leonardo Francalanci m_li...@yahoo.it wrote: Maybe you should change xl_act_commit to have a separate list of rels to drop the init fork for (instead of mixing those with the list of files to drop as a whole). I tried to follow your suggestion, thank you very much. I have to admit I don't like this approach very much. I can't see adding 4 bytes to every commit record for this feature. Hmm, yeah. Maybe we can add a flags int8 somewhere in that struct and set a bit in it if nrels, nsubxacts, nmsgs and respective arrays are present. That would save some int's that are already in there. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
All, I agree that we should not reduce the support window. The fact that we can do in place upgrades of the data only addresses one pain point in upgrading. Large legacy apps require large retesting efforts when upgrading, often followed by lots more work renovating the code for backwards incompatibilities. Definitely. Heck, I can't get half our clients to apply *update* releases because they have a required QA process which takes a month. And a lot of companies are just now deploying the 8.4 versions of their products. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On 05/09/2011 12:06 PM, Andrew Dunstan wrote: The fact that we can do in place upgrades of the data only addresses one pain point in upgrading. Large legacy apps require large retesting efforts when upgrading, often followed by lots more work renovating the code for backwards incompatibilities. This can be a huge cost for what the suits see as little apparent gain, and making them do it more frequently in order to stay current will not win us any friends. I just had a why a new install on 8.3? conversation today, and it was all about the application developer not wanting to do QA all over again for a later release. Right now, one of the major drivers for why upgrade? has been the performance improvements in 8.3, relative to any older version. The main things pushing happy 8.3 sites to 8.4 or 9.0 that I see are either VACUUM issues (improved with partial vacuum in 8.4) or wanting real-time replication (9.0). I predict many sites that don't want either are likely to sit on 8.3 for a really long time. The community won't be able to offer a compelling reason why smaller sites in particular should go through the QA an upgrade requires. The fact that the app QA time is now the main driver--not the dump and reload time--is good, because it makes it does make it easier for the people with the biggest data sets to move. They're the ones that need the newer versions the most anyway, and in that regard having in-place upgrade start showing up as of 8.3 was really just in time. I think 8.3 is going to be one of those releases like 7.4, where people just keep running it forever. At least shortening the upgrade path has made that concern a little bit better. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching
Tom Lane t...@sss.pgh.pa.us wrote: I am thinking about a new target installcheck-collations in src/test/regress/GNUmakefile that creates a UTF8-encoding database and runs a different test schedule than the regular tests. I don't know the best way to do this (or how many people agree we should), but I found that running the standard `make installcheck` against a database where the postgresql.conf file set default_transaction_isolation = serializable helped provide some baseline testing of SSI. None of the results change, but it helps protect against certain classes of dumb error. (I know this from personal experience.) I know *I'd* feel better if at least a few buildfarm animals were set up to do this. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Greg, [There were complaints upthread about things like how Aster's patch submissions were treated. Those were WIP patches that half implemented some useful ideas. There are two reasons why I think we failed with the Aster patches: 1) I passed Aster along to Bruce, who said he would review the patches and give them a private response on them before they put them on -hackers (which response would be these aren't nearly ready) Bruce punted on this instead, passing their submissions straight through to -hackers without review. 2) Our process for reviewing and approving patches, and what criteria such patches are required to meet, is *very* opaque to a first-time submitter (as in no documentation the submitter knows about), and does not become clearer as they go through the process. Aster, for example, was completely unable to tell the difference between hackers who were giving them legit feedback, and random list members who were bikeshedding. As a result, they were never able to derive a concrete list of these are the things we need to fix to make the patch acceptable, and gave up. While the first was specific to the Aster submissions, I've seen the second problem with lots of first-time submissions to this list. Our feedback to submitters of big patches requires a lot of comprehension of project personalities and politics to make any sense of. As I don't think we can change this, I think the best answer is to tell people Don't submit a big patch to PostgreSQL until you've done a few small patches first. You'll regret it. That goes double for some of the people complaining in this thread about dissatisfaction with the current process. The problem is not the process itself, but that there is little documentation of that process, and that much of that documentation does not match the defacto process. Obviously, the onus is on me as much as anyone else to fix this. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fsync reliability
FYI, does wal.c need updated comments to explain the file system semantics we expect, and how our code triggers it? --- Greg Smith wrote: On 04/23/2011 09:58 AM, Matthew Woodcraft wrote: As far as I can make out, the current situation is that this fix (the auto_da_alloc mount option) doesn't work as advertised, and the ext4 maintainers are not treating this as a bug. See https://bugzilla.kernel.org/show_bug.cgi?id=15910 I agree with the resolution that this isn't a bug. As pointed out there, XFS does the same thing, and this behavior isn't going away any time soon. Leaving behind zero-length files in situations where developers tried to optimize away a necessary fsync happens. Here's the part where the submitter goes wrong: We first added a fsync() call for each extracted file. But scattered fsyncs resulted in a massive performance degradation during package installation (factor 10 or more, some reported that it took over an hour to unpack a linux-headers-* package!) In order to reduce the I/O performance degradation, fsync calls were deferred... Stop right there; the slow path was the only one that had any hope of being correct. It can actually slow things by a factor of 100X or more, worst-case. So, we currently have the choice between filesystem corruption or major performance loss: yes, you do. Writing files is tricky and it can either be slow or safe. If you're going to avoid even trying to enforce the right thing here, you're really going to get really burned. It's unfortunate that so many people are used to the speed you get in the common situation for a while now with ext3 and cheap hard drives: all writes are cached unsafely, but the filesystem resists a few bad behaviors. Much of the struggle where people say this is so much slower, I won't put up with it and try to code around it is futile, and it's hard to separate out the attempts to find such optimizations from the legitimate complaints. Anyway, you're right to point out that the filesystem is not necessarily going to save anyone from some of the tricky rename situations even with the improvements made to delayed allocation. They've fixed some of the worst behavior of the earlier implementation, but there are still potential issues in that area it seems. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On Mon, May 9, 2011 at 1:53 PM, Josh Berkus j...@agliodbs.com wrote: While the first was specific to the Aster submissions, I've seen the second problem with lots of first-time submissions to this list. Our feedback to submitters of big patches requires a lot of comprehension of project personalities and politics to make any sense of. Ah ha! Now we're getting somewhere. As was doubtless obvious from my previous responses, I don't agree that the process is as broken as I felt you were suggesting, and I think we've made a lot of improvements. However, I am in complete agreement with you on this point. Unfortunately, people often come into our community with incorrect assumptions about how it works, including: - someone's in charge - there's one right answer - it's our job to fix your problem Now if you read a few hundred emails (which is not that much calendar time, if you read them all) it's not too hard to figure out what the real dynamic is, and I think that real dynamic is increasingly positive (with some unfortunate exceptions). But if the first thing you do is post (no doubt about some large or controversial change), yeah, serious culture shock. That goes double for some of the people complaining in this thread about dissatisfaction with the current process. The problem is not the process itself, but that there is little documentation of that process, and that much of that documentation does not match the defacto process. Obviously, the onus is on me as much as anyone else to fix this. I can't disagree with this, either. I'm not sure where it would be possible for us to document this that people would actually see and read, and I think it's a tough to understand just from reading a wiki page or a blog post: if you've never been part of a community that operates this way, then it's kind of strange and it takes a while to adjust. Of course from the inside it seems to make a fair amount of sense, but what good is that? Anyhow, whatever we can do to help people get into the swing of things I'm highly in favor of. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
On Mon, May 9, 2011 at 1:14 PM, Greg Smith g...@2ndquadrant.com wrote: On 05/09/2011 10:53 AM, Robert Haas wrote: I would really like to see us try to group things by topic, and not just by whether or not we can all agree that the extension is important enough to be first-class (which is bound to be a bit tendentious). Having played around with the prototype, I think it doesn't actually matter if there's a further division below the new one I introduced. The main thing I think is worth pointing out is that I only feel extensions with no external dependencies are worth the trouble of re-classifying here. If it were worth reorganizing contrib just for the sake of categorizing it, that would have been done years ago. The new thing is that extensions make it really easy to make some tools available in the server's extensions subdirectly, without actually activating them in the default install. Looking at your list: auto_explain oid2name pageinspect pg_buffercache pg_freespacemap pg_stat_statements pg_test_fsync (perhaps) pgrowlocks pgstattuple oid2name and pg_test_fsync would be out because those are real executables. I'd rather not introduce the risk/complexity of playing around with moving standalone utilities of such marginal value. Whereas I think it sets an excellent precedent if the server is shipping with some standard add-ons, built using the same extension mechanism available to external code, in the core server package. I'd certainly be happy to add auto_explain and pg_stat_statements (also extremely popular things to install for me) to that list. I'm happy enough with that set of guidelines: namely, that we'd use src/extension only for things that don't require additional dependencies, and not for things that build standalone executables. If we're going to move things around, I think we should take the trouble to categorize them along the way, and your idea of inserting one more subdirectory under src/extension for grouping seems fine to me. I don't think we should be too obstinate about trying to twist the arm of packagers who (as Tom points out) will do whatever they want in spite of us, but the current state of contrib, with all sorts of things of varying type, complexity, and value mixed together cannot possibly be a good thing. Even if the effect of all this is that some distributions end up with postgresql-server-instrumentation and postgresql-server-datatypes packages, rather than putting everything in postgresql-server, I still think that'd be better than having a monolithic lump called postgresql-contrib. Heaven only knows what could be in there (says the sys admin)... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Robert, I can't disagree with this, either. I'm not sure where it would be possible for us to document this that people would actually see and read, and I think it's a tough to understand just from reading a wiki page or a blog post: Still, if we had a wiki page which was a really comprehensive guide to submitting patches, then we could send people a link after they submit their first patch. As well as having it in the header for the commitfest page. While it wouldn't do everything, it would help. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures
Josh Berkus wrote: Peter, I would like to collect some specs on this feature. So does anyone have links to documentation of existing implementations, or their own spec writeup? A lot of people appear to have a very clear idea of this concept in their own head, so let's start collecting those. Delta between SPs and Functions for PostgreSQL: * SPs are executed using CALL or EXECUTE, and not SELECT. * SPs do not return a value ** optional: SPs *may* have OUT parameters. [ Late reply.] What is it about stored procedures that would require it not to return a value or use CALL? I am trying to understand what part of this is procedures (doesn't return a values, we decided there isn't much value for that syntax vs. functions), and anonymous transactions. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote: Greg Smith wrote: On 04/21/2011 12:39 PM, Robert Haas wrote: In fact, I've been wondering if we shouldn't consider extending the support window for 8.2 past the currently-planned December 2011. There seem to be quite a lot of people running that release precisely because the casting changes in 8.3 were so painful, and I think the incremental effort on our part to extend support for another year would be reasonably small. The pending EOL for 8.2 is the only thing that keeps me sane when speaking with people who refuse to upgrade, yet complain that their 8.2 install is slow. This last month, that seems to be more than usual why does autovacuum suck so much? complaints that would all go away with an 8.3 upgrade. Extending the EOL is not doing any of these users a favor. Every day that goes by when someone is on a version of PostgreSQL that won't ever allow in-place upgrade is just making worse the eventual dump and reload they face worse. The time spent porting to 8.3 is a one-time thing; the suffering you get trying to have a 2011 sized database on 2006's 8.2 just keeps adding up the longer you postpone it. Interesting. You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. Personally, I think the support window is on the borderline of being too short already. There are several Linux distributions out there that offer 5-year support for certain releases. Even assuming they incorporate the latest version of PostgreSQL at the time they wrap the final release, it'll already be some months since we released that version, and that means we'll stop supporting that version of PostgreSQL before they stop supporting that release. I regularly have systems that run for 3 or 4 years without needing to be reinstalled, and they're not necessarily running the bleeding-edge version of PostgreSQL when first installed. So they, too, are on the trailing edge of our support. As much as I believe that 9.0 (and, now, 9.1) are the future and people should move to them, we can't enforce that. EOL doesn't necessarily drive people to move. If they're just running yum update they're going to get 8.whatever.latest, and that's out of support and missing relevant bug fixes, then it is. I haven't run into much 8.1 recently, but it seems there is still a decent chunk of 8.2 out there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf error checking strategy
On Mon, May 9, 2011 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yes, definitely. Perhaps summarize as rethink how we handle partially correct postgresql.conf files. Or maybe Robert sees it as rethink approach to making sure all backends share the same value of critical settings? Or maybe those are two different TODOs? The second is what I had in mind. I'm thinking that at least for critical GUCs we need a different mechanism for making sure everything stays in sync, like having the postmaster write a precompiled file and convincing the backends to read it in some carefully synchronized fashion. However, it's not clear to me whether something along those lines (or some other lines) would solve the problem you were complaining about; therefore it's possible, as you say, that there are two separate action items here. Or maybe not: maybe someone can come up with an approach that swats both problems in one go. Well, the thing that was annoying me was that because a backend saw one value in postgresql.conf as incorrect, it was refusing to apply any changes at all from postgresql.conf. And worse, there was no log entry to give any hint what was going on. This doesn't seem to me to have much to do with the problem you're on about. I agree it's conceivable that someone might think of a way to solve both issues at once, but I think we'd better list them as separate TODOs. OK by me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Robert Haas wrote: Interesting. ?You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. Personally, I think the support window is on the borderline of being too short already. There are several Linux distributions out there that offer 5-year support for certain releases. Even assuming they incorporate the latest version of PostgreSQL at the time they wrap the final release, it'll already be some months since we released that version, and that means we'll stop supporting that version of PostgreSQL before they stop supporting that release. I regularly have systems that run for 3 or 4 years without needing to be reinstalled, and they're not necessarily running the bleeding-edge version of PostgreSQL when first installed. So they, too, are on the trailing edge of our support. As much as I believe that 9.0 (and, now, 9.1) are the future and people should move to them, we can't enforce that. EOL doesn't necessarily drive people to move. If they're just running yum update they're going to get 8.whatever.latest, and that's out of support and missing relevant bug fixes, then it is. I haven't run into much 8.1 recently, but it seems there is still a decent chunk of 8.2 out there. I agree we don't want to shorten the window --- I was just pointing out that we have more upgrade options than in the past. One big push for shortening was the Win32 issues on 8.0 and perhaps 8.1 that were unfixable, which helped push retiring, at least on that platforms, and once you retire on one platform, there is momentum to retire all platforms for that release. With Win32 stable on 8.2, we could say we don't need to shorten the window as much, but pg_upgrade would allow us to keep it the same as now because upgrades are potentially easier. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Greg Smith wrote: [There were complaints upthread about things like how Aster's patch submissions were treated. Those were WIP patches that half implemented some useful ideas. But they were presented as completed features, and they seemed to expect the community would pick those up and commit in that not quite right state without extended additional work on their side. Not doing that sort of thing is part of the reason the PostgreSQL code isn't filled with nothing but the fastest hack to get any given job done. Anyone who thinks I'm misrepresenting that view of history should revisit the lengthy feedback provided to them at https://commitfest.postgresql.org/action/patch_view?id=173 and https://commitfest.postgresql.org/action/patch_view?id=205 -- it actually goes back even further than that because the first versions of these patches were even less suitable for commit.] [ Again, sorry for my late reply.] Greg hits a big item above --- it takes 3-4x more work to get a patch to merge cleanly into our code (look like it was always there) than to write the initial patch. If the author isn't willing to do that 3-4x work, it is not something the community is going to do on a regular basis, so it is not surprising the patches are dropped. This is very often true of academicly-developed patches too. (I know I rewrite my patches 4-5 times, and some feel even that is not enough interations for me. ;-) ) That goes double for some of the people complaining in this thread about dissatisfaction with the current process. If you're not helping review patches already, you're not participating in the thing that needs the most help. This is not a problem you make better with fuzzy management directives to be nicer to people. There are real software engineering issues about how to ensure good code quality at its core. I agree on this one too. It is good for people outside the patch review group to make suggestions (external review is good), but when those external people can't give clear examples of problems, it is impossible for the patch review group to react or improve, and the complaints do more harm than good. The complaints did spark discussion to reevaluate our development process, so something good did come out of it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On Mon, May 9, 2011 at 2:41 PM, Josh Berkus j...@agliodbs.com wrote: Robert, I can't disagree with this, either. I'm not sure where it would be possible for us to document this that people would actually see and read, and I think it's a tough to understand just from reading a wiki page or a blog post: Still, if we had a wiki page which was a really comprehensive guide to submitting patches, then we could send people a link after they submit their first patch. As well as having it in the header for the commitfest page. While it wouldn't do everything, it would help. I'm all in favor. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Collation mega-cleanups
Tom this collation stuff has seen more post-feature-commit cleanups than I think any patch I remember. Is there anything we can learn from this? Yes, this is coming from me, who some consider to be the king of post-commit cleanups, namely, cleaning up my own commits. --- Tom Lane wrote: I just noticed that the collation patch has modified char2wchar and wchar2char to accept a collation OID as argument ... but it hasn't done anything to make those arguments actually work. Since those functions depend on wcstombs and mbstowcs, which respond to LC_CTYPE and nothing else, this flat out does not work in non-default collations. What's more, there doesn't seem to be any such thing as wcstombs_l or mbstowcs_l (at least my Fedora box hasn't got them), so this can't be fixed within the available glibc API. Right at the moment this only affects str_tolower, str_toupper, and str_initcap; there are other uses of these functions in the text search code, but those always pass DEFAULT_COLLATION_OID. It's possible that things are not too broken in practice, because it's likely that the transformations done by these functions only depend on the encoding indicated by LC_CTYPE, and we (try to) enforce that all locales used in a given database match the database encoding. Still, that's a rather shaky chain of reasoning. The complete lack of code comments on this doesn't make me any happier --- in fact, the comments for char2wchar and wchar2char still claim that they have the same API as wcstombs and mbstowcs, which can hardly be considered true when they don't even have the same argument lists. Any thoughts what to do about this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Robert Haas robertmh...@gmail.com writes: On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote: Interesting. You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. Personally, I think the support window is on the borderline of being too short already. There are several Linux distributions out there that offer 5-year support for certain releases. Keep in mind that at least some contributors are paid to do exactly that long-term support (and if you've not heard, Red Hat is up to seven years support on RHEL ...). So the work is going to get done, and if it doesn't get committed to the community SCM, I'm not sure that really helps anybody. Although whether we do formal releases is a different question. Maybe it would be sensible to continue patching an old branch but not bother wrapping up release tarballs? But the incremental work to do one more set of release notes and one more tarball build is not that large. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] switch UNLOGGED to LOGGED
On Mon, May 9, 2011 at 12:51 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of vie may 06 23:25:09 -0300 2011: On Fri, Apr 22, 2011 at 4:13 AM, Leonardo Francalanci m_li...@yahoo.it wrote: Maybe you should change xl_act_commit to have a separate list of rels to drop the init fork for (instead of mixing those with the list of files to drop as a whole). I tried to follow your suggestion, thank you very much. I have to admit I don't like this approach very much. I can't see adding 4 bytes to every commit record for this feature. Hmm, yeah. Maybe we can add a flags int8 somewhere in that struct and set a bit in it if nrels, nsubxacts, nmsgs and respective arrays are present. That would save some int's that are already in there. Yes, that seems like a very appealing approach. There is plenty of bit-space available in xinfo, and we could reserve a bit each for nrels, nsubxacts, and nmsgs, with set meaning that an integer count of that item is present and clear meaning that the count is omitted from the structure (and zero). This will probably require a bit of tricky code reorganization so I think it should be done separately from the main patch. With that done, then it's not a big deal for the main patch to add in one more array that will normally get omitted. And in the process, we can save 12 bytes on every commit record in the common case, which is quite appealing: I don't expect a huge performance gain, but a penny saved is a penny earned. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Inconsistent treatment of serials in pg_dump
All, Just encountered this: create table josh ( id serial not null, desc text ); pg_dump -Fc -T josh -f no_josh_dump postgres pg_dump -Fc -t josh -f josh_dump postgres pg_restore -d new no_josh_dump pg_restore -d new josh_dump pg_restore: [archiver (db)] Error from TOC entry 2645; 1259 49910 SEQUENCE josh_id_seq postgres pg_restore: [archiver (db)] could not execute query: ERROR: relation josh_id_seq already exists It seems that if I exclude a table using -T, its dependant sequences do not get excluded. But if I include it using -t, its dependent sequences *do* get included. Is there a reason this is a good idea, or is it just an oversight? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote: Interesting. ?You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. Personally, I think the support window is on the borderline of being too short already. There are several Linux distributions out there that offer 5-year support for certain releases. Keep in mind that at least some contributors are paid to do exactly that long-term support (and if you've not heard, Red Hat is up to seven years support on RHEL ...). So the work is going to get done, and if it doesn't get committed to the community SCM, I'm not sure that really helps anybody. Although whether we do formal releases is a different question. Maybe it would be sensible to continue patching an old branch but not bother wrapping up release tarballs? But the incremental work to do one more set of release notes and one more tarball build is not that large. I think the big reason we trimmed the support window was to push people off of old releases, not to lighten our workload. Until we stated that a release was not supported, we didn't give administrators ammunition to force upgrades within their organizations. Yeah, that is a lousy reason, but it was the stated case when we shrunk to five years. You can argue that our more recent releases are not as stop using them bad as previous ones, but Greg Smith's statement about autovacuum badness reinforces that goal. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On 05/09/2011 11:43 AM, Robert Haas wrote: Interesting. You could argue that once 8.3 is our earliest supported release that we could even shrink the support window because the argument I can't dump/reload my data would be gone. Personally, I think the support window is on the borderline of being too short already. There are several Linux distributions out there that offer 5-year support for certain releases. Some (RH?) offer significantly longer periods. I agree that we should not reduce the support window. The fact that we can do in place upgrades of the data only addresses one pain point in upgrading. Large legacy apps require large retesting efforts when upgrading, often followed by lots more work renovating the code for backwards incompatibilities. This can be a huge cost for what the suits see as little apparent gain, and making them do it more frequently in order to stay current will not win us any friends. I often want to wait a while after a release for certain customers, while it beds down, and to get them to start moving towards upgrading well before it's the last minute. That makes an effective life of four years or less per release as things are now. That's plenty short enough. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] hint bit cache v5
Attached is an updated version of the 'hint bit cache'. What's changed: *) 'bucket' was changed to 'page' everywhere *) rollup array is now gets added during 'set', not the 'get' (pretty dumb the way it was before -- wasn't really dealing with non-commit bits yet) *) more source comments, including a description of the cache in the intro *) now caching 'invalid' bits. I went back and forth several times whether to store invalid bits in the same cache, a separate cache, or not at all. I finally settled upon storing them in the same cache which has some pros and cons. It makes it more or less exactly like the clog cache (so I could copy/pasto some code out from there), but adds some overhead because 2 bit addressing is more expensive than 1 bit addressing -- this is showing up in profiling...i'm upping the estimate of cpu bound scan overhead from 1% to 2%. Still fairly cheap, but i'm running into the edge of where I can claim the cache is 'free' for most workloads -- any claim is worthless without real world testing though. Of course, if tuple hint bits are set or PD_ALL_VISIBLE is set, you don't have to pay that price. What's not: *) Haven't touched any satisfies routine besides HeapTupleSatisfiesMVCC (should they be?) *) Haven't pushed the cache data into CacheMemoryContext. I figure this is the way to go, but requires extra 'if' on every cache 'get'. *) Didn't abstract the clog bit addressing macros. I'm leaning on not doing this, but maybe they should be. My reasoning is that there is no requirement for hint bit cache that pages should be whatever block size is, and I'd like to reserve the ability to adjust the cache page size independently. I'd like to know if this is a strategy that merits further work...If anybody has time/interest that is. It's getting close to the point where I can just post it to the commit fest for review. In particular, I'm concerned if Tom's earlier objections can be satisfied. If not, it's back to the drawing board... merlin hbache_v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
Excerpts from Robert Haas's message of lun may 09 14:31:33 -0400 2011: I'm happy enough with that set of guidelines: namely, that we'd use src/extension only for things that don't require additional dependencies, and not for things that build standalone executables. If we're going to move things around, I think we should take the trouble to categorize them along the way, and your idea of inserting one more subdirectory under src/extension for grouping seems fine to me. For executables we already have src/bin. Do we really need a separate place for, say, pg_standby or pg_upgrade? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collation mega-cleanups
Bruce Momjian br...@momjian.us writes: Tom this collation stuff has seen more post-feature-commit cleanups than I think any patch I remember. Is there anything we can learn from this? The pre-commit review was obviously woefully inadequate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures
On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian br...@momjian.us wrote: Josh Berkus wrote: Peter, I would like to collect some specs on this feature. So does anyone have links to documentation of existing implementations, or their own spec writeup? A lot of people appear to have a very clear idea of this concept in their own head, so let's start collecting those. Delta between SPs and Functions for PostgreSQL: * SPs are executed using CALL or EXECUTE, and not SELECT. * SPs do not return a value ** optional: SPs *may* have OUT parameters. [ Late reply.] What is it about stored procedures that would require it not to return a value or use CALL? I am trying to understand what part of this is procedures (doesn't return a values, we decided there isn't much value for that syntax vs. functions), and anonymous transactions. FWICT the sql standard. The only summary of standard behaviors I can find outside of the standard itself is here: http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html. Peter's synopsis of how the standard works is murky at best and competing implementations are all over the place...SQL server's 'CALL' feature is basically what I personally would like to see. It would complement our functions nicely. Procedures return values and are invoked with CALL. Functions return values and are in-query callable. The fact that 'CALL' is not allowed inside a query seems to make it pretty darn convenient to make the additional distinction of allowing transactional control statements there and not in functions. You don't *have* to allow transactional control statements and could offer this feature as an essentially syntax sugar enhancement, but then run the risk of boxing yourself out of a useful properties of this feature later on because of backwards compatibility issues (in particular, the assumption that your are in a running transaction in the procedure body). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collation mega-cleanups
On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian br...@momjian.us wrote: Tom this collation stuff has seen more post-feature-commit cleanups than I think any patch I remember. Is there anything we can learn from this? How about don't commit all the large patches at the end of the cycle? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inconsistent treatment of serials in pg_dump
Josh Berkus j...@agliodbs.com writes: It seems that if I exclude a table using -T, its dependant sequences do not get excluded. But if I include it using -t, its dependent sequences *do* get included. Is there a reason this is a good idea, or is it just an oversight? It's not immediately clear to me that those switches ought to be exact inverses. As a counterexample, consider the case where multiple tables share the same sequence. Suppressing one of the tables with -T ought not lead to suppressing the sequence. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Robert Haas's message of lun may 09 14:31:33 -0400 2011: I'm happy enough with that set of guidelines: namely, that we'd use src/extension only for things that don't require additional dependencies, and not for things that build standalone executables. If we're going to move things around, I think we should take the trouble to categorize them along the way, and your idea of inserting one more subdirectory under src/extension for grouping seems fine to me. For executables we already have src/bin. Do we really need a separate place for, say, pg_standby or pg_upgrade? Putting them in there implies we think they are of core-code quality. I'm definitely *not* ready to grant that status to pg_upgrade, for instance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inconsistent treatment of serials in pg_dump
As a counterexample, consider the case where multiple tables share the same sequence. Suppressing one of the tables with -T ought not lead to suppressing the sequence. Now, that's a good point. And I don't expect that pg_dump can distinguish between a serial and an sequence with a dependency? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
Tom Lane t...@sss.pgh.pa.us writes: Sure, but that's a documentation issue, which again is not going to be helped by a source-tree rearrangement. So we have several problem to solve here, and I agree that source code rearrangement is fixing none of them. Maybe it would ease maintaining down the road, though, but I'll leave that choice up to you. Which contribs are ready (safe) for production? We could handle that in the version numbers, having most of contrib at version 0.9.1 (say) and some of them at version 1.0. We could also stop distributing examples in binary form, only ship them in source package. Then we need to include inspection extensions into the core packaging, but still as extensions. That's more of a packager problem, except that they need a clear and strong message about it. Maybe have a new Makefile and build those extensions as part of the server build, and install them as part as the normal install. Another mix of those ideas is to ship inspection extensions and ready for production ones all into a new package postgresql-server-extensions that the main server would depend on, and the ones that are adding more dependencies still in contrib, where not-ready for production extensions would not get built. This kind of ideas would also allow to quite easily remove things from the main server and have them as extensions, available on any install but a CREATE EXTENSION (WITH SCHEMA pg_catalog) command away. And still maintained at the same quality level in the source tree. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inconsistent treatment of serials in pg_dump
Excerpts from Josh Berkus's message of lun may 09 16:43:10 -0400 2011: Now, that's a good point. And I don't expect that pg_dump can distinguish between a serial and an sequence with a dependency? They're the same thing, so no. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
E.1.2.2. Casting * Tighten casting checks for domains based on arrays (Tom Lane) When a domain is based on an array type,..., such a domain type is no longer allowed to match an anyarray parameter of a polymorphic function, except by explicitly downcasting it to the base array type. This will require me to add hundreds of casts to my code. I do not get how this will Tighten casting checks. It will certainly not tighten my code! Could you explain how it is good to not be able to do array operations with a type which is an array? BTW: All of my DOMAINs which are array types exist because of PostgreSQL's inability to infer array types for DOMAINs, so I have lots of code like this: CREATE DOMAIN foo_ids AS integer; CREATE DOMAIN foo_id_arrays AS integer[]; I would love to be able to simply use foo_ids[] instead of having to have the second DOMAIN foo_id_arrays. If there is some value which I'm missing in the above Tightening, perhaps it could be put in *after* PostgreSQL were given the ability to understand foo_ids[] as an array of foo_ids. Thanks, _Greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Server Programming Interface underspecified in 4.1beta1
The documentation for the Server Programming Interface is underspecified such that it is impossible to create real extensions using only the documented interface. For example, in the example 43.5 http://www.postgresql.org/docs/9.1/static/spi-examples.html a variable of type SPITupleTable is being dereferenced as tupdesc-natts but this is not documented. The SPI documentation suggests studying the contributed extension code for further examples but that code is full of hundreds of features which are not in the SPI at all. My own SPI code uses dozens of macros and field names which I found in existing source and in conversations on the mailing lists. I'm always nervous when a new release comes out because the API I am using is not official and might therefore change and break my code. This is especially bad since much of the code is not type-safe and problems will tend to be hidden by all of the casts in the pre-ISO-C-style macro code. It would be good if the Server Programming Interface were sufficiently documented that most of the contributed extensions which ship with PostgreSQL and most of the SPI-based modules in the backend were using ONLY the documented features of the SPI. Macros hiding casts and typedefs hiding void * types should replaced with inline functions and pointers to specific strong types. To be very clear: Yes, I can always rummage around in the include files and source to find out how to do things IN THIS RELEASE. But I can't expect PostgreSQL development to avoid breaking the idioms I happen upon, nor can the developers write unit tests to ensure that proper extensions using the SPI will not break. I would like to be able to program to a C or C++ SPI which is clean, complete and type-safe. I am good at reading API documentation in C or C++ and would be happy to review any proposed improvements. _Greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
On 05/09/2011 03:31 PM, Alvaro Herrera wrote: For executables we already have src/bin. Do we really need a separate place for, say, pg_standby or pg_upgrade? There's really no executables in contrib that I find myself regularly desperate for/angry at because they're not installed as an integral part of the server, the way I regularly find myself cursing some things that are now extensions. The only contrib executable I use often is pgbench, and that's normally early in server life--when it's still possible to get things installed easily most places. And it's something that can be removed when finished, in cases where people are nervous about the contrib package. Situations where pg_standby or pg_upgrade suddenly pop up as emergency needs seem unlikely too, which is also the case with oid2name, pg_test_fsync, pg_archivecleanup, and vacuumlo. I've had that happen with pg_archivecleanup exactly once since it appeared--running out of space and that was the easiest way to make the problem go away immediately and permanently--but since it was on an 8.4 server we had to download the source and build anyway. Also, my experience is that people are not that paranoid about running external binaries, even though they could potentially do harm to the database. Can always do a backup beforehand. But the idea of loading a piece of code that lives in the server all the time freaks them out. The way the word contrib implies (and sometimes is meant to mean) low quality, while stuff that ships with the main server package does not, has been beaten up here for years already. It's only a few cases where that's not fully justified, and the program can easily become an extension, that I feel are really worth changing here. There are 49 directories in contrib/ ; at best maybe 20% of them will ever fall into that category. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why not install pgstattuple by default?
On 05/09/2011 02:31 PM, Robert Haas wrote: I don't think we should be too obstinate about trying to twist the arm of packagers who (as Tom points out) will do whatever they want in spite of us, but the current state of contrib, with all sorts of things of varying type, complexity, and value mixed together cannot possibly be a good thing. I think the idea I'm running with for now means that packagers won't actually have to do anything. I'd expect typical packaging for 9.1 to include share/postgresql/extension from the build results without being more specific. You need to grab 3 files from there to get the plpgsql extension, and I can't imagine any packager listing them all by name. So if I dump the converted contrib extensions to put files under there, and remove them from the contrib build area destination, I suspect they will magically jump from the contrib to the extensions area of the server package at next package build; no packager level changes required. The more I look at this, the less obtrusive of a change it seems to be. Only people who will really notice are users who discover more in the basic server package, and of course committers with backporting to do. Since packaged builds of 9.1 current with beta1 seem to be in short supply still, this theory looks hard to prove just yet. I'm very excited that it's packaging week here however (rolls eyes), so I'll check it myself. I'll incorporate the suggestions made since I posted that test patch and do a bigger round of this next, end to end with an RPM set as the output. It sounds like everyone who has a strong opinion on what this change might look like has sketched a similar looking bikeshed. Once a reasonable implementation is hammered out, I'd rather jump to the big argument between not liking change vs. the advocacy benefits to PostgreSQL of doing this; they are considerable in my mind. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server Programming Interface underspecified in 4.1beta1
On 05/09/2011 05:23 PM, J. Greg Davidson wrote: It would be good if the Server Programming Interface were sufficiently documented that most of the contributed extensions which ship with PostgreSQL and most of the SPI-based modules in the backend were using ONLY the documented features of the SPI. Docs patches welcome. (BTW, 4.1beta) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On Mon, May 9, 2011 at 7:18 PM, Robert Haas robertmh...@gmail.com wrote: Ah ha! Now we're getting somewhere. As was doubtless obvious from my previous responses, I don't agree that the process is as broken as I felt you were suggesting, and I think we've made a lot of improvements. However, I am in complete agreement with you on this point. Unfortunately, people often come into our community with incorrect assumptions about how it works, including: - someone's in charge - there's one right answer - it's our job to fix your problem Now if you read a few hundred emails (which is not that much calendar time, if you read them all) it's not too hard to figure out what the real dynamic is, and I think that real dynamic is increasingly positive (with some unfortunate exceptions). But if the first thing you do is post (no doubt about some large or controversial change), yeah, serious culture shock. Honestly it's not even that clear. It took me years to realize that when Tom says There's problems x, y, z he doesn't mean give up now there are all these fatal flaws but rather think about these things and maybe they're problems and maybe they're not, but we need to figure that out. To be fair that's true for everyone on th4 list depending on the audience. We have a tendency to state general concerns as pretty black-and-white statements that would read to a newbie as fatal flaws that aren't worth investigating. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unfriendly handling of pg_hba SSL options with SSL off
Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On m??n, 2011-04-25 at 14:18 -0400, Tom Lane wrote: In the particular case at hand, if someone is trying to use the same hostssl-containing pg_hba.conf across multiple systems, is it not reasonable to suppose that he should have SSL turned on in postgresql.conf on all those systems? If he doesn't, it's far more likely to be a configuration mistake that he'd appreciate being pointed out to him, instead of having to reverse-engineer why some of the systems aren't working like others. I think, people use and configure PostgreSQL in all kinds of ways, so we shouldn't assume what they might be thinking. Especially if an artificial boundary has the single purpose of being helpful. Well, it's not just to be helpful, it's to close off code paths that are never going to be sufficiently well-tested to not have bizarre failure modes. That helps both developers (who don't have to worry about testing/fixing such code paths) and users (who won't have to deal with the bizarre failure modes). But in any case, I think that the presence of a hostssl line in pg_hba.conf is pretty strong evidence that the admin intends to use SSL, so we should tell him about it if he's forgotten the other piece of setup he needs. Late reply, but we are basically ignoring 'local' lines if the build doesn't support unix domain sockets (windows), but throwing an error for hostssl usage if ssl is not compiled in. Is the only logic here that 'local' is part of the default pg_hba.conf and hostssl is not? Is that good logic? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Kevin Grittner wrote: Peter Eisentraut pete...@gmx.net wrote: what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. The two things which leap to mind for me are: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. The \d commands would be changed to call the SPs for releases recent enough to support this. Eventually psql would be free of worrying about which release contained which columns in which system tables, because it would just be passing the parameters in and displaying whatever results came back. I have used products which implemented something like this, and found it quite useful. Uh, why does this require stored procedures? Seems our existing function capabilities are even better suited to this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On m??n, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Excerpts from Greg Stark's message of lun may 09 19:44:15 -0400 2011: Honestly it's not even that clear. It took me years to realize that when Tom says There's problems x, y, z he doesn't mean give up now there are all these fatal flaws but rather think about these things and maybe they're problems and maybe they're not, but we need to figure that out. These things may seem trivial but I think they are worth documenting. It feels weird to document something that's inherently social rather than technical (to me at least), but if that's what we need to help others to collaborate, then we should. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On 05/09/2011 08:20 PM, Bruce Momjian wrote: Tom Lane wrote: Peter Eisentrautpete...@gmx.net writes: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. Multiple resultsets in one call would be a good thing, though, no? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
Josh Berkus wrote: As I don't think we can change this, I think the best answer is to tell people Don't submit a big patch to PostgreSQL until you've done a few small patches first. You'll regret it. When I last did a talk about getting started writing patches, I had a few people ask me afterwards if I'd ever run into problems with having patch submissions rejected. I said I hadn't. When asked what my secret was, I told them my first serious submission modified exactly one line of code[1]. And *that* I had to defend in regards to its performance impact.[2] Anyway, I think the intro message should be Don't submit a big patch to PostgreSQL until you've done a small patch and some patch review instead though. It's both a good way to learn what not to do, and it helps with one of the patch acceptance bottlenecks. The problem is not the process itself, but that there is little documentation of that process, and that much of that documentation does not match the defacto process. Obviously, the onus is on me as much as anyone else to fix this. I know the documentation around all this has improved a lot since then. Unfortunately there's plenty of submissions done by people who never read it. Sometimes it's because people didn't know about it; in others I suspect it was seen but some hard parts ignored because it seemed like too much work. One of these days I'm going to write the Formatting Curmudgeon Guide to Patch Submission, to give people an idea just how much diff reading and revision a patch should go through in order to keep common issues like spurious whitespace diffs out of it. Submitters can either spent a block of time sweating those details out themselves, or force the job onto a reviewer/committer; they're always there. And every minute it's sitting in someone else's hands who is doing that job instead of reading the real code, the odds of the patch being kicked back go up. [1] http://archives.postgresql.org/pgsql-patches/2007-03/msg00553.php [2] http://archives.postgresql.org/pgsql-patches/2007-02/msg00222.php -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
On Mon, May 9, 2011 at 6:25 PM, J. Greg Davidson g...@ngender.net wrote: E.1.2.2. Casting * Tighten casting checks for domains based on arrays (Tom Lane) When a domain is based on an array type,..., such a domain type is no longer allowed to match an anyarray parameter of a polymorphic function, except by explicitly downcasting it to the base array type. This will require me to add hundreds of casts to my code. I do not get how this will Tighten casting checks. It will certainly not tighten my code! Could you explain how it is good to not be able to do array operations with a type which is an array? BTW: All of my DOMAINs which are array types exist because of PostgreSQL's inability to infer array types for DOMAINs, so I have lots of code like this: CREATE DOMAIN foo_ids AS integer; CREATE DOMAIN foo_id_arrays AS integer[]; I would love to be able to simply use foo_ids[] instead of having to have the second DOMAIN foo_id_arrays. If there is some value which I'm missing in the above Tightening, perhaps it could be put in *after* PostgreSQL were given the ability to understand foo_ids[] as an array of foo_ids. I didn't read the thread that led up to this change (see: http://postgresql.1045698.n5.nabble.com/Domains-versus-arrays-versus-typmods-td3227701.html) but if I had, I would have argued that the problem cases listed, the worst being the failed constraint check, do not justify the compatibility break :(. In the pre-unnest() world, you might have gotten away with it, but it's been out for two released versions (and some ad hoc approaches for longer than that) and perhaps we were too quick on the trigger. Considering we've already got a report during beta1, this does not exactly inspire confidence. We've got other cases of known bugs where a good solution is unclear or breaks unknown amounts of user code. The giant clusterfark surrounding is null and composites comes to mind. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash-safe visibility map, take five
On Fri, May 6, 2011 at 2:47 PM, Robert Haas robertmh...@gmail.com wrote: Comments? At my day job there is saying: Silence is consent. I am surprised there has not been more discussion of this change, considering the magnitude of the possibilities it unlocks. -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On Mon, May 9, 2011 at 8:27 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 9 May 2011 11:19, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Can't we use the pipe trick on Windows? The API is different, but we use pipes on Windows for other things already. When a process is launched, open a pipe between postmaster and the child process. In the child, spawn a thread that just calls ReadFile() on the pipe, which blocks. If postmaster dies, the ReadFile() call will return with an error. Alright. I'm currently working on a proof-of-concept implementation of that. In the meantime, any thoughts on how this should meld with the existing latch implementation? How about making WaitLatch monitor the file descriptor for the pipe by using select()? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers
On 05/09/2011 09:43 PM, Greg Smith wrote: When I last did a talk about getting started writing patches, I had a few people ask me afterwards if I'd ever run into problems with having patch submissions rejected. I said I hadn't. Part of the trouble is in the question. Having a patch rejected is not really a problem; it's something you should learn from. I know it can be annoying. I get annoyed when it happens to me too. But I try to get over it as quickly as possible, and either fix the patch, or find another (and better) way to do the same thing, or move on. Everybody here is acting in good faith, and nobody's on a power trip. That's one of the good things about working on Postgres. If it were otherwise I would have moved on to something else long ago. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan and...@dunslane.net wrote: On 05/09/2011 08:20 PM, Bruce Momjian wrote: Tom Lane wrote: Peter Eisentrautpete...@gmx.net writes: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] the big picture for index-only scans
So, what do we need in order to find our way to index-only scans? 1. The visibility map needs to be crash-safe. The basic idea of index-only scans is that, instead of checking the heap to find out whether each tuple is visible, we first check the visibility map. If the visibility map bit is set, then we know all tuples on the page are visible to all transactions, and therefore the tuple of interest is visible to our transaction. Assuming that a significant number of visibility map bits are set, this should enable us to avoid a fair amount of I/O, especially on large tables, because the visibility map is roughly 8000 times smaller than the heap, and therefore far more practical to keep in cache. However, before we can rely on the visibility map for this purpose, we need to fix the problems that can leave bits set inappropriately in the face of an inconveniently-timed crash. I've been working on a patch for this on and off for a few months now; my latest version is in need of review[1]. 2. Crash safe visibility map vs. pg_upgrade. Even if we make the visibility map crash-safe in 9.2, people are going to want to use pg_upgrade to migrate from older versions, bringing their possibly-not-quite-correct visibility map forks along with them. How should we handle that? We could (2A) arrange to have pg_upgrade nuke all visibility forks when upgrading from a release where the visibility map is not crash-safe to one where it is; (2B) keep a piece of state somewhere indicating, for each relation, whether or not the visibility map can be trusted, set it to false only if pg_upgrade brings the relation over from and older version, and set it to true after a successful vacuum that skips no intervening pages; or (2C) advise the user to do a VACUUM FULL on each of their tables pre-upgrade, and if they don't, treat wrong answers as their own fault. (I doubt anyone will advocate for this option, but for the sake of completeness...). (2A) seems like the simplest solution, especially because it also avoids the overhead of checking the is the visibility map bit reliable? flag every time we want to plan a query. 3. Statistics. I believe that in order to accurately estimate the cost of an index-only scan, we're going to need to know the fraction of tuples that are on pages whose visibility map bits are set. I believe it should be fairly straightforward to have ANALYZE collect this information; and I'm inclined to do that as a separate patch. It seems like it would also be nice to know what fraction of tuples are on pages that don't have the visibility map set but where, in fact, all tuples on the page are visible to all transactions, so it would be legal to set the bit. A large discrepancy between these two percentages might be a good reason to auto-vacuum the table (perhaps using a really lazy vacuum[2]). I'm not sure if this can be added cheaply, though, and in any case, any change to the set of criteria that will trigger an auto-vacuum is probably a can of worms. Thoughts? 4. Planner and executor changes. In contrast to Heikki's original implementation, I'm inclined to not to try to split the Index Scan node into index scan and heap fetch. Since there are many choices for where to put the heap fetch node (any level of the join tree between the index scan and the root), this seems likely to result in a combinatorial explosion of paths[3], and I'm not real sure that the payback will be adequate. Furthermore, the idea of allowing user code to see tuples that will only later be determined not to have been visible to that MVCC snapshot in the first place seems pretty scary from a security perspective, though certainly there are possible benefits[4]. Instead, I'm inclined to just have the planner evaluate whether the necessary columns can be extracted purely from the index. If not, we proceed as now. If so, we can use the index only approach of using the visibility map to decide which heap fetches can be skipped. It's not clear to me whether we need to compare the cost of the standard approach with the cost of the index only approach: in theory, if there aren't any visibility map bits anyway, the index only approach could be slower. But I'm not sure whether that problem is significant or common enough to be worth expending a lot of code on. Either way, the number of actual paths doesn't need to increase, because in this design, even if we apply a costing model, one approach will dominate the other. Heikki also suggested considering index scans in cases where we don't now[4, again] but I'm inclined to leave this, too, for a later optimization, again because balancing the increase in paths against the possible performance benefits of using indexes in more situations seems finicky. In short, for a first cut at this, I just want to look at this as a way to get cheaper index scans, and leave everything else to future work. Any thoughts welcome. Incidentally, if anyone else feels like working on this, feel
Re: [HACKERS] crash-safe visibility map, take five
On Fri, May 6, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Mar 30, 2011 at 8:52 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Another question: To address the problem in http://archives.postgresql.org/pgsql-hackers/2010-02/msg02097.php , should we just clear the vm before the log of insert/update/delete? This may reduce the performance, is there another solution? Yeah, that's a straightforward way to fix it. I don't think the performance hit will be too bad. But we need to be careful not to hold locks while doing I/O, which might require some rearrangement of the code. We might want to do a similar dance that we do in vacuum, and call visibilitymap_pin first, then lock and update the heap page, and then set the VM bit while holding the lock on the heap page. Here's an attempt at implementing the necessary gymnastics. Is there a quick synopsis of why you have to do (sometimes) the pin-lock-unlock-pin-lock mechanic? How come you only can fail to get the pin at most once? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] the big picture for index-only scans
On Mon, May 9, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote: So, what do we need in order to find our way to index-only scans? 1. The visibility map needs to be crash-safe. The basic idea of index-only scans is that, instead of checking the heap to find out whether each tuple is visible, we first check the visibility map. If the visibility map bit is set, then we know all tuples on the page are visible to all transactions, and therefore the tuple of interest is visible to our transaction. Assuming that a significant number of visibility map bits are set, this should enable us to avoid a fair amount of I/O, especially on large tables, because the visibility map is roughly 8000 times smaller than the heap, and therefore far more practical to keep in cache. hm, what are the implications for tuple hint bits, short and long term? I'm particularly interested if you think any hint bit i/o mitigation strategies are worth pursuing. 2. Crash safe visibility map vs. pg_upgrade. Even if we make the visibility map crash-safe in 9.2, people are going to want to use pg_upgrade to migrate from older versions, bringing their possibly-not-quite-correct visibility map forks along with them. How should we handle that? We could (2A) arrange to have pg_upgrade nuke all visibility forks when upgrading from a release where the visibility map is not crash-safe to one where it is; +1 on 2A. 3. Statistics. I believe that in order to accurately estimate the cost of an index-only scan, we're going to need to know the fraction of tuples that are on pages whose visibility map bits are set. It would be helpful to know the performance benefit of index only scans before knowing how much benefit to attribute here. Maybe a system wide kludge would for starters anyway, like assuming 60% of pages can be vis checked from the VM, or a single GUC, Then again, maybe not. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XML with invalid chars
On Sun, May 08, 2011 at 06:25:27PM -0400, Andrew Dunstan wrote: On 04/27/2011 11:41 PM, Noah Misch wrote: On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote: On 04/27/2011 05:30 PM, Noah Misch wrote: To make things worse, the dump/reload problems seems to depend on your version of libxml2, or something. With git master, a CentOS 5 system with 2.6.26-2.1.2.8.el5_5.1 accepts the ^A byte, but an Ubuntu 8.04 LTS system with 2.6.31.dfsg-2ubuntu rejects it. Even with a patch like this, systems with a lenient libxml2 will be liable to store XML data that won't restore on a system with a strict libxml2. Perhaps we should emit a build-time warning if the local libxml2 is lenient? No, I think we need to be strict ourselves. Then I suppose we'd also scan for invalid characters in xml_parse()? Or, at least, do so when linked to a libxml2 that neglects to do so itself? Yep. I see you've gone with doing it unconditionally. I'd lean toward testing the library in pg_xml_init and setting a flag indicating whether we need the extra pass. However, a later patch can always optimize that. Injecting the check here aids xmlelement and xmlforest , but xmlcomment and xmlpi still let the invalid byte through. You can also still inject the byte into an attribute value via xmlelement. I wonder if it wouldn't make more sense to just pass any XML that we generate from scratch through libxml2. There are a lot of holes to plug, otherwise. Maybe there are, but I'd want lots of convincing that we should do that at this stage. Maybe for 9.2. I think we can plug the holes fairly simply for xmlpi and xmlcomment, and catch the attributes by moving this check up into map_sql_value_to_xml_value(). I don't have much convincing to offer -- hunting down the holes seem fine, too. I think I've done that. Here's the patch I have now. It looks like we can catch pretty much everything by putting checks in four places, which isn't too bad. Please review and try to break. Here are the test cases I tried: -- caught successfully SELECT E'\x01'::xml; SELECT xmlcomment(E'\x01'); SELECT xmlelement(name foo, xmlattributes(E'\x01' AS bar), ''); SELECT xmlelement(name foo, NULL, E'\x01'); SELECT xmlforest(E'\x01' AS foo); SELECT xmlpi(name foo, E'\x01'); SELECT query_to_xml($$SELECT E'\x01'$$, true, false, ''); -- not caught SELECT xmlroot('root/', version E'\x01'); SELECT xmlcomment(E'\ufffe'); -- not directly related, but also wrongly accepted SELECT xmlroot('root/', version ' '); SELECT xmlroot('root/', version 'foo'); Offhand, I don't find libxml2's handling of XML declarations particularly consistent. My copy's xmlCtxtReadDoc() API (used by xml_in when xmloption = document) accepts '?xml version=foo?' but rejects '?xml version= ?'. Its xmlParseBalancedChunkMemory() API (used by xml_in when xmloption = content) accepts anything, even control characters. The XML 1.0 standard is stricter: the version must match ^1\.[0-9]+$. We might want to tighten this at the same time. diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index ee82d46..12cfd56 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -142,6 +142,20 @@ static void SPI_sql_row_to_xmlelement(int rownum, StringInfo result, #define NAMESPACE_XSI http://www.w3.org/2001/XMLSchema-instance; #define NAMESPACE_SQLXML http://standards.iso.org/iso/9075/2003/sqlxml; +/* forbidden C0 control chars */ +#define FORBIDDEN_C0 \ + \x01\x02\x03\x04\x05\x06\x07\x08\x0B\x0C\x0E\x0F\x10\x11 \ + \x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F + +static inline void +check_forbidden_c0(char * str) +{ + if (strpbrk(str,FORBIDDEN_C0) != NULL) + ereport(ERROR, + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg(character out of range), + errdetail(XML does not support control characters.))); This would be an errhint, I think. However, the message seems to emphasize the wrong thing. XML 1.0 defines a lexical production called Char that includes various Unicode character ranges. Control characters as we know them happen to not fall in any of those ranges. The characters aren't unsupported in the sense of being missing features; the language simply forbids them. libxml2's error message for this case is PCDATA invalid Char value 1 (assuming \x01). Mentioning PCDATA seems redundant, since no other context offers greater freedom. How about: ereport(ERROR, (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), errmsg(invalid XML 1.0 Char \\U%08x, char_val))); nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Christopher Browne wrote: Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. I am kind of confused too, particularly with the CALL syntax. I thought our function call usage was superior in every way to CALL, so why implement CALL? I assume for SQL-standards compliance, right? Does multiple result sets require CALL? I assume autonomous transactions don't require CALL. Are we assuming no one is going to want a function that allows multiple result sets or autonomous transactions? That seems unlikely. I would think CALL is independent of those features. Maybe we need those features to support SQL-standard CALL, and we will just add those features to functions too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays
J. Greg Davidson g...@ngender.net writes: * Tighten casting checks for domains based on arrays (Tom Lane) When a domain is based on an array type,..., such a domain type is no longer allowed to match an anyarray parameter of a polymorphic function, except by explicitly downcasting it to the base array type. This will require me to add hundreds of casts to my code. I do not get how this will Tighten casting checks. It will certainly not tighten my code! Could you explain how it is good to not be able to do array operations with a type which is an array? The discussion that led up to that decision is in this thread: http://archives.postgresql.org/pgsql-hackers/2010-10/msg01362.php specifically here: http://archives.postgresql.org/pgsql-hackers/2010-10/msg01545.php The previous behavior was clearly broken. The new behavior is at least consistent. It might be more user-friendly if we did automatic downcasts in these cases, but we were not (and still are not) doing automatic downcasts for domains over scalar types in comparable cases, so it's not very clear why domains over array types should be treated differently. To be concrete, consider the function array_append(anyarray, anyelement) yielding anyarray. Suppose we have a domain D over int[] and the call array_append(var_of_type_D, 42). If we automatically downcast the variable to int[], should the result of the function be considered to be of type D, or type int[]? This isn't a trivial distinction because choosing to consider it of type D means we have to re-check D's domain constraints, which might or might not be satisfied by the modified array. Previous releases considered the result to be of type D, *without* rechecking the domain constraints, which was flat out wrong. So we basically had three alternatives to make it better: * downcast to the array type, which would possibly silently break applications that were relying on the function result being considered of the domain type * re-apply domain checks on the function result, which would be a performance hit and possibly again result in unobvious breakage * explicitly break it by throwing a parse error until you downcast (and then upcast the function result if you want) I realize that #3 is a bit unpleasant, but are either of the other two better? At least #3 shows you where you need to check for problems. There is another issue that wasn't really mentioned in the previous thread, which is that if we are matching a domain-over-array to a function's ANYARRAY argument, what exactly should be allowed to match to ANYELEMENT --- or if the function returns ANYELEMENT, what should the imputed result type be? AFAICS it's impossible to give an answer to that without effectively deciding that function argument matching smashes the domain to its base type (the array type). It's not very clear what's the point of a domain type if every operation on it is going to neglect its domain-ness. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unfriendly handling of pg_hba SSL options with SSL off
Bruce Momjian br...@momjian.us writes: Late reply, but we are basically ignoring 'local' lines if the build doesn't support unix domain sockets (windows), but throwing an error for hostssl usage if ssl is not compiled in. Is the only logic here that 'local' is part of the default pg_hba.conf and hostssl is not? Is that good logic? I wouldn't have a problem with making the Windows port throw an error for local lines. We'd have to fix initdb to remove that line from the sample file (if it doesn't already), but that's surely not hard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
2011/5/10 Bruce Momjian br...@momjian.us: Christopher Browne wrote: Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. I am kind of confused too, particularly with the CALL syntax. I thought our function call usage was superior in every way to CALL, so why implement CALL? I assume for SQL-standards compliance, right? Does multiple result sets require CALL? I assume autonomous transactions don't require CALL. no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. Are we assuming no one is going to want a function that allows multiple result sets or autonomous transactions? That seems unlikely. I would think CALL is independent of those features. Maybe we need those features to support SQL-standard CALL, and we will just add those features to functions too. We can use a SETOF cursors for returning a multiple result sets now. But there are a few complications: a) The client should to wait for finish of all sets from multiple result sets - minimally in PL/pgSQL b) client 'psql' doesn't support a unpacking result when result is multiple result set c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL Regards Pavel -- Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers