[HACKERS] Removing unreferenced files
Hello, This is in regards to the patch[0] posted in 2006 based on previous works[1]. Below is a summary of the issues, at present, as I understand it along with some questions. Initial questions that had no consensus in previous discussions: 1. Approach on file handling undecided 2. Startup vs standalone tool 3. If startup: how to determine when to run Outstanding problems (point-for-point with original 2005 post): 1. Does not work with non-standard tablespaces. The check will even report all files are stale, etc. 2. Has issues with stale subdirs of a tablespace (subdirs corresponding to a nonexistent database) [appears related to #1 because of maintenance mode and not failing] 3. Assumes relfilenode is unique database-wide when it’s only safe tablespace-wide 4. Does not examine table segment files such as “nnn.1” - it should instead complain when “nnn” does not match a hash entry 5. It loads every value of relfilenode in pg_class into the hash table without checking that it is meaningful or not - needs to check. 6. strol vs strspn (or other) [not sure what the problem here is. If errors are handled correctly this should not be an issue] 7. No checks for readdir failure [this should be easy to check for] Other thoughts: 1. What to do if problem happens during drop table/index and the files that should be removed are still there.. the DBA needs to know when this happens somehow 2. What happened to pgfsck: was that a better approach? why was that abandoned? 3. What to do about stale files and missing files References: 0 - http://www.postgresql.org/message-id/200606081508.k58f85m29...@candle.pha.pa.us 1 - http://www.postgresql.org/message-id/8291.1115340...@sss.pgh.pa.us Ron -- Command Prompt, Inc. http://www.commandprompt.com/ +1-800-492-2240 PostgreSQL Centered full stack support, consulting, and development. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: It's a major undertaking trying to write software that runs against PostgreSQL for more than one release. We should be making that easier, not harder. None of the proposals would make it impossible to write a LOCK statement that works on all available releases, [] +1 for this as a nice guideline/philosophy for syntax changes in general. Personally I don't mind changing a few SQL statements when I upgrade to a new release; but it sure is nice if there's at least some syntax that works on both a current and previous release. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compatibility GUC for serializable
Josh Berkus wrote: Mainly, that it's not clear we need it. Nobody's pointed to a concrete failure mechanism that makes it necessary for an existing app to run under fake-SERIALIZABLE mode. I think it's quite possible that you're right, and nobody depends on current SERIALIZABLE behavior because it's undependable. However, we don't *know* that -- most of our users aren't on the mailing lists, especially those who use packaged vendor software. That being said, the case for a backwards-compatiblity GUC is weak, and I'd be ok with not having one barring someone complaining during beta, or survey data showing that there's more SERIALIZABLE users than we think. Oh, survey: http://www.postgresql.org/community/ That Survey's missing one important distinction for that discussion. Do you take the the current survey answer Yes, we depend on it for production code to imply Yes, we depend on actual real SERIALIZABLE transactions in production and will panic if you tell us we're not getting that or Yes, we depend on the legacy not-quite SERIALIZABLE transactions in production and don't want real serializable transactions -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why percent_rank is so slower than rank?
Tom Lane wrote: argue that there was a regression. It's certainly a performance bug though: nobody would expect that giving a query *more* work_mem would cause it to run many times slower. I wouldn't be that surprised - otherwise it'd just be hard-coded to something large. Especially since earlier in the thread: The example is *not* particularly slow if you leave work_mem at default. which makes me think it's arguably not quite a bug. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
Josh Berkus wrote: On 11/20/10 6:11 PM, Jeff Janes wrote: True, but I think that changing these from their defaults is not considered to be a dark art reserved for kernel hackers, i.e they are something that sysadmins are expected to tweak to suite their work load, just like the shmmax and such. I disagree. Linux kernel hackers know about these kinds of parameters, and I suppose that Linux performance experts do. But very few sysadmins, in my experience, have any idea. To me, a lot of this conversation feels parallel to the arguments the occasionally come up debating writing directly to raw disks bypassing the filesystems altogether. Might smoother checkpoints be better solved by talking to the OS vendors virtual-memory-tunning-knob-authors to work with them on exposing the ideal knobs; rather than saying that our only tool is a hammer(fsync) so the problem must be handled as a nail. Hypothetically - what would the ideal knobs be? Something like madvise WONTNEED but that leaves pages in the OS's cache after writing them? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash support for arrays
Tom Lane wrote: It's possible that the multiply-by-31 method is as good as the rotate-and-xor method by that measure, or even better; but it's far from obvious that it's better. And I'm not convinced that the multiply method has a pedigree that should encourage me to take it on faith. Short summary: * some history of it's trivia follows * (nothing here suggests it's better - just old and common and cheap) Longer - some trivia regarding its pedigree: It (or at least a *31 variant) seems to have a history of advocacy going back to Chris Torek in 1990: http://groups.google.com/group/comp.lang.c/browse_thread/thread/28c2095282f0c1b5/193be99e9836791b?q=#193be99e9836791b X#defineHASH(str, h, p) \ X for (p = str, h = 0; *p;) h = (h 5) - h + *p++ and gets referred to in Usenet papers in the early 90's as well: http://www.usenix.com/publications/library/proceedings/sa92/salz.pdf Regarding why the magic number 31 [or 33 which also often comes up] apparently the only thing magic about it is that it's an odd number != 1.The rest of the odd numbers work about as well according to this guy who tried to explain it: http://svn.eu.apache.org/repos/asf/apr/apr/trunk/tables/apr_hash.c * The magic of number 33, i.e. why it works better than many other * constants, prime or not, has never been adequately explained by * anyone. So I try an explanation: if one experimentally tests all * multipliers between 1 and 256 (as I did while writing a low-level * data structure library some time ago) one detects that even * numbers are not useable at all. The remaining 128 odd numbers * (except for the number 1) work more or less all equally well. * They all distribute in an acceptable way and this way fill a hash * table with an average percent of approx. 86%. * * If one compares the chi^2 values of the variants (see * Bob Jenkins ``Hashing Frequently Asked Questions'' at * http://burtleburtle.net/bob/hash/hashfaq.html for a description * of chi^2), the number 33 not even has the best value. But the * number 33 and a few other equally good numbers like 17, 31, 63, * 127 and 129 have nevertheless a great advantage to the remaining * numbers in the large set of possible multipliers: their multiply * operation can be replaced by a faster operation based on just one * shift plus either a single addition or subtraction operation. And * because a hash function has to both distribute good _and_ has to * be very fast to compute, those few numbers should be preferred. ... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
Markus Wanner wrote: On 09/07/2010 02:16 PM, Robert Haas wrote: practice, this means that the master and standby need to compare notes on the ending WAL location and whichever one is further advanced needs to stream the intervening records to the other. Not necessarily, no. Remember that the client didn't get a commit confirmation. So reverting might also be a correct solution (i.e. not violating the durability constraint). In that situation, wouldn't it be possible that a different client queried the slave and already saw the result of that transaction which would later be rolled back? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: [ shrug... ] I stated before that the Hot Standby patch is doing utterly unsafe things in signal handlers. Simon rejected that. I am waiting for irrefutable evidence to emerge from the field (and am very confident that it will be forthcoming...) [...] [...]Why are we releasing 9.0 with this problem again? Surely this is nuts. Will the docs give enough info so that release note readers will know when they're giving well-informed consent to volunteer to produce such field evidence? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Robert Haas wrote: If git had a place to store all the information we care about, that would be fine... There's no reviewer header, and there's no concept that a patch might have come from the author (or perhaps multiple authors), but then have been adjusted by one or more reviewers and then frobnicated some more by the committer ... I don't think that non-linear history is an advantage in any situation. ISTM we could have the best of both those worlds - linear history and authorreviewercommitter information. Instead of squashing every patch into a single commit, what if it got squashed into a perhaps 3 separate commits -- one as submitted, one as reviewed, and one as re-written by the committer. History stays linear; and you keep the most relevant parts of the history, while dropping all the development brainstorming commits. And ISTM the patch reviewer could be responsible for this squashing so it's not much more work for the committer. For example, instead of commit 351c0b92eca40c1a36934cf83fe75db9dc458cde Author: Robert Haas robertmh...@gmail.com Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Andres Freund, with cleanup and adjustment for older branches by me. we'd see Author: Andreas Freund Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Path as originally submitted to commit fest Author: [Whomever the reviewer was] Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Patch marked read for committer by reviewer. Author: Robert Haas robertmh...@gmail.com Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Patch as rewritten by committer. For a complex enough patch with many authors, the reviewer could choose to keep an extra author commit or two to credit the extra authors. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Robert Haas wrote: On Wed, Jun 16, 2010 at 9:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: Sorry, I've been a bit distracted by other responsibilities (libtiff security issues for Red Hat, if you must know). I'll get on it shortly. What? You have other things to do besides hack on PostgreSQL? Shocking! :-) I suspect you're kidding, but in case some on the list didn't realize, Tom's probably as famous (if not moreso) in the image compression community as he is in the database community: http://www.jpeg.org/jpeg/index.html Probably the largest and most important contribution however was the work of the Independent JPEG Group (IJG), and Tom Lane in particular. http://www.w3.org/TR/PNG-Credits.html , http://www.w3.org/TR/PNG/ PNG (Portable Network Graphics) Specification Version 1.0 ... Contributing Editor Tom Lane, t...@sss.pgh.pa.us http://www.fileformat.info/format/tiff/egff.htm ... by Dr. Tom Lane of the Independent JPEG Group, a member of the TIFF Advisory Committee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: So... can we get back to coming up with a reasonable definition, (1) no access to system calls (including file and network I/O) If a PL has file access to it's own sandbox (similar to what flash seems to do in web browsers), could that be considered trusted? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Anyone know if Alvaro is OK?
Jaime Casanova wrote: At Saturday, 02/27/2010 on 4:21 pm Marc G. Fournier scra...@hub.org wrote: On Sat, Feb 27, 2010 at 10:45 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote: Is there a higher then normal amount of earthquakes happening recently? Re: the more frequent earthquakes, yeah I was thinking the same today. An actual scientific study would be more useful than idle speculation though This is a technical list so i won't insist on this but those of you that wanna give a try can read Matthew 24:3, 7, 8 and Luke 21:11 I find these links useful: http://earthquake.usgs.gov/earthquakes/eqinthenews/2010/ http://earthquake.usgs.gov/earthquakes/eqinthenews/2009/ ... I note an 8.1 in Samoa in Sep 2009 no 8.x's in 2008 an 8.5 in Sumatra Sep 12 2007 an 8.0 in Peru, Aug 2007 an 8.1 in Solomon Islands Apr 2007 an 8.1 in Kuril Islands Jan 13 2007 an 8.3 in Kuril Islands Nov 2006 an 8.7 in Sumatra, March 2005 an 8.1 in Macquarie Island Dec 2004 an 8.3 in Hokkaido Japan, Sep 2003 So yeah, if we're counting 8.8+'s this year's worse than usual; but 2005's 8.7's close. But if we're counting anything over 8.0, 2007's up there as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scheduler in core
Lucas wrote: I believe that in core may be installed by default in case of Those seem like totally orthogonal concepts to me. A feature may be in core but not installed by default (like many PLs). A feature might not be in core but installed by many installers (say postgis). It seems like half the people here are arguing for the former concept. It seems the other half are arguing against the latter concept. Is the real need here for a convenient way to enable and/or recommend packagers to install some non-core modules by default? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: PostgreSQL Add-On Network
Magnus Hagander wrote: On Fri, Jan 8, 2010 at 05:22, Ron Mayer rm...@cheapcomplexdevices.com wrote: David E. Wheeler wrote: On Jan 7, 2010, at 1:31 PM, Dave Page wrote: No, I'm suggesting the mechanism needs to support source and binary distribution. For most *nix users, source will be fine. For Windows binaries are required. I would love to follow what Strawberry Perl has done to solve this problem. In 2.0. +1. They did a nice job. For those of us who have no idea what Strawberry Perl did (other than not shipping Microsoft compatible libraries, and is thus useless for PostgreSQL), could someone explain it? As far as I can tell they shipped the minimal set of tools that they needed to build extensions rather than distribute binaries. http://en.wikipedia.org/wiki/Strawberry_Perl I don't know the details, but it works smoothly for them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: PostgreSQL Add-On Network
David E. Wheeler wrote: On Jan 7, 2010, at 1:31 PM, Dave Page wrote: No, I'm suggesting the mechanism needs to support source and binary distribution. For most *nix users, source will be fine. For Windows binaries are required. I would love to follow what Strawberry Perl has done to solve this problem. In 2.0. +1. They did a nice job. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: ...oom_adj... One interesting thing I read there is: Swapped out tasks are killed first. Half of each child's memory size is added to the parent's score if they do not share the same memory. This suggests that PG's shared memory ought not be counted in the postmaster's OOM score, which would mean that the problem shouldn't be quite as bad as we've believed. I wonder if that is a recent change? Or maybe it's supposed to be that way and is not implemented correctly? The code for oom_kill.c looks fairly readable (link below [1]): 96 points = mm-total_vm; 117 list_for_each_entry(child, p-children, sibling) { 118 task_lock(child); 119 if (child-mm != mm child-mm) 120 points += child-mm-total_vm/2 + 1; 121 task_unlock(child); 122 } Which seems to add points for each child who doesn't share the same mm structure as the parent. Which I think is a quite a bit stricter interpretation of if they do not share the same memory. [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=mm/oom_kill.c;h=f52481b1c1e5442c9a5b16b06b1b75b9bb7c;hb=HEAD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating index names automatically?
Peter Eisentraut wrote: Could we create an option to create index names automatically, so you'd only have to write CREATE INDEX ON foo (a); which would pick a name like foo_a_idx. Why wouldn't it default to a name more like: CREATE INDEX foo(a) on foo(a); which would extend pretty nicely to things like: CREATE INDEX foo USING GIN(hstore) ON foo USING GIN(hstore);' Seems to be both more readable and less chance for arbitrary collisions if I have column names with underscores. Otherwise what would the rule distinguishing create index on foo(a_b) from create index on foo(a,b), etc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
+1 for such a feature, simply to avoid the need of writing a hstore-parser (which wasn't too bad to write, but it felt unnecessary). Doesn't matter to me if it's hstore-to-json or hstore-to-xml or hstore-to-yaml. Just something that parsers are readily available for. Heck, I wouldn't mind if hstore moved to using any one of those for it's external representations by default. Tom Lane wrote: a ton of special syntax for xml support, ...a json type... [ I can already hear somebody insisting on a yaml type :-( ] If these were CPAN-like installable modules, I'd hope there would be eventually. Don't most languages and platforms have both YAML and JSON libraries? Yaml's user-defined types are an example of where this might be useful eventually. Tom Lane wrote: Well, actually, now that you mention it: how much of a json type would be duplicative of the xml stuff? Would it be sufficient to provide json - xml converters and let the latter type do all the heavy lifting? I imagine eventually a JSON type could validate fields using JSON Schema. But that's drifting away from hstore. (If so, this patch ought to be hstore_to_xml instead.) Doesn't matter to me so long as it's any format with readily available parsers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for SE-Linux security
Bruce Momjian wrote: Well, the bottom line is that this effort should grow the development and user community of Postgres --- it if doesn't, it is a failure. Really? Even if it only allows existing Postgres users and companies to expand their use into higher security applications IMHO it's a success. If a main goal were increasing users, implementing MySQL-isms and MSFTSqlServer-isms would seem the biggest bang for the buck. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain output infelicity in psql
Alvaro Herrera wrote: Robert Haas escribió: On first blush, I'm inclined to suggest that the addition of + signs to mark continuation lines is a misfeature. EXPLAIN is a special case. IMHO it should be dealt with accordingly. Is it? Wouldn't it affect anyone who stuck XML in a txt column and wanted to copy and paste it into a parser? Perhaps single column output usually won't want the + signs (because it's copypasteable) but multi-column output could? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain output infelicity in psql
Tom Lane wrote: Why don't you just do \pset format unaligned (or \a if you're lazy)? That's fair. Now that I see it, I guess I should have been doing that for copypaste work anyway. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] YAML Was: CommitFest status/management
Greg Smith wrote: That's a step backwards. By providing JSON format, we've also satisfied people who want YAML. Ripping out JSON would mean we *only* support YAML. There are far many more JSON parsers than YAML parsers, which is why I thought the current code committed was good enough. XML parsers are common enough IMHO the other computer readable formats can't be that important from a computer-readability perspective, leaving their main benefit as being human friendly. I like YAML output; but I think the most compelling arguments against the patch are that if so many people want different formats it may be a good use case for external modules. And far more than yaml output, I'd like to see a flexible module system with an equivalent of cpan install yaml or gem install yaml. I suppose one could argue that instead of YAML we design a different human-oriented format for loosely structured data; but that seems even harder. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for SE-Linux security
Robert Haas wrote: On Thu, Dec 3, 2009 at 5:23 PM, Josh Berkus j...@agliodbs.com wrote: Kaigai, you've said that you could get SELinux folks involved in the patch review. I think it's past time that they were; please solicit them. Actually, we tried that already, in a previous iteration of this discussion. Someone actually materialized and commented on a few things. The problem, as I remember it, was that they didn't know much about PostgreSQL, so we didn't get very far with it. Unfortunately, I can't find the relevant email thread at the moment. IIRC, at least a couple of the guys mentioned on the NSA's SE-Linux page[1] participated - Joshua Brindle[2] and Chad Sellers[3] (in addition to Kaigai/NEC who's credited on the NSA site as well). Perhaps one or two others too - but with common names it's hard to guess. Links to the threads with Chad and Joshua below. [1] http://www.nsa.gov/research/selinux/contrib.shtml [2] http://www.google.com/search?q=site%3Aarchives.postgresql.org+brindle [3] http://www.google.com/search?q=site%3Aarchives.postgresql.org+chad+sellers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] YAML Was: CommitFest status/management
Josh Berkus wrote: ... YAML for easier readability ... almost as good ... I agree with Kevin that it's more readable. Again, if there were a sensible way to do YAML as a contrib module, I'd go for that, but there isn't. Perhaps that's be a direction this could take? What would it take for this feature to be a demo/example for some future modules system. It seems like there have been a few recent features related to decorating output (UTF8 tables, YAML explain, \d... updates). While there's no great way to make this a contrib module today, would it make sense to add such hooks for an eventual module system? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] YAML Was: CommitFest status/management
Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: YAML... Hmm. So the argument for it is let's make a machine-readable format more human-readable? I'm not getting the point. People should look at the regular text output. IMHO YAML beats the regular text format for human-readability - at least for people with narrow terminal windows, and for novices. Greg posted examples comparing regular-text vs yaml vs json here: http://archives.postgresql.org/pgsql-hackers/2009-08/msg02090.php I think it's more human-readable for novices since it explicitly spells out what values refer to startup values vs totals. I think it's more human-readable to me because the current text format frequently wraps for me on even a modestly complex query, and I find scrolling down easier than scrolling both ways. None of the other machine-intended formats seem to suit that purpose well because they're dominated by a lot of markup. That said, though, it's not that big a deal. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [CORE] [HACKERS] EOL for 7.4?
Dave Page wrote: On Tue, Dec 1, 2009 at 4:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: ... 8.1 in RHEL5 ... +1 for letting 7.* and 8.0 die whenever no-one's motivated to bother supporting it anymore. Presumably you'll be on the hook until 2014 for 8.1 security patches I can't see the community wanting to support it for that long -1 for letting 8.1 die while someone major still supporting it, even if that means EOLing 8.2 before 8.1. As a PG user, it's confidence inspiring to see a project that can provide 7-years of support on a version. As a Red Hat customer, I'd feel happier if my database were not considered dead by the upstream community. It also feels more in the spirit of open-source to me -- where if one member is willing to put in work (Red Hat/Tom), the benefits are shared back; and in exchange the rest of the community can help with that contribution. I'm for EOLing *at least* 7.4 and 8.0 by January 2011, and I'm certainly not going to argue against doing the same for 8.1. Frankly, I think we could do 7.4 and maybe 8.0 six months earlier. I think the best would be to say 7.4 and 8.0 end in Jan 2011, and 8.1 switches to only high-priority security patches at that date. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for SE-Linux security
KaiGai Kohei wrote: Needless to say, NEC is also a supporter to develop and maintain SE-PgSQL feature. We believe it is a necessity feature to construct secure platform for SaaS/Cloud computing, so my corporation has funded to develop SE-PgSQL for more than two years. Rather than needless to say, I think this is worth elaborating on. Knowing how companies like NEC and their customers see SELinux and SE-PgSQL help their database projects would probably be one of the most compelling stories for getting broader support for the feature. Before googling nec software after seeing you mention this, I knew very little about NEC's software business. I can read some about NEC's software/database business for NEC North America's[1] and NEC Global Services[2] but imagine globally there's even more to it than that. Understanding how SE-PgSQL (and presumably SE-Linux) helps build a better SaaS/Cloud computing platform would probably help many people support this feature more. The cloud computing platforms I see more are ones that isolate a user's data either at a higher application layer (like salesforce) or a lower virtual machine layer (like amazon's elastic cloud). Is a vision of SE-PgSQL to help cloud computing companies sell customers access to a single underlying postgres instance, and share selected data between each other at a row level? Just curious. [1] http://www.necam.com/EntSw/ [2] http://www.necgs.com/partners.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SE-PgSQL patch review
Joshua D. Drake wrote: On Tue, 2009-12-01 at 14:46 -0500, Tom Lane wrote: Joshua D. Drake j...@commandprompt.com writes: On Mon, 2009-11-30 at 20:28 -0800, David Fetter wrote: This is totally separate from the really important question of whether SE-Linux has a future, and another about whether, if SE-Linux has a future, PostgreSQL needs to go there. Why would we think that it doesn't? Have you noticed anyone except Red Hat taking it seriously? I just did a little research and it appears the other two big names in this world (Novel and Ubuntu) are using something called App Armor. How much of SE-PgSQL would also complement the App Armor framework? Also, yet another MAC system called Tomoyo from NTT was merged into the linux kernel earlier this year. Is SE-PgSQL orthogonal and/or complimentary to all of those? Since I see MAC features continuing to be added to operating systems, I can certainly imagine they're important to some customers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cvs chapters in our docs
Brendan Jurd wrote: 2009/11/29 Bruce Momjian br...@momjian.us: Wow, we mention 28k modems --- we are legacy software: ;-) This initial checkout is a little slower than simply downloading a filenametar.gz/filename file; expect it to take 40 minutes or so if you have a 28.8K modem. Yes, and what about all the people using carrier pidgeon to download Postgres? I think our documentation is neglecting this substantial and vital portion of our user community. Never underestimate the bandwidth of a carrier pigeon with a flash card tied to his leg. [1] 11-month-old bird armed with a 4GB memory stick... the carrier pigeon delivered 4GB of data 60 miles in a little over an hour [1] http://www.dslreports.com/shownews/Carrier-Pigeon-Officially-Beats-DSL-104393 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ORDER BY vs. volatile functions
Andrew Gierth wrote: This query: select random() from generate_series(1,10) order by random(); produces sorted output. Should it? I recall a workaround from a different thread[1] if specifically were looking for random ordering of random numbers is: select random() from foo order by random()+1; The thread has more odd corner cases with multiple calls to random() and sorts as well. [1] http://archives.postgresql.org/pgsql-general/2006-11/msg01544.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] next CommitFest
Robert Haas wrote: That wasn't my intention. I really was assuming that we would just let those patches drop on the floor, and that they would not be picked up either by reviewers or committers. Surely it should depend on the nature of the patch. For an extreme strawman, segfault fixes almost certainly shouldn't be dropped. Same for docs patches that clarify the product. I think the majority of my contributions to open source this decade have been of that nature (a few links to examples in postgres and postgis follow). Maybe a better policy would be: if you reviewed patches, a reviewer will be assigned -- if you didn't, your patch is at the mercy of reviewers volunteering to review it based on their own interest in your patch that way patches that the community really wants could get in anyway. http://postgis.refractions.net/pipermail/postgis-users/2005-April/007762.html http://archives.postgresql.org/pgsql-performance/2009-03/msg00252.php http://postgis.refractions.net/pipermail/postgis-users/2005-April/007704.html http://postgis.refractions.net/pipermail/postgis-devel/2005-April/001341.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EOL for 7.4?
Is a somewhat related question how long are the various commercial support organizations committed to supporting 7.4? I guess support companies might support their client's systems for longer or shorter times than the community patches the old versions. No doubt it's easier for them if the community does the backpatching. But if any of those companies has a lot of 7.4 clients, they might be tempted to deal with backpatches for their clients even after the community stops. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Could postgres be much cleaner if a future release skipped backward compatibility?
Tom Lane wrote: What are the probabilities that the OpenACSes of the world will just set the value to backward compatible instead of touching their code? Would postgres get considerably cleaner if a hypothetical 9.0 release skipped backward compatibility and removed anything that's only maintained for historical reasons? I notice the docs are filled with passages like the quotes below - which suggest that there's a fair amount of stuff that might be done differently if it weren't for backward compatibility. For historical reasons (i.e., this is clearly wrong but it's far too late to change it), subscripting of fixed-length array types starts from zero, rather than from one as for variable-length arrays. Most of the alternative names listed in the Aliases column are the names used internally by PostgreSQL for historical reasons. In addition, some internally used or deprecated types are available, but are not listed here. Note: The name oid2name is historical, and is actually rather misleading Note: Native Kerberos authentication has been deprecated and should be used only for backward compatibility. Old-style functions are now deprecated because of portability problems and lack of functionality, but they are still supported for compatibility reasons. Although they still work, they are deprecated due to poor error handling, inconvenient methods of detecting end-of-data, and lack of support for binary or nonblocking transfers. The PostgreSQL usage of SELECT INTO to represent table creation is historical. It is best to use CREATE TABLE AS for this purpose in new code. regular expression metasyntax ... option...m: historical synonym for n Such comments are more a historical artifact than a useful facility, and their use is deprecated; use the expanded syntax instead. The CAST syntax conforms to SQL; the syntax with :: is historical PostgreSQL usage. timeofday() is a historical PostgreSQL function. (This does not match non-slice behavior and is done for historical reasons.) The SQL standard requires the use of the ISO 8601 format. The name of the SQL output format is a historical accident. attribute ... The historical way to specify optional pieces of information about the function. Caution Caution: If the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. This is for backward compatibility with the historical historical alias for stddev_samp ... historical alias for var_samp For historical reasons, this variable contains two independent components For historical reasons, the same function doesn't just return a boolean result; instead it has to store the flag at the location indicated by the third argument. For historical reasons, there are two levels of notice handling, Note that subscripting is 1-based, whereas for historical reasons proargtypes is subscripted from 0 The term attribute is equivalent to column and is used for historical reasons. For historical reasons, ALTER TABLE can be used with sequences too; but the only variants of ALTER TABLE that are allowed with sequences are While this still works, it is deprecated and the special meaning of \. can be expected to be removed in a future release. Use of this parameter is deprecated as of PostgreSQL 8.3; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Bruce Momjian wrote: Yep, this is illustrating something that is pretty basic to open source --- that is open source often provides the tools for a solution, rather than a complete solution. I often think of open source as providing a calculator with wires sticking out, rather than calculator buttons; the wires allow more flexibility, but they are harder to use. I disagree.Open source typically provides the complete solution too - just not from the developer who programs one component of the solution. Checklist writers intentionally use this to make straw-man arguments. People used to say linux doesn't even have a GUI - noting that X11 is a separate project. Now people have database checkboxes for: * a GUI admin tool (which we have, though it's a separate package) * GIS data types (which we have, though it's a separate package) * server-side password filters (which we have, though LDAP, etc) * replication (which we have, though many packages) * clustering (which we have, though hadoopdb) The Linux guys successfully communicated that it isn't fair for checklists to compare an OS kernel against commercial application suites. Seems it'd be good for the postgres project to similarly communicate that the database kernel is the core of a platform that's broader than just a database kernel. Ron -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Mark Mielke wrote: On 10/15/2009 10:08 AM, Dave Page wrote: ...other DBMSs (and all major operating systems I can think of) offer password policy features as non-client checks...we are compared ... Not so clear to me. If they're doing strong checks, this means they're sending passwords in the clear or only barely encoded, or using some OTHER method than 'alter role ... password ...' to change the password. This makes it sounds like a documentation problem to me. We need to educate the security-feature-checklist writers. It seems we need to clearly spell out the security risks of sending plain text passwords in the section where we would state the reason why the checks are done in the client - and then hopefully the security checklists writers will include only sends encrypted passwords as a checkbox on the product comparison charts. And if server-side checks are that important, perhaps the wiki needs an example of how to enable server-side check for popular GSSAPI or LDAP or PAM configurations and describe how to make postgres use those. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Dave Page wrote: I never said it wasn't - in fact I said from the outset it was about box-checking, and that anyone doing things properly will use LDAP/SSPI/Kerberos etc. I don't understand why the box-checkers can't already check that box; with the explanation stating Yes - by using LDAP or GSSAPI or PAM configured accordingly. Or do checkbox-lists specifically say can postgres do XYZ with all OS security features disabled. Anyway, as noted in the message you quoted, the current proposal will allow my colleagues to check boxes, and will be implemented in a sensible way on the server side. And it's entirely confined to a plugin, so if you trust all your users, there's no need for you to load it at all. Note that I'm not horribly against the feature (though I wouldn't use it) --- just that ISTM we're checkbox-compliant already by working with the OS, and it's perhaps more a documentation issue than coding issue to get those boxes checked. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated hstore patch
Andrew Gierth wrote: I'd appreciate public feedback on: - whether conversions to/from a {key,val,key,val,...} array are needed (and if there's strong opinions in favour of making them casts; in the absence of strong support for that, I'll stick to functions) Strikes me as an independent separate patch. It seems totally orthogonal to the features in the patch as submitted, no? - what to do when installing the new version's .sql into an existing db; the send/recv support and some of the index support doesn't get installed if the hstore type and opclasses already exist. In the case of an upgrade (or a dump/restore from an earlier version) it would be nice to make all the functionality available; but there's no CREATE OR REPLACE for types or operator classes. It seems similar in ways to the PostGIS upgrade issues when their types and operators change: http://postgis.refractions.net/docs/ch02.html#upgrading It seems they've settled on a script which processes the dump file to exclude the parts that would conflict? If the perfect solution is too complex, I'd also kinda hope this isn't a show-stopper for this patch, but rather a TODO for the future modules feature. If there are any more potential showstoppers I'd appreciate hearing about them now rather than later. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on getting rid of VACUUM FULL
Robert Haas wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hannu Krosing wrote: On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote: 2) Another utility that does something like UPDATE ... WHERE ctid ? to I also wonder whether we should consider teaching regular VACUUM to do a little of this every time it's run. Right now, once your table gets Having it be built into VACUUM would surprise me a bit, but I wonder if autovacuum could detect when such a tuple-mover would be useful, and run one before it does a VACUUM if needed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-based Releases WAS: 8.5 release timetable, again
Andrew Dunstan wrote: In any case, I don't accept this analogy. The mechanics of a Linux distribution are very different from the mechanics of a project like PostgreSQL. The prominent OSS project that seems to me most like ours is the Apache HTTP project. I'd think that File Systems might be more like postgres - with a shared obsession about data loss risks, and concerns about compatibility with any on-disk format changes. I wonder if the ext4 or btrfs guys use time-based release schedules, or if they'll release when it's ready. I see the ZFS guys have target dates for completing features that are still in beta, but also that they change as needed.[1] [1] http://opensolaris.org/os/project/zfs-crypto/ Anyone know how the F/OSS filesystem guys schedule their releases? I agree it's quite different than a distro - which, if I understand correctly, is mostly a matter of identifying completed and stable features rather than completing and stabilizing features. I would argue that it would be an major setback for us if we made another release without having Hot Standby or whatever we are calling it now. I would much rather slip one month or three than ship without it. Perhaps if sufficiently interesting features get in outside of a time-based schedule, an extra release could be made after the commit fest it gets in? If hot-standby + streaming-replication + index_only_scans + magic-fairy-dust-powered-shared-nothing-clusters all happened to get in 3 months after a time-based release, it'd be nice to see it sooner rather than waiting 9 months for a time-based window. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-based Releases WAS: 8.5 release timetable, again
Josh Berkus wrote: I can't find information about HTTPD release planning so I'll take your word for it. On the other hand, I have to point out that Apache is releasing HTTPD major versions an average of once every 3 years. I don't think we want to go to 3 years, do we? I'd say it depends on the flexibility of some hypothetical future module layer. If I understand right, much of the functionality in apache comes from modules - and those modules that are under heavy development may have different release cycles. I realize this doesn't work for many of the big features; but some of those seem to have over 1 year development cycles anyway. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove flatfiles.c
Robert Haas wrote: On Tue, Sep 1, 2009 at 9:29 PM, Alvaro Herreraalvhe...@commandprompt.com wrote: Ron Mayer wrote: Greg Stark wrote: That's what I want to believe. But picture if you have, say a 1-terabyte table which is 50% dead tuples and you don't have a spare 1-terabytes to rewrite the whole table. Could one hypothetically do update bigtable set pk = pk where ctid in (select ctid from bigtable order by ctid desc limit 100); vacuum; and repeat until max(ctid) is small enough? I remember Hannu Krosing said they used something like that to shrink really bloated tables. Maybe we should try to explicitely support a mechanism that worked in that fashion. I think I tried it at some point and found that the problem with it was that ctid was too limited in what it was able to do. I think a way to incrementally shrink large tables would be enormously beneficial. Maybe vacuum could try to do a bit of that each time it runs. Yet when I try it now, I'm having trouble making it work. Would you expect the ctid to be going down in the psql session shown below? I wonder why it isn't. regression=# create table shrink_test as select * from tenk1; SELECT regression=# delete from shrink_test where (unique2 % 2) = 0; DELETE 5000 regression=# create index shrink_test(unique1) on shrink_test(unique1); CREATE INDEX regression=# select max(ctid) from shrink_test; max -- (333,10) (1 row) regression=# update shrink_test set unique1=unique1 where ctid in (select ctid from shrink_test order by ctid desc limit 100); UPDATE 100 regression=# vacuum shrink_test; VACUUM regression=# select max(ctid) from shrink_test; max -- (333,21) (1 row) regression=# update shrink_test set unique1=unique1 where ctid in (select ctid from shrink_test order by ctid desc limit 100); UPDATE 100 regression=# vacuum shrink_test; VACUUM regression=# select max(ctid) from shrink_test; max -- (333,27) (1 row) regression=# update shrink_test set unique1=unique1 where ctid in (select ctid from shrink_test order by ctid desc limit 100); UPDATE 100 regression=# vacuum shrink_test; VACUUM regression=# select max(ctid) from shrink_test; max -- (333,33) (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remove flatfiles.c
Greg Stark wrote: That's what I want to believe. But picture if you have, say a 1-terabyte table which is 50% dead tuples and you don't have a spare 1-terabytes to rewrite the whole table. Could one hypothetically do update bigtable set pk = pk where ctid in (select ctid from bigtable order by ctid desc limit 100); vacuum; and repeat until max(ctid) is small enough? Sure, it'll take longer than vacuum full; but at first glance it seems lightweight enough to do even on a live, heavily accessed table. IIRC I tried something like this once, and it worked to some extent, but after a few loops didn't shrink the table as much as I had expected. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add YAML option to explain
Peter Eisentraut wrote: On fre, 2009-08-28 at 20:13 +, Greg Sabino Mullane wrote: Readability and easy editing. All the power of JSON without the annoying quotes, braces, and brackets. But these are supposed to be machine-readable formats. So readability and editability are not high priority criteria. Greg, can we see a few examples of the YAML output compared to both json and text? IMVHO, an advantage of YAML is human readability of structured data even compared to most non-computer-parseable human-intended text formats. But maybe that's just because I read too much yaml. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 release timetable, again
Andrew Dunstan wrote: I don't know of anyone who is likely to want to try out alphas in their normal development environments. The client I approached was specifically prepared to test beta releases that way. Perhaps end-users won't, but I think companies who develop software that works on top of postgres will. Perhaps to make sure their existing software continues to work; or perhaps to get a head start working with new features. I test against CVS-head occasionally. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 release timetable, again
Josh Berkus wrote: There's some very good reasons for the health of the project to have specific release dates and stick to them. Help me understand why? The Linux kernel seems to do fine with a when it is ready cycle, where some releases(2.6.24) take twice the time of others(2.6.28)[1,2]. I imagine it has similar stability and lack-of-data-loss requirements as postgres does. I understand why commercial packagers like Ubuntu - especially public ones like Novell and Red Hat who have to forecast earnings - want to schedule their releases. And I can imagine they'd appreciate it if project releases aren't too close to their release schedules (if postgres releases right after they release they suffer from not having the current version; if postgres releases just before, they have limited testing time). [1] http://www.linuxfoundation.org/publications/linuxkerneldevelopment.php [2] http://fblinux.freebase.com/view/base/fblinux/views/linux_kernel_release So, with that in mind: what is your statement on three versus four commitfests? Does it make a difference to you? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 release timetable, again
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: That is a slightly alarmist. Who are we going to lose these users to? Drizzle. MySQL forks. CouchDB. Any database which has replication which you don't need a professional DBA to understand. Whether or not it works. You haven't explained why we'd lose such folk next year when we haven't lost them already. MySQL has had replication (or at least has checked off the bullet point ;-)) for years. I think it's a slow but ongoing stream of organizations that are switching away using logic similar to the thoughts outlined here: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00955.php ...switched their bugzilla from Postgres to MySQL because the admins didn't want to deal with Slony any more. People want simple. MySQL may not have caught postgres in a number of ways yet, but it's good enough now for many of the things it wasn't good enough for earlier. And if it's good enough and easier, it's easy to switch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] revised hstore patch
Robert Haas wrote: On Wed, Jul 22, 2009 at 2:17 PM, Andrew Gierthand...@tao11.riddles.org.uk wrote: Unless I hear any objections I will proceed accordingly... At this point it's been 12 days since this was written and no updated patch has been posted, so I think it's well past time to move this to Returned with Feedback. Accordingly I'm going to make that change. Hopefully, an updated patch will be ready in time for the September CommitFest. Curious if this patch is likely for 8.5 and/or if there's a newer patch available. I've come across an application that it seems well suited for, and would be happy to test whichever version of the patch would be most useful for me to test against. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby?
David Fetter wrote: On Tue, Aug 11, 2009 at 08:56:38AM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: OK, so it is warm slave. Why isn't it just a read only slave. Do some systems have read-only slave databases that can't serve as a warm standby system as well as this one could? That is technically accurate, given the preceding definitions, but it has disturbing connotations. Enough so, in my view, to merit getting a little more creative in the naming. How about warm replica? Other ideas? Warm Read Streamed Copy Master/Slave Replication and Warm Standby systems are common enough terms that I can google them or look them up in many computer science books. While coming up with creative politically correct euphemisms might be fun, I hope we stick near terms that other DBAs could already be familiar with. ISTM the best way to refer to it formally would be a Read Only Slave / Warm Standby system, even if informally we might have informal discussions of just how hot our slaves are when hot-standby features get added down the road. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table and Index compression
I'm curious what advantages there are in building compression into the database itself, rather than using filesystem-based compression. I see ZFS articles[1] discuss how enabling compression improves performance with ZFS; for Linux, Btrfs has compression features as well[2]; and on Windows NTFS seems to too. [1]http://blogs.sun.com/observatory/entry/zfs_compression_a_win_win [2]http://lwn.net/Articles/305697/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SE-PostgreSQL Specifications
Robert Haas wrote: If you want to store intelligence data about the war in Iraq and intelligence data about the war in Afghanistan, it might not be too bad to store them in separate databases, though storing them in the same database might also make things simpler for users who have access to both sets of data. But if you have higher and lower classifications of data it's pretty handy (AIUI) to be able to let the higher-secrecy users read the lower-secrecy data Nice example. Is this system being designed flexibly enough so that one user may have access to the higher-secrecy data of the Iraq dataset but only the lower-secrecy Afghanistan dataset; while a different user may have access to the higher-secrecy Afghanistan data but only the lower-secrecy Iraq data? I imagine it's not uncommon for organizations to want to have total access to their data, but expose more limited access to other organizations they communicate with. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hadoop backend?
Paul Sheer wrote: Hadoop backend for PostGreSQL Resurrecting an old thread, it seems some guys at Yale implemented something very similar to what this thread was discussing. http://dbmsmusings.blogspot.com/2009/07/announcing-release-of-hadoopdb-longer.html It's an open source stack that includes PostgreSQL Hadoop, and Hive, along with some glue between PostgreSQL and Hadoop, a catalog, a data loader, and an interface that accepts queries in MapReduce or SQL and generates query plans that are processed partly in Hadoop and partly in different PostgreSQL instances spread across many nodes in a shared-nothing cluster of machines. Their detailed paper is here: http://db.cs.yale.edu/hadoopdb/hadoopdb.pdf According to the paper, it scales very well. A problem that my client has, and one that I come across often, is that a database seems to always be associated with a particular physical machine, a physical machine that has to be upgraded, replaced, or otherwise maintained. Even if the database is replicated, it just means there are two or more machines. Replication is also a difficult thing to properly manage. With a distributed data store, the data would become a logical object - no adding or removal of machines would affect the data. This is an ideal that would remove a tremendous maintenance burden from many sites well, at least the one's I have worked at as far as I can see. Does anyone know of plans to implement PostGreSQL over Hadoop? Yahoo seems to be doing this: http://glinden.blogspot.com/2008/05/yahoo-builds-two-petabyte-postgresql.html But they store tables column-ways for their performance situation. If one is doing a lot of inserts I don't think this is most efficient - ? Has Yahoo put the source code for their work online? Many thanks for any pointers. -paul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SE-PgSQL/tiny rev.2193
Joshua Brindle wrote: How many people are you looking for? Is there a number or are you waiting for a good feeling? Is it individuals or organizations people are looking for? I see KaiGai wrote In addition, I (and NEC) can provide our capability to the PostgreSQL community to keep these security features work correctly. Does that imply that a larger part of NEC is interested? The unfortunate part is that many of the people that would use it are unable to publicly say so. Could they publicly say something softer. I see SELinux had a number of large organizations (NSA) and publicly traded companies (Secure Computing Corp, Network Associates, etc) pushing it and contributing to it. If people who could speak for those organizations were here saying ooh, and such features in a F/OSS database would be interesting too, that would probably convince a lot of people. Joshua - if you're still associated with Tresys - could someone who could speak for that company say what they think about this project? The seem quite in-the-loop on what SELinux customers want. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SE-PostgreSQL?
David Fetter wrote: 2. Apart from Kohei-san and Stephen Frost, is anybody actually interested in having this feature at all? The features (both MAC, and row-level security), are interesting. * I've worked with organizations where MAC was a big deal. * I've had use cases where row-level security would be useful. * If this feature's a right step of getting PG into getting onto lists of EAL-certified databases like these: http://www.niap-ccevs.org/cc-scheme/vpl/?tech_name=DBMS it could make selling PG-backed solutions to some companies easier. I guess that'd count as a sales/PR feature? What I don't know is if this particular patch is the best step to getting any of those features. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans
Heikki Linnakangas wrote: ... CREATE TABLE manytomany (aid integer, bid integer); CREATE INDEX a_b ON manytomany (aid, bid); CREATE INDEX b_a ON manytomany (bid, aid); ... new and interesting indexing strategies. Covered indexes are also one kind of materialized view. It may be better to implement mat views and gain wider benefits too. Materialized view sure would be nice, but doesn't address quite the same use cases. Doesn't help with the many-to-many example above, for example. We should have both. Really? I'd have thought that index is similar to materializing these views: create view a_b as select aid,bid from manytomany order by aid,bid; create view b_a as select bid,aid from manytomany order by bid,aid; Or perhaps create view a_b as select aid,array_agg(bid) from manytomany group by aid; But I like the index-only scan better anyway because I already have the indexes so the benefit would come to me automatically rather than having to pick and choose what views to materialize. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Maintenance Policy?
Josh Berkus wrote: I'd suggest that we publish an official policy, with the following dates for EOL: 7.4 2009-08-01 ... 8.4 2014-08-01 What would such forward-looking statements even mean for a community-driven project? I assume for a commercial product, such a statement would mean something like I could get my money back or sue for breach of contract or similar if the vendor stops providing support before such a date. For an open source project, would such a statement really mean anything more than we'll provide support as long as some community members feel like it, and we guess that's about 5 years? If so, what? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] *_collapse_limit, geqo_threshold
Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: You do, but it's been pretty rare in my experience, and we're considering alternatives which give a lot less flexibility that this. I dunno about considering. We've already wasted vastly more time on this than it's worth. AFAIR there has never been one single user request for the ability to partially constrain join order. I think we should do an enable_join_ordering boolean and quit wasting brainpower on the issue. I think I've found it useful in the past[1], but I also think we already have a way to give postgres such hints using subselects and offset 0. Instead of SAP-DB's select * from (t1 join t2 on whatever) join t3 on whatever; ISTM we can already do select * from (select t1 join t2 on whatever offset 0) as a join t3 on whatever; which seems like a reasonably way of hinting which parenthesis can be reordered and which can't. Would these new proposals give (guc's or syntax hacks) anything that I can't already do? [1] http://archives.postgresql.org/pgsql-performance/2007-12/msg00088.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First CommitFest: July 15th
Josh Berkus wrote: Folks,...the first CF on July 15th. Would it make the CommitFest easier if there were an additional column which indicates what CVS-version of Postgres the patch cleanly applies to? Perhaps a patch submitter could indicate the CVS date/time with which he developed the patch. If a reviewer happens to apply the patch on a later version he could update it as cleanly applying at that later date. Commiters could feel free to ignore patches that are sufficiently far off of HEAD, so it might make work easier for them too. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 development schedule
Tom Lane wrote: Joshua D. Drake j...@commandprompt.com writes: We already push and pull our release dates based are what in the queue, we just do so informally. Why not just make it part of the process? I think we used to do it more or less like that, but people didn't like it because they couldn't do any long-range planning. Does the current system help long-range planning? I could imagine an enterprise plan that says we'll upgrade to the current production release in January [after christmas sales]; or we'll begin to upgrade the January after [feature-x] is in production. But in neither case does it help my long term planning to know if the current version January release is scheduled to be called 8.4 or 8.5 or 9.1 (which is really all that the current system helps me predict). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 development schedule
Bruce Momjian wrote: Where did you see 8.4 was scheduled to be released around the start of the year? I never never seen that and would have disputed anyone saying it publicly. I think people carefully avoided the word scheduled, but the press FAQ on www.postgresql.org did say to expect it in Q4 08. http://archives.postgresql.org/pgsql-general/2009-02/msg01265.php http://www.postgresql.org/about/press/faq Q: When will 8.4 come out? A: Historically, PostgreSQL has released approximately every 12 months and there is no desire in the community to change from that pattern. So expect 8.4 sometime in the fourth quarter of 2008. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query progress indication - an implementation
Greg Stark wrote: Right, that was why my proposed interface was to dump out the explain plan with the number of loops, row counts seen so far, and approximate percentage progress. My thinking was that a human could interpret that to understand where the bottleneck is if, say you're still on the first row for the top few nodes but all the nodes below a certain sort have run to completion that the query is busy running the sort... +1. Especially if I run it a few times and I can see which counters are still moving. Basically I disagree that imperfect progress reports annoy users. I think we can do better than reporting 250% done or having a percentage that goes backward though. It would be quite tolerable (though perhaps for no logical reason) to have a progress indicator which slows done as it gets closer to 100% and never seems to make it to 100%. -1.A counter that slowly goes from 99% to 99.5% done is much worse than a counter that takes the same much time going from 1000% of estimated rows done to 2000% of estimated rows done. The former just tells me that it lies about how much is done. The latter tells me that it's processing each row quickly but that the estimate was way off. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Robert Haas wrote: On Fri, Jun 5, 2009 at 12:15 PM, Tom Lanet...@sss.pgh.pa.us wrote: ... but I'm not at all excited about cluttering the long-term project history with a zillion micro-commits. One of the things I find most annoying about reviewing the current commit history is that Bruce has taken a micro-commit approach to managing the TODO list --- I was seldom so happy as the day that disappeared from CVS, because of the ensuing reduction in noise level. For better or worse, git also includes a command git-rebase that can collapse such micro-commits into a larger one. Quoting the git-rebase man page: A range of commits could also be removed with rebase. If we have the following situation: E---F---G---H---I---J topicA then the command git-rebase --onto topicA~5 topicA~3 topicA would result in the removal of commits F and G: E---H´---I´---J´ topicA While I wouldn't recommend using this for historical revisionism, I imagine it could be useful during code-review time when the micro-commits (from both the patch submitter and patch reviewer) are interesting. After the review, the commits could be collapsed into meaningful-sized-chunks just before they're merged into the official branches. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Markus Wanner wrote: The new branches getting merged up could work. That is, applying the fix to the oldest back-branch which requires the fix first and then merge it to all newer ones, including HEAD. However, that would require some rethinking: instead of creating bugfix-patches for HEAD, then manually adjust patches for back-branches and then group committing, you'd have to create a bugfix-patch for the oldest branch first, commit that and then merge that to the newer branches. That sounds a bit dangerous too, since I imagine there are some changes in the old release branches you wouldn't want merged into the newest releases (say, code affecting sections that got redesigned). Seems you'd want to do is create a new branch as close to the point where the bug was introduced - and then merge that forward into each of the branches. This concept was mentioned in a page linked earlier in the thread[1] and seems like the way monotone recommends people use their system[2]. See that page for more reasons why they think it's good. [1]http://archives.postgresql.org/pgsql-hackers/2009-06/msg00153.php [2]http://www.monotone.ca/wiki/DaggyFixes/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Managing multiple branches in git
Robert Haas wrote: The problem with making each release a separate directory is that, just like using separate repositories, it will defeat one of the main strengths of git, which is the ability to move around commits easily. git-new-workdir is the only solution to the problem of having multiple branches checked out simultaneously that seems like it might not suffer from that weakness. While I agree git-new-workdir is best for typical postgres workflows so I won't dwell on separate-repositories beyond this post - but I think you overstate the difficulty a bit. It seems it's not that hard to cherry-pick from a remote repository by setting up a temporary tracking branch and (optionally) removing it when you're done with it if you don't think you'll need it often. From: http://www.sourcemage.org/Git_Guide $ git checkout --track -b tmp local branch origin/remote branch $ git cherry-pick -x sha1 refspec of commit from other (local or remote) branch $ git push origin tmp local branch $ git branch -D tmp local branch And if you know you'll be moving patches between external repositories like origin/remote branch often, ISTM you don't have to do the first and last steps (which create and remove the tracked branch) each time; but rather leave the local tracking branch there. IMVHO, Moving commits around across *different* remote repositories is also one of the main strengths of moving to a distributed VCS. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Robert Haas wrote: But I wonder if it would make more sense to include some kind of metadata in the commit message (or some other property of the commit? does git support that?) to make it not depend on that. From elsewhere in this thread[1], 'The git cherry-pick ... -x flag adds a note to the commit comment describing the relationship between the commits.' If the commit on the main branch had this message = added a line on the main branch = The commit on the cherry picked branch will have this comment = added a line on the main branch (cherry picked from commit 189ef03b4f4ed5078328f7965c7bfecce318490d) = where the big hex string identifies the comment on the other branch. [1] http://archives.postgresql.org/pgsql-hackers/2009-06/msg00191.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
Aidan Van Dyk wrote: * Markus Wanner mar...@bluegap.ch [090602 10:23]: As mentioned before, I'd personally favor *all* of the back-ports to actually be merges of some sort, because that's what they effectively are. However, that also bring up the question of how we are going to do back-patches in the future with git. Well, if people get comfortable with it, I expect that backports don't happenen.. Bugs are fixed where they happen, and merged forward into all affected later development based on the bugged area. I imagine the closest thing to existing practices would be that people would to use git-cherry-pick -x -n to backport only the commits they wanted from the current branch into the back branches. AFAICT, this doesn't record a merge in the GIT history, but looks a lot like the linear history from CVS - with the exception that the comment added by -x explicitly refers to the exact commit from the main branch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Managing multiple branches in git
Tom Lane wrote: Marko Kreen mark...@gmail.com writes: They cannot be same commits in GIT as the resulting tree is different. The way I prepare a patch that has to be back-patched is first to make and test the fix in HEAD. Then apply it (using diff/patch and perhaps manual adjustments) to the first back branch, and test that. Repeat for each back branch as far as I want to go. Almost always, there is a certain amount of manual adjustment involved due to renamings, historical changes of pgindent rules, etc. Once I have all the versions tested, I prepare a commit message and commit all the branches. This results in one commit message per branch in the pgsql-committers archives, and just one commit in the cvs2cl representation of the history --- which is what I want. I think the closest equivalent to what you're doing here is: git cherry-pick -n -x the commit you want to pull The git cherry-pick command does similar to the diff/patch work. The -n prevents an automatic checking to allow for manual adjustments. The -x flag adds a note to the commit comment describing the relationship between the commits. It seems to me we could make a cvs2cl like script that's aware of the comments git-cherry-pick -x inserts and rolls them up in a similar way that cvs2cl does. The way that I have things set up for CVS is that I have a checkout of HEAD, and also sticky checkouts of the back branches... Each of these is configured (using --prefix) to install into a separate installation tree. ... I think the most similar thing here would be for you to have one normal clone of the official repository, and then use git-clone --local when you set up the back branch directories. The --local flag will use hard-links to avoid wasting space time of maintaining multiple copies of histories. I don't see any even-approximately-sane way to handle similar cases in git. From what I've learned so far, you can have one checkout at a time in a git working tree, which would mean N copies of the entire repository if I want N working trees git-clone --local avoids that. ... Not to mention the impossibility of getting it to regard parallel commits as related in any way whatsoever. Well - related in any way whatsoever seems possible either through the comments added in the -x flag in git-cherry-pick, or with the other workflows people described where you fix the bug in a new branch off some ancestor of all the releases (ideally near where the bug occurred) and merge them into the branches. So how is this normally done with git? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Managing multiple branches in git
Robert Haas wrote: And, unfortunately, I'm not sure there's a good solution. Tom could create 1 local repository cloned from the origin and then N-1 copies cloned with --local from that one, but this sort of defeats the purpose of using git, because now if he commits a change to one of them and then wants to apply that change to each back branch, he's got to fetch that change on each one, cherry-pick it, make his changes, commit, and then push it back to his main repository. Some of this Why has he got to do this pushing back to his main? How about creating 1 local repository from Origin, create N-1 cloned with --local from that one for each of those --local ones, git-remote add the main origin From then ISTM his workflow is very similar to the way he does with CVS, pulling and pushing from those multiple repositories to the central origin. He can creating the patches/diffs to apply to each the same way he does today. ISTM he'd mostly be unaware that these repositories were ever connected in some way unless he inspected that some of the files in .git had the same inodes because they came from hard links. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain analyze rows=%.0f
Euler Taveira de Oliveira wrote: Robert Haas escreveu: ...EXPLAIN ANALYZE reports the number of rows as an integer... Any chance we could reconsider this decision? I often find myself wanting to know the value that is here called ntuples, but rounding ntuples/nloops off to the nearest integer loses too much precision. Don't you think is too strange having, for example, 6.67 rows? I would confuse users and programs that parses the EXPLAIN output. However, I wouldn't object I don't think it's that confusing. If it says 0.1 rows, I imagine most people would infer that this means typically 0, but sometimes 1 or a few rows. What I'd find strange about 6.67 rows in your example is more that on the estimated rows side, it seems to imply an unrealistically precise estimate in the same way that 667 rows would seem unrealistically precise to me. Maybe rounding to 2 significant digits would reduce confusion? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hstore improvements?
Andrew Gierth wrote: I have a patch almost done that adds some obvious but currently missing functionality to hstore... If there's any other features that people find notably missing from hstore, I could stick them in too; any requests? Currently hstore gives me an indexed operator to query if a hstore contains a single key.It'd be nice if there were as way to extend this so that I could ask for only records that have all or any the keys in a query. 'a=1, b=1'::hstore ? 'a,b' In one database I put ids of each of the keys in a hstore into a largely redundant intarray to be able to do fast queries for rows containing all the hstore-keys in a set. Even cooler might be extending the hstore '?' operator to allow expressions similar to intarray's queries: 'a=1, b=1'::hstore ? 'a|b' 'a=1, b=1'::hstore ? 'ab' 'a=1, b=1'::hstore ? 'a(b|c)' I don't have a need for the more general expressions, but if the code can be borrowed from intarray to handle both, that'd be sweet. I once wanted a variation of hstore where a key could have multiple values(and the ability to query them). 'a=x, a=y'::hstore @ 'a=x' I imagine most EAV systems allow multiple values for each attribute - and a hstore that supported this could probably be a pretty nice general solution for many EAV systems. IIRC other people asked about similar on the lists before. Also, hstore has an (undocumented) limit of 65535 bytes for keys and values, and it does not behave very cleanly when given longer values (it truncates them mod 2^16, rather than erroring). That gives rise to two obvious questions: (1) are those lengths reasonable? they strike me as being rather long for keys and rather short for values; and (2) should exceeding the lengths throw an error? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1710)
Alvaro Herrera wrote: Gregory Stark escribió: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: KaiGai Kohei wrote: * [..feature description..] This again falls into the category of trying to have more fine-grained permissions than vanilla PostgreSQL has Would it make sense to instead of removing and deferring pieces bit by bit to instead work the other way around? Extract just the part of the patch that maps SELinux capabilities to Postgres privileges as a first patch? Then discuss any other parts individually at a later date? I think that makes sense. Implement just a very basic core in a first patch, and start adding checks slowly, one patch each. We have talked about incremental patches in the past. +1 from an end-user's point of view too. I'm quite aware of the postgres privileges, and if there were a MAC system of enforcing those I'd be reasonably likely to enable them right away. On the other hand, if SEPostgres initially comes with a different set of privileges that don't map to what I'm already using, I'm much less likely to spend the time to figure out the two different systems. And I do see row-level restrictions (and the other access restrictions mentioned in this thread) as quite orthogonal to MAC vs DAC. Would it be cool to have row-level permissions in postgres? Sure, as an abstract concept. Do I have any use for them? Seeing that I'm getting by without them, the answer must be not now. We wouldn't get unbreakable PostgreSQL in a single commit, but we would at least start moving. The good thing about having started in the opposite direction is that by now we know that the foundation APIs are good enough to build the complete feature. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1704)
Tom Lane wrote: Joshua D. Drake j...@commandprompt.com writes: I know we are a little uncomfortable here but KaiGai-San (forgive me if I type that wrong) has proven to be a contributor in his own right, Not to put too fine a point on it, but: no, he hasn't. Show me one significant patch he's contributed before/beside this one. The only I thought Joshua was talking about his contribtions to F/OSS in general. He's credited on the NSA site for SELinux kernel scalability and locking issues: http://www.nsa.gov/research/selinux/contrib.shtml Kaigai Kohei of NEC replaced the original Access Vector Cache (AVC) locking scheme with a RCU-based approach, which solved the major SELinux kernel scalability problem, and fixed other locking issues in the SELinux kernel code. He later optimized the SELinux ebitmap implementation to improve performance on AVC misses. He also developed SE PostgreSQL, and is one of the developers for the SE busybox project. At first glance it seems it'd be valuable to have him as an active member of this community. Frankly, what we have here is a large patch, with insanely difficult correctness requirements, written by a Postgres newbie. I'm kinda hoping the discussion could turn to what parts (no matter how small) seem both useful safe enough for 8.4 - even if the main use of the small parts ar just as hooks to make it easier for SEPostgres to live as a parallel side project. As far as I can tell, the community feels interested in the feature set; but relatively unable to contribute since none of the people have that much of a security background. It seems the best way to fix that would be to get more people with a security background more involved. Not push them away. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One less footgun: deprecating pg_dump -d
Selena Deckelmann wrote: Tom Lane wrote: Greg Sabino Mullaneg...@endpoint.com writes: ... deprecate -d by having it throw an exception when used. Deprecate does not mean break. ... While this change may break existing scripts...less painful Why do people want a failure rather than warning messages being spewed to both stderr and the log files? If someone doesn't notice warnings there, I wonder if even throwing an exception would save them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The science of optimization in practical terms?
Robert Haas wrote: experience, most bad plans are caused by bad selectivity estimates, and the #1 source of bad selectivity estimates is selectivity estimates for unknown expressions. ISTM unknown expressions should be modeled as a range of values rather than one single arbitrary value. For example, rather than just guessing 1000 rows, if an unknown expression picked a wide range (say, 100 - 1 rows; or maybe even 1 - table_size), the planner could choose a plan which wouldn't be pathologically slow regardless of if the guess was too low or too high. For that matter, it seems if all estimates used a range rather than a single value, ISTM less in general we would product less fragile plans. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new GUC var: autovacuum_process_all_tables
Joshua D. Drake wrote: On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote: My feeling is that we should be trying to eliminate use-cases for cron-driven vacuuming, not trying to make sure that cron-driven scripts can do anything autovacuum can. Agreed. IMO, the user should only have to think about vacuum in an abstract sense. +1 The main remaining use-case seems to me to make vacuuming work adhere to some business-determined schedule, hence maintenance windows seem like the next thing to do. Also agreed. Somewhat agreed - since in many cases the business-determined schedule is just a rough estimate of measurable attributes of the machine. When we say vacuum between midnight and 5am we often actually mean vacuum when the I/O subsystem has bandwidth to spare and the machine's otherwise lightly loaded, and we guess that means late at night. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get SE-PostgreSQL acceptable
Bruce Momjian wrote: Josh Berkus wrote: Bruce Momjian wrote: Josh Berkus wrote: Yea, it would take some work but it is an idea. It's *an* idea,yes. But it introduces as many (or more) problems than it solves. Ah, but my problems might be easier solved than the row-level permission problems. ;-) Or might not. Multi-partition indexes? Multi-partition uniqueness? Automated moving of rows between partitions? Are you trying to make some kind of point? IMVHO Josh was describing a nice-to-have TODO list for a partitions feature in general. :-) Maybe he was saying that when they partitioning feature is designed that they try to think of polyinstantiation as they design it :-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get SE-PostgreSQL acceptable
Joshua Brindle wrote: Tom Lane wrote: Joshua Brindle met...@manicmethod.com writes: I'm not sure how much it would cut to remove row level access controls, but I do have some points here. To me, row level access controls are the most important part, this is the feature that lets us put secret and top secret data in the same table and use the clients selinux context to decide what they can see, Help me understand this. It seems to me exactly as easy/hard to make sure that the secret and top-secret rows are put into their appropriate partitions, as it is to make sure that the secret and top-secret rows are tagged with the right row-level-access-info. If the idea is that the top-secret-row-inserter magically forces the row-level tag top-secret it seems just as easy if the top-secret row-inserter only has write permission to the top-secret partition and not the others. If the idea is that the less-secret-reader can't read rows tagged top-secret it seems just as easy if the less-secret-reader has no read access on the top-secret partition. At first glance, partition level seems quite a bit easier to manage in all the cases I can think of immediately. partitions don't help because, once again, the application would be making the determination about which partition to query. For mandatory I think that's not true. I would hope that the application queries the master table, and the SEPostgres ACLs prevent any data coming from the inappropriate partitions. access control that we need in the kind of environments to work the application doesn't get to make security relevant decisions, the database holds the data and needs to say who can access it. Further, partitioning isn't fine grained. I can't say user X can read secret rows and read/write top secret rows and get that data out in a Why not? It seems one can define the user with read access on the partition with secret rows and read/write on top-secret rows. Queries done on the master partition should get the data out in a transparent way. transparent way (the applications would have to be aware of the partitions). Relabeling of data also looks like a challenge with partitions (if I correctly understand how they work). ISTM we need to have a discussion of how partitioned tables work - and what the postgres roadmap is. If they can't yet, I think they should. I could be persuaded to get behind a patch that does Peter's step #1 (ie, use SELinux permissions as an additional filter for existing SQL permission checks). I don't believe I will ever think that row-level checks are a good idea; as long as those are in the patch I will vote against applying it. We've already used postgresql in sensitive environments and had to make compromises because of the lack of fine grained access control. We've been telling customers for years that we hope postgresql will have said access controls in the future and that it will help them solve many of the problems they have. We've been enthusiastic about the work KaiGai has been doing with respect to the environments we operate in and it would be a shame for all that to be discarded. AFACT there's nowhere near a consensus that it should be discarded (or accepted); but rather that if the project can be split into two phases parts of it could go in sooner. There seems to be much less debate about the column/table/partition level MAC parts. Once those get in, I think the next valuable discussion would be whether the fine-grained access control is best achieved through improving the partition system or by adding the row-level acls as proposed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get SE-PostgreSQL acceptable
Stephen Frost wrote: And, just to go full circle, row-level access controls are exactly what the other enterprise RDBMSs have and is what is used in these security circles today. One of the major issues, as I understand it, is to be able to use stock applications with multiple security levels where the application doesn't know (or care about) the security level. Doing that through views and partitions and triggers and whatnot for each and every application that is run on these systems will be a big hurdle to those users, if it ends up being workable at all. That seems to me to be a shortcoming of the partition system and a good TODO for the future partitioning improvements. Why shouldn't be just as easy to make sure a row ends up in the right partition as opposed to making sure it's tagged with right row-level ACLs. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get SE-PostgreSQL acceptable
Joshua Brindle wrote: Nonetheless, this conversation seems moot now that Tom has walled off and won't even discuss row-level access controls. I think that's a bit of an overstatement. He says he's against them[1] and he says that they are the sticking point on this patch[2], and that they break SQL[2] and that he believes that implementations of row level acls he can imagine would be buggy[2]. Elsewhere other people on the core team are suggesting that others want to see SQL-level row permissions[3]. My reading of the discussion is that row-level access controls aren't vetoed permanently, but rather that (a) it's still clear what SQL semantics they'll break, (b) the implementations discussed so far seem at high risk of bugs to some people, and (c) some people haven't been sold on the need for them.None of those necessarily state that the feature will never get into postgres; but it makes it sound like a really high bar to jump over for a release that was originally scheduled to be done a while ago. [1] http://archives.postgresql.org/pgsql-hackers/2009-01/msg02389.php [2] http://archives.postgresql.org/pgsql-hackers/2009-01/msg02339.php [3] http://archives.postgresql.org/pgsql-hackers/2009-01/msg02391.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Simon Riggs wrote: The process works like this: software gets developed, then it gets certified. If its not certified, then Undercover Elephant will not be used by the secret people. We can't answer the will it be certified? question objectively yet. If we have someone willing to write the software and put it forward for certification then we should trust that it probably will pass certification and if it doesn't we will see further patches to allow that to happen. For what it's worth, we can see that there are indeed Postgres forks on the Common Criteria certified list. http://www.commoncriteriaportal.org/products_DB.html PostgreSQL Certified Version V8.1.5 for Linux ManufacturerAssurance level Certification date NTT DATA CORPORATIONEAL122-MAR-07 Certification report c0089_ecvr.pdf http://www.commoncriteriaportal.org/files/epfiles/c0089_ecvr.pdf though at EAL1 they're quite far from the EAL4+ that DB2, Oracle, etc get. That someone went through the effort suggests that there's at least some interest in getting security certifications for postgres. It'd be interesting to hear from whomever at NTT was involved with that certification, if SEPostgreSQL would have either made that process easier or help postgres achieve a higher level. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Peter Eisentraut wrote: On Tuesday 27 January 2009 00:42:32 Ron Mayer wrote: If it were just as easy for us to pull from a all 'pending-patches' for-commit-fest-nov that pass regression tests branch, I'd happily pull from that instead. Considering that most patches don't come with regression tests, this would accomplish very little. And even those patches that did come with regression tests (e.g., updatable views) need a design analysis much more than running an automated test suite. Ultimately, it does come down to human work. So long as the patch passes the pre-existing regression tests, it's likely to be stable enough to run on some of our development instances. I certainly don't suggest that this is a substitute for reviews. Just that more testing of patches might happen incidentally (by people who currently test their own software against CVS head) if all the pending patches for a commit fest were as easy to pull as CVS head. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: 8.4 release planning (was Re: [HACKERS] [COMMITTERS] pgsql: Automatic view update rules)
Dave Page wrote: On Tue, Jan 27, 2009 at 2:01 PM, Peter Eisentraut pete...@gmx.net wrote: Updatable views is reverted. I agree that we should reject the rest and prepare a release. That will send a fine message to those companies that have sponsored development work - that we will arbitrarily reject large patches that have been worked on following the procedures that we require. To some extent that seems more an issue of linguistics and tone. If Peter had written we should defer the rest and try to help resolve specific issues identified in the reviews during commitfest 2009-First, sponsors might be happy rather than upset. I think one can make a strong argument that the features should be moved to the next commit-fest, just so the other patches in that commit fest ( http://wiki.postgresql.org/wiki/CommitFest_2009-First ) don't bit-rot too badly. Whether the community wants to release an 8.4 between commitfest 2008-11 and 2009-First seems to me a largely orthogonal question that would be based more on what demand there is for an 8.4 release and how distracting it would be to do a release between commitfests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Stephen Frost wrote: * Gregory Stark (st...@enterprisedb.com) wrote: It does seem weird to simply omit records rather than throw an error and require the user to use a where clause, even if it's something like WHERE pg_accessible(tab). It is weird from an SQL perspective, I agree with you there. On the other hand, it's what the security community is looking for, and is what's implemented by other databases (Oracle, SQL Server...) that do row-level security and security labels. Requiring a where clause or you throw an error would certainly make porting applications that depend on that mechanism somewhat difficult, and doesn't really seem like it'd gain you all that much... It seems to me that there are two different standards to which this feature might be held. Is the goal a) SEPostgres can provide useful rules to add security to some specific applications so long as you're careful to avoid crafting policies that produce bizarre behaviors (like avoiding restricing access to foreign key data you might need). On the other hand it gives you enough rope to hang yourself and produce weird results that don't make sense from a SQL standard point of view if you aren't careful matching the SEPostgres rules with your apps. or b) SEPostgreSQL should only give enough rope that you can not craft rules that produce unexpected behavior from a SQL point of view; and that it would be bad if one can produce SEPostgres policies that produce unexpected SQL behavior. It seems to me many of the security-enhanced products seem to do the former; while it seems some of the objections to this patch are more of the latter. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Tom Lane wrote: We do not consider that a short coming, anyone who needs to hide existence of files needs to set up their directory structure to disallow read/search/create on the directories they aren't allowed to discover filenames in. This seems to me to be exactly parallel to deciding that SELinux should control only table/column permissions within SQL; an approach that would be enormously less controversial, less expensive, and more reliable than what SEPostgres tries to do. With the table/column approach, could users who needed some row-level capabilities work around this easily by setting table-level access control on partitions? In some ways that seems like it'd be easier to manage as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Joshua Brindle wrote: FWIW, as you know, sepostgresql is already included in Fedora. You can continue shipping it as a seperate RPM set. That is non-ideal. Getting the capability in to the standard database shipped with RHEL is very important to me and my customers. Could you speak - even in general terms - about who your customers are and what kinds of needs (is row-level acls the most important to them? mandantory access control at the table level? both?) they have? I'm guessing a better understanding of how real-world users would use this feature would be enlightening. Since you can turn this off with GUC I don't see why it makes sense to ship 2 databases (nevermind the maintenance issues) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Tom Lane wrote: Heh. The reason we wanted a short 8.3 cycle was so we could push out patches that had been held over from 8.2. We are going to have exactly no credibility if we tell Simon et al we're pushing these patches to 8.5, but don't worry, it'll be a short release cycle. I think the best thing we could do overall is to set release dates and stick to them. On the other hand, we might be better throwing out release dates and releasing at the end of any Commit Fest where there is enough demand/interest. Then we could release 8.4 now, with few objections since people would know that if Hot Standby has enough demand it could be released within one commitfest of it being in good shape. Likewise the SE* guys would be aware that if they want that patch to drive a release, they'd need to round up more visible demand. Heck, 8.4 could have been released a whole commitfest ago if enough people think FSM is the killer feature in that one. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Gregory Stark wrote: I think a lot of people weren't aware there was anybody testing this patch other than Simon and Heikki -- I wasn't until just today. I wonder how many more people are trying it out? I've been running the patch (I think since Jan 5) on a couple dev instances that were used primarily to test if our software worked with what we expected to turn into 8.4. I can't say I've really been testing the patch specifically, but rather testing my workplace's software against a version with this patch, and hadn't noticed any problems. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Gregory Stark wrote: I think a lot of people weren't aware there was anybody testing this patch ...I wonder how many more people are trying it out? I think I have an idea to improve this aspect for future commit fests. For a long time at each of my workplaces I've been running a development instance against CVS-HEAD just to make sure our software is more future-proof against up-and-coming releases. We run this system with -enable-debug, asserts, etc, and accept that it's just a development system not expected to be totally stable. If it were just as easy for us to pull from a all 'pending-patches' for-commit-fest-nov that pass regression tests branch, I'd happily pull from that instead. I realize in the current system (emailed patches), this would be a horrible pain to maintain such a branch; but perhaps some of the burden could be pushed down to the patch submitters (asking them to merge their own changes into this merged branch). And I hate bringing up the version control flame war again; but git really would make this easier. If all patches were on their own branches; the painful merges into this shared branch would be rare, as the source control system would remember the painful parts of the merges. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On SCM
Christopher Browne wrote: On Mon, Jan 26, 2009 at 5:42 PM, Ron Mayer rm...@cheapcomplexdevices.com wrote: There has been enough experimentation with git usage during the 8.4 ... I certainly didn't mean for the idea to be advocating git nor any changes in 8.4. I was hoping the main idea was that the part you didn't quote: If it were just as easy for us to pull from a all 'pending-patches' for-commit-fest-nov that pass regression tests branch, I'd happily pull from that instead. would be very useful regardless of the source control system it's based on. And if enough people found such a 'staging branch' useful, it'd be worth maintaining even if I had to do it with 'patch' and no SCM tools whatsoever -- simply so not everyone has to merge it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Tom Lane wrote: The problem, in words of one syllable, is that we are not sure we want it. Do you see a user community clamoring for SEPostgres, or a hacker This is a chicken-and-egg type of problem. Security-conscious users, applications, hackers, and customers will flock towards whichever database product leads in that area. If some hypothetical database has only minimal security features, I imagine few security experts would spend a lot of time with the database. The second problem is that we're not sure it's really the right thing, because we have no one who is competent to review the design from a security standpoint. But unless we get past the first problem the second one is moot. Are we underestimating Kaigai Kohei? I seem to see him credited on the NSA's SELinux pages: http://www.nsa.gov/research/selinux/contrib.shtml and it seems his patches there related to postgresql were pretty widely discussed on the SELinux lists: http://www.nsa.gov/research/selinux/list-archive/0805/index.shtml#26163 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Tom Lane wrote: Ron Mayer rm...@cheapcomplexdevices.com writes: Are we underestimating Kaigai Kohei? Perhaps he walks on water, but still I'd like to have more than one person who has confidence that this design and implementation are correct. Totally fair. I know I'm totally unqualified to review his buoyancy on water, though. and it seems his patches there related to postgresql were pretty widely discussed on the SELinux lists: http://www.nsa.gov/research/selinux/list-archive/0805/index.shtml#26163 Well, a quick look through that thread shows a lot of discussion of the selinux policy code that's in the patch, which is good as far as it goes because for sure there's no one in *this* list who understands a line of that stuff. Totally fair too. Mind you, I'd like nothing better than to have some NSA database security experts (I'm sure there are some) show up here and tell us that this design is good, secure, and useful --- and why. But right now we What's the right way for us to ask them? No doubt there are some, but how do we expect them to find join our email list? If we wanted more feedback would it make sense for someone who can speak for the project to call them and ask if they'd be interested in getting involved? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
Tom Lane wrote: Hmm, you think selinux people read pgsql-announce? But seriously, we *have* been trying to get people's attention for this patch, both inside and outside the postgres community, for well over a year now. The lack of response has been depressing and (IMHO) telling. Nowhere have we gotten anything more concrete than ooh, that's cool ... maybe I might use it someday, but I can't be bothered right now. Ah! Then yes, that does say something about the lack of interest. It wasn't obvious to me that people were reaching out beyond these lists. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pluggable Indexes
Gregory Stark wrote: Simon Riggs si...@2ndquadrant.com writes: The original design of Postgres allowed pluggable index access methods, but that capability has not been brought forward to allow for WAL. This patch would bridge that gap. Well I think what people do is what GIST did early on -- they just don't support recoverability until they get merged into core. What other constraints are there on such non-in-core indexex? Early (2005) GIST indexes were very painful in production environments because vacuuming them held locks for a *long* time (IIRC, an hour or so on my database) on the indexes locking out queries. Was that just a shortcoming of the implementation, or was it a side-effect of them not supporting recoverability. If the latter, I think that's a good reason to try to avoid developing new index types the same way the GIST guys did. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery Test Framework
Robert Haas wrote: 2. Start using more git... This is a red herring, unless your proposal also includes making the master CVS^H^H^Hgit repository world-writable. The complaint I have about people posting URLs is that there's no stable archive of what the patches really were, and just because it came out of someone's local git repository doesn't help that. No, git really does help with this. ... git IS a stable archive of what the patches really were. Sorry to re-ignite the flame war, but this is the *perfect* example of the singlemost compelling advantage git over cvs. All of Simon's history remains visible in git on his branch. Better - any patches submitted to Simon by code reviewers that Simon accepts (pulls) into his branch - can also be seen on branches off of Simon's branch with the complete history of where they came from. When/if the patch eventually gets accepted into the master, as as much (or as little, thanks to git-rebase) of the history of that branch can be pulled along with it; as can be seen with the major merges of linux branches here: http://repo.or.cz/git-browser/by-commit.html?r=linux-2.6.git There's no need for the master git to be world-writable. The few with write access choose exactly how much history from Simon's branch (and from the code review's branches) they want to merge in when they pullmerge from his branch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Frames vs partitions: is SQL2008 completely insane?
Hitoshi Harada wrote: 2008/12/28 Tom Lane t...@sss.pgh.pa.us: Hitoshi Harada umi.tan...@gmail.com writes: 2008/12/27 Tom Lane t...@sss.pgh.pa.us: which doesn't conform to spec AFAICS ... 4.15...says: interesting...6.10 general rule 1b, which very clearly states ... ... 4.15 does seem like evidence that the spec authors may have misspoke in 6.10 Oracle... results are: ... which means the section 4.15 is true ISTM ISO should hire you guys (or the postgres project as a whole) to proof-read their specs before they publish them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] incoherent view of serializable transactions
Robert Haas wrote: ... serializable transaction ... If we were to construct a database that had one giant lock for the entire database so that only a single query could execute at one time, transactions would be serializable (because they'd in fact be serialized). However, performance would suck. I wonder if this giant-lock-for-isolation-level-serializable is a mode postgres should support. ISTM it would meet the letter of the spec, and at least some of the people using transaction isolation level serializable are doing so precisely because they *want* the database to deal with all possible serialization issues, and accepting performance penalties. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep: First Thoughts on Code
Josh Berkus wrote: Hmmm. I thought this was pretty clear. There's three levels of synch which are useful features: 1) synchronus standby which is really asynchronous, but only has a gap of 100ms. 2) Synchronous standby which guarentees that all committed transactions are on the failover node and that no data will be lost for failover, but the failover node is still in standby mode. 3) Synchronous replication where the standby node has identical transactions to the master node, and is queryable read-only. Any of these levels would be useful Isn't the queryable read-only feature totally orthogonal with how synchronous the replication is? For one reporting system I have, where new data is continually being added every second; I'd love to have a read-only-slave even if that system has the 100ms gap you mentioned in #1. Heck I don't care if the queries it runs even have a 100 *minute* gap; but I sure would like it to be synchronous in the sense that all the transactions to survive a failure of the primary. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: Welcoming our C++ friends
Tom Lane wrote: I am, btw, still waiting for an actually plausible use-case for this. AFAICS the setjmp-vs-exceptions thing puts a very serious crimp in what you could hope to accomplish by importing a pile of C++ code. The one use-case I can think of that imports a pile of C++ code is the GEOS library that PostGIS uses (used?): http://postgis.refractions.net/support/wiki/index.php?GEOS GEOS is a C++ port of the JTS Topology Suite. It is used by PostGIS to implement Topological functions. However it seems to work fine even without the C++-header project, so I must be missing something here... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep: First Thoughts on Code
Robert Haas wrote: We can make the reply to a commit message when any of the following events have occurred 1. We sent the message to standby 2. We received the message on standby 3. We wrote the WAL to the WAL file 4. We fsync'd the WAL file 5. We CRC checked the WAL commit record 6. We applied the WAL commit record Perhaps it'd be useful if the failure modes these are trying to protect against were described too. If I understand right. 1. Protects all the transactions from the failure of the master; so long as neither the network nor the slave machine die soon? 2. Protects all the transactions from the failure of the master and the network between the slave and master, so long as the slave doesn't die soon? 3. Same as #2? 4. Protects against the failure of the master, the network, and parts of the slave; so long as the slave's disk survives the failure? 5. Protects against all of the above, and bit-errors in the memories of the slave machine (except the slave's disk controller?)? Or are we reading-back the CRC from the slave's disk and comparing to the CRC computed on the master where it might protect from even more? 6. Same as 4? If this is right, #2, #3, #4, and #6 feel similar except that they're protecting against failures of different (but still all incomplete) subsets of the hardware on the slave, right? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] benchmarking the query planner
Gregory Stark wrote: Simon Riggs si...@2ndquadrant.com writes: The amount of I/O could stay the same, just sample all rows on block. [] It will also introduce strange biases. For instance in a clustered table it'll think there are a lot more duplicates than there really are because it'll see lots of similar values. But for ndistinct - it seems it could only help things. If the ndistinct guesser just picks max(the-current-one-row-per-block-guess, a-guess-based-on-all-the-rows-on-the-blocks) it seems we'd be no worse off for clustered tables; and much better off for randomly organized tables. In some ways I fear *not* sampling all rows on the block also introduces strange biases by largely overlooking the fact that the table's clustered. In my tables clustered on zip-code we don't notice info like state='AZ' is present in well under 1% of blocks in the table, while if we did scan all rows on the blocks it might guess this. But I guess a histogram of blocks would be additional stat rather than an improved one. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: Welcoming our C++ friends
Tom Lane wrote: Given the above constraints, I think the only real role for C++ here would be to allow access to third-party C++ libraries as Postgres extensions --- for instance something like an XML or numerical analysis I seem to recall that we're already able to do this. IIRC, some older postgis's wrapped some C++ library that they used internally; and some of my old scripts for installing postgres have: env LDFLAGS=-lstdc++ ./configure --prefix=$PGHOME I guess existing current c++ postgres extensions probably have a C wrapper? and I guess the main benefit of this project would be that the C wrapper could be thinner (or even nonexistant?) with these proposed changes? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers