[HACKERS] Data from zone.tab
Is there any reason that the zone.tab information is not included in the pg_timezone_names system view? ISTM that there is really no reason not to, as that view is really populated using that file anyway. There is a 1:1 mapping (assuming the aliases are mapped to the zone.tab entries they are aliases of) of entries in that view with enties in zone.tab. Reading an earlier thread on this matter, I think Magnus is behind the code that generates the view. What are the chances of getting at least the country code included in the pg_timezone_names system view? It'd really help out with i18n / L10n work, and given that PG already ships with that data present, it seems silly to not take advantage of it given how easy it would be to do so. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] VACUUM FULL out of memory
Tom Lane wrote: Michael Akinde [EMAIL PROTECTED] writes: INFO: vacuuming pg_catalog.pg_largeobject ERROR: out of memory DETAIL: Failed on request of size 536870912 Are you sure this is a VACUUM FULL, and not a plain VACUUM? Very sure. Ran a VACUUM FULL again yesterday (the prior query was a VACUUM FULL ANALYZE) and received essentially the same error, simply with different failure size. INFO: vacuuming pg_catalog.pg_largeobject ERROR: invalid memory alloc request size 1073741824 No changes done on the system from the previous iteration. VACUUM ran OK on the 8.3beta2 instance I tested with before Christmas (current setup is 8.2.5) I suspect that it's the latter, and the reason it's failing is that you are running the postmaster under a ulimit that is less than 512MB (or at least not enough more to allow an allocation of that size). We went over this somewhat prior to Christmas. Here's how its currently set up. $ ulimit -a core file size (blocks, -c) 1 data seg size (kbytes, -d) unlimited max nice(-e) 0 file size (blocks, -f) unlimited pending signals (-i) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) unlimited max rt priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Anything we should try to change? Andrew Sullivan wrote: Something is using up the memory on the machine, or (I'll bet this is more likely) your user (postgres? Whatever's running the postmaster) has a ulimit on its ability to allocate memory on the machine. If one looks at the system resources while the VACUUM FULL is going up, its pretty obvious that its a postgres process going on a memory allocation rampage that eats up all the resources. No, it's not really that big. I've never seen a problem like this. If it were the 8.3 beta, I'd be worried; but I'm inclined to suggest you look at the OS settings first given your set up. Have the same problem with the 8.3beta, but won't be using it anyway until its been out for a while. Note that you should almost never use VACUUM FULL unless you've really messed things up. I understand from the thread that you're just testing things out right now. But VACUUM FULL is not something you should _ever_ need in production, if you've set things up correctly. That's good to hear. I'm not particularly worried about this with respect to my own system. So far, we have found Postgres amazingly robust in every other issue that we have deliberately (or unwittingly) provoked. More reason to be puzzled about this problem, though. Holger Hoffstaette wrote: Then why does it exist? Is it a historical leftover? If it is only needed for emergency, should it not have a different name? Or in this case: if VACUUM FULL is never required (except in very special circumstances), it might be a good idea not to have VACUUM recommend running it (cf. the VACUUM I ran before New Year on a similar size table). INFO: vacuuming pg_catalog.pg_largeobject INFO: scanned index pg_largeobject_loid_pn_index to remove 106756133 row versions DETAIL: CPU 38.88s/303.43u sec elapsed 2574.24 sec. INFO: pg_largeobject: removed 106756133 row versions in 13190323 pages DETAIL: CPU 259.42s/113.20u sec elapsed 14017.17 sec. INFO: index pg_largeobject_loid_pn_index now contains 706303560 row versions in 2674471 pages DETAIL: 103960219 index row versions were removed. 356977 index pages have been deleted, 77870 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.02 sec. INFO: pg_largeobject: found 17489832 removable, 706303560 nonremovable row versions in 116049431 pages DETAIL: 0 dead row versions cannot be removed yet. There were 36000670 unused item pointers. 64493445 pages contain useful free space. 0 pages are entirely empty. CPU 1605.42s/1107.48u sec elapsed 133032.02 sec. WARNING: relation pg_catalog.pg_largeobject contains more than max_fsm_pages pages with useful free space HINT: Consider using VACUUM FULL on this relation or increasing the configuration parameter max_fsm_pages. Anyway, thanks for the responses. I do have the test setup available for hopefully some weeks, so if there is anyone interested in digging further into the matter, we do have the possibility to run further test attempts for a while (it takes about a week to load all the data, so once we take it back down, it may be a while before we set it up again). Regards, Michael Akinde Database Architect, met.no begin:vcard fn:Michael Akinde n:Akinde;Michael org:Meteorologisk Institutt, Norge;IT adr;quoted-printable:;;Gaustadall=C3=A9en
Re: [HACKERS] ERROR: translation failed from server encoding to wchar_t
On Jan 8, 4:14 am, [EMAIL PROTECTED] (Tom Lane) wrote: [EMAIL PROTECTED] writes: I am using tsearch2 with pgsql 8.2.5 and get the following error when calling to_tsvector : translation failed from server encoding to wchar_t My database is UTF8 encoded and the data sent to to_tsvector comes from a bytea column converted to text with encode(COLUMN, 'escape'). Two likely theories: 1. Your database encoding is UTF-8, but your locale (LC_CTYPE) assumes some other encoding. 2. The encode() is yielding something that isn't valid UTF-8. PG 8.3 contains checks that should complain about both of these scenarios, but IIRC 8.2 does not. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Dear Tom, Thanks for your reply. This is the output of `locale` on my system : # locale LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 LC_ALL= As for your second scenario I guess you are right, it's possible encode does not return all UTF8 characters. But to_tsvector() succeeds and fails at random with this kind of characters... So how can I sanitize output from encode before I pipe it to to_tsvector() ? Regards, Ilan ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Data from zone.tab
Sorry to reply, but there should also be a field in the system view is_alias so that devs are able to tell which zone names are in the zone.tab file and which are not. That way a perfect 1:1 mapping between zone.tab and app can be made. If this were done then it'd make things like using CLDR data and other standardized data sources easier, as you could be confident that all timezone names matched the data in the CLDR. I think what I'm trying to say is that using and applying standards is a good thing. - Naz. Naz Gassiep wrote: Is there any reason that the zone.tab information is not included in the pg_timezone_names system view? ISTM that there is really no reason not to, as that view is really populated using that file anyway. There is a 1:1 mapping (assuming the aliases are mapped to the zone.tab entries they are aliases of) of entries in that view with enties in zone.tab. Reading an earlier thread on this matter, I think Magnus is behind the code that generates the view. What are the chances of getting at least the country code included in the pg_timezone_names system view? It'd really help out with i18n / L10n work, and given that PG already ships with that data present, it seems silly to not take advantage of it given how easy it would be to do so. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
On Mon, Jan 07, 2008 at 04:24:13PM -0800, Darcy Buskermolen wrote: On Monday 07 January 2008 16:06:27 Bruce Momjian wrote: Devrim GÜNDÜZ wrote: Hi, On Mon, 2008-01-07 at 17:12 -0500, Tom Lane wrote: if no serious bugs come up this week, core is thinking of wrapping 8.3.0 at the end of the week, Please give the packagers and other people one more week break between two releases. We all had a busy weekend. We have thousands of people waiting for the 8.3 release. It seems odd to hold up 8.3 because the packagers are too busy. Perhaps we need more packagers or release the source and have the binary packages ready later. Also to be fair I've seen no real announcement of RC1, probabbly because it's been bundled in with security releases for the other branched.. can we maybe do an RC1 announcement, let people actually test the RC before we push a gold ? Yeah, when are we going to annonce RC1? We nede to update the beta pages on the website at the same time. //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
On Mon, Jan 07, 2008 at 08:00:36PM -0500, Bruce Momjian wrote: Joshua D. Drake wrote: Hi, On Mon, 2008-01-07 at 17:12 -0500, Tom Lane wrote: if no serious bugs come up this week, core is thinking of wrapping 8.3.0 at the end of the week, Please give the packagers and other people one more week break between two releases. We all had a busy weekend. We have thousands of people waiting for the 8.3 release. It seems odd to hold up 8.3 because the packagers are too busy. Perhaps we need more packagers or release the source and have the binary packages ready later. Perhaps we need to allow the release candidate to percolate and be tested a bit before we rush to release. If I recall, it is release when it is done not release because lots of people are waiting for it. Right, but you are mixing source code not ready with packagers not ready. They are two different things. I am not saying we are ready to release but if we determine the source code is ready I would like to avoid more delay because the packagers aren't ready. (Yes, I read Toms message saying we are not ready, but these points apply anyway) I think you greatly underestimate how important the binary distributions are. If they are not available at release time, we will get a *lot* of questions about that. I know it's certainly true for win32 packaging, and I beleive it is for the others as well. More importantly, we haven't even announced RC1 yet. Which means that we're now talking maybe 4 days of RC before we wrap release, which seems way too little. And in fact, just 4 days of warnings for the packagers for something as big as this isn't enough *even if we hadn't just released the back branches*. And we normally give them more warning. Also note that the tentative release timeline says that the presskit should be verified 6 days before release, something which hasn't even been started. Translations of presskits shuold be finished 10 days before. Embargoed pressreleases should go out 7 days before. etc. Buttom line, I think Devrim has a very valid complaint. For more reasons than he actually stated :-) That said, 2 or 3 weeks from now seems very doable. But let's pick an actual date around taht and inform the packagers and press contacts about it so they can prepare. //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
On 08/01/2008, Darcy Buskermolen [EMAIL PROTECTED] wrote: On Monday 07 January 2008 16:06:27 Bruce Momjian wrote: Devrim GÜNDÜZ wrote: Hi, On Mon, 2008-01-07 at 17:12 -0500, Tom Lane wrote: if no serious bugs come up this week, core is thinking of wrapping 8.3.0 at the end of the week, Please give the packagers and other people one more week break between two releases. We all had a busy weekend. We have thousands of people waiting for the 8.3 release. It seems odd to hold up 8.3 because the packagers are too busy. Perhaps we need more packagers or release the source and have the binary packages ready later. Also to be fair I've seen no real announcement of RC1, probabbly because it's been bundled in with security releases for the other branched.. can we maybe do an RC1 announcement, let people actually test the RC before we push a gold ? Indeed the website still says we are on Beta 4. I did not even know RC1 was out until I saw this thread this morning. Peter.
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
On Tue, Jan 08, 2008 at 09:47:15AM +, Peter Childs wrote: On 08/01/2008, Darcy Buskermolen [EMAIL PROTECTED] wrote: On Monday 07 January 2008 16:06:27 Bruce Momjian wrote: Devrim GÜNDÜZ wrote: Hi, On Mon, 2008-01-07 at 17:12 -0500, Tom Lane wrote: if no serious bugs come up this week, core is thinking of wrapping 8.3.0 at the end of the week, Please give the packagers and other people one more week break between two releases. We all had a busy weekend. We have thousands of people waiting for the 8.3 release. It seems odd to hold up 8.3 because the packagers are too busy. Perhaps we need more packagers or release the source and have the binary packages ready later. Also to be fair I've seen no real announcement of RC1, probabbly because it's been bundled in with security releases for the other branched.. can we maybe do an RC1 announcement, let people actually test the RC before we push a gold ? Indeed the website still says we are on Beta 4. I did not even know RC1 was out until I saw this thread this morning. RC1 isn't out. What's available on the ftp site is a preliminary version of it, that should not be used. //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] ERROR: translation failed from server encoding to wchar_t
[EMAIL PROTECTED] wrote: NEW.idxFTI := to_tsvector('simple', encode($x$E$x$|| NEW.messageblk, 'escape')); I strongly doubt that this does what you think it does - I would check the results if I were you. The $x$E$x$ should almost certainly not be there - if you are trying to get E'foo' behaviour, that is purely for literals. All you are doing here is to prepend a literal 'E' to your value. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ERROR: translation failed from server encoding to wchar_t
On Jan 8, 10:43 am, [EMAIL PROTECTED] wrote: On Jan 8, 4:14 am, [EMAIL PROTECTED] (Tom Lane) wrote: [EMAIL PROTECTED] writes: I am using tsearch2 with pgsql 8.2.5 and get the following error when calling to_tsvector : translation failed from server encoding to wchar_t My database is UTF8 encoded and the data sent to to_tsvector comes from a bytea column converted to text with encode(COLUMN, 'escape'). Two likely theories: 1. Your database encoding is UTF-8, but your locale (LC_CTYPE) assumes some other encoding. 2. The encode() is yielding something that isn't valid UTF-8. PG 8.3 contains checks that should complain about both of these scenarios, but IIRC 8.2 does not. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Dear Tom, Thanks for your reply. This is the output of `locale` on my system : # locale LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 LC_ALL= As for your second scenario I guess you are right, it's possible encode does not return all UTF8 characters. But to_tsvector() succeeds and fails at random with this kind of characters... So how can I sanitize output from encode before I pipe it to to_tsvector() ? Regards, Ilan Tom, To get around the non-UTF8 chars I used following function : CREATE OR REPLACE FUNCTION u_messageblk_idxfti() RETURNS trigger AS $$ DECLARE BEGIN RAISE NOTICE '[DBMAIL] Trying ID %', NEW.messageblk_idnr; BEGIN NEW.idxFTI := to_tsvector('simple', encode($x$E$x$|| NEW.messageblk, 'escape')); RAISE NOTICE '[DBMAIL] Ended ID %', NEW.messageblk_idnr; RETURN NEW; EXCEPTION WHEN character_not_in_repertoire THEN RAISE WARNING '[DBMAIL] character_not_in_repertoire ID %', NEW.messageblk_idnr; NEW.idxFTI := to_tsvector('simple', 'character_not_in_repertoire: This email contains illegal characters.'); RETURN NEW; END; END; $$ LANGUAGE plpgsql; Hope this helps others with DBmail and tsearch2 on postgres 8.2 Thanks for your help Tom, ilan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Psql command-line completion bug
If you hit tab on a table name containing a \ you get spammed with a series of WARNINGS and HINTS about nonstandard use of \\ in a string literal: postgres=# select * from bar\bazTAB WARNING: nonstandard use of \\ in a string literal LINE 1: ... substring(pg_catalog.quote_ident(c.relname),1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 3: ...ing(pg_catalog.quote_ident(n.nspname) || '.',1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 3: ...alog.quote_ident(nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ... || '.' || pg_catalog.quote_ident(c.relname),1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ...og.quote_ident(n.nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ...alog.quote_ident(nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. There are a few options here: 1) Use E'' in all the psql completion queries. This means they won't work on older versions of postgres (but they don't in general do so anyways). It would also break anybody who set standard_conforming_string = 'on'. Ideally we would want to use E'' and then pass false directly to PQEscapeStringInternal but that's a static function. 2) Use $$%s$$ style quoting. Then we don't need to escape the strings at all. We would probably have to move all the quoting outside the C strings and borrow the function from pg_dump to generate the quoting as part of sprintf parameter substitution. 3) set standards_conforming_strings=on for psql tab-completion queries and then reset it afterwards. That way we can just use plain standard-conforming '' and not get any warnings. 4) Replace PQExec with PQExecParam in tab-complete.c Personally I think (4) is the best long-term option but at this point that doesn't seem feasible. (3) or (2) seems the next best option. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Psql command-line completion bug
Option 5 would be to deprecate the ability to use a \ in an object name. Jon -Original Message- From: Gregory Stark [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 08, 2008 8:14 AM To: pgsql-hackers list Subject: [HACKERS] Psql command-line completion bug If you hit tab on a table name containing a \ you get spammed with a series of WARNINGS and HINTS about nonstandard use of \\ in a string literal: postgres=# select * from bar\bazTAB WARNING: nonstandard use of \\ in a string literal LINE 1: ... substring(pg_catalog.quote_ident(c.relname),1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 3: ...ing(pg_catalog.quote_ident(n.nspname) || '.',1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 3: ...alog.quote_ident(nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ... || '.' || pg_catalog.quote_ident(c.relname),1,7)='bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ...og.quote_ident(n.nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 5: ...alog.quote_ident(nspname) || '.',1,7) = substring('bar\\baz'... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. There are a few options here: 1) Use E'' in all the psql completion queries. This means they won't work on older versions of postgres (but they don't in general do so anyways). It would also break anybody who set standard_conforming_string = 'on'. Ideally we would want to use E'' and then pass false directly to PQEscapeStringInternal but that's a static function. 2) Use $$%s$$ style quoting. Then we don't need to escape the strings at all. We would probably have to move all the quoting outside the C strings and borrow the function from pg_dump to generate the quoting as part of sprintf parameter substitution. 3) set standards_conforming_strings=on for psql tab-completion queries and then reset it afterwards. That way we can just use plain standard- conforming '' and not get any warnings. 4) Replace PQExec with PQExecParam in tab-complete.c Personally I think (4) is the best long-term option but at this point that doesn't seem feasible. (3) or (2) seems the next best option. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
We previously tried to send this proposal/patch, but it never showed up. We tried twice in a 16 hour period. It might be something with the attachment, who knows. This time, we are providing a URL instead of attaching the patch. Please disregard previous emails if they come back from never-never-land. http://www.esilo.com/projects/postgresql/libpq/typesys-beta-0.8a.tar.gz What was previously called PGparam is now called libpq Type System. PGparam is just one structure within the Type System. This is an updated proposal/patch for a Type System in libpq. Basically, it allows applications to send binary formatted paramters put and receive text or binary formatted results get through a printf-style interface. It also adds the ability to register user-defined types, sub-classes of existing types as well as composites. There is full support for arrays, composites, composite arrays and nested composites. There are four documents included within the tar: type-system-api.txt - Documents the API functions type-specifiers.txt - This explains the type specifier syntax (printf-style). It also documents how to construct parameters and get result values for every supported type. composites-arrays.txt - Documents the use of arrays, composites and arrays of composites. This proposed API has full support for nested arrays or composites. type-handlers.txt - Explains how to register and implement a libpq type handler. Types can be sub-classed. The patch is named typesys.patch. There is a regression test named regression-test.c and a makefile for it named makefile.typesys. Andrew Merlin eSilo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] VACUUM FULL out of memory
On Tue, Jan 08, 2008 at 09:50:07AM +0100, Michael Akinde wrote: stack size (kbytes, -s) 8192 Perhaps this is the issue? (I don't know.) Also, this _is_ for the postgres user, right? That's the relevant one: the one that's actually running the back end process. Also, are you sure there's nothing else in the way? I don't remember what OS you're using. On AIX, for instance, there's some _other_ dopey setting that allows you to control user resource consumption as well, and it means that ulimit's answers are not the full story. (I learned this through painful experience, and confess it's one of the many reasons I think AIX should be prounounced as one word, rather than three letters.) Andrew Sullivan wrote: Something is using up the memory on the machine, or (I'll bet this is more likely) your user (postgres? Whatever's running the postmaster) has a ulimit on its ability to allocate memory on the machine. If one looks at the system resources while the VACUUM FULL is going up, its pretty obvious that its a postgres process going on a memory allocation rampage that eats up all the resources. Of course VACUUM FULL is eating up as much memory as it can: it's moving a lot of data around. But is it in fact exhausting memory on the machine? There are only two possibilities: either there's something else that is preventing that allocation, or else you've run into a case so unusual that nobody else has ever seen it. The data you're talking about isn't that big: I've run similar-sized databases on my laptop without pain. Or in this case: if VACUUM FULL is never required (except in very special circumstances), it might be a good idea not to have VACUUM recommend running it (cf. the VACUUM I ran before New Year on a similar size table). The suggestion you see there, though, is in fact one of the cases where you might in fact want to run it. That is, WARNING: relation pg_catalog.pg_largeobject contains more than max_fsm_pages pages with useful free space HINT: Consider using VACUUM FULL on this relation or increasing the configuration parameter max_fsm_pages. what it is saying is that a regular vacuum can no longer recover all the dead pages in the table, and if you want that space back and marked usable on your disk, you have to run VACUUM FULL (or, in fact, CLUSTER, or else dump and reload the table. But one of these). Note that I said that, if you have things configured _correctly_, you shouldn't have to run VACUUM FULL except in unusual circumstances. That doesn't mean never. The problem here is an historical one: you have a hangover from previous missed maintenance or sub-optimal vacuum scheduling. In those cases, you may want to perform VACUUM FULL, provided you understand the potential side effects (like possibly slower inserts initially, and some possible index bloat). A ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Hi, Andrew Chernow wrote: It might be something with the attachment, who knows. Most probably that was the case, yes. The -hackers list is limited, please use -patches to send patches. ;-) Regards Markus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] VACUUM FULL out of memory
Michael Akinde [EMAIL PROTECTED] writes: We went over this somewhat prior to Christmas. Here's how its currently set up. $ ulimit -a core file size (blocks, -c) 1 ... What you're showing us is the conditions that prevail in your interactive session. That doesn't necessarily have a lot to do with the ulimits that init-scripts run under ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 09:47:15AM +, Peter Childs wrote: Indeed the website still says we are on Beta 4. I did not even know RC1 was out until I saw this thread this morning. RC1 isn't out. What's available on the ftp site is a preliminary version of it, that should not be used. Eh? On what do you base that statement? RC1 is what it is, there is nothing preliminary about it; and anyone who is running an 8.3beta installation that is at all security-exposed would be well advised to get onto it ASAP. We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] VACUUM FULL out of memory
On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote: Those are the ulimits of the db_admin account (i.e., the user that set up and runs the DB processes). Is Postgres limited by other settings? Are you sure? On one system I used many years ago, /bin/sh wasn't what I thought it was, and so the ulimit that I got when logged in was not what the postmaster was starting under. Took me many days to figure out what was up. A ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 11:23:13AM -0500, Tom Lane wrote: We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. Uh, no, that isn't the project policy. If we were to find some fatal problem in RC1 at this point, we'd spin an RC2, precisely because RC1 has been up on the servers for a couple days now and confusion would inevitably result if we tried to redefine what RC1 was. The lack of an announcement is not my bailiwick, but tarball-making is. Once a tarball appears in the public FTP directories, it's official, and there's no reason to discourage people from using it. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps
On Tue, Jan 08, 2008 at 01:08:52AM +0100, Markus Schiltknecht wrote: Uh, which key are you talking about? AFAIU Simon's proposal, he suggests maintaining min/max values for all columns of the table. Right, but I think that's just because that approach is automatable. Only some use cases are going to be approproate to this. Yeah, and if only *one* tuple in the 1G segment has: some_date = '1998-12-31' OR some_date = '2001-01-01' Segment Exclusion can't exclude that segment. That's all I'm saying. Correct. Huh? I'm certainly not the one asking for it. Quite the opposite, I'm warning from over-estimating the use of SE. Right; I think one should be clear that there are many -- maybe most -- uses of PostgreSQL where the proposal will be of no use. I just think we need to be clear that for the areas where it _can_ be useful, it could be very useful indeed. A ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps
On Tue, Jan 08, 2008 at 02:12:28AM +, Gregory Stark wrote: Yes: it doesn't solve the problem I have, which is that I don't want to have to manage a whole bunch of tables. I want one table, and I want to be able to say, That section is closed. That's not your problem, that's the solution you're looking for. You're assuming a particular solution in your problem statement. Probably in that one, yes. I'm still waiting for permission to post my original problem statement; I suspect it's not going to be forthcoming by next Monday, so it's not going to happen. But I did outline something like what I'm talking about elsewhere in this thread. For my case, I'm thinking of the sort of data that builds up over time, and most of which happens probably not to be useful at any moment, but all of which _might_ be useful over the long haul. So what I wanted, originally, was to be able to set arbitrary ranges of tuples to be read-only, and to be able to set them offline if I wanted. Pseudo-DDL: ALTER TABLE foo SET read_only='t' WHERE created_on '2007-01-01'; ALTER TABLE foo SET tuple_offline='t' WHERE created_on '2006-01-01'; Now, the second segment is marked offline. If I query the table for things in that range, I get an ERROR telling me there might be data in the range, but it's not mounted at the moment. If I try to update records in the read-only (first) range, I get an error telling me the tuple is marked read only. The idea then is that these older tuples can be put off into long-term storage (wave hands here about the management of that stuff), and this keeps my online system compact but yet allows me, for just the cost of mounting a backup tape and reading the segments back, to go back and query those old ranges. The case I was particularly aiming at originally was for a case of data that cannot cost more than fractions of pennies to store, but that might represent a hugely expensive liability if the answer is not always right. Driving down that storage cost was mostly what I was aiming at, but people gradually convinced me that slightly more generic implementations might be useful. Simon's proposal came along, and it seems to me to be something like the generic implementation that others already convinced me was needed. I think Simon's proposal loses the very feature that makes partitioning useful. The DBA doesn't have a thing to describe, he has to define what parts of the table he's describing for every operation. And if you define a whole new object to name these things I think you'll end up with something that looks a lot like tables. I don't see how that's the case at all. In fact, I have the feeling it's the opposite, so perhaps I've misunderstood something. A ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
On Tue, Jan 08, 2008 at 11:23:13AM -0500, Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 09:47:15AM +, Peter Childs wrote: Indeed the website still says we are on Beta 4. I did not even know RC1 was out until I saw this thread this morning. RC1 isn't out. What's available on the ftp site is a preliminary version of it, that should not be used. Eh? On what do you base that statement? RC1 is what it is, there is nothing preliminary about it; and anyone who is running an 8.3beta installation that is at all security-exposed would be well advised to get onto it ASAP. We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. And I was under the impression that we *were* going to announce RC1 sometime this week. And that we just didn't want to do it at the exact same time as we did the backbranch releases. If not then we relly need to update the website with this information. But we don't normally release RC or beta (or any other, for that matter) releases without telling anybody, so IMHO it seems like a bad idea... //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] VACUUM FULL out of memory
Tom Lane wrote: Michael Akinde [EMAIL PROTECTED] writes: $ ulimit -a core file size (blocks, -c) 1 ... What you're showing us is the conditions that prevail in your interactive session. That doesn't necessarily have a lot to do with the ulimits that init-scripts run under ... Those are the ulimits of the db_admin account (i.e., the user that set up and runs the DB processes). Is Postgres limited by other settings? Regards, Michael A. Database Architect, Met.no begin:vcard fn:Michael Akinde n:Akinde;Michael org:Meteorologisk Institutt, Norge;IT adr;quoted-printable:;;Gaustadall=C3=A9en 30D;Oslo;;0313;Norge email;internet:[EMAIL PROTECTED] tel;work:22963379 tel;cell:45885379 x-mozilla-html:FALSE url:http://www.met.no version:2.1 end:vcard ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 11:23:13AM -0500, Tom Lane wrote: We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. Uh, no, that isn't the project policy. If we were to find some fatal problem in RC1 at this point, we'd spin an RC2, precisely because RC1 has been up on the servers for a couple days now and confusion would inevitably result if we tried to redefine what RC1 was. The lack of an announcement is not my bailiwick, but tarball-making is. Once a tarball appears in the public FTP directories, it's official, and there's no reason to discourage people from using it. Ok. That's not how I thought it was, and we have previously pulled releases that were available as tarballs and re-released them with the same version number. Sure, it was a while ago, but it has happened, and if I'm not completely mistaken, more than once. Anyway. My apologies for the incorrect statement in that case, and just let me (or us, really) know when it's time to update the webpage. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
Tom Lane a écrit : Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 11:23:13AM -0500, Tom Lane wrote: We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. Uh, no, that isn't the project policy. If we were to find some fatal problem in RC1 at this point, we'd spin an RC2, precisely because RC1 has been up on the servers for a couple days now and confusion would inevitably result if we tried to redefine what RC1 was. For example, Martin Pitt push rc1 on Sat, 05 Jan 2008 19:19:46 +0100 into Debian. The lack of an announcement is not my bailiwick, but tarball-making is. Once a tarball appears in the public FTP directories, it's official, and there's no reason to discourage people from using it. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] VACUUM FULL out of memory
Andrew Sullivan [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote: Those are the ulimits of the db_admin account (i.e., the user that set up and runs the DB processes). Is Postgres limited by other settings? Are you sure? On one system I used many years ago, /bin/sh wasn't what I thought it was, and so the ulimit that I got when logged in was not what the postmaster was starting under. Took me many days to figure out what was up. The only thing I find convincing is to insert ulimit -a someplace into the script that starts the postmaster, adjacent to where it does so, and then reboot. There are too many systems on which daemons are launched under settings different from what interactive shells use (a policy that's often a good one, too). regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
Magnus Hagander wrote: Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 11:23:13AM -0500, Tom Lane wrote: We didn't include 8.3RC1 in the security announcement because Josh wanted to make a separate announcement for it, but from every perspective except the PR one, it's out. There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. Uh, no, that isn't the project policy. If we were to find some fatal problem in RC1 at this point, we'd spin an RC2, precisely because RC1 has been up on the servers for a couple days now and confusion would inevitably result if we tried to redefine what RC1 was. The lack of an announcement is not my bailiwick, but tarball-making is. Once a tarball appears in the public FTP directories, it's official, and there's no reason to discourage people from using it. Ok. That's not how I thought it was, and we have previously pulled releases that were available as tarballs and re-released them with the same version number. Sure, it was a while ago, but it has happened, and if I'm not completely mistaken, more than once. yeah that is my recollection too ... Anyway. My apologies for the incorrect statement in that case, and just let me (or us, really) know when it's time to update the webpage. yeah while several of us noticed that new RC1 tarballs went up it was not immediatly clear to me that we can now announce RC1 on the website. Confusion on our major release policy during the 8.2 release lead to the following document on the wiki: http://developer.postgresql.org/index.php/MajorReleaseTimeline if this page(which is only a shell but we could use that as an opportunity to improve it) has still any relevance - where exactly on that timeline are we now ? Stefan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] VACUUM FULL out of memory
On Tue, Jan 08, 2008 at 12:33:34PM -0500, Tom Lane wrote: Andrew Sullivan [EMAIL PROTECTED] writes: On one system I used many years ago, /bin/sh wasn't what I thought it was, and so the ulimit that I got when logged in was not what the postmaster was starting under. Took me many days to figure out what was up. The only thing I find convincing is to insert ulimit -a someplace into the script that starts the postmaster, adjacent to where it does so, and then reboot. There are too many systems on which daemons are launched under settings different from what interactive shells use (a policy that's often a good one, too). What about a stored procedure in a language that allows you to do system(3) calls? Sam ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Markus Schiltknecht wrote: Hi, Andrew Chernow wrote: It might be something with the attachment, who knows. Most probably that was the case, yes. The -hackers list is limited, please use -patches to send patches. ;-) Regards Markus Noted. In our case, its a little ambiguos whether -hackers or -patches is the correct place. We are really posting a proposal that happens to have a working implementation. We are looking for feedback and/or a discussion. andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Andrew Chernow wrote: Markus Schiltknecht wrote: Hi, Andrew Chernow wrote: It might be something with the attachment, who knows. Most probably that was the case, yes. The -hackers list is limited, please use -patches to send patches. ;-) Regards Markus Noted. In our case, its a little ambiguos whether -hackers or -patches is the correct place. We are really posting a proposal that happens to have a working implementation. We are looking for feedback and/or a discussion. You should split it and send the proposal to -hackers. Ideally, you would have had a proposal discussed before you wrote a line of code. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] VACUUM FULL out of memory
Sam Mason [EMAIL PROTECTED] writes: On Tue, Jan 08, 2008 at 12:33:34PM -0500, Tom Lane wrote: The only thing I find convincing is to insert ulimit -a someplace into the script that starts the postmaster, What about a stored procedure in a language that allows you to do system(3) calls? Yeah, that would work, if you have any untrusted languages installed. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Named vs Unnamed Partitions
Hi, IMO, the lengthy discussion about Segment Exclusion and Segment Visibility Maps has long turned into a discussion about partitioning in general. I'm thankful for all the new insights it has brought me and I want to continue sharing my view on things. What's following is highly theoretical and has brainstorming characteristics. You've been warned. There are two very distinct ways to handle partitioning. For now, I'm calling them named and unnamed partitioning. Let's have a closer look at both options from a users point of view. I'm using Andrew's pseudo DDL example from the above mentioned thread: ALTER TABLE foo SET read_only='t' WHERE created_on '2007-01-01'; Given all tuples were read-writeable before, that implicitly created two partitions. Giving them names could look like that: ALTER TABLE foo SPLIT INTO old_foos AND new_foos; AT created_on '2007-01-01' ALTER PARTITION old_foos SET READ ONLY; Instead of only setting the read-only property, one could also set an alternative table space for the partition: ALTER TABLE foo SET TABLE SPACE large_but_slow_storage WHERE created_on '2007-01-01'; vs: ALTER PARTITION old_foos SET TABLE SPACE large_but_slow_storage; Please also note, that neither variant is limited to range partitioning. You can theoretically partition by pretty much anything, for example with a WHERE clause like: ..WHERE (id % 5) 2 The primary difference I see between these two ways to declare partitions is, that the former only modifies tuple properties (read-only, storage location), while the later also tells the database *why* it has to modify them. That has several different effects. First, newly inserted tuples are treated differently. For unnamed partitions, there must be defaults, like read-writable and a default table space. With named partitions, you define split points, so I guess one expects newly inserted tuples to end up in the right partition automatically. Unnamed partitioning could be equally automatic when letting a function decide, where to insert the new tuple. Second, repartitioning must be treated differently. With unnamed partitioning, the admin must first adjust the defaults (if required) and then move the existing tuple properties accordingly. With named partitions, the admin only needs to adjust the split point and the database system knows what it has to do. And third, but IMO most importantly: to be able to optimize queries, the database system has to know split points, so it can exclude partitions or segments from scanning. Obviously, with named partitions, it always knows them. Otherwise, you'll have to maintain some information about the tuples in your partitions, as Simon does with the min/max tuples. As soon as required, it could also maintain additional min/max values, i.e. for (id % 5) for the above example. I hope to have shown the most relevant aspects. To conclude, I'd say that named partitioning is closer to manually managed partitioning, as already known and often used. While unnamed partitioning is closer to automated partitioning, where the DBA does *not need* to have names for partitions, which is a pretty new and interesting idea to me. Regards Markus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Andrew Dunstan wrote: Andrew Chernow wrote: Markus Schiltknecht wrote: Hi, Andrew Chernow wrote: It might be something with the attachment, who knows. Most probably that was the case, yes. The -hackers list is limited, please use -patches to send patches. ;-) Regards Markus Noted. In our case, its a little ambiguos whether -hackers or -patches is the correct place. We are really posting a proposal that happens to have a working implementation. We are looking for feedback and/or a discussion. You should split it and send the proposal to -hackers. Ideally, you would have had a proposal discussed before you wrote a line of code. cheers andrew proposal discussed before you wrote a line of code Yeah, we realize that. In our situation, we use this code internally which is why it exists. Back in Aug 2007, we packaged it up and proposed it because we thought it would be useful to others. Since then, we have submitted several versions. Feedback was minimal. In fact, only Tom has made suggestions, which we have taken into consideration and adjusted the spec accordingly. We are interested in having a discussion about the beta-0.8a proposal and concept, not the implementation or submitting procedures. We provided the code in case someone wants to take a test drive. Andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
On Jan 8, 2008 12:57 PM, Andrew Dunstan [EMAIL PROTECTED] wrote: You should split it and send the proposal to -hackers. Ideally, you would have had a proposal discussed before you wrote a line of code. This is the latest in a long series of submissions...check the archives. We are (and have been) sensitive to the other pressures resulting from the 8.3 release. That said, we have been a little dismayed in the lack of comment. We attributed this to either lack of interest or just general business (we perhaps optimistically guessed the latter). We needed our extensions for our own projects and are willing to maintain them outside of the project if we have to...that is for the community to decide. For the record, we are extremely excited about the libpq changes and think others well be as well. Tom noted the lack of documentation as well as a number of technical issues. We addressed those issues (within the scope of what we wanted to accomplish). merlin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
Tom Lane wrote: Comparing the behavior of this to my patch for HEAD, I am coming to the conclusion that this is actually a *better* planning method than removing the redundant join conditions, even when they're truly rendundant! The reason emerges as soon as you look at cases involving more than a single join. If we strip the join condition from just one of the joins, then we find that the planner insists on doing that join last, whether it's a good idea or not, because clauseful joins are always preferred to clauseless joins in the join search logic. Would it be a good idea to keep removing redundant clauses and rethink the preference for clauseful joins, going forward? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
I wrote: Haven't looked closely at how to fix 8.2, yet. After some study it seems that the simplest, most reliable fix for 8.2 is to dike out the code that removes redundant outer join conditions after propagating a constant across them. This gives the right answer in the cases of concern (where we actually need the join condition) and doesn't really add much overhead in the cases where we don't need it. One small problem is that the join condition is redundant with the generated constant-equality constraints (mostly so, even if not entirely so) which will cause the planner to underestimate the size of the join, since clausesel.c is not very bright at all about redundant conditions. However, we already have a hack we can use for that: we can force the cached selectivity estimate for the join clause to 1.0, so that it's not considered to reduce the join size any more than the constant conditions already did. (This is also a problem in my earlier patch for 8.3, with the same fix possible.) That leads to the attached very simple patch. There is some dead code left behind, but it doesn't seem worth removing it. I'm rather tempted to patch 8.1 similarly, even though it doesn't fail on the known test case --- I'm far from convinced that there are no related cases that will make it fail, and in any case it's getting the selectivity wrong. 8.0 and before don't try to propagate constants like this, so they're not at risk. Comparing the behavior of this to my patch for HEAD, I am coming to the conclusion that this is actually a *better* planning method than removing the redundant join conditions, even when they're truly rendundant! The reason emerges as soon as you look at cases involving more than a single join. If we strip the join condition from just one of the joins, then we find that the planner insists on doing that join last, whether it's a good idea or not, because clauseful joins are always preferred to clauseless joins in the join search logic. What's worse, knowing that this is an outer join, is that the only available plan type for a clauseless outer join is a NestLoop with the inner side on the right, which again may be a highly nonoptimal way to do it. None of this matters a whole lot if the pushed-down constant conditions select single rows, but it does if they select multiple rows. I'm trying this in the regression database: select * from tenk1 a left join tenk1 b on (a.hundred = b.hundred) left join tenk1 c on (b.hundred = c.hundred) where a.hundred = 42; and finding patched 8.2 about 2X faster than 8.3 because it selects a better plan that avoids multiple rescans of subplans. So I'm coming around to the idea that getting rid of the redundant join conditions is foolish micro-optimization, and we should leave them in place even when we know they're redundant. The extra execution cycles paid to test the condition don't amount to much in any case, and the risk of getting a bad plan is too high. This is a reasonably simple adjustment to my prior patch for 8.3, which I will go ahead and make if there are no objections... regards, tom lane binOJTxejPTPF.bin Description: const-propagation-8.2.patch ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: 8.3.0 release schedule (Was:Re: [HACKERS] [BUGS] BUG #3852: Could not create complex aggregate)
All, There has been no annonucement whatsoever. Our web site stll claims beta4 is the current version. I was under the impression that this tarball, like all others, are considered preliminary until announced one way or another. Sorry about that. I was confused and thought we were deliberately waiting 24 hours between the security release and the RC1 announcement. It's out now. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] OUTER JOIN performance regression remains in 8.3beta4
Alvaro Herrera [EMAIL PROTECTED] writes: Would it be a good idea to keep removing redundant clauses and rethink the preference for clauseful joins, going forward? No --- it would create an exponential growth in planning time for large join problems, while not actually buying anything in the typical case. It's possible that we could do something along the lines of inserting dummy join conditions, to allow particular join paths to be explored, without generating any clause that actually requires work at runtime. I'm not convinced this complication is needed though; at least not if the only thing it's good for is this rather specialized optimization rule. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Merlin, That said, we have been a little dismayed in the lack of comment. I think most people can't really follow what functionality this would allow users driver authors to access, and what the spec for that functionality would. I know I'm not clear on it. A high-level proposal would arouse more general interest. Otherwise, you'll just get a critique and eventually it'll either get applied or rejected without much comment. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Merlin Moncure [EMAIL PROTECTED] writes: That said, we have been a little dismayed in the lack of comment. Personally, I'm ignoring this thread, as well as the one on partitioning, because I'm up to my arse in 8.3-release alligators. I'm going to try hard not to think about any 8.4 development issues until 8.3 is actually out the door and we branch for 8.4 development. I can't say how many other people are equally pressed for time ... but seeing the minimal attention that seems to be getting paid to open 8.3 issues, it doesn't look to me like the community as a whole has a lot of spare cycles right now. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
On Jan 8, 2008 4:31 PM, Josh Berkus [EMAIL PROTECTED] wrote: Merlin, That said, we have been a little dismayed in the lack of comment. I think most people can't really follow what functionality this would allow users driver authors to access, and what the spec for that functionality would. I know I'm not clear on it. A high-level proposal would arouse more general interest. Otherwise, you'll just get a critique and eventually it'll either get applied or rejected without much comment. There are over 1500 lines of documentation attached in the patch. That's a good place to start learning (or the attached regression test, if you want to get right to it). There's a lot of material to cover. Here's a very high level overview of the functionality: Goal: The ultimate goal is to be able to put parameters for command execution and get results in a consistent way. The wire format (either text or binary) is completely abstracted. We added many functions to facilitate this, but the core functionality comes from two varargs style functions, PQgetf and PQputf, and the PGparam, which complements PGresult. Features: *) Binary transfer of all built in types is supported in both directions. Basically, we marshal backend wire format to/from C types (some native, some were introduced). We also support text results so you can pull data in a consistent interface. *) For user types (with custom send/recv functions), functions can be registered to marshal them through a type registration interface. *) Arrays and composites are supported automatically (composites have to be 'registered'). *) Client side handlers can be aliased (domains) or subclassed...type handlers can chained together for special handling and/or conversion to exotic application types. Here is a short example which demonstrates some of the major features. There are many other examples and discussions of minutia in the documentation. int resfmt = 1; /* binary format */ /* Put an int4 and a text */ PGparam *param = PQparamCreate(conn); PQputf(param, %int4 %text, 2000, foobar); PQparamExec(conn, param, resfmt, insert into foo(id, user) values ($1, $2)); /* The above as a one liner, internally 'puts' for you */ res = PQexecParamsf(conn, resfmt, insert into foo(id, user) values (%int4, %text), 2000, foobar); int i4; char *text; PGresult *res = PQparamExec(conn, NULL, resfmt, select * from foo limit 1); /* From tuple 0, get an int4 at field 0 and a text * from the user field * '%' denotes by field num, '#' by field name */ PQgetf(res, 0, %int4 #text*, 0, i4, user, text); note the above line is not wired to binary, text results would be fine as well. /* let's get an array */ PGresult *res = PQparamExec(conn, NULL, resfmt, select current_schemas(true)); /* pop an array object out of the result. it creates a new result * with one field and one 'tuple' for each array element. * * arrays of composites return one field for each attribute of the composite. */ PGarray array; PQgetf(res, 0, %name[], 0, array); PQclear(res); for (i = 0; i PQntuples(array.res); i++) { char *name; PQgetf(array.res, i, %name*, 0, name); printf(%s\n, name); } PQclear(array.res); /* return data from composite type which we create and register */ CREATE TYPE merlin as (a int, t text); -- on server PGresult *merlin; PQregisterTypeHandler(conn, merlin, NULL, NULL); res = PQparamExec(conn, NULL, resfmt, select (1, 'abc')::merlin); PQgetf(res, 0, %merlin, 0, merlin); PQclear(res); PQgetf(merlin, 0, %int4 #text*, 0, i4, t, text); PQclear(merlin); merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal - libpq Type System beta-0.8a (was PGparam)
Tom Lane wrote: Merlin Moncure [EMAIL PROTECTED] writes: That said, we have been a little dismayed in the lack of comment. Personally, I'm ignoring this thread, as well as the one on partitioning, because I'm up to my arse in 8.3-release alligators. I'm going to try hard not to think about any 8.4 development issues until 8.3 is actually out the door and we branch for 8.4 development. I can't say how many other people are equally pressed for time ... but seeing the minimal attention that seems to be getting paid to open 8.3 issues, it doesn't look to me like the community as a whole has a lot of spare cycles right now. regards, tom lane Thanks for the heads up. We had a feeling the 8.3 crunch was a factor. We don't want to slow that down because we were are waiting for 8.3 to perform yet another mysql to postgresql migration :) andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Index trouble with 8.3b4
Gregory Stark [EMAIL PROTECTED] writes: Further poking around shows that the unrecognized locktag is because lmgr.c:DescribeLockTag was never taught about virtual xids. That's fixed, thanks for the patch. The pid it's waiting on is long since gone but looks like it was probably an autovacuum process. I have a vague recollection that you had rigged CREATE INDEX CONCURRENTLY to ignore vacuum processes when checking for conflicting processes. I'm still not too clear on the underlying bug though. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Index trouble with 8.3b4
Gregory Stark [EMAIL PROTECTED] writes: The pid it's waiting on is long since gone but looks like it was probably an autovacuum process. I have a vague recollection that you had rigged CREATE INDEX CONCURRENTLY to ignore vacuum processes when checking for conflicting processes. Since any such process will be blocked on our session-level ShareUpdateExclusiveLock it will always cause a deadlock and we would rather it just hang out and wait until our index build is finished. OK, after reading the code some more I think I've got the point. The scenario is that autovacuum is waiting to get ShareUpdateExclusiveLock (it can't already have it, because the CREATE INDEX CONCURRENTLY does) and then one of C.I.C's three wait steps decides it has to wait for the autovacuum. It cannot be one of the first two, because those only block for xacts that *already have* a conflicting lock. The problem must be at the third wait step, which waits out all xacts that might conceivably be interested in recently-dead tuples that are not in the index. Now an unindexed dead tuple is not a problem from vacuum's point of view, nor does ANALYZE care, so AFAICS there is no need for this step to wait for autovacuum processes --- nor indeed for manual vacuums. So we can avoid the deadlock if we just exclude those processes from the list of ones to wait for. I suggest we extend GetCurrentVirtualXIDs() with an additional parameter includeVacuums, and have it skip vacuum procs if that's set. (Hmm, maybe a more flexible approach is to make the parameter a bitmask, and ignore any procs for which param vacuumFlags is not zero.) Comments? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Index trouble with 8.3b4
Tom Lane [EMAIL PROTECTED] writes: It cannot be one of the first two, because those only block for xacts that *already have* a conflicting lock. The problem must be at the third wait step, which waits out all xacts that might conceivably be interested in recently-dead tuples that are not in the index. Ah, I had missed that point. Now an unindexed dead tuple is not a problem from vacuum's point of view, nor does ANALYZE care, so AFAICS there is no need for this step to wait for autovacuum processes --- nor indeed for manual vacuums. So we can avoid the deadlock if we just exclude those processes from the list of ones to wait for. That's what I had in mind. I suggest we extend GetCurrentVirtualXIDs() with an additional parameter includeVacuums, and have it skip vacuum procs if that's set. (Hmm, maybe a more flexible approach is to make the parameter a bitmask, and ignore any procs for which param vacuumFlags is not zero.) Comments? Only that the restrictions on what VACUUM is allowed to do seem the piling up. We may have to write up a separate document explaining what specialized set of rules VACUUM operates under. Also, ANALYZE was included in the latest security changes. Is there some way that ANALYZE could trigger some user-defined function being invoked which could in turn run some SQL using this index? I suppose a very strange expression index where the expression involved a recursive SQL query back to the same table (presumably being careful to avoid an infinite loop) could be possible. I am hoping our other things which ignore VACUUM such as the globalxmin calculation are careful not to ignore VACUUM ANALYZE processes? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Problem with CVS HEAD's handling of mergejoins
So I adjusted the patch I was working on as suggested here http://archives.postgresql.org/pgsql-hackers/2008-01/msg00251.php and things started blowing up all over the place --- Assert failures, too few pathkeys for mergeclauses errors, etc :-( On investigation, the problem seems to be a bit of brain-fade on my part. The planner uses PathKeys lists both to represent what's known about the output sort order of a Path, and to represent per-merge-clause ordering data for a merge join. A PathKeys list that represents sort ordering ought to be canonical, meaning it contains no redundant keys --- a key might be redundant because it is the same as some prior key in the list (ORDER BY x,x) or because it is known equal to a constant and thus uninteresting for sorting (WHERE x = 1 ... ORDER BY x). However, there are several places in the planner that expect the pathkeys for a merge join to be one-for-one with the selected merge clauses. I'm not sure why we didn't come across test cases exposing this problem earlier in beta. A partial explanation is that the equal-to-a-constant case was unlikely to arise before, because given WHERE x = y AND x = 1 the code up to now would get rid of x = y altogether and then never try for a mergejoin; my patch to eliminate that behavior was what exposed the problem. But there are other ways to get redundant keys in a list of candidate mergejoin clauses. One possibility seems to be to keep a list of raw (not canonical) pathkeys for each side of a list of proposed mergejoin clauses, which is one-to-one with the clauses, with the clear understanding that the canonical list that describes the path's sort ordering might be just a subset of this list. This would mean a couple of new fields in MergePath structs, but fortunately no on-disk format changes since MergePaths never get to disk. A perhaps less invasive idea is to discard any proposed mergeclauses that are redundant in this sense. This would still require some reshuffling of responsibility between select_mergejoin_clauses and the code in pathkeys.c, since right now select_mergejoin_clauses takes no account of that. However, I'm worried that that might result in planner failure on some FULL JOIN cases that work today, since we require all the join clauses to be mergejoinable for a FULL JOIN. People seem to complain when the planner fails, even for really stupid queries ;-). I think this would only be acceptable if we can prove that ignoring clauses that are redundant in this sense doesn't change the result --- which might be the case, but I'm not sure. I think I can fix this in a day or so, but I now definitely feel that we'll need an RC2 :-( regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Problem with CVS HEAD's handling of mergejoins
Tom Lane wrote: I think I can fix this in a day or so, but I now definitely feel that we'll need an RC2 :-( Understood. :-| -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Named vs Unnamed Partitions
Markus Schiltknecht [EMAIL PROTECTED] writes: There are two very distinct ways to handle partitioning. For now, I'm calling them named and unnamed partitioning. I had most of a draft email written which I seem to have lost in a reboot. To a large degree I was on the same line of thought as you. The whole point of partitioning is to give the DBA a short-hand to allow him or her to describe certain properties of the data to the database. The named approach is to let the DBA create objects which can then have various properties attached to them. So you can create a bucket for each month or for each financial account or whatever. Then you can attach properties to the buckets such as what tablespace to store them in, or whether to treat them as read-only or offline. The naming is precisely the useful part in that it is how the DBA associates the properties with chunks of data. Without naming the DBA would have to specify the same ranges every time he wants to change the properties. He might do a SET read_only WHERE created_on '2000-01-01' one day then another SET tablespace tsslow WHERE created_on '2000-01-01' and then later again do SET offline WHERE created_on '2000-01-01' I have to admit I always found it kludgy to have objects named invoices_2000_JAN and invoices_2000_FEB and so on. It's kind of an meta denormalization. But so is specifying where clauses repeatedly. If you don't have a first-class object which you can refer to to attach properties to, and instead are forced to redefine it repeatedly for each use then there's nothing stopping you from creating overlapping or even conflicting sets of properties. What's the database to do if you tell it something like: ALTER TABLE foo SET tablespace tsslow WHERE created_on '2000-01-01' ALTER TABLE foo SET tablespace tsfast WHERE updated_on '2006-01-01' Maybe you know that no record older than 2000 will be updated now but the database doesn't. As Markus describes too the behaviour *before* you've attached any particular properties to a partition is interesting too. A big benefit of partitioning is being able to load whole partitions or drop whole partitions of data which were not in any way special prior to needing to be archived. Effectively the named objects are the DBA's way of telling the database this chunk of data here, keep it all in one place because I'll be doing something en masse to it (such as dropping it) at some later date. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Index trouble with 8.3b4
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Now an unindexed dead tuple is not a problem from vacuum's point of view, nor does ANALYZE care, so AFAICS there is no need for this step to wait for autovacuum processes --- nor indeed for manual vacuums. Also, ANALYZE was included in the latest security changes. Is there some way that ANALYZE could trigger some user-defined function being invoked which could in turn run some SQL using this index? Hmm. ANALYZE itself doesn't look into the indexes, but it does invoke user-defined functions that could nominally run queries. However, a function in an index that runs a query that examines its own table seems implausible, and very unlikely to work right anyway. You could hardly expect such a function to be really immutable -- consider for example that it would be unlikely to deliver the same results during CREATE INDEX on an already-filled table that it would if the rows were being inserted with the index already existing. So I'm not really worried about that scenario. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Index trouble with 8.3b4
Gregory Stark wrote: I am hoping our other things which ignore VACUUM such as the globalxmin calculation are careful not to ignore VACUUM ANALYZE processes? It doesn't matter -- the ANALYZE is done in a separate transaction (so the VACUUM part is ignored, the ANALYZE part is not). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] VACUUM FULL out of memory
On Tue, Jan 08, 2008 at 05:53:28PM +, Sam Mason wrote: What about a stored procedure in a language that allows you to do system(3) calls? PL/bash? (I think there is something like this). But surely the ulimit before start is much easier! A ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Some notes about the index-functions security vulnerability
Now that the dust has settled, I want to post some notes about CVE-2007-6600, which is to my mind the most important of the five security problems fixed in our recent security updates. There are some unfinished issues here. Itagaki Takahiro originally identified the issue. The crux of it is that VACUUM FULL and ANALYZE need to execute functions in index definitions (both expression index columns and partial index predicates). Up to now this has just happened without any special steps being taken, which means that such functions were executed with the privileges of whoever is doing VACUUM/ANALYZE, who is very likely to be a superuser. Now CREATE INDEX requires such functions to be marked IMMUTABLE, which makes them unable to write anything, so the damage is seemingly limited; but it's easy to get around that. Hence, a nefarious user need only put some trojan-horse code into a PL-language function, use the function in an index on one of his tables, and wait for the next routine vacuuming in order to get his code executed as superuser. There are a whole bunch of related scenarios involving trojan-horse code in triggers, view definitions, etc. pgsql-core wasted quite a lot of time (months, actually :-() trying to devise an all-encompassing solution for all of them. However, those other scenarios have been publicly known for years, and haven't seemed to cause a lot of problems in practice, in part because it requires intentional use of a table or view in order to expose yourself to subversion. The index function attack is more nasty than these because it can subvert a superuser during required routine maintenance (including autovacuum). Moreover we couldn't find any way to deal with these other issues that doesn't involve nontrivial semantic incompatibilities, which wouldn't be suitable for back-patching. So the decision was to deal with only the index function problem as a security exercise, and after that try to get people to think some more about plugging those other holes in a future release. Takahiro-san's initial suggestion for fixing this was to try to make the marking of a function as IMMUTABLE into an air-tight guarantee that it couldn't modify the database. Right now it is not air-tight for a number of reasons: you can alter the volatility marking of a function after-the-fact, you can call a volatile function from an immutable one, etc. I originally argued against this fix on the grounds that making a planner hint into a security classification was a bad idea, since people routinely want to lie to the planner, and often have good reasons for it. But there is a better argument: even if you guarantee that a function can't write the database, it'll still be able to read the database and thereby read data the user shouldn't be able to get at. At that point you are reduced to hoping that the user cannot think of any covert channel by which to transmit the interesting info; and there are *always* covert channels, eg timing or CPU usage. We'd have to try to restrict IMMUTABLE functions so that they could not read the DB either, which seems impractical, as well as likely to break a lot of existing applications. So the direction we've pursued instead is to arrange for index expressions to be evaluated as if they were being executed by the table owner, that is, there's an implicit SECURITY DEFINER property attached to them. Up to now I think we've always thought of SECURITY DEFINER functions as being a mechanism for increasing one's privilege level. However, in this context we want to use them as a mechanism for *decreasing* privilege level, and if we want to use them that way then the privilege loss has to be air-tight. The problem there is that so far it's been possible for a SECURITY DEFINER function to execute SET SESSION AUTHORIZATION or SET ROLE and thereby regain whatever privileges are held at the outermost level. The patch as applied disallows both these operations inside a security-definer context. One reason for doing this restrictive fix is that GUC currently isn't being told about fmgr_security_definer's manipulations of CurrentUserId. There was actually a separate bug here: if you did SET ROLE inside a sec-def function and then exited without any error, SHOW ROLE continued to report the SET value as the current role, even though in reality the session had reverted to the previous CurrentUserId. Worse yet, a subsequent ABORT could cause GUC's idea of the setting to become the reality. The thinking among core was that we'd be happy with leaving SET SESSION AUTHORIZATION disabled forever, but it would be nice to allow SET ROLE, with the modified semantics that the set of accessible roles would be determined by the innermost security-definer function's owner, rather than the session authorization; and that the effects of SET ROLE would roll back at function exit. To implement that we'd need to redo the interface between GUC and miscinit.c's tracking of privilege state, but
Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps
On Wed, Jan 02, 2008 at 05:56:14PM +, Simon Riggs wrote: This technique would be useful for any table with historical data keyed by date or timestamp. It would also be useful for data where a time-of-insert component is implicit, such as many major entity tables where the object ids are assigned by a sequence. e.g. an Orders table with an OrderId as PK. Once all orders placed in a period have been shipped/resolved/closed then the segments will be marked read-only. Its not really going to change the code path much for small tables, yet seems likely to work reasonably well for large tables of all shapes and sizes. If a segment is being updated, we leave it alone, and maybe never actually set the visibility map at all. So overall, this idea seems to cover the main use case well, yet with only minimal impact on the existing use cases we support. As before, I will maintain this proposal on the PG developer Wiki, once we get down to detailed design. Like it? Simon, A novel approach to the problem. For me, this proposal addresses some of the other problems in postgres (visibility in the heap vs. index) rather than the problems with partitioning itself. It might seem otherwise, but for me partitioning is tool for managing large volumes of data. It allows the user to encode what they know about the nature of their data, and that's often about management. In this way, your proposal actually makes partitioning too smart: the user wants to tell the system how to organise the data, as opposed to the other way around. At Greenplum, we've been discussing this in depth. Interestingly, we also felt that the storage layer could be much smarter with large tables with predictable layouts and predictable data patterns. But the thing is, people with large tables like partitioning, putting different partitions on different storage; they like the ability to merge and split partitions; they like truncating old partitions. In a word, they seem to like the managability partitions give them -- as well as the performance that comes with this. To this end, we (well, Jeff Cohen) looked at the syntax and semantics of partitining in leading databases (Oracle, Informix, DB2) and came up with a highly expressive grammar which takes the best of each I think (I'll post details on the grammar in a seperate thread). The idea is that range (for example, a date range), list (a list of distinct values) and hash partitioning be supported on multiple columns. Partitions can be added, merged, truncated. Partitions can reside on different tablespaces. The existing issues with the rewriter, COPY support and the like go away by smartening up the backend. To explore the grammar and semantics Jeff and I (to a small extent) have implemented the parsing of such a grammar in the backend. At the moment, this just results in the generation of a bunch of rules (i.e., it produces the current behaviour, as if someone had written rules themselves) and debugging output. The fully fledged approach will see partitioning rules stored in a new system catalog which can be interrogated by the planner or COPY to determine which partition a given tuple belongs in or which partition(s) a WHERE clause matches. Yes, this is the traditional approach but I think it still has merit. We shall continue working on this approach because it is what our customers have asked for. We would also like to see it get into postgres too. Thanks, Gavin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org