Re: [HACKERS] windows shared memory error

2009-05-05 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
 Passes my tests, but I can't really reproduce the requirement to retry,
 so I haven't been able to test that part :(
 
 The patch looks sane to me.  If you want to test, perhaps reducing the
 sleep to 1 msec or so would reproduce the need to go around the loop
 more than once.  (Don't forget to put the machine under additional
 load, too.)

I've applied this to HEAD and 8.3 so we can get some buildfarm testing
on it as well.

Andrew, any chance you can get 8.3-tip tested with your client? Or at
least in your own reproducable-environment?

I didn't backpatch to 8.2, because the code is completely different
there. We should probably consider doing it once we know if this fixes
the actual issue, but I don't want to spend the effort on backporting it
until we know it works.


//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not reattach to shared memory captured in buildfarm

2009-05-05 Thread Magnus Hagander
Alvaro Herrera wrote:
 Magnus Hagander wrote:
 
 I didn't mean race condition between backends. I meant against a
 potential other thread started by a loaded DLL for initialization.
 (Again, things like antivirus are known to do this, and we do see these
 issues more often if AV is present for example)
 
 I don't understand this.  How can memory allocated by a completely separate
 process affect what happens to a backend?  I mean, if an antivirus is running,
 surely it does not run on the backend's process?  Or does it?

Anti[something] software regularly injects code into other processes,
yes. Either by creating a thread in the process using
CreateRemoteThread() or by using techniques similar to LD_PRELOAD.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Chris Browne
robertmh...@gmail.com (Robert Haas) writes:
 On Mon, May 4, 2009 at 10:10 AM, Andres Freund and...@anarazel.de wrote:
 Would a patch adding 'IF EXISTS' support to:
 - ALTER TABLE ... DROP COLUMN
 - ALTER TABLE ... DROP CONSTRAINT
 possibly be accepted?

 Having it makes the annoying task of writing/testing of schema-upgrade
 scripts a bit easier.

 Can't speak for the committers, but I've wished for this a time or two myself.

For constraints, it's easy enough to treat that as idempotent; it's no
big deal to drop and re-add a constraint.

For columns, I'd *much* more frequently be interested in
   ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...

Note that this is distinctly NOT the same as:
   ALTER TABLE ... DROP COLUMN IF EXISTS ...
   ALTER TABLE ... ADD COLUMN ...
-- 
(format nil ~...@~s cbbrowne linuxdatabases.info)
http://linuxdatabases.info/info/lisp.html
Signs of a Klingon Programmer -  10. A TRUE  Klingon Warrior does not
comment his code!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bytea vs. pg_dump

2009-05-05 Thread Bernd Helmle


From time to time we had complains about slow dump of large tables with 
bytea columns, people often complaining about a) size and b) duration of 
the dump.


That latter occurred recently to me, a customer would like to dump large 
tables (approx. 12G in size) with pg_dump, but he was annoyed about the 
performance. Using COPY BINARY reduced the time (unsurprisingly) to a 
fraction (from 12 minutes to 3 minutes).


As discussed in the past[1], we didn't implement pg_dump to support BINARY 
to preserve portability and version independence of dumps using pg_dump. I 
would like to bring that topic up again, since implementing an option like 
--binary-copy seems interesting in use cases, where portability and version 
issues doesn't matter and someone wants to have a fast COPY of his 
documents . This would make this task much easier, especially in the 
described case, where the customer has to dump referenced tables as well.


Another approach would be to just dump bytea columns in binary format only 
(not sure how doable that is, though).


Opinions, again?


[1] http://archives.postgresql.org//pgsql-hackers/2007-12/msg00139.php
--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Values of fields in Rules

2009-05-05 Thread mito

Hi,

when i create rule(on insert) on view(select id, name from users), i 
will recieve values that were inserted in form of reference words 
new.id, new.name.


if i insert into users (id, name) values (null, null);
then new.id = null and new.name = null

if i insert into users (name) values (null);
then new.id = null and new.name = null

is there any way how to distinguish that id column wasnt explicitly named?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Peter Eisentraut
On Monday 04 May 2009 23:11:22 Archana Sundararam wrote:
 I have many views dependent on a table. So whenever I do alter table and
 change the column type I have to drop and recreate all the views. Is there
 any other easy way to propagate the changes in the table to the views. Any
 suggestion is welcome.

Consider this example:

CREATE TABLE tab1 (
a int,
b text
);

CREATE VIEW view1 AS SELECT a, foo(b) FROM tab1;

ALTER TABLE tab1 ALTER COLUMN b TYPE inet;

Now what do expect should become of the view?

CREATE VIEW view1 AS SELECT a, foo(b) FROM tab1;  -- now using foo(inet)

or

CREATE VIEW view1 AS SELECT a, foo(b::text) FROM tab1; -- still using 
foo(text)

(This becomes more entertaining if you specified a conversion function (USING) 
for the type change.)

And this could then also change the return type of foo(), thus changing the 
row type of the view and would thus propogate up to other views.  And so if 
you have many views, as you say, this could become a great mess.  You could 
probably define and implement a solution, but it would be very confusing and 
risky to use.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Peter Eisentraut
On Monday 04 May 2009 22:21:10 Chris Browne wrote:
 For constraints, it's easy enough to treat that as idempotent; it's no
 big deal to drop and re-add a constraint.

Not if the constraint is a primary key, for example.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Andrew Dunstan



Chris Browne wrote:

For columns, I'd *much* more frequently be interested in
   ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...


  


We have debated CREATE ... IF NOT EXISTS in the past, and there is no 
consensus on what it should do, so we don't have it for any command. 
That is quite a different case from what's being asked for, and the two 
should not be conflated.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Andres Freund

Hi Chris,

On 05/04/2009 09:21 PM, Chris Browne wrote:

robertmh...@gmail.com (Robert Haas) writes:

On Mon, May 4, 2009 at 10:10 AM, Andres Freundand...@anarazel.de  wrote:

Would a patch adding 'IF EXISTS' support to:
- ALTER TABLE ... DROP COLUMN
- ALTER TABLE ... DROP CONSTRAINT
possibly be accepted?

Can't speak for the committers, but I've wished for this a time or two myself.

For constraints, it's easy enough to treat that as idempotent; it's no
big deal to drop and re-add a constraint.

For columns, I'd *much* more frequently be interested in
ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...

Note that this is distinctly NOT the same as:
ALTER TABLE ... DROP COLUMN IF EXISTS ...
ALTER TABLE ... ADD COLUMN ...
Yes, I would like to have that myself - but this seems to open a way 
much bigger can of worms.
Also the problem solved by both suggestions are not completely congruent 
- so I thought better tackle the easier one first before starting a long 
and arduous discussion...


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Tom Lane
Bernd Helmle maili...@oopsware.de writes:
 From time to time we had complains about slow dump of large tables with 
 bytea columns, people often complaining about a) size and b) duration of 
 the dump.

 That latter occurred recently to me, a customer would like to dump large 
 tables (approx. 12G in size) with pg_dump, but he was annoyed about the 
 performance. Using COPY BINARY reduced the time (unsurprisingly) to a 
 fraction (from 12 minutes to 3 minutes).

Seems like the right response might be some micro-optimization effort on
byteaout.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Kevin Grittner
Bernd Helmle maili...@oopsware.de wrote:
 
 Another approach would be to just dump bytea columns in binary
 format only (not sure how doable that is, though).
 
If that's not doable, perhaps a base64 option for bytea COPY?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 And this could then also change the return type of foo(), thus changing the 
 row type of the view and would thus propogate up to other views.  And so if 
 you have many views, as you say, this could become a great mess.  You could
 probably define and implement a solution, but it would be very confusing and 
 risky to use.

The SQL committee has also historically chosen to punt on such things.
Note the long-established rule that * is expanded at view definition
time (so adding columns doesn't change views).  I also see a flat
prohibition on *any* view reference in the newly added SET DATA TYPE
command (SQL:2008 11.17 alter column data type clause):

7) C shall not be referenced in the query expression of any view descriptor. 

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:

 Bernd Helmle maili...@oopsware.de writes:
 That latter occurred recently to me, a customer would like to dump large 
 tables (approx. 12G in size) with pg_dump, but he was annoyed about the 
 performance. Using COPY BINARY reduced the time (unsurprisingly) to a 
 fraction (from 12 minutes to 3 minutes).

 Seems like the right response might be some micro-optimization effort on
 byteaout.

Still, apart from lack of interest from developpers and/or resources, is
there some reason we don't have a pg_dump --binary option?

DBA would have to make sure his exports are usable, but when the routine
pg_dump backup is mainly there to be able to restore on the same machine
in case of unwanted event (DELETE bug, malicious TRUNCATE, you name it),
having a faster dump/restore even if local only would be of interest.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Still, apart from lack of interest from developpers and/or resources, is
 there some reason we don't have a pg_dump --binary option?

It seems rather antithetical to one of the main goals of pg_dump,
which is to provide a dump that can reliably be loaded onto other
machines or newer versions of Postgres.  I don't think that we
should provide such a foot-gun in hopes of getting relatively
minor performance improvements; especially when we have not
exhausted the alternatives.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:

 It seems rather antithetical to one of the main goals of pg_dump,
 which is to provide a dump that can reliably be loaded onto other
 machines or newer versions of Postgres.

You're calling for a pg_export/pg_import tool suite, or I have to learn
to read again :)

 I don't think that we should provide such a foot-gun in hopes of
 getting relatively minor performance improvements; especially when we
 have not exhausted the alternatives.

If you think improvements will be minor while alternatives are
promising, of course, I'm gonna take your word for it.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Bernd Helmle maili...@oopsware.de wrote:
 Another approach would be to just dump bytea columns in binary
 format only (not sure how doable that is, though).
 
 If that's not doable, perhaps a base64 option for bytea COPY?

I'm thinking plain old pairs-of-hex-digits might be the best
tradeoff if conversion speed is the criterion.  The main problem
in any case would be to decide how to control the format option.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Alvaro Herrera
Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Bernd Helmle maili...@oopsware.de wrote:
  Another approach would be to just dump bytea columns in binary
  format only (not sure how doable that is, though).
  
  If that's not doable, perhaps a base64 option for bytea COPY?
 
 I'm thinking plain old pairs-of-hex-digits might be the best
 tradeoff if conversion speed is the criterion.  The main problem
 in any case would be to decide how to control the format option.

It would be great if COPY FROM could read some fields as binary while
the rest is text.  That would allow us to do something like

--bytea-column-format=binary
--bytea-column-format=hexpair
--bytea-column-format=text

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST index changes

2009-05-05 Thread Bruce Momjian
Bruce Momjian wrote:
 bruce wrote:
  Has the on-disk format changed for GiST indexes?  I know it has for hash
  and GIN indexes.
 
 Sorry, I should have clarified: did the GiST index on-disk format change
 between Postgres 8.3 and 8.4.

Would someone please answer my question:  Did the GiST index on-disk
format change between Postgres 8.3 and 8.4?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 8:56 AM, Andrew Dunstan and...@dunslane.net wrote:
 We have debated CREATE ... IF NOT EXISTS in the past, and there is no
 consensus on what it should do, so we don't have it for any command. That is
 quite a different case from what's being asked for, and the two should not
 be conflated.

I must be missing something, because the semantics of CREATE ... IF
NOT EXISTS seem pretty well-defined to me, at least for any object
that has a name.  Check whether that name is in use; if not, create
the object per the specified definition.  Now for something like ALTER
TABLE ... ADD FOREIGN KEY I can see that there could be a problem.
That having been said, it's certain that CREATE IF NOT EXISTS is a
bigger foot-gun than DROP IF EXISTS, because after a succesful DROP IF
EXISTS the state of the object is known, whereas after CREATE IF NOT
EXISTS, it isn't (yes, it exists, but the definitions might not
match).  Still, that seems no reason not to implement it.  Right now,
I have a complex set of scripts which track the state of the database
to determine which DDL statements have already been applied.
Something like this would potentially simplify those scripts quite a
bit, so I'm much in favor.

On the other hand, I also agree with the point already made that these
are two different features, and we may as well focus on one at a time.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] any idea why http://www.postgresql.org/community/survey.61 reverts to old values after a while ?

2009-05-05 Thread Hannu Krosing
Hi

I voted for (or registered my use of) Londiste on
http://www.postgresql.org/community/ User survey and the results page
( http://www.postgresql.org/community/survey.61 ) showed 11 for Londiste
after that. 

But looking back after an hour, it was Londiste 9 again

Are you perhaps missing a COMMIT; somewhere ? ;)

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] any idea why http://www.postgresql.org/community/survey.61 reverts to old values after a while ?

2009-05-05 Thread Alvaro Herrera
Hannu Krosing wrote:
 Hi
 
 I voted for (or registered my use of) Londiste on
 http://www.postgresql.org/community/ User survey and the results page
 ( http://www.postgresql.org/community/survey.61 ) showed 11 for Londiste
 after that. 
 
 But looking back after an hour, it was Londiste 9 again
 
 Are you perhaps missing a COMMIT; somewhere ? ;)

You're probably looking at a mirror that did not have the updated
results yet.  It says 11 now on the mirror I get.

The master copy that the mirrors read from is
http://wwwmaster.postgresql.org/community/survey.61
(it says 12 there)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] any idea why http://www.postgresql.org/community/survey.61 reverts to old values after a while ?

2009-05-05 Thread Dave Page
On Tue, May 5, 2009 at 3:57 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Hi

 I voted for (or registered my use of) Londiste on
 http://www.postgresql.org/community/ User survey and the results page
 ( http://www.postgresql.org/community/survey.61 ) showed 11 for Londiste
 after that.

 But looking back after an hour, it was Londiste 9 again

 Are you perhaps missing a COMMIT; somewhere ? ;)

No - you're looking at one of the static mirror sites. They'll catch
up with the master site periodically.

http://wwwmaster.postgresql.org/community/survey.61 is the master.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Bernd Helmle
--On Dienstag, Mai 05, 2009 10:00:37 -0400 Tom Lane t...@sss.pgh.pa.us 
wrote:




Seems like the right response might be some micro-optimization effort on
byteaout.


Hmm looking into profiler statistics seems to second your suspicion:

Normal COPY shows:

 %   cumulative   self  self total
time   seconds   secondscalls   s/call   s/call  name
31.29 81.3881.38   134487 0.00 0.00  CopyOneRowTo
22.88140.8959.51   134487 0.00 0.00  byteaout
13.44175.8434.95 3052797224 0.00 0.00 
appendBinaryStringInfo

12.10207.3231.48 3052990837 0.00 0.00  CopySendChar
 8.45229.3121.99 3052797226 0.00 0.00  enlargeStringInfo
 3.90239.4510.1455500 0.00 0.00  pglz_decompress
 3.28247.97 8.523 2.84 2.84  appendStringInfoChar
 1.82252.71 4.74   134489 0.00 0.00  resetStringInfo
 1.72257.18 4.47 copy_dest_destroy
 0.27257.89 0.71  5544679 0.00 0.00 
hash_search_with_hash_value

 0.09258.13 0.24 13205044 0.00 0.00  LWLockAcquire
 0.08258.35 0.22 13205044 0.00 0.00  LWLockRelease

COPY BINARY generates:

time   seconds   secondscalls   s/call   s/call  name
73.70  9.05 9.0555500 0.00 0.00  pglz_decompress
 6.03  9.79 0.74  5544679 0.00 0.00 
hash_search_with_hash_value

 2.93 10.15 0.36 13205362 0.00 0.00  LWLockAcquire
 1.87 10.38 0.23 13205362 0.00 0.00  LWLockRelease

This is PostgreSQL 8.3.7 btw.

--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, May 5, 2009 at 8:56 AM, Andrew Dunstan and...@dunslane.net wrote:
 We have debated CREATE ... IF NOT EXISTS in the past, and there is no
 consensus on what it should do, so we don't have it for any command. That is
 quite a different case from what's being asked for, and the two should not
 be conflated.

 I must be missing something, because the semantics of CREATE ... IF
 NOT EXISTS seem pretty well-defined to me,

Please go read the prior threads (I think searching for CINE might
help, because we pretty shortly started abbreviating it like that).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] conditional dropping of columns/constraints

2009-05-05 Thread Andrew Dunstan



Robert Haas wrote:

On Tue, May 5, 2009 at 8:56 AM, Andrew Dunstan and...@dunslane.net wrote:
  

We have debated CREATE ... IF NOT EXISTS in the past, and there is no
consensus on what it should do, so we don't have it for any command. That is
quite a different case from what's being asked for, and the two should not
be conflated.



I must be missing something, because the semantics of CREATE ... IF
NOT EXISTS seem pretty well-defined to me, at least for any object
that has a name.  Check whether that name is in use; if not, create
the object per the specified definition.  
  


And if it does exist but the definitions don't match? That's the issue 
on which there has not been consensus. You apparently thing the command 
should silently do nothing, but that's not what everyone thinks. (I have 
no very strong feelings on the subject - I'm just explaining the issue.)


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared transactions vs novice DBAs, again

2009-05-05 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 The pgsql-admin list has just seen another instance where careless use
 of prepared transactions brought down a database, and the DBA (who had
 no idea what a prepared transaction even was) had no idea how to fix it.

Just as a followup (and I already posted this on the pgsql-admin thread),
the check_postgres script now has a specific check for this very case.
It simply checks the age of entries in pg_prepared_xacts and gives
a warning if the number is at or over the given threshhold (defaults
to 1 second). I'm still a heavy +1 on making the default Postgres
configuration value 0, but hopefully this will help.

- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200905051128
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkoAW0wACgkQvJuQZxSWSsgGRgCePjErqeAPEv4MLJzgEnh/tXtA
yLEAoPhBNvaWvcmTF9D8faZzI044zpBL
=ouXW
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared transactions vs novice DBAs, again

2009-05-05 Thread Tom Lane
Greg Sabino Mullane g...@turnstep.com writes:
 Just as a followup (and I already posted this on the pgsql-admin thread),
 the check_postgres script now has a specific check for this very case.
 It simply checks the age of entries in pg_prepared_xacts and gives
 a warning if the number is at or over the given threshhold (defaults
 to 1 second). I'm still a heavy +1 on making the default Postgres
 configuration value 0, but hopefully this will help.

Hmm, 1 second seems kinda tight --- it would not surprise me to have
valid situations where it takes over a second for an XA manager to
collect all the responses and decide to commit.  If you set it at a
minute or an hour you'd have very much less chance of false positives,
and not really give up much that I can see.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared transactions vs novice DBAs, again

2009-05-05 Thread Bernd Helmle
--On Dienstag, Mai 05, 2009 15:29:15 + Greg Sabino Mullane 
g...@turnstep.com wrote:



It simply checks the age of entries in pg_prepared_xacts and gives
a warning if the number is at or over the given threshhold (defaults
to 1 second). I'm still a heavy +1 on making the default Postgres
configuration value 0, but hopefully this will help.


1 seconds seems a very low default for me. I can imagine that most 
distributed transactions are taking longer than this to complete.


--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Serializable Isolation without blocking

2009-05-05 Thread Kevin Grittner
While discussing potential changes to PostgreSQL documentation of
transaction isolation levels, Emmanuel Cecchet pointed out an
intriguing new paper[1] on a new algorithm to provide true
serializable behavior in a MVCC based database, with no additional
blocking; although, there being no such things as a free lunch, there
is an increase in serialization failures under contention.  I have
been hesitant to raise the issue while everyone was busy trying to
wrap up release 8.4; but since that is now in beta testing and PGCon
is fast approaching, I wanted to put the issue out there so that
people have a chance to review it and discuss it.
 
Michael Cahill has given me permission to quote from the paper.  He
has also provided a URL to his personal version of the work[2], which
people may directly access for their personal use, although
redistribution is prohibited by the ACM copyright.  He has asked to be
copied on the discussion here.
 
I know that some here have questioned why anyone would want
serializable transactions.  Our environment has 21 programmers, 21
business process analysts, and four DBAs.  A major part of the time
for this staff is enhancement of existing software and development of
new software.  We have many distinct databases, the largest of which
has a schema of over 300 tables.  (That's well normalized and not
partitioned -- the structure of the data really is that complex.)  We
have over 8,700 queries against these various databases, including
OLTP, reporting, statistics, public access, web queries, etc.  If one
were to go through the well-know techniques to identify all possible
interactions between these queries against these tables, it would not
only be a massive undertaking, the results would immediately be
obsolete.
 
The nice thing about serializable transactions is that if you can show
that a transaction does the right thing when run by itself, you
automatically know that it will function correctly when run in any
mix, or it will fail with a serializable error and can be safely
retried.  Our framework is designed so that serialization errors are
automatically detected and the transaction is retried without any user
interaction or application programming needed -- a serialization
failure appears to the application code and the user the same as
simple blocking.
 
Quoting the paper's abstract:
 
Many popular database management systems offer snapshot isolation
rather than full serializability. There are well known anomalies
permitted by snapshot isolation that can lead to violations of data
consistency by interleaving transactions that individually maintain
consistency. Until now, the only way to prevent these anomalies was to
modify the applications by introducing artificial locking or update
conflicts, following careful analysis of conflicts between all pairs
of transactions.
 
This paper describes a modification to the concurrency control
algorithm of a database management system that automatically detects
and prevents snapshot isolation anomalies at runtime for arbitrary
applications, thus providing serializable isolation. The new algorithm
preserves the properties that make snapshot isolation attractive,
including that readers do not block writers and vice versa. An
implementation and performance study of the algorithm are described,
showing that the throughput approaches that of snapshot isolation in
most cases.
 
Quoting a portion of the conclusions:
 
A prototype of the algorithm has been implemented in Oracle Berkeley
DB and shown to perform significantly better that two-phase locking in
a variety of cases, and often comparably with snapshot isolation.
 
One property of Berkeley DB that simplified our implementation was
working with page level locking and versioning. In databases that
version and lock at row-level granularity (or finer), additional
effort would be required to avoid phantoms, analogous to standard two
phase locking approaches such as multigranularity locking.
 
Quoting a snippet from the implementation section:
 
Making these changes to Berkeley DB involved only modest changes to
the source code. In total, only 692 lines of code (LOC) were modified
out of a total of over 200,000 lines of code in Berkeley DB.
 
Michael J. Cahill has since implemented these techniques in InnoDB as
part of his PhD work.  While Microsoft SQL Server does provide full
serializability in an MVCC implementation, I believe they do it with
blocking rather than this newer and faster technique.
 
The paper is a very interesting read, and I fear that if we don't
pursue these techniques, InnoDB users will have legitimate bragging
rights over PostgreSQL users in an important area.
 
Oh, and I know that these issues are well known, and I know that the
solution involves predicate locks; although these won't necessarily be
locks which cause blocking.
 
-Kevin
 
[1] Michael J. Cahill, Uwe Röhm, Alan D. Fekete. Serializable
Isolation for Snapshot Databases. In the Proceedings of the 2008 ACM
SIGMOD 

Re: [HACKERS] Prepared transactions vs novice DBAs, again

2009-05-05 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 1 seconds seems a very low default for me. I can imagine that most
 distributed transactions are taking longer than this to complete.

One second means it is set by default to catch *all* prepared
transactions. It's simply checking how long the transaction has
been open via:

SELECT database, ROUND(EXTRACT(epoch FROM now()-prepared)), prepared
FROM pg_prepared_xacts ORDER BY prepared ASC;

If you *are* using prepared transactions (which most people are not),
you would want to set a specific number for your environment -
and certainly more than 1 second (perhaps 5 minutes?) At that point,
the check changes from has anybody mistakenly created a prepared
transaction to has one of our prepared transactions been open
too long?

- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200905051154
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkoAYaEACgkQvJuQZxSWSsgYqACgvQOPJKMpDAIdSuGIGjvqrkxO
XA8AoKraljUOgV7JrFlv2dJR/T/IJ1iv
=QMWI
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] windows doesn't notice backend death

2009-05-05 Thread Tom Lane
I wrote:
 Attached is a proposed patch for the dead man switch idea.
 ...
 Barring objections I'll go ahead and apply this to HEAD.  I'm wondering
 whether we are sufficiently worried about the Windows task manager issue
 to risk back-patching into 8.3 and 8.2 ... comments?

For lack of response, I assume no one wants to back-patch this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] windows doesn't notice backend death

2009-05-05 Thread Andrew Dunstan



Tom Lane wrote:

I wrote:
  

Attached is a proposed patch for the dead man switch idea.
...
Barring objections I'll go ahead and apply this to HEAD.  I'm wondering
whether we are sufficiently worried about the Windows task manager issue
to risk back-patching into 8.3 and 8.2 ... comments?



For lack of response, I assume no one wants to back-patch this.


  


No, I think it's unnecessary, now we understand what's going on.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

2009-05-05 Thread Robert Haas
On Mon, May 4, 2009 at 10:41 PM, Joshua Tolley eggyk...@gmail.com wrote:
 On Mon, May 04, 2009 at 10:13:31PM -0400, Robert Haas wrote:

 nit
 +       own analysis indicates otherwie).  When set to a negative value, 
 which
 s/otherwie/otherwise
 /nit


 A question: why does attdistinct become entry #5 instead of going at the end?
 I assume it's because the order here controls the column order, and it makes
 sense to have attdistinct next to attstattarget, since they're related. Is
 that right? Thanks in advance...

Yep, that was my thought.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

2009-05-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, May 4, 2009 at 10:41 PM, Joshua Tolley eggyk...@gmail.com wrote:
 A question: why does attdistinct become entry #5 instead of going at the end?
 I assume it's because the order here controls the column order, and it makes
 sense to have attdistinct next to attstattarget, since they're related. Is
 that right? Thanks in advance...

 Yep, that was my thought.

We generally want fixed-size columns before variable-size ones, to ease
accessing them from C code.  So it shouldn't go at the end in any case.
Beyond that it's mostly aesthetics, with maybe some thought for avoiding
unnecessary alignment padding.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 12:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, May 4, 2009 at 10:41 PM, Joshua Tolley eggyk...@gmail.com wrote:
 A question: why does attdistinct become entry #5 instead of going at the 
 end?
 I assume it's because the order here controls the column order, and it makes
 sense to have attdistinct next to attstattarget, since they're related. Is
 that right? Thanks in advance...

 Yep, that was my thought.

 We generally want fixed-size columns before variable-size ones, to ease
 accessing them from C code.  So it shouldn't go at the end in any case.
 Beyond that it's mostly aesthetics, with maybe some thought for avoiding
 unnecessary alignment padding.

I thought about that as well; it should be OK where it is, in that regard.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Andrew Dunstan



Tom Lane wrote:

Kevin Grittner kevin.gritt...@wicourts.gov writes:
  

Bernd Helmle maili...@oopsware.de wrote:


Another approach would be to just dump bytea columns in binary
format only (not sure how doable that is, though).
  
 
  

If that's not doable, perhaps a base64 option for bytea COPY?



I'm thinking plain old pairs-of-hex-digits might be the best
tradeoff if conversion speed is the criterion.  The main problem
in any case would be to decide how to control the format option.


  




Yeah.  Any ideas on how to do that? I can't think of anything very clean 
offhand.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Archana Sundararam
Thanks a lot. I thought I would go with writing a function to Drop the views , 
ALTER table and the recreate the views so as to take care of the column type 
changes in the table.

--- On Tue, 5/5/09, Tom Lane t...@sss.pgh.pa.us wrote:

From: Tom Lane t...@sss.pgh.pa.us
Subject: Re: [HACKERS] ALTER TABLE should change respective views
To: Peter Eisentraut pete...@gmx.net
Cc: pgsql-hackers@postgresql.org, Archana Sundararam archn...@yahoo.com
Date: Tuesday, May 5, 2009, 8:10 AM

Peter Eisentraut pete...@gmx.net writes:
 And this could then also change the return type of foo(), thus changing the 
 row type of the view and would thus propogate up to other views.  And so if 
 you have many views, as you say, this could become a great mess.  You could
 probably define and implement a solution, but it would be very confusing and 
 risky to use.

The SQL committee has also historically chosen to punt on such things.
Note the long-established rule that * is expanded at view definition
time (so adding columns doesn't change views).  I also see a flat
prohibition on *any* view reference in the newly added SET DATA TYPE
command (SQL:2008 11.17 alter column data type clause):

7) C shall not be referenced in the query expression of any view descriptor. 

            regards, tom lane



  

[HACKERS] Wrong stats for empty tables

2009-05-05 Thread Emmanuel Cecchet
Hi,

Here is an example showing the problem:

Welcome to psql 8.3.6, the PostgreSQL interactive terminal.
manu=# create table foo (x int);
CREATE TABLE
manu=# explain select * from foo;
  QUERY PLAN
---
 Seq Scan on foo  (cost=0.00..34.00 rows=2400 width=4)
(1 row)

manu=# analyze foo;
ANALYZE
manu=# explain select * from foo;
  QUERY PLAN
---
 Seq Scan on foo  (cost=0.00..34.00 rows=2400 width=4)
(1 row)

manu=# insert into foo values (1);
INSERT 0 1
manu=# analyze foo;
ANALYZE
manu=# explain select * from foo;
QUERY PLAN
---
 Seq Scan on foo  (cost=0.00..1.01 rows=1 width=4)
(1 row)


Now a possible cause for this might be the relpages attribute in pg_class (the 
default value 0 does not seem to be interpreted correctly):

manu=# create table bar(x int);
CREATE TABLE
manu=# explain select * from bar;
  QUERY PLAN
---
 Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
(1 row)

manu=# select relpages from pg_class where relname='bar';
 relpages 
--
0
(1 row)
manu=# update pg_class set relpages=1 where relname='bar';
UPDATE 1
manu=# explain select * from bar;
QUERY PLAN
---
 Seq Scan on bar  (cost=0.00..0.00 rows=1 width=4)
(1 row)


This is a real problem if you have a lot of empty child tables. Postgres will 
not optimize correctly queries in the presence of empty child tables.

Is this a bug?

Thanks for your help,
Emmanuel
--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 I'm thinking plain old pairs-of-hex-digits might be the best
 tradeoff if conversion speed is the criterion.  The main problem
 in any case would be to decide how to control the format option.

 Yeah.  Any ideas on how to do that? I can't think of anything very clean 
 offhand.

Well, there's nothing much wrong with a GUC setting to control output
--- we have lots of precedent, such as DateStyle.  The problem is with
figuring out what ambiguous input is meant to be.  There seems to be
an uncomfortably high risk of misinterpreting the input.

For sake of argument, suppose we define the hex format as 0x followed
by pairs of hex digits.  We could then modify byteaout so that if it
were told to print in old-style a value that happened to start with
0x, it could output 0\x instead, which means the same but would be
unambiguous.  This would fix the problem going forward, but old-style
dumps and un-updated clients would still be at risk.  The risk might
not be too high though, since the odds of successfully parsing old-style
data as hex would be relatively low, particularly if we were draconian
about case (ie the x MUST be lower case and the hex digits MUST be
upper).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong stats for empty tables

2009-05-05 Thread Tom Lane
Emmanuel  Cecchet emmanuel.cecc...@asterdata.com writes:
 Is this a bug?

No, it's intentional.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] windows doesn't notice backend death

2009-05-05 Thread Magnus Hagander
Tom Lane wrote:
 I wrote:
 Attached is a proposed patch for the dead man switch idea.
 ...
 Barring objections I'll go ahead and apply this to HEAD.  I'm wondering
 whether we are sufficiently worried about the Windows task manager issue
 to risk back-patching into 8.3 and 8.2 ... comments?
 
 For lack of response, I assume no one wants to back-patch this.

Hmm. I didn't have time to look it over :(

In general, killing server processes from task manager in windows is
less likely to be a popular thing than using kill on unix (and it still
surprises me how many people that consider themselves experts still do
kill -9 by defualt whenever they want to stop something..)

Given that it actually doesn't notice it if we do, we might have people
doing this that don't know about it. But I think we can at least keep it
HEAD only for a while until it's seen some productoin level testing...

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong stats for empty tables

2009-05-05 Thread Emmanuel Cecchet
From: Tom Lane [...@sss.pgh.pa.us]
Subject: Re: [HACKERS] Wrong stats for empty tables

Emmanuel  Cecchet emmanuel.cecc...@asterdata.com writes:
 Is this a bug?

No, it's intentional.


So what is the rationale behind not being able to use indexes and optimizing 
empty tables as in the following example:

manu=# create table father (id int, val int, tex varchar(100), primary key(id));
manu=# create table other (id1 int, id2 int, data varchar(10), primary 
key(id1,id2));
insert some data
manu=# explain select father.*,id2 from father left join other on 
father.id=other.id1 where id2=2 order by id;
   QUERY PLAN

 Sort  (cost=37.81..37.82 rows=5 width=230)
   Sort Key: father.id
   -  Hash Join  (cost=23.44..37.75 rows=5 width=230)
 Hash Cond: (father.id = other.id1)
 -  Seq Scan on father  (cost=0.00..13.10 rows=310 width=226)
 -  Hash  (cost=23.38..23.38 rows=5 width=8)
   -  Seq Scan on other  (cost=0.00..23.38 rows=5 width=8)
 Filter: (id2 = 2)
(8 rows)

manu=# create table child1() inherits(father);
manu=# create table child2() inherits(father);
manu=# create table child3() inherits(father);
manu=# create table child4() inherits(father);
manu=# create table child5() inherits(father);
manu=# create table child6() inherits(father);
manu=# create table child7() inherits(father);
manu=# create index i1 on child1(id);
manu=# create index i2 on child2(id);
manu=# create index i3 on child3(id);
manu=# create index i4 on child4(id);
manu=# create index i5 on child5(id);
manu=# create index i6 on child6(id);
manu=# create index i7 on child7(id);
manu=# explain select father.*,id2 from father left join other on 
father.id=other.id1 where id2=2 order by id;
 QUERY PLAN

 Sort  (cost=140.00..140.16 rows=62 width=230)
   Sort Key: public.father.id
   -  Hash Join  (cost=23.44..138.16 rows=62 width=230)
 Hash Cond: (public.father.id = other.id1)
 -  Append  (cost=0.00..104.80 rows=2480 width=226)
   -  Seq Scan on father  (cost=0.00..13.10 rows=310 width=226)
   -  Seq Scan on child1 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child2 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child3 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child4 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child5 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child6 father  (cost=0.00..13.10 rows=310 
width=226)
   -  Seq Scan on child7 father  (cost=0.00..13.10 rows=310 
width=226)
 -  Hash  (cost=23.38..23.38 rows=5 width=8)
   -  Seq Scan on other  (cost=0.00..23.38 rows=5 width=8)
 Filter: (id2 = 2)
(16 rows)



I must admit that I did not see what the original intention was to get this 
behavior.
Emmanuel
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Josh Berkus

All,

I was discussing this with a client who experiences this problem on a 
weekly basis, and the issue is mainly one of change management.


That is, manually dropping all of the views  functions dependant on a 
table, changing the table, and recreating the views and functions, is a 
major PITA and substantially inhibits the use of views and functions for 
security and database abstraction.  Add OID invalidation for cached 
plans into this and you have a bunch of developers taking their business 
logic out of the database and putting it into middleware.


What would solve the issue for 90% of our users would be an ALTER TABLE 
... CASCADE which would apply the changes to the table, and do a REPLACE 
VIEW and REPLACE FUNCTION for every dependant view and function, failing 
and rolling back if any REPLACE doesn't work automatically.


Incompatible table changes would still require manual drop and 
recreation, of course.  But most table changes to a production database 
are adding fields or changing constraints, which in most cases won't 
break dependant views or functions.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] community equipment

2009-05-05 Thread Josh Berkus



It has been brought to our attention that many in the PostgreSQL
community are still not aware that we have equipment which has been
donated for community use (e.g. development and testing).  As
requested we have set up an additional web page on pgfoundy and a new
mailing list to discuss usage of the equipment.  Note that the web
page points back to the wiki for information that was already been
created, and that this mailing lists is not intended to replace lists
already in place such as pgsql-hackers or pgsql-performance:


And let me thank the following sponsors for that equipment (in order of 
donation size):


HP
CommandPrompt
Hi5.com
IBM
Sun Microsystems

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Incompatible table changes would still require manual drop and 
 recreation, of course.  But most table changes to a production database 
 are adding fields or changing constraints, which in most cases won't 
 break dependant views or functions.

... as indeed they don't.  What's your point?  The question here was
about whether the DB should try to guess the right behavior for a
datatype change in an existing column.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode string literals versus the world

2009-05-05 Thread Peter Eisentraut
On Tuesday 05 May 2009 03:01:05 Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
  I think we can handle that and the cases Tom presents by erroring out
  when the U syntax is used with stdstr off.
 
  Proposed patch for that attached.

 I have not been able to think of any security hole in that proposal,
 so this patch seems acceptable to me.  I wonder though whether any
 corresponding change is needed in psql's lexer, and if so how should
 it react exactly to the rejection case.

I had thought about that as well, but concluded that no additional change is 
necessary.

Note that the *corresponding* change would be psql complaining I don't like 
what you entered, versus the just-committed behavior that psql is indifferent 
and the server complains I don't like what you sent me.

In any case, the point of the change is to prevent confusion in client 
programs, so if we had to patch psql to make sense, then the change would have 
been pointless in the first place.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Greg Stark

Sorry got top-posting -- stupid iphone mail client.

We could eliminate the problem with old dumps by doing something like  
\x to indicate a new-style hex dump.


That doesn't make us 100% safe against arbitrary user input but should  
be pretty low risk.



--
Greg


On 5 May 2009, at 18:51, Tom Lane t...@sss.pgh.pa.us wrote:


Andrew Dunstan and...@dunslane.net writes:

Tom Lane wrote:

I'm thinking plain old pairs-of-hex-digits might be the best
tradeoff if conversion speed is the criterion.  The main problem
in any case would be to decide how to control the format option.


Yeah.  Any ideas on how to do that? I can't think of anything very  
clean

offhand.


Well, there's nothing much wrong with a GUC setting to control output
--- we have lots of precedent, such as DateStyle.  The problem is with
figuring out what ambiguous input is meant to be.  There seems to be
an uncomfortably high risk of misinterpreting the input.

For sake of argument, suppose we define the hex format as 0x followed
by pairs of hex digits.  We could then modify byteaout so that if it
were told to print in old-style a value that happened to start with
0x, it could output 0\x instead, which means the same but would be
unambiguous.  This would fix the problem going forward, but old-style
dumps and un-updated clients would still be at risk.  The risk might
not be too high though, since the odds of successfully parsing old- 
style

data as hex would be relatively low, particularly if we were draconian
about case (ie the x MUST be lower case and the hex digits MUST be
upper).

   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE should change respective views

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 2:17 PM, Josh Berkus j...@agliodbs.com wrote:
 Incompatible table changes would still require manual drop and recreation,
 of course.  But most table changes to a production database are adding
 fields or changing constraints, which in most cases won't break dependant
 views or functions.

You can already add a column to a table or change a constraint without
needing to drop and recreate dependent views or functions.  You can
also rename and drop columns.  I think the real issue is when you have
dependencies on a VIEW.

http://archives.postgresql.org/pgsql-hackers/2008-05/msg00691.php

I currently handle the problem you're describing by having a series of
scripts which automatically drop pretty much every view in the
database in reverse order of creation, and then recreate them all (all
within a single transaction).  I run it every time I do a release and
it works great, but it's definitely not ideal, and wouldn't work at
all but for the fact that my system is sufficiently lightly loaded
that taking locks on all of those views is actually possible.

8.4 will be slightly better than 8.3 in that it will allow CREATE OR
REPLACE VIEW to add additional columns to the end of the view
definition.

http://archives.postgresql.org/pgsql-committers/2008-12/msg00066.php

...but this still falls considerably short of where I'd like to be.
The next logical step would probably be to support ALTER VIEW DROP
COLUMN, but I haven't really looked at what would be required to
implement that.  Checking the dependencies is probably the easy part;
the tricky things, I think, are (a) currently, attisdropped can never
be set for any column of a view, does anything break if we change
this? and (b) how do we modify the stored view definition to remove
the dropped column from the query's target list?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 
 Unless we can think of a more bulletproof format selection mechanism
 
Would it make any sense to have an option on the COPY command to tell
it to use base64 for bytea columns?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Tom Lane wrote:
 I'm thinking plain old pairs-of-hex-digits might be the best
 tradeoff if conversion speed is the criterion.

 That's a lot less space-efficient than base64, though.

Well, base64 could give a 33% savings, but it's significantly harder
to encode/decode.  Also, since it has a much larger set of valid
data characters, it would be *much* more likely to allow old-style
formatting to be mistaken for new-style.  Unless we can think of
a more bulletproof format selection mechanism, that could be
an overriding consideration.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Andrew Dunstan



Tom Lane wrote:

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  

Tom Lane wrote:


I'm thinking plain old pairs-of-hex-digits might be the best
tradeoff if conversion speed is the criterion.
  


  

That's a lot less space-efficient than base64, though.



Well, base64 could give a 33% savings, but it's significantly harder
to encode/decode.  Also, since it has a much larger set of valid
data characters, it would be *much* more likely to allow old-style
formatting to be mistaken for new-style.  Unless we can think of
a more bulletproof format selection mechanism, that could be
an overriding consideration.


  


Hex will already provide some space savings over our current encoding 
method for most byteas anyway. It's not like we'd be making things less 
efficient space-wise. And in compressed archives the space difference is 
likely to dissolve to not very much, I suspect.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable Isolation without blocking

2009-05-05 Thread Neil Conway
On Tue, May 5, 2009 at 8:50 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 While discussing potential changes to PostgreSQL documentation of
 transaction isolation levels, Emmanuel Cecchet pointed out an
 intriguing new paper[1] on a new algorithm to provide true
 serializable behavior in a MVCC based database

I agree, this is very interesting work. I blogged about it a while ago[1].

 Making these changes to Berkeley DB involved only modest changes to
 the source code. In total, only 692 lines of code (LOC) were modified
 out of a total of over 200,000 lines of code in Berkeley DB.

Tracking the read sets of each transaction would be very expensive.
Worse still, that information needs to be kept around after
end-of-transaction, which raises questions about where it should be
stored and how it should be cleaned up. Note that the benchmarks in
the paper involve transactions that perform a small number of simple
read and update operations, which reduces the bookkeeping overhead.

Neil

[1] http://everythingisdata.wordpress.com/2009/02/25/february-25-2009/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch to fix search_path defencies with pg_bench

2009-05-05 Thread Joshua D. Drake
Hello,

I have been doing some testing with pgbench and I realized that it
forces the use of public as its search_path. This is bad if:

* You want to run multiple pgbench instances within the same database
* You don't want to use public (for whatever reason)

This patch removes that ability and thus will defer to the default
search_path for the connecting user.

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index ad20cac..1f25921 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -357,8 +357,6 @@ doConnect(void)
return NULL;
}
 
-   executeStatement(conn, SET search_path = public);
-
return conn;
 }
 

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong stats for empty tables

2009-05-05 Thread Josh Berkus

On 5/5/09 9:52 AM, Tom Lane wrote:

Emmanuel  Cecchetemmanuel.cecc...@asterdata.com  writes:

Is this a bug?


No, it's intentional.


Huh?  Why would we want wrong stats?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not reattach to shared memory captured in buildfarm

2009-05-05 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 One proposed fix is to allocate a fairly large block of memory in the
 postmaster just before we get the shared memory, and then free it right
 away. The effect should be to push down the shared memory segment
 further in the address space.

I have no enthusiasm for doing something like this when we have so
little knowledge of what's actually happening.  We have *no* idea
whether the above could help, or what size of allocation to request.
It's not very hard to imagine that the wrong size choice could make
things worse rather than better.

It seems to me that what we ought to do now is make a serious effort
to gather more data.  I came across a suggestion that one could use
VirtualQuery() to generate a map of the process address space
under Windows.  I suggest that we add some code that is executed
if the reattach attempt fails and dumps the process address space
details to the postmaster log.  Dumping the postmaster's address
space at the time it successfully creates the shmem segment might
be useful for comparison, too.

(A quick look at the VirtualQuery spec indicates that you can't tell
very much beyond free/allocated status, though.  Maybe there's some
other call that would tell more?  It'd be really good if we could get
the names of DLLs occupying memory ranges, for example.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong stats for empty tables

2009-05-05 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 5/5/09 9:52 AM, Tom Lane wrote:
 No, it's intentional.

 Huh?  Why would we want wrong stats?

Tables rarely stay empty; and a plan generated on the assumption that a
table is empty is likely to suck much more when the table stops being
empty than a plan generated on the assumption that the table contains
some data will suck when it really doesn't.  Neither case is really
attractive, but the downside of a size underestimate tends to be a
lot worse than that of an overestimate.

This decision was made before we had autovacuum/autoanalyze support
or the ability to replan automatically after a stats update, but I think
it's still good even now that we do.  You can add a hundred or so tuples
to an empty table before autovac will deign to pay attention, and that's
more than enough to blow a nestloop plan out of the water.  Also, the
most common case for this type of issue is a temp table, which autovac
can't help with at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-05 Thread Heikki Linnakangas

Tom Lane wrote:

Kevin Grittner kevin.gritt...@wicourts.gov writes:

Bernd Helmle maili...@oopsware.de wrote:

Another approach would be to just dump bytea columns in binary
format only (not sure how doable that is, though).
 

If that's not doable, perhaps a base64 option for bytea COPY?


I'm thinking plain old pairs-of-hex-digits might be the best
tradeoff if conversion speed is the criterion.


That's a lot less space-efficient than base64, though.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Values of fields in Rules

2009-05-05 Thread mito

mito wrote:

Hi,

when i create rule(on insert) on view(select id, name from users), i 
will recieve values that were inserted in form of reference words 
new.id, new.name.


if i insert into users (id, name) values (null, null);
then new.id = null and new.name = null

if i insert into users (name) values (null);
then new.id = null and new.name = null

is there any way how to distinguish that id column wasnt explicitly named?



is there any way how to determinate in rule that null comes from 
explicit insert or from not naming column in insert statment


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong stats for empty tables

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 2:03 PM, Emmanuel  Cecchet
emmanuel.cecc...@asterdata.com wrote:
 So what is the rationale behind not being able to use indexes and optimizing 
 empty tables as in the following example:

 manu=# create table father (id int, val int, tex varchar(100), primary 
 key(id));
 manu=# create table other (id1 int, id2 int, data varchar(10), primary 
 key(id1,id2));
 insert some data
 manu=# explain select father.*,id2 from father left join other on 
 father.id=other.id1 where id2=2 order by id;

Just because the table was empty at the time statistics were most
recently gathered doesn't mean it's still empty at the time the query
is executed.

ANALYZE;
PREPARE foo AS SELECT ...;
INSERT INTO ...some previously empty child table...
EXECUTE foo;

In order to rely on this for query planning, you'd need some way to
invalidate any cached plans when inserting into an empty table.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to fix search_path defencies with pg_bench

2009-05-05 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 I have been doing some testing with pgbench and I realized that it
 forces the use of public as its search_path. This is bad if:

 * You want to run multiple pgbench instances within the same database
 * You don't want to use public (for whatever reason)

 This patch removes that ability and thus will defer to the default
 search_path for the connecting user.

Hmm.  The search_path setting seems to have been added here
http://archives.postgresql.org/pgsql-committers/2002-10/msg00118.php
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pgbench/pgbench.c.diff?r1=1.20;r2=1.21

as part of a mass patch to make everything in contrib work in the public
schema.  I agree that it probably wasn't considered carefully whether
pg_bench should do that; but does anyone see a reason not to change it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable Isolation without blocking

2009-05-05 Thread Kevin Grittner
Neil Conway neil.con...@gmail.com wrote: 
 
 Tracking the read sets of each transaction would be very expensive.
 Worse still, that information needs to be kept around after
 end-of-transaction, which raises questions about where it should be
 stored and how it should be cleaned up. Note that the benchmarks in
 the paper involve transactions that perform a small number of
 simple read and update operations, which reduces the bookkeeping
 overhead.
 
I know that some of the simplifying assumptions listed in 3.1 do not
currently hold for PostgreSQL.  A prerequisite for using the algorithm
would be to make them hold for PostgreSQL, or find some way to work
around their absence.  I hope people will read the paper and mull it
over, but these assumptions are probably the crux or whether this
enhancement is feasible.  I guess it would be best to throw that much
out to the list for discussion.
 
To quote:
 
 1. For any data item x, we can efficiently get the list of
locks held on x.
 2. For any lock l in the system, we can efficiently get
l.owner, the transaction object that requested the lock.
 3. For any version xt of a data item in the system, we
can efficiently get xt.creator, the transaction object
that created that version.
 4. When *nding a version of item x valid at some given
timestamp, we can efficiently get the list of other ver-
sions of x that have later timestamps.
 
Based on my limited understanding, I don't get the impression 2 or 3
are a problem.
 
I'm assuming that we would use the structures which back the pg_locks
view for the SIREAD locks implicit in 1, possibly with some scope
escalation as counts for a table rise (row to page to extent to
table).  This may require a larger allocation for this information by
those wanting to use serializable transactions.
 
I'm not sure how 4 could be handled.
 
I don't know how much work would be required to track the transaction
information listed in section 4.1 (or its functional equivalent).
 
I'd be happy to hear the opinion of those more familiar with the
internals than I.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Values of fields in Rules

2009-05-05 Thread Alvaro Herrera
mito escribió:

 is there any way how to determinate in rule that null comes from  
 explicit insert or from not naming column in insert statment

Not that I know of (and yes, this sucks).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let CREATE DATABASE reassign encoding?

2009-05-05 Thread Tom Lane
I wrote:
 Peter Eisentraut pete...@gmx.net writes:
 AFAIR, the only reason that we haven't disallowed this sort of stuff
 years and years ago is that people use it; the Japanese in particular.
 I don't see what is different now.

 What's different now is that 8.4 has already established the principle
 that you have to clone template0 if you want to change the locale of a
 database.  I think this is a good time to establish the same principle
 for encodings.  (Or in other words, if we don't fix it now, when will
 be a better time?)

Attached is a proposed patch (without documentation changes as yet)
for this.  Since the code is already enforcing exact locale match when
cloning a non-template0 database, I just made it act the same for
encoding, without any strange exceptions for SQL_ASCII.

I found that mbregress.sh was already broken by the existing
restrictions, if you try to use it in a database whose default
locale isn't C.  The patch adds switches to fix that.

The patch also incidentally fixes a few ereport's that were missing
errcode values.

Last chance for objections ...

regards, tom lane

Index: src/backend/commands/dbcommands.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/dbcommands.c,v
retrieving revision 1.223
diff -c -r1.223 dbcommands.c
*** src/backend/commands/dbcommands.c   5 May 2009 23:39:55 -   1.223
--- src/backend/commands/dbcommands.c   6 May 2009 00:30:59 -
***
*** 361,367 
  #endif
  (encoding == PG_SQL_ASCII  superuser(
ereport(ERROR,
!   (errmsg(encoding %s does not match locale %s,
pg_encoding_to_char(encoding),
dbctype),
 errdetail(The chosen LC_CTYPE setting requires 
encoding %s.,
--- 361,368 
  #endif
  (encoding == PG_SQL_ASCII  superuser(
ereport(ERROR,
!   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
!errmsg(encoding %s does not match locale %s,
pg_encoding_to_char(encoding),
dbctype),
 errdetail(The chosen LC_CTYPE setting requires 
encoding %s.,
***
*** 374,402 
  #endif
  (encoding == PG_SQL_ASCII  superuser(
ereport(ERROR,
!   (errmsg(encoding %s does not match locale %s,
pg_encoding_to_char(encoding),
dbcollate),
 errdetail(The chosen LC_COLLATE setting requires 
encoding %s.,
   
pg_encoding_to_char(collate_encoding;
  
/*
!* Check that the new locale is compatible with the source database.
 *
!* We know that template0 doesn't contain any indexes that depend on
!* collation or ctype, so template0 can be used as template for
!* any locale.
 */
if (strcmp(dbtemplate, template0) != 0)
{
if (strcmp(dbcollate, src_collate) != 0)
ereport(ERROR,
!   (errmsg(new collation is incompatible 
with the collation of the template database (%s), src_collate),
 errhint(Use the same collation as in 
the template database, or use template0 as template.)));
  
if (strcmp(dbctype, src_ctype) != 0)
ereport(ERROR,
!   (errmsg(new LC_CTYPE is incompatible 
with LC_CTYPE of the template database (%s), src_ctype),
 errhint(Use the same LC_CTYPE as in 
the template database, or use template0 as template.)));
}
  
--- 375,419 
  #endif
  (encoding == PG_SQL_ASCII  superuser(
ereport(ERROR,
!   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
!errmsg(encoding %s does not match locale %s,
pg_encoding_to_char(encoding),
dbcollate),
 errdetail(The chosen LC_COLLATE setting requires 
encoding %s.,
   
pg_encoding_to_char(collate_encoding;
  
/*
!* Check that the new encoding and locale settings match the source
!* database.  We insist on this because we simply copy the source data 
---
!* any non-ASCII data would be wrongly encoded, and any indexes sorted
!* according to the source locale would be wrong.
 *
!* However, we assume 

Re: [HACKERS] create if not exists (CINE)

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, May 5, 2009 at 8:56 AM, Andrew Dunstan and...@dunslane.net wrote:
 We have debated CREATE ... IF NOT EXISTS in the past, and there is no
 consensus on what it should do, so we don't have it for any command. That is
 quite a different case from what's being asked for, and the two should not
 be conflated.

 I must be missing something, because the semantics of CREATE ... IF
 NOT EXISTS seem pretty well-defined to me,

 Please go read the prior threads (I think searching for CINE might
 help, because we pretty shortly started abbreviating it like that).

OK - done, and thanks for the search tip.

I still stand by my original comment.  I think there is no semantic
question about what CREATE IF NOT EXISTS ought to do.  It ought to
create the object if it doesn't exist.  Otherwise, it ought to do
nothing.  That leads to two questions, the first of which Andrew asked
in an email earlier today, and the second of which you asked in the
previous discussion of this issue:

1. Why should it do nothing if the object already exists (as opposed
to any other alternative)?

Answer: Because that's what CREATE IF NOT EXISTS means when
interpreted as English.  If you wanted it to take some action when the
object already exists, you'd have to call the command something like
CREATE IF NOT EXISTS OTHERWISE MUTILATE.  Actually, we pretty much
already have this in the form of CREATE OR REPLACE, but CREATE OR
REPLACE is only suitable for objects whose state can be fully defined
by the command which creates them.  This is true for views and
functions, but false for tables and sequences, which contain user
data.

2. What good is this anyway?

Answer: It's good for schema management.  Typically, you have a
development system and N0 production systems.  Periodically, you do
releases from develepment to production.  When you release to a
machine X, you want to upgrade that machine from whatever version of
the schema it has now to the one appropriate to the version of the
application you are releasing.  So suppose you have a table caled foo
that didn't exist in version 1 of the software.  In version 2 it was
added with columns id and name.  In version 3 of the software a date
column called bar was added.  You are releasing version 3.  So you
write the following SQL script:

CREATE TABLE IF NOT EXISTS foo (id serial, name varchar not null,
primary key (id));
ALTER TABLE foo ADD COLUMN IF NOT EXISTS bar date;

Observe that after running this script on EITHER a V1 or a V2
database, you now have the V3 schema.  Without CINE, you have to
either write separate upgrade scripts for V1-V3 and V2-V3, or write
a PL/pgsql function that scrutinizes the system catalogs and figures
out what needs to be done, or have some sort of bookkeeping system to
keep track of which DDL bits have previously been executed, or
something other alternative that will definitely be more complicated
than the above.  Obviously, there are more complex cases that CINE
can't handle, but this is actually enough for a pretty good percentage
of them in my experience.  You typically add a table, then as releases
go by you add more columns, then possibly at some point you decide
that whole table was a stupid idea and you rip it out (which is
already well-handled via DROP IF EXISTS).  Typically when adding a
column to an existing table you either allow nulls or set a default,
either of which will work fine with this syntax.  If you need to do
something more complicated (like compute the initial values of bar
based on the contents of some other table), well, then you're back to
where you always are today.

It seems to me that the right thing to do is to support CREATE OR
REPLACE for as many object types as possible.  But that won't be
possible for things like tables unless we can make PostgreSQL
AI-complete, so for those I think we ought to support CINE to cater to
the design pattern above.  That is of course only my opinion, but I
gather from some of the comments made earlier today that I'm not the
only one who wrestles with this problem.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create if not exists (CINE)

2009-05-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 1. Why should it do nothing if the object already exists (as opposed
 to any other alternative)?

 Answer: Because that's what CREATE IF NOT EXISTS means when
 interpreted as English.

The argument was not about whether that is the plain meaning of the
phrase; it was about whether that is a safe and useful behavior for a
command to have.  There is a pretty substantial group of people who
think that it would be quite unsafe, which is why we failed to arrive
at a consensus that this is a good thing to implement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create if not exists (CINE)

2009-05-05 Thread Robert Haas
On Tue, May 5, 2009 at 9:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 1. Why should it do nothing if the object already exists (as opposed
 to any other alternative)?

 Answer: Because that's what CREATE IF NOT EXISTS means when
 interpreted as English.

 The argument was not about whether that is the plain meaning of the
 phrase; it was about whether that is a safe and useful behavior for a
 command to have.  There is a pretty substantial group of people who
 think that it would be quite unsafe, which is why we failed to arrive
 at a consensus that this is a good thing to implement.

Who are these people other than you, and did you read the rest of my email?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create if not exists (CINE)

2009-05-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, May 5, 2009 at 9:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The argument was not about whether that is the plain meaning of the
 phrase; it was about whether that is a safe and useful behavior for a
 command to have.  There is a pretty substantial group of people who
 think that it would be quite unsafe, which is why we failed to arrive
 at a consensus that this is a good thing to implement.

 Who are these people other than you,

In the thread that went into this in most detail
http://archives.postgresql.org//pgsql-hackers/2005-10/msg00632.php
it seemed that wanting CINE was a minority opinion, and in any case
a number of pretty serious issues were raised.

 and did you read the rest of my email?

Yes, I did.  I'm not any more convinced than I was before.  In
particular, the example you give is handled reasonably well without
*any* new features, if one merely ignores object already exists
errors.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create if not exists (CINE)

2009-05-05 Thread Asko Oja
It was just yesterday when i wondering why we don't have this feature (i was
trying to use it and it wasn't there :).
The group of people who think it's unsafe should not use the feature.
Clearly this feature would be useful when managing large amounts of servers
and would simplify our release process.


On Wed, May 6, 2009 at 5:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
  On Tue, May 5, 2009 at 9:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  The argument was not about whether that is the plain meaning of the
  phrase; it was about whether that is a safe and useful behavior for a
  command to have.  There is a pretty substantial group of people who
  think that it would be quite unsafe, which is why we failed to arrive
  at a consensus that this is a good thing to implement.

  Who are these people other than you,

 In the thread that went into this in most detail
 http://archives.postgresql.org//pgsql-hackers/2005-10/msg00632.php
 it seemed that wanting CINE was a minority opinion, and in any case
 a number of pretty serious issues were raised.

  and did you read the rest of my email?

 Yes, I did.  I'm not any more convinced than I was before.  In
 particular, the example you give is handled reasonably well without
 *any* new features, if one merely ignores object already exists
 errors.

It sounds pretty amazing. Ignoring errors as a suggested way to use
PostgreSQL.
We run our release scripts inside transactions (with exception of concurrent
index creation). So if something unexpected happens we are left still in
working state.
PostgreSQL ability to do DDL changes inside transaction was one of biggest
surprises/improvements when switching from Oracle. Now you try to bring us
down back to the level of Oracle :)


regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers