date:20090601

Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Sam Mason

On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
 regression=# select '999'::interval second;
 The correct interpretation of the input value is certainly 999 seconds.

Agreed; silent truncation like this is confusing and will lead to
unnecessary bugs in users' code.

 the attached patch we would render as
 regression=# select '1 day 1 hour'::interval hour;
  1 day 01:00:00
 
 There is some case to be made that we should throw error here,
 which we could do by putting error tests where the attached patch
 has comments suggesting an error test.

With things as they are I think it would be useful to throw an error
here; if the user means 25 hours they should say 25 hours!

 However I'm inclined to think
 that doing that would expose an implementation dependency rather more
 than we should.  It is usually not clear to novices that '1 day 1 hour'
 is different from '25 hours', and it would be even less clear why the
 latter would be acceptable input for an INTERVAL HOUR field when the
 former isn't.  So I'm proposing the patch as-is rather than with the
 extra error tests, but am open to being convinced otherwise.

It would only be different when the interval is used with values of type
timestamptz, or am I missing something?  How much sense does it make to
have a timezone aware interval where this distinction is true and leave
the current interval as timezone naive.  Not sure if that would help to
clean up the semantics at all or if it's just adding more unnecessary
complexity.  I have a feeling it's probably the latter, but thought it
may help things.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] INTERVAL SECOND limited to 59 seconds?

2009-06-01 Thread Sebastien FLAESCH


Thank you Tom for looking at this.

I would be pleased to help on testing the fix when available.

My plan is to store Informix INTERVALs (coming from the 4gl applications we
support) into PostgreSQL INTERVALs, and I have a bunch of tests for that...

I believe Informix INTERVALs (and related operators and functions) are not
100% SQL99, but they are close...

Thanks a lot!
Seb

Tom Lane wrote:

Sebastien FLAESCH s...@4js.com writes:

I would expect that an INTERVAL SECOND can store more that 59 seconds.


I took a look into the SQL spec and I think that we do indeed have a
spec compliance issue here.  SQL99 section 4.7 saith

 Within a value of type interval, the first field is constrained
 only by the interval leading field precision of the associated
 interval qualifier. Table 8, Valid values for fields in INTERVAL
 values, specifies the constraints on subsequent field values.
 [ Table 8 says about what you'd expect, eg 0..23 for HOUR ]
 Values in interval fields other than SECOND are integers and have
 precision 2 when not the first field. SECOND, however, can be
 defined to have an interval fractional seconds precision that
 indicates the number of decimal digits maintained following the
 decimal point in the seconds value. When not the first field,
 SECOND has a precision of 2 places before the decimal point.

So in other words, 999 seconds is a valid value for a field of type
INTERVAL SECOND, *and should come out the same way*, not as 00:16:39,
and certainly not as 00:00:39.

It might be a relatively easy fix to not truncate the input value
incorrectly.  I haven't looked, but I think we should look now, because
8.4 has already changed the behavior in this area and it would be good
not to change it twice.  The focus of the 8.4 work was to make sure that
we would correctly interpret the values of spec-compliant interval
literals, but this example shows we are not there yet.

We are fairly far away from being able to make it print out as the spec
would suggest, because interval_out simply doesn't have access to the
information that the field is constrained to be INTERVAL SECOND rather
than some other kind of interval.  We also have got no concept at all of
interval leading field precision, only of interval fractional seconds
precision, so constraining the leading field to only a certain number
of integral digits isn't possible either.  I don't foresee anything
getting done about either of those points for 8.4.

regards, tom lane




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Win32 link() function

2009-06-01 Thread Bruce Momjian

Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   pg_migrator needs hard link() capabiity on Win32 to support its --link
   option.  Can someone create that and hopefully add it to libpgport? 
  
  AFAIK hard links simply don't exist on Windows.
 
 Magnus showed me this:
 
   http://msdn.microsoft.com/en-us/library/aa363860(VS.85).aspx
 
   BOOL WINAPI CreateHardLink(
 __inLPCTSTR lpFileName,
 __inLPCTSTR lpExistingFileName,
 __reserved  LPSECURITY_ATTRIBUTES lpSecurityAttributes
   );

Hiroshi Saito emailed me some sample link code I was able to use for
Win32 hard links and I have committed that to pg_migrator CVS.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Tom Lane

Fujii Masao masao.fu...@gmail.com writes:
 pg_standby can use ln command to restore an archived file,
 which might destroy the archived file as follows.

Does it matter?  pg_standby's source area wouldn't normally be an
archive in the real sense of the word, it's just a temporary staging
area between master and slave.  (If it were being used as a real
archive, keeping it on the same disk as the live slave seems pretty
foolish anyway, so the case wouldn't arise.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Heikki Linnakangas


Tom Lane wrote:

Fujii Masao masao.fu...@gmail.com writes:

pg_standby can use ln command to restore an archived file,
which might destroy the archived file as follows.


Does it matter?  pg_standby's source area wouldn't normally be an
archive in the real sense of the word, it's just a temporary staging
area between master and slave.  (If it were being used as a real
archive, keeping it on the same disk as the live slave seems pretty
foolish anyway, so the case wouldn't arise.)


It seems perfectly sane to source pg_standby directly from the archive 
to me. And we're talking about symbolic linking, so the archive 
directory might well be on an NFS mount.


(I haven't looked at the issue yet)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Robert Haas

On Sun, May 31, 2009 at 11:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 (At the risk of beating a dead horse, note if we were upgrading the
 catalog tables directly via SQL, this type of scenario could be
 handled cleanly without hacking pg_dump; I repeat my earlier critique
 that the pg_migrator approach consigns us to a never-ending series of
 pg_dump hacks, that or having it not work very well.)

 Updating the catalog tables directly via SQL?  Good luck making that
 work.  If you ever get it to work at all, it'll require a pile of hacks
 that will make pg_migrator look like paradise.

For clarity, I really mean from a standalone backend, but ideally
I'd like it to be SQL.

 (Which is not to say that pg_migrator isn't a hack; it surely is.
 But it looks like the least painful approach available.)

Maybe.  It seems that we don't have a good way of handling datatype
conversions.  The approaches that have been proposed for tsvector
wouldn't work at all but for the fact that the new output function can
handle the old internal representation, which is not something that we
can guarantee in every case.  And, even so, they leave the database in
a broken state until the post-migration scripts have been run.  The
good news is that tsvector is not a datatype that everyone uses, and
those who do probably don't use it in every table, but what happens
when we want to change numeric incompatibly?

We really need to figure out an approach that lets us keep the old
datatypes around under a different name while making the original name
be the new version of the datatype.  That way people can migrate and
be up, and deal with the need to rewrite their tables at a later time.
 Having the database come up quickly but in a state where queries may
return wrong answers is a fairly poor excuse for in-place upgrade.
This is not to say that I don't appreciate the work Bruce is putting
into it, but I fear that without some more engineering it's just going
to be a hack used only by the desperate.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane

Sam Mason s...@samason.me.uk writes:
 On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
 There is some case to be made that we should throw error here,
 which we could do by putting error tests where the attached patch
 has comments suggesting an error test.

 With things as they are I think it would be useful to throw an error
 here; if the user means 25 hours they should say 25 hours!

Well, maybe, but I'm not really convinced.  By the same logic, if the
column is INTERVAL MONTH and the user puts in '1 year 1 month', then
we ought to throw an error for that too.  But AdjustIntervalForTypmod
is incapable of doing that because months and years are folded into
a single field.  There is a logical reason for the difference --- 1 year
is always exactly 12 months whereas 1 day is not always exactly 24 hours
--- but that difference isn't acknowledged by the SQL standard, which
last I checked still pretends daylight savings time doesn't exist.

The real bottom line here is that our underlying implementation and
semantics for INTERVAL are considerably different from what the SQL
standard has in mind.  AFAICS they intend an interval to contain six
separate numeric fields with what you see is what you get behavior.
We could argue some other time about whether that's a better design
than what we're using, but it's surely not going to change for 8.4.
My ambitions for the moment are limited to making sure that we accept
all spec-compliant interval literals and interpret them in a fashion
reasonably compatible with what the spec says the value is.  I don't
feel that we need to throw error for stuff we used to accept in order
to meet that goal.

 It would only be different when the interval is used with values of type
 timestamptz, or am I missing something?  How much sense does it make to
 have a timezone aware interval where this distinction is true and leave
 the current interval as timezone naive.

Doesn't seem practical, certainly not for 8.4.  In any case I'm
uncomfortable with the idea that a value would be accepted at entry
and then fail later on depending on how you used it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 4:03 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, May 31, 2009 at 11:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Updating the catalog tables directly via SQL?  Good luck making that
 work.  If you ever get it to work at all, it'll require a pile of hacks
 that will make pg_migrator look like paradise.

 For clarity, I really mean from a standalone backend, but ideally
 I'd like it to be SQL.

Keep in mind that you have catalogs in all the databases, and even in
standalone mode you need those catalogs to find the, er, catalogs.
There's a reason bootstrap mode is so limited and even then it needs
some of the catalogs to already be in place.

 (Which is not to say that pg_migrator isn't a hack; it surely is.
 But it looks like the least painful approach available.)

 Maybe.  It seems that we don't have a good way of handling datatype
 conversions.  The approaches that have been proposed for tsvector
 wouldn't work at all but for the fact that the new output function can
 handle the old internal representation, which is not something that we
 can guarantee in every case.

Well I think all we need for that is to have pg_migrator provide the
old output function wrapped up in a migrate_foo() C function.


 And, even so, they leave the database in
 a broken state until the post-migration scripts have been run.  The
 good news is that tsvector is not a datatype that everyone uses, and
 those who do probably don't use it in every table, but what happens
 when we want to change numeric incompatibly?

Or, say, timestamp...

 We really need to figure out an approach that lets us keep the old
 datatypes around under a different name while making the original name
 be the new version of the datatype.  That way people can migrate and
 be up, and deal with the need to rewrite their tables at a later time.

I do agree that having to rewrite the whole table isn't really
upgrade-in-place.

But the work to support multiple versions of data types is more than
you're describing. You need to be concerned about things like joins
between tables when some columns are the old data type and some the
new, etc.

Ultimately people will have to convert the data types sometime.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Tom Lane

Greg Stark st...@enterprisedb.com writes:
 On Mon, Jun 1, 2009 at 4:03 PM, Robert Haas robertmh...@gmail.com wrote:
 We really need to figure out an approach that lets us keep the old
 datatypes around under a different name while making the original name
 be the new version of the datatype.  That way people can migrate and
 be up, and deal with the need to rewrite their tables at a later time.

 I do agree that having to rewrite the whole table isn't really
 upgrade-in-place.

It's certainly the case that there is a lot more work to do before
pg_migrator could support everything that we reasonably want to be
able to do in a version update.  As I see it, the reason it's getting
revived now is that 8.3-8.4 happens to be an update where most of what
it can't (yet) do isn't necessary.  That means we can get it out there,
get the bugs out of the functionality it does have, and most importantly
try to set an expectation that future updates will also have some degree
of update-in-place capability.  If we wait till it's perfect then
nothing will ever happen at all in this space.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Aidan Van Dyk

* Heikki Linnakangas heikki.linnakan...@enterprisedb.com [090601 10:56]:
 Tom Lane wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 pg_standby can use ln command to restore an archived file,
 which might destroy the archived file as follows.

 Does it matter?  pg_standby's source area wouldn't normally be an
 archive in the real sense of the word, it's just a temporary staging
 area between master and slave.  (If it were being used as a real
 archive, keeping it on the same disk as the live slave seems pretty
 foolish anyway, so the case wouldn't arise.)

 It seems perfectly sane to source pg_standby directly from the archive  
 to me. And we're talking about symbolic linking, so the archive  
 directory might well be on an NFS mount.

I would expect that any archive directly available would at least be RO
to the postgres slave... But

Something like this would stop the symlink being renamed... Not portable, but 
probably portable
across platforms that have symlinks...
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 1b575e2..cba3f7a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3042,13 +3042,23 @@ RemoveOldXlogFiles(uint32 log, uint32 seg, 
XLogRecPtr endptr)
{
if (XLogArchiveCheckDone(xlde-d_name))
{
+   struct stat stat_buf;
snprintf(path, MAXPGPATH, XLOGDIR 
/%s, xlde-d_name);

/*
 * Before deleting the file, see if it 
can be recycled as a
 * future log segment.
+* If it's a symlink, we don't recycle 
it
 */
-   if (InstallXLogFileSegment(endlogId, 
endlogSeg, path,
+   if ( (stat(path, stat_buf) == 0)  
S_ISLNK(stat_buf.st_mode))
+   {
+   ereport(DEBUG2,
+   
(errmsg(removing transaction log symlink \%s\,
+   
xlde-d_name)));
+   unlink(path);
+   
CheckpointStats.ckpt_segs_removed++;
+   }
+   else if 
(InstallXLogFileSegment(endlogId, endlogSeg, path,

   true, max_advance,

   true))
{

You can make a smaller patch if your not interested in the DEBUG2 message
saying that it deleted a symlink...


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 11:10 AM, Greg Stark st...@enterprisedb.com wrote:
 On Mon, Jun 1, 2009 at 4:03 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, May 31, 2009 at 11:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Updating the catalog tables directly via SQL?  Good luck making that
 work.  If you ever get it to work at all, it'll require a pile of hacks
 that will make pg_migrator look like paradise.

 For clarity, I really mean from a standalone backend, but ideally
 I'd like it to be SQL.

 Keep in mind that you have catalogs in all the databases, and even in
 standalone mode you need those catalogs to find the, er, catalogs.
 There's a reason bootstrap mode is so limited and even then it needs
 some of the catalogs to already be in place.

Yep.  Changes to the schema of the bootstrap tables are definitely the
toughest nut to crack.  I haven't been convinced that it's impossible,
but given my relative level of knowledge of the code compared to Tom,
that could well be an indication that I'm overly optimistic.

 (Which is not to say that pg_migrator isn't a hack; it surely is.
 But it looks like the least painful approach available.)

 Maybe.  It seems that we don't have a good way of handling datatype
 conversions.  The approaches that have been proposed for tsvector
 wouldn't work at all but for the fact that the new output function can
 handle the old internal representation, which is not something that we
 can guarantee in every case.

 Well I think all we need for that is to have pg_migrator provide the
 old output function wrapped up in a migrate_foo() C function

Well that would be better, but it still leaves the database temporarily broken.

 And, even so, they leave the database in
 a broken state until the post-migration scripts have been run.  The
 good news is that tsvector is not a datatype that everyone uses, and
 those who do probably don't use it in every table, but what happens
 when we want to change numeric incompatibly?

 Or, say, timestamp...

Yeah.

 We really need to figure out an approach that lets us keep the old
 datatypes around under a different name while making the original name
 be the new version of the datatype.  That way people can migrate and
 be up, and deal with the need to rewrite their tables at a later time.

 I do agree that having to rewrite the whole table isn't really
 upgrade-in-place.

 But the work to support multiple versions of data types is more than
 you're describing. You need to be concerned about things like joins
 between tables when some columns are the old data type and some the
 new, etc.

True.

 Ultimately people will have to convert the data types sometime.

Yes they will, but not having to do it as part of the upgrade is
important.  What particularly bothers me is the possibility that the
database comes on line and starts letting clients in (who don't know
that it's broken) while the breakage is still present.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 11:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@enterprisedb.com writes:
 On Mon, Jun 1, 2009 at 4:03 PM, Robert Haas robertmh...@gmail.com wrote:
 We really need to figure out an approach that lets us keep the old
 datatypes around under a different name while making the original name
 be the new version of the datatype.  That way people can migrate and
 be up, and deal with the need to rewrite their tables at a later time.

 I do agree that having to rewrite the whole table isn't really
 upgrade-in-place.

 It's certainly the case that there is a lot more work to do before
 pg_migrator could support everything that we reasonably want to be
 able to do in a version update.  As I see it, the reason it's getting
 revived now is that 8.3-8.4 happens to be an update where most of what
 it can't (yet) do isn't necessary.  That means we can get it out there,
 get the bugs out of the functionality it does have, and most importantly
 try to set an expectation that future updates will also have some degree
 of update-in-place capability.  If we wait till it's perfect then
 nothing will ever happen at all in this space.

I agree.  I remain doubtful that dumping and reloading the schema is
the best way to go, but it's certainly a worthwhile experiment,
because (a) I might easily be wrong and (b) we'll hopefully learn some
things that will be useful going forward.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Bruce Momjian

Robert Haas wrote:
  It's certainly the case that there is a lot more work to do before
  pg_migrator could support everything that we reasonably want to be
  able to do in a version update. ?As I see it, the reason it's getting
  revived now is that 8.3-8.4 happens to be an update where most of what
  it can't (yet) do isn't necessary. ?That means we can get it out there,
  get the bugs out of the functionality it does have, and most importantly
  try to set an expectation that future updates will also have some degree
  of update-in-place capability. ?If we wait till it's perfect then
  nothing will ever happen at all in this space.
 
 I agree.  I remain doubtful that dumping and reloading the schema is
 the best way to go, but it's certainly a worthwhile experiment,
 because (a) I might easily be wrong and (b) we'll hopefully learn some
 things that will be useful going forward.

The amount of work to do an upgrade without dump/reloading the schema is
overly complex and would have to be adjusted for every major release,
with the high likelihood it would not be done or fail in random ways.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-06-01 Thread Bruce Momjian

Greg Stark wrote:
 On Mon, Jun 1, 2009 at 4:03 PM, Robert Haas robertmh...@gmail.com wrote:
  On Sun, May 31, 2009 at 11:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Updating the catalog tables directly via SQL? ?Good luck making that
  work. ?If you ever get it to work at all, it'll require a pile of hacks
  that will make pg_migrator look like paradise.
 
  For clarity, I really mean from a standalone backend, but ideally
  I'd like it to be SQL.
 
 Keep in mind that you have catalogs in all the databases, and even in
 standalone mode you need those catalogs to find the, er, catalogs.
 There's a reason bootstrap mode is so limited and even then it needs
 some of the catalogs to already be in place.
 
  (Which is not to say that pg_migrator isn't a hack; it surely is.
  But it looks like the least painful approach available.)
 
  Maybe. ?It seems that we don't have a good way of handling datatype
  conversions. ?The approaches that have been proposed for tsvector
  wouldn't work at all but for the fact that the new output function can
  handle the old internal representation, which is not something that we
  can guarantee in every case.
 
 Well I think all we need for that is to have pg_migrator provide the
 old output function wrapped up in a migrate_foo() C function.

Yep, it is just a matter of using a custom function call rather than a
cast.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on writing extensible modules

2009-06-01 Thread Alvaro Herrera

Dimitri Fontaine wrote:

 Le 31 mai 09 à 18:21, Tom Lane a écrit :

 You could maybe make this work by executing your own transaction
 to do it, but I really have to wonder if it's a good idea.  One
 point to think about is that elog(ERROR) still means elog(FATAL)
 at this point, so any brokenness in the queries you're trying to
 prepare will result in locking all users out of the database.

 Yeah that's a pretty good foot gun, yet another one. But  
 preprepare.at_init is optional and defaults to off. If you broke it all, 
 you can turn it off again and reload.

Maybe you could set a callback to be called during the first transaction
in InitPostgres ... in full knowledge that if it fails, people will be
locked out of the database.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

As of today we are three months behind the original plan for 8.4.0 release.
In a one-year release cycle that's already pretty bad slip; but there now
seems no chance of a release happening in less than a month, and if we
continue to let things drift it could easily stretch to five or six
months' slip.  Given the slow pace of bug reports there is no reason to
be waiting.  We need to refocus our energy on getting the release out.

The main thing that needs to happen now is to deal with the open items
listed at
http://wiki.postgresql.org/wiki/PostgreSQL_8.4_Open_Items
either by fixing them or by agreeing that it's okay to let them slide
to 8.5 or beyond.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-06-01 Thread Kevin Field

On May 29, 1:04 pm, t...@sss.pgh.pa.us (Tom Lane) wrote:
 Kevin Field kevinjamesfi...@gmail.com writes:
  default:
 elog(ERROR, unrecognized raise option: %d, 
  opt-opt_type);
  Should this be changed to:
  default:
 ereport(ERROR, (errmsg_internal(unrecognized 
  raise option: %d,
  opt-opt_type)));

 No, we generally don't bother with that.  The above two are exactly
 equivalent and the first is easier to write, so why complicate the code?
 ereport is needed if you want to specify a SQLSTATE, provide a
 translatable error message, etc, but for internal shouldn't-happen cases
 we customarily just use elog.

Ah, I had missed that.  I understand.  The function's comment's still
out of date though, I think, since it uses ereport at the end.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feedback on writing extensible modules

2009-06-01 Thread Simon Riggs


On Mon, 2009-06-01 at 12:23 -0400, Alvaro Herrera wrote:
 Dimitri Fontaine wrote:
 
  Le 31 mai 09 à 18:21, Tom Lane a écrit :
 
  You could maybe make this work by executing your own transaction
  to do it, but I really have to wonder if it's a good idea.  One
  point to think about is that elog(ERROR) still means elog(FATAL)
  at this point, so any brokenness in the queries you're trying to
  prepare will result in locking all users out of the database.
 
  Yeah that's a pretty good foot gun, yet another one. But  
  preprepare.at_init is optional and defaults to off. If you broke it all, 
  you can turn it off again and reload.
 
 Maybe you could set a callback to be called during the first transaction
 in InitPostgres ... in full knowledge that if it fails, people will be
 locked out of the database.

Should be possible to define a custom variable that has an assign hook
that does nothing unless called with PGC_S_DATABASE or PGC_S_USER. That
should guarantee the code only runs after connection, rather than at
server startup. Not tried it yet.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path improvements

2009-06-01 Thread Josh Berkus


Greg,


Well the goal is to make them simpler. I don't know any language that
has implemented what you describe. Either you have access to the
internal methods of a class or you don't and you only have access to
the public api. That seems to work for much more sophisticated
languages than ours just fine.


Um, PostgreSQL isn't a programming language.  It's a DBMS.

You're arguing what DBAs should do using some theoretical idealized 
DBMS.  You're ignoring what DBAs *do* do currently using the real world 
PostgreSQL.  I, for one, am not interested in theory.



Organization/Maintainability: when you have hundreds or thousands of
database objects, you want folders to put them in just so that you can
keep track of them and view them in easy-to-digest groups, just as you deal
with files in a filesystem.


Well when you ls a directory or perform some operation on a file you
only work with what's in that directory, not everything in the
hierarchy.


Precisely!  And that's why DBAs often use lots of schema to divide up 
their hundreds of database objects.



I don't think that gets any easier with better tools. This is the same
reason unix systems don't put every tool in a different directory and
then insist you put every directory in your path based on which tools
each user should have access to.


What, you're telling me you never had a cron job break because of $PATH 
issues?



It doesn't get any ideasier if you have
every function hard coding inside it assumptions about what schemas it
will need.


When have I proposed that?


As I said before, schema conbine 3 purposes: organization, visibility and
security, into one structure.  Which is why it's difficult to make them work
perfectly for all 3 purposes.  We could, however, make them work better.


How is this different from any other analogous system? The filesystem
uses directories for all three of the above, for example?


It's very similar to issues with the filesystem.  Unfortunately, while 
very familiar, Unix filesystems aren't really a positive example; $PATHs 
and UMASK are a PITA an have forced many and admin (and OS) to come up 
with complex tools to manage them.



Having three different namespaces, one for organizing your code, one
to control visibility, and one to control security would be 9 times
more complex, i think.


I didn't say I had a good answer to this problem.  You just need to be 
aware of the three purposes of schema when proposing any improvements; 
your previous e-mails kept making the assumption that schema were used 
*only* for visibility, and never for security or organization.



3rd-party vendor code is precisely what I'm thinking of when I point
out that having global state to override what the code requests is a
recipe for problems. 3rd-party vendors would be left with no way to
write their code such that they could guarantee it would work -- the
DBA would always be able to break it by setting this variable. And
some other code might require this variable to be set leaving the
hapless DBA with no right option.


You must set search_path_suffix='information_schema' to use this tool 
is vastly simpler than what you'd deal with currently if you had to deal 
with a Microsoftian tool which assumed that information_schema was 
automatically in your search path.


Again, I'm looking to improve what we actually *have* right now, rather 
than implement some theoretically ideal database.



Well I don't mind push but I still think pop is an error. What you
really want to do is restore it to the value you started with. You
don't want to remove the last element since that may not be the
element you added. Some function you called may have added an extra
element on the head.


Yeah, pop is a misnomer; what I'd want is 
search_path_del(search_path,'admin') ... that is, a way to remove a 
specific schema from the list.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Markus Wanner

Hi,

Kevin Grittner wrote:
 Greg Stark st...@enterprisedb.com wrote:
 I would want any serialization failure to be
 justifiable by simple inspection of the two transactions.
  
 BTW, there are often three (or more) transaction involved in creating
 a serialization failure, where any two of them alone would not fail. 
 You probably knew that, but just making sure

I'm not that eager on the justifiable by simple inspection requirement
above. I don't think a DBA is commonly doing these inspections at all.

I think a tool to measure abort rates per transaction (type) would serve
the DBA better. Of course there may be false positives, but high abort
rates should point out the problematic transactions pretty quickly. The
DBA shouldn't need to care about rare serialization failures or their
justifiability.

But maybe that reveals another requirement: false positives should be
rare enough for the DBA to still be able to figure out which
transactions are problematic and actually lead to conflicts.

In general, getting good performance by allowing a certain
false-positive rate seems like a good approach to me.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dtrace probes documentation

2009-06-01 Thread Frank Ch. Eigler

Tom Lane t...@sss.pgh.pa.us writes:

 [...]
 See http://blog.endpoint.com/2009/05/postgresql-with-systemtap.html for
 details. Perhaps it's worth noting in the documentation that SystemTap users
 will need to use the double-underscore version?

 I think a better solution is to persuade the Systemtap guys that they
 ought to accept the single-hyphen spelling.  [...]

Will do: http://sourceware.org/PR10225.

- FChE

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-01 Thread Markus Wanner

Hi,

a newish conversion with cvs2git is available to check here:

  git://www.bluegap.ch/

(it's not incremental and will only stay for a few days)


For everybody interested, please check the committer names and emails.
I'm missing the names and email addresses for these committers:

'barry' : ('barry??', ''),
'dennis' : ('Dennis??', ''),
'inoue' : ('inoue??', ''),
'jurka' : ('jurka??', ''),
'pjw' : ('pjw??', ''),

And I'm guessing that 'peter' is the same as 'petere':

'peter' : ('Peter Eisentraut (?)', 'pete...@gmx.net'),


I've compared all branch heads and all tags with a cvs checkout. The
only differences are keyword expansion errors. Most commonly the RCS
version 1.1 is used in the resulting git repository, instead of
version 1.1.1.1. This also leads to getting dates wrong ($Date keyword).

I'm unsure on how to test Tom's requirement that every commit and its
log message is included in the resulting git repository. Feel free to
clone and inspect the mentioned git repository and propose improvements
on the cvs2git options used.

Aidan Van Dyk wrote:
 Yes, but the point is you want an exact replica of CVS right?  You're
 git repo should have $PostgreSQL$ and the cvs export/checkout (you do
 use -kk right) should also have $PostgreSQL$.

No, I'm testing against cvs checkout, as that's what everybody is used to.

 But it's important, because on *some* files you *do* want expanded
 keywords (like the $OpenBSD ... Exp $.  One of the reasons pg CVS went
 to the $PostgreSQL$ keyword (I'm guessing) was so they could explictly
 de-couple them from other keywords that they didn't want munging on.

I don't care half as much about the keyword expansion stuff - that's
doomed to disappear anyway.

What I'm much more interested in is correctness WRT historic contents,
i.e. that git log, git blame, etc.. deliver correct results. That's
certainly harder to check.

In my experience, cvs2svn (or cvs2git) does a pretty decent job at that,
even in case of some corruptions. Plus it offers lots of options to fine
tune the conversion, see the attached configuration I've used.

 So, I wouldn't consider any conversion good unless it had all these:
 
 As well as stuff like:
   parsecvs-master:src/backend/access/index/genam.c: *   $PostgreSQL$

I disagree here and find it more convenient for the git repository to
keep the old RCS versions - as in the source tarballs that got (and
still get) shipped. Just before switching over to git one can (and
should, IMO) remove these tags to avoid confusion.

Regards

Markus Wanner
# (Be in -*- mode: python; coding: utf-8 -*- mode.)

import re

from cvs2svn_lib import config
from cvs2svn_lib import changeset_database
from cvs2svn_lib.common import CVSTextDecoder
from cvs2svn_lib.log import Log
from cvs2svn_lib.project import Project
from cvs2svn_lib.git_revision_recorder import GitRevisionRecorder
from cvs2svn_lib.git_output_option import GitRevisionMarkWriter
from cvs2svn_lib.git_output_option import GitOutputOption
from cvs2svn_lib.revision_manager import NullRevisionRecorder
from cvs2svn_lib.revision_manager import NullRevisionExcluder
from cvs2svn_lib.fulltext_revision_recorder \
 import SimpleFulltextRevisionRecorderAdapter
from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader
from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader
from cvs2svn_lib.checkout_internal import InternalRevisionRecorder
from cvs2svn_lib.checkout_internal import InternalRevisionExcluder
from cvs2svn_lib.checkout_internal import InternalRevisionReader
from cvs2svn_lib.symbol_strategy import AllBranchRule
from cvs2svn_lib.symbol_strategy import AllTagRule
from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule
from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule
from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule
from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule
from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule
from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule
from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule
from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform
from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform
from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform
from cvs2svn_lib.property_setters import AutoPropsPropertySetter
from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter
from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter
from cvs2svn_lib.property_setters import CVSRevisionNumberSetter
from cvs2svn_lib.property_setters import DefaultEOLStyleSetter
from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote: 
 
 But at least it doesn't seem like anyone is seriously arguing that
 true serializability wouldn't be a nice feature, if hypothetically
 we had an agreed-upon implementation and a high-level developer with
 a lot of time on their hands.
 
If that's true, I think it represents a major shift in perspective on
this list.  Does everyone *really* agree with the above?
 
 - Is it feasible to think about implementing this with traditionally
 blocking locks?  Kevin seems to think it isn't, because it will suck
 too much.
 
I'm not sure it's without value to the project; I just don't know that
it would be worth using for us.  It seems to be accepted in some other
DBMS products.  Since some (like MS SQL Server) allow users to choose
snapshot isolation or blocking-based serializable transactions in
their MVCC implementation, it would be interesting to know how many
users have chosen the latter.  Has anyone seen numbers (or even have
anecdotal evidence) on this point?
 
 - Is it feasible to think about implementing this with SIREAD locks?
 
I'd be willing to bet that if we solved the predicate locking issue,
the rest of it would be minor by comparison.  But I am still trying
get comfortable with the train of thought I got onto when responding
to Greg Stark's last email on the topic.
 
With blocking techniques you always have at least two transactions
involved, and you can pick between at least two, when you need to roll
something back.  With this new method, it is possible to discover the
dangerous structure which requires rollback when there is only one
participating transaction left active -- which might have done a lot
of work by that point.  It seems like a pretty significant weakness. 
Do others see that as a fatal flaw?
 
 - Why is this an all-or-nothing proposition?  Given the undeniable
 difficulty of getting large patches committed, tying the locking-of-
 existing-rows part of the solution to the predicate-locking part of
 the solution seems like a recipe for failure.
 
Agreed.  If we can get agreement on an approach, with a road map which
allows incremental progress, we might be able to contribute
programming for some parts, and might be able to draw in others for
reasonable chunks.  Requiring all-or-nothing seems to me to be the
same as a straight thumbs-down, for all practical purposes.
 
Of course, there's no point starting on coding for even an incremental
change without concensus on the type of issues we've been covering in
this thread.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 6:27 PM, Markus Wanner mar...@bluegap.ch wrote:
 I'm not that eager on the justifiable by simple inspection requirement
 above. I don't think a DBA is commonly doing these inspections at all.

 I think a tool to measure abort rates per transaction (type) would serve
 the DBA better. Of course there may be false positives, but high abort
 rates should point out the problematic transactions pretty quickly. The
 DBA shouldn't need to care about rare serialization failures or their
 justifiability.

I don't think that's true. It might be true for OLTP transactions
where having to repeat the occasional transaction once or twice for no
reason just means a slower response time. Even there I fear it means
the DBA would never be able to guarantee his response time since there
will always be a chance the transaction will have to be repeated too
many times to fall within the guarantee.

But it's certainly insufficient in an OLAP or DSS environment where
transactions can take hours. If you can never know for sure that
you've written your transaction safely and it might randomly fail and
need to be retried any given day due to internal implementation issues
you can't predict then I would call the system just broken.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-01 Thread Tom Lane

Markus Wanner mar...@bluegap.ch writes:
 I'm missing the names and email addresses for these committers:

 'barry' : ('barry??', ''),

Barry Lind, formerly one of the JDBC bunch, been inactive for awhile

 'dennis' : ('Dennis??', ''),

I suppose this must be Dennis Björklund, but I didn't realize he
used to be a committer.

 'inoue' : ('inoue??', ''),

Hiroshi Inoue, still active, but ODBC is not part of core anymore

 'jurka' : ('jurka??', ''),

Kris Jurka, still active, but JDBC is not part of core anymore

 'pjw' : ('pjw??', ''),

Philip Warner, inactive (still reads the lists though)

 And I'm guessing that 'peter' is the same as 'petere':

 'peter' : ('Peter Eisentraut (?)', 'pete...@gmx.net'),

No, that would be Peter Mount, also a retired JDBC hacker.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Josh Berkus


Tom,

Let me start this out by voting the things I think we can drop until 8.5:

* gettext plurals patch needs some polishing
-- revert patch, save for 8.5

#  adjust information_schema precision and scale fields?
-- save for 8.5

# instrument the Windows shared memory reattachment problem?
-- as much as I'd like to do this, the solution could be as bad as the 
problem; we'd need more testing time for new instrumentation.  Will have 
to deal with via testing patched versions.


# tweak the ordering heuristics for parallel pg_restore?
-- beta2 version is good enough; further improvements should be saved 
for 8.5.


# change or at least redocument from_collapse_limit?
-- should be doc patch only.  Robert Haas should write it.

# revisit increase of default_statistics_target?
-- No.  Still appears to be artifact of DBT2.  Can't reproduce the issue 
using pgbench, or any other test.  Still investigating.



Other questions:

#  cost_nestloop and cost_hashjoin are broken for SEMI and ANTI joins

* tgl says: I think this is mostly dealt with but it could use 
performance testing.


Mark?  Jignesh?   Can you test this?





--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Robert Haas robertmh...@gmail.com wrote: 
 But at least it doesn't seem like anyone is seriously arguing that
 true serializability wouldn't be a nice feature, if hypothetically
 we had an agreed-upon implementation and a high-level developer with
 a lot of time on their hands.
 
 If that's true, I think it represents a major shift in perspective on
 this list.  Does everyone *really* agree with the above?

I think we'd all love to have it, if we could get it with reasonable
performance and without an undue amount of complexity.  What you're
up against is a lot of skepticism that that's going to be possible.
Which then translates into wondering whether partial solutions are
worthwhile, if they won't ever get extended to full solutions.

(So, in that sense, discussing possible implementations now is not
premature --- we need to calibrate what we believe is possible.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote: 
 
 But it's certainly insufficient in an OLAP or DSS environment where
 transactions can take hours. If you can never know for sure that
 you've written your transaction safely and it might randomly fail
 and need to be retried any given day due to internal implementation
 issues you can't predict then I would call the system just broken.
 
I absolutely guarantee that it means that a transaction like that
should not be run at the SERIALIZABLE transaction isolation level
without some other protection.  I don't know that I would say the
system is broken when that's true; it seems to me more a matter of
having a tool in you tookbox which isn't the right one for every job.
 
The question is, is it an unacceptably risky foot-gun?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Joshua D. Drake

On Mon, 2009-06-01 at 11:09 -0700, Josh Berkus wrote:
 Tom,

 #  adjust information_schema precision and scale fields?
 -- save for 8.5

I read this thread. It seems for the changes we can make we should just
make them. The actual amount of change is actually very nominal.

Joshua D. Drake


-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 #  cost_nestloop and cost_hashjoin are broken for SEMI and ANTI joins
  * tgl says: I think this is mostly dealt with but it could use 
 performance testing.

 Mark?  Jignesh?   Can you test this?

Actually I'm hoping that Kevin Grittner will have something to report
on that one soon --- he's our best test case for complicated EXISTS
queries.  I doubt that the standard benchmarks exercise this code
meaningfully at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Josh Berkus


Kevin,


I'm not sure it's without value to the project; I just don't know that
it would be worth using for us.  It seems to be accepted in some other
DBMS products.  Since some (like MS SQL Server) allow users to choose
snapshot isolation or blocking-based serializable transactions in
their MVCC implementation, it would be interesting to know how many
users have chosen the latter.  Has anyone seen numbers (or even have
anecdotal evidence) on this point?


This approach allowed MSSQL to clean up on TPCE; to date their 
performance on that benchmark is so much better than anyone else nobody 
else wants to publish.


So, at least theoretically, anyone who had a traffic mix similar to TPCE 
would benefit.  Particularly, some long-running serializable 
transactions thrown into a mix of Read Committed and Repeatable Read 
transactions, for a stored procedure driven application.


In the field, we're not going so see a lot of requests for this because 
most applications that complex run in Java middleware with pessimistic 
locking.  To the exent, though, that we want to promote PostgreSQL as 
'better development platform' for transactional applications, it might 
be beneficial to support more sophisticated serializablity.


Besides, I'd love to beat Microsoft on TPCE.  ;-)

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-01 Thread Alvaro Herrera

Tom Lane wrote:
 Markus Wanner mar...@bluegap.ch writes:

  'dennis' : ('Dennis??', ''),
 
 I suppose this must be Dennis Bj�rklund, but I didn't realize he
 used to be a committer.

IIRC he was given commit privs for translation files.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-01 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 I suppose this must be Dennis Björklund, but I didn't realize he
 used to be a committer.

 IIRC he was given commit privs for translation files.

Ah, right, that does ring a bell now.

BTW, Markus: you do realize thomas is not me but Tom Lockhart?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 7:24 PM, Josh Berkus j...@agliodbs.com wrote:
  Since some (like MS SQL Server) allow users to choose
 snapshot isolation or blocking-based serializable transactions in
 their MVCC implementation

 This approach allowed MSSQL to clean up on TPCE; to date their performance
 on that benchmark is so much better than anyone else nobody else wants to
 publish.

Are you sure you aren't thinking of some other feature? An
implementation of Serializable transactions isn't going to suddenly
make MSSQL faster than Oracle which uses snapshots anyways.

From what I remember TPC-E actually spends most of its energy testing
things like check constraints, referential integrity checks, and
complex queries. What you describe is possible but it's seems more
likely to be due to some kind of optimization like materialized views
or cached query results or something like that.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path improvements

2009-06-01 Thread Alvaro Herrera

Josh Berkus wrote:

 Well I don't mind push but I still think pop is an error. What you
 really want to do is restore it to the value you started with. You
 don't want to remove the last element since that may not be the
 element you added. Some function you called may have added an extra
 element on the head.

 Yeah, pop is a misnomer; what I'd want is  
 search_path_del(search_path,'admin') ... that is, a way to remove a  
 specific schema from the list.

Except that del shouldn't delete if your push didn't add it because
it was already present.  So you actually want some sort of refcounting
there somehow.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path improvements

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 7:57 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Josh Berkus wrote:

 Well I don't mind push but I still think pop is an error. What you
 really want to do is restore it to the value you started with. You
 don't want to remove the last element since that may not be the
 element you added. Some function you called may have added an extra
 element on the head.

 Yeah, pop is a misnomer; what I'd want is
 search_path_del(search_path,'admin') ... that is, a way to remove a
 specific schema from the list.

 Except that del shouldn't delete if your push didn't add it because
 it was already present.  So you actually want some sort of refcounting
 there somehow.

As I said earlier I doubt pop or delete is ever going to actually
be what you want. I suspect you're far more likely to want to restore
it to what it was before you started altering it.

As support I'll point out this is what our C api has. There's no short
cut to strip out a single element of the path but the normal calling
pattern is to set aside a copy of the old path, add modify it in some
way -- often adding a schema to the head -- then restore the old path.

Note that you may want to make other modifications such as adding
several paths -- it would suck to have to hard code those twice once
to add and once to remove. Or remove a search path element and then
later restore it. Or for that matter to replace the whole search path
wholesale temporarily...




-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path vs extensions

2009-06-01 Thread Alvaro Herrera

Tom Lane escribió:

 An
 alternative was to not have BEGIN/END but instead a GUC variable that
 you can SET to the name of the extension currently being added to.
 (The main advantage of that is that the state isn't hidden, but easily
 checkable via existing commands.)

With the CREATE EXTENSION you could still have a read-only GUC var
current_extension or so.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Heikki Linnakangas


Aidan Van Dyk wrote:

* Heikki Linnakangas heikki.linnakan...@enterprisedb.com [090601 10:56]:

Tom Lane wrote:

Fujii Masao masao.fu...@gmail.com writes:

pg_standby can use ln command to restore an archived file,
which might destroy the archived file as follows.

Does it matter?  pg_standby's source area wouldn't normally be an
archive in the real sense of the word, it's just a temporary staging
area between master and slave.  (If it were being used as a real
archive, keeping it on the same disk as the live slave seems pretty
foolish anyway, so the case wouldn't arise.)
It seems perfectly sane to source pg_standby directly from the archive  
to me. And we're talking about symbolic linking, so the archive  
directory might well be on an NFS mount.


I would expect that any archive directly available would at least be RO
to the postgres slave... But


Me too.

I wonder if we should just remove the symlink option from pg_standby. 
Does anyone use it? Is there a meaningful performance difference?



Something like this would stop the symlink being renamed... Not portable, but 
probably portable
across platforms that have symlinks...
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c


That seems reasonable as well.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-06-01 Thread Kevin Field

On May 29, 12:43 pm, Kevin Field kevinjamesfi...@gmail.com wrote:
 On May 29, 11:48 am, Kevin Field kevinjamesfi...@gmail.com wrote:



  On May 29, 11:35 am, robertmh...@gmail.com (Robert Haas) wrote:

   On Fri, May 29, 2009 at 7:59 AM, Kevin Field kevinjamesfi...@gmail.com 
   wrote:
On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
On May 28, 2009, at 12:53 PM, Kevin Field wrote:

 Can pgTap check for a regex instead if just a string?

 That's the other option, if the pgTAP author is willing...if the
 SQLSTATE thing doesn't work out I guess we'll have to go down that
 road.

Patches welcome. ;-)

   http://github.com/theory/pgtap/tree/master/

I'm getting a new version ready to release as I type.

Thanks, great to know.  :)  Although, I do think changing plperl is
the more proper option, so I'm going to try there first...

   It seems to me that removing line numbers from PL/perl error messages
   is not a good solution to anything.  Line numbers are extremely useful
   for debugging purposes, and getting rid of them because one particular
   testing framework doesn't know how to use regular expressions is
   solving the wrong problem.

  You're right, but that's not what I'm proposing...

   I'm also a bit confused because your original post had a line number
   in the PL/pgsql output, too, just formatted slightly differently.  Why
   doesn't that one cause a problem?

  The difference is, in PL/pgsql they're in the CONTEXT: line, whereas
  in plperl they're in the error line.  This is inconsistent; if we fix
  it, we don't need to add kludge to pgTAP.

  But later in the thread the desired fix became not changing perl but
  instead making a way to report error codes from plperl, which is what
  I'm attempting to do with my rusty C skills soon.  plperl should have
  ereport() *anyway*, as I believe Tom had insinuated.

 Hmm, I'm rustier than I thought.  I might need some help with this
 later.

Actually, I'm not sure I'll be able to be of any use on this after
all.  Would someone be able to add plperl ereport to the todo list for
me at least?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I wonder if we should just remove the symlink option from pg_standby. 

I was considering suggesting that too...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path improvements

2009-06-01 Thread Sam Mason

On Mon, Jun 01, 2009 at 08:05:33PM +0100, Greg Stark wrote:
 As I said earlier I doubt pop or delete is ever going to actually
 be what you want. I suspect you're far more likely to want to restore
 it to what it was before you started altering it.
 
 As support I'll point out this is what our C api has. There's no short
 cut to strip out a single element of the path but the normal calling
 pattern is to set aside a copy of the old path, add modify it in some
 way -- often adding a schema to the head -- then restore the old path.

Without reading much of what's been said here (I've read maybe ten of
the posts in this thread) I'll say it sounds a lot like lexical closures
are needed.  Code is free to define and use generally use whatever is
in their closure, but can't affect what's outside it unless explicitly
granted.

I saw these mentioned in another post by David Wheeler[1] but my client
says it wasn't directly responded to.  He calls it lexical scoping
but I think closing over the environment seems more suitable---mainly
because it'll go wrong less often in the presence of functions defined
as security definer.

-- 
  Sam  http://samason.me.uk/

 [1] 
http://archives.postgresql.org/message-id/5a1fe6b1-9857-454c-a385-ba061ded3...@kineticode.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 Let me start this out by voting the things I think we can drop until 8.5:

 * gettext plurals patch needs some polishing
 -- revert patch, save for 8.5

Peter might think differently about that ;-).  My problem with it is
that we don't seem to have a polished API for how code uses the feature.
I wouldn't mind so much except that once we release we are going to be
stuck with the API indefinitely --- the usage will propagate into
third-party code very quickly and we won't want to break that code by
changing it.

Personally I'd prefer to fix it rather than revert it.

 #  adjust information_schema precision and scale fields?
 -- save for 8.5

No objection here.  I mainly wanted to make sure the issue doesn't get
forgotten.

 # instrument the Windows shared memory reattachment problem?
 -- as much as I'd like to do this, the solution could be as bad as the 
 problem; we'd need more testing time for new instrumentation.  Will have 
 to deal with via testing patched versions.

Yeah.  That was put on the list a month ago, when there was still plenty
of time to do something about it; but since we missed getting it into
beta2 I think it will have to wait.

 # tweak the ordering heuristics for parallel pg_restore?
 -- beta2 version is good enough; further improvements should be saved 
 for 8.5.

OK, particularly if no one is in a position to test it right away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 2:09 PM, Josh Berkus j...@agliodbs.com wrote:
 Tom,

 Let me start this out by voting the things I think we can drop until 8.5:

 * gettext plurals patch needs some polishing
 -- revert patch, save for 8.5

 #  adjust information_schema precision and scale fields?
 -- save for 8.5

 # instrument the Windows shared memory reattachment problem?
 -- as much as I'd like to do this, the solution could be as bad as the
 problem; we'd need more testing time for new instrumentation.  Will have to
 deal with via testing patched versions.

 # tweak the ordering heuristics for parallel pg_restore?
 -- beta2 version is good enough; further improvements should be saved for
 8.5.

 # change or at least redocument from_collapse_limit?
 -- should be doc patch only.  Robert Haas should write it.

 # revisit increase of default_statistics_target?
 -- No.  Still appears to be artifact of DBT2.  Can't reproduce the issue
 using pgbench, or any other test.  Still investigating.


 Other questions:

 #  cost_nestloop and cost_hashjoin are broken for SEMI and ANTI joins

    * tgl says: I think this is mostly dealt with but it could use
 performance testing.

 Mark?  Jignesh?   Can you test this?

+1 to all of these.  Will send extremely short doc patch tonight.  I
recommend we create
http://wiki.postgresql.org/wiki/PostgreSQL_8.5_Open_Items and start
bumping things that can't be completed for 8.4.

Basically, my opinion is that if it's not a regression, it's too late
to work on it now.  We should ship the release when we're confident
that the new features have had adequate testing and we've squashed all
the regressions.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] search_path improvements

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 3:27 PM, Sam Mason s...@samason.me.uk wrote:
 On Mon, Jun 01, 2009 at 08:05:33PM +0100, Greg Stark wrote:
 As I said earlier I doubt pop or delete is ever going to actually
 be what you want. I suspect you're far more likely to want to restore
 it to what it was before you started altering it.

 As support I'll point out this is what our C api has. There's no short
 cut to strip out a single element of the path but the normal calling
 pattern is to set aside a copy of the old path, add modify it in some
 way -- often adding a schema to the head -- then restore the old path.

 Without reading much of what's been said here (I've read maybe ten of
 the posts in this thread) I'll say it sounds a lot like lexical closures
 are needed.  Code is free to define and use generally use whatever is
 in their closure, but can't affect what's outside it unless explicitly
 granted.

 I saw these mentioned in another post by David Wheeler[1] but my client
 says it wasn't directly responded to.  He calls it lexical scoping
 but I think closing over the environment seems more suitable---mainly
 because it'll go wrong less often in the presence of functions defined
 as security definer.

+1.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 I recommend we create
 http://wiki.postgresql.org/wiki/PostgreSQL_8.5_Open_Items and start
 bumping things that can't be completed for 8.4.

Why not just the regular TODO list?

Stuff that has plausible patches attached can go directly to the
CommitFest_2009-First page, but otherwise I don't see a need for
special treatment.  If we kick something off the 8.4 open items
list we're essentially saying it doesn't deserve special treatment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I recommend we create
 http://wiki.postgresql.org/wiki/PostgreSQL_8.5_Open_Items and start
 bumping things that can't be completed for 8.4.

 Why not just the regular TODO list?

 Stuff that has plausible patches attached can go directly to the
 CommitFest_2009-First page, but otherwise I don't see a need for
 special treatment.  If we kick something off the 8.4 open items
 list we're essentially saying it doesn't deserve special treatment.

That would be fine too...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] from_collapse_limit vs. geqo_threshold

2009-06-01 Thread Selena Deckelmann


In the spirit of helping wrap-up 8.4 todo items...

Robert Haas wrote:

On Mon, May 25, 2009 at 6:15 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Now I'm still not exactly happy with GEQO, but it's surely a lot better
than it was in the fall of 2000.  So on the whole it does seem that the
current relationships between from_collapse_limit, join_collapse_limit,
and geqo_threshold are based on obsolete information and should be
revisited.  I don't have any data at hand to suggest specific new
default values, though.


For 8.4, I'd be happy to just improve the documentation.  I think this
sentence could just be deleted from the section on
from_collapse_limit:

It is usually wise to keep this less thanxref linkend=guc-geqo-threshold.

We could put some other explanation in place of that sentence, but I'm
not exactly sure what that explanation would say.  I guess the point
is that setting from_collapse_limit  geqo_threshold may delay GEQO
planning considerably in the face of complex subqueries, because
pulling up subqueries increases the size of the FROM list (I think).
That could be good if you want your query plans to be more
deterministic, but there's no guarantee they'll be good. Setting
from_collapse_limit  geqo_threshold is basically saying that the
standard planner will always have subqueries pulled up, so
from_collapse_limit should be based on what the pain point will be for
GEQO.



My vote would be to provide some information.

Suggested revision of Robert's prose:

Because genetic query optimization may be triggered, increasing 
from_collapse_limit should be considered relative to xref 
linkend=guc-geqo-threshold.


-selena


--
Selena Deckelmann
End Point Corporation
sel...@endpoint.com
503-282-2512

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote:
 
 If you can never know for sure that you've written your transaction
 safely
 
Whoa!  I just noticed this phrase on a re-read.  I think there might
be some misunderstanding here.
 
You can be sure you've written your transaction safely just as soon as
your COMMIT returns without error.  Perhaps you're getting confused
because under the non-blocking approach, each transaction's read locks
(if any) continue to be tracked until all concurrent transactions
terminate in order to determine if some *other* transaction might need
to be rolled back.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-01 Thread Zdenek Kotala

During integration gcc4.2 into Solaris. My colleague hit a following
problem with PostgreSQL compilation:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6845982

cd 
/builds/sfw-fixes/usr/src/cmd/postgres/postgresql-8.2/postgresql-8.2.13/src/pl/plperl
+ /ws/onnv-tools/SUNWspro/SS12/bin/cc -xO3 -m32 -xchip=pentium -xspace -Xa 
-xildoff -xc99=all -xc99=none -xCC -KPIC -I. -I../../../src/include 
-I/builds/sfw-fixes/proto/root_i386/usr/include 
-I/builds/sfw-fixes/proto/root_i386/usr/sfw/include -I/usr/sfw/include 
-I/usr/include/kerberosv5 -I/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE -c -o 
plperl.o plperl.c
/usr/include/sys/list.h, line 52: identifier redeclared: list_head
current : function(pointer to struct list {unsigned int list_size, 
unsigned int list_offset, struct list_node {..} list_head}) returning pointer 
to void
previous: function(pointer to struct List {enum NodeTag 
{T_TIDBitmap(902), (blah blah)

At this moment there is no clear conclusion how to fix it in solaris system 
headers to satisfy all side.  :(

My idea is to rename list_head to pg_list_head (and other functions 
analogously) to avoid name conflict.

Comments, better ideas?

Thanks Zdenek






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 3:57 PM, Zdenek Kotala zdenek.kot...@sun.com wrote:
 During integration gcc4.2 into Solaris. My colleague hit a following
 problem with PostgreSQL compilation:

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6845982

 cd 
 /builds/sfw-fixes/usr/src/cmd/postgres/postgresql-8.2/postgresql-8.2.13/src/pl/plperl
 + /ws/onnv-tools/SUNWspro/SS12/bin/cc -xO3 -m32 -xchip=pentium -xspace -Xa 
 -xildoff -xc99=all -xc99=none -xCC -KPIC -I. -I../../../src/include 
 -I/builds/sfw-fixes/proto/root_i386/usr/include 
 -I/builds/sfw-fixes/proto/root_i386/usr/sfw/include -I/usr/sfw/include 
 -I/usr/include/kerberosv5 -I/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE -c 
 -o plperl.o plperl.c
 /usr/include/sys/list.h, line 52: identifier redeclared: list_head
        current : function(pointer to struct list {unsigned int list_size, 
 unsigned int list_offset, struct list_node {..} list_head}) returning pointer 
 to void
        previous: function(pointer to struct List {enum NodeTag 
 {T_TIDBitmap(902), (blah blah)

 At this moment there is no clear conclusion how to fix it in solaris system 
 headers to satisfy all side.  :(

 My idea is to rename list_head to pg_list_head (and other functions 
 analogously) to avoid name conflict.

        Comments, better ideas?

Uh... isn't it rather bad form for the operating system headers to be
defining commonly used type names like list?  I'd be very surprised
if PostgreSQL is the only application that is going to break on this.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 8:55 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:

 Whoa!  I just noticed this phrase on a re-read.  I think there might
 be some misunderstanding here.

 You can be sure you've written your transaction safely just as soon as
 your COMMIT returns without error.

I think we have different definitions of safely. You only know that
you got away with it *this time* when the commit returns without
error.

I'm concerned with whether you can be sure that the 999th time you run
it the database won't randomly decide to declare a serialization
failure for reasons you couldn't predict were possible.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-01 Thread Tom Lane

Zdenek Kotala zdenek.kot...@sun.com writes:
 My idea is to rename list_head to pg_list_head (and other functions 
 analogously) to avoid name conflict.

There is zero chance of that happening.  We have thousands of references
to those functions in the core code, and who knows how many more in
third-party code we don't control.

What is sys/list.h, and why is it being imported by the Perl headers?
Can we do something like #define list_head solaris_list_head around
the problematic #include?  (Which one is that, anyway?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Magnus Hagander

Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 # instrument the Windows shared memory reattachment problem?
 -- as much as I'd like to do this, the solution could be as bad as the 
 problem; we'd need more testing time for new instrumentation.  Will have 
 to deal with via testing patched versions.
 
 Yeah.  That was put on the list a month ago, when there was still plenty
 of time to do something about it; but since we missed getting it into
 beta2 I think it will have to wait.

Agreed. I did some mucking around with it, but the parts I found
reasonably easy to do were also reasonably useless :(

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 4:08 PM, Greg Stark st...@enterprisedb.com wrote:
 On Mon, Jun 1, 2009 at 8:55 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:

 Whoa!  I just noticed this phrase on a re-read.  I think there might
 be some misunderstanding here.

 You can be sure you've written your transaction safely just as soon as
 your COMMIT returns without error.

 I think we have different definitions of safely. You only know that
 you got away with it *this time* when the commit returns without
 error.

 I'm concerned with whether you can be sure that the 999th time you run
 it the database won't randomly decide to declare a serialization
 failure for reasons you couldn't predict were possible.

Aren't serialization failures of any sort unpredictable, or any database?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote:
 On Mon, Jun 1, 2009 at 8:55 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 
 You can be sure you've written your transaction safely just as soon
 as your COMMIT returns without error.
 
 I think we have different definitions of safely. You only know
 that you got away with it *this time* when the commit returns
 without error.
 
 I'm concerned with whether you can be sure that the 999th time you
 run it the database won't randomly decide to declare a serialization
 failure for reasons you couldn't predict were possible.
 
Now you're questioning whether SERIALIZABLE transaction isolation
level is useful.  Probably not for everyone, but definitely for some.
 
As stated before, the trade-off is that you don't need to know what
all the transactions look like or which ones might be run concurrently
in order to guarantee that you avoid anomalies; but you need to be
able to handle the rollback of any serializable transaction.  Nothing
in the proposed techniques would create problems like you describe in
transactions running at other isolation levels, or preclude taking out
explicit locks to prevent this where you need additional guarantees --
like needing to be sure that a transaction won't be rolled back with a
serialization failure after 10 hours.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-01 Thread Zdenek Kotala


Robert Haas píše v po 01. 06. 2009 v 16:03 -0400:
 On Mon, Jun 1, 2009 at 3:57 PM, Zdenek Kotala zdenek.kot...@sun.com wrote:
  During integration gcc4.2 into Solaris. My colleague hit a following
  problem with PostgreSQL compilation:
 
  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6845982
 
  cd 
  /builds/sfw-fixes/usr/src/cmd/postgres/postgresql-8.2/postgresql-8.2.13/src/pl/plperl
  + /ws/onnv-tools/SUNWspro/SS12/bin/cc -xO3 -m32 -xchip=pentium -xspace -Xa 
  -xildoff -xc99=all -xc99=none -xCC -KPIC -I. -I../../../src/include 
  -I/builds/sfw-fixes/proto/root_i386/usr/include 
  -I/builds/sfw-fixes/proto/root_i386/usr/sfw/include -I/usr/sfw/include 
  -I/usr/include/kerberosv5 -I/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE 
  -c -o plperl.o plperl.c
  /usr/include/sys/list.h, line 52: identifier redeclared: list_head
 current : function(pointer to struct list {unsigned int list_size, 
  unsigned int list_offset, struct list_node {..} list_head}) returning 
  pointer to void
 previous: function(pointer to struct List {enum NodeTag 
  {T_TIDBitmap(902), (blah blah)
 
  At this moment there is no clear conclusion how to fix it in solaris system 
  headers to satisfy all side.  :(
 
  My idea is to rename list_head to pg_list_head (and other functions 
  analogously) to avoid name conflict.
 
 Comments, better ideas?
 
 Uh... isn't it rather bad form for the operating system headers to be
 defining commonly used type names like list?  

It is in solaris since 2003 for kernel structure linked list. When I
look on it closer, it seems that problem is more in perl which includes
some sys/ headers which is probably not good practice. :(

 I'd be very surprised
 if PostgreSQL is the only application that is going to break on this.

At this moment I know only about PostgreSQL. How I understand it now, it
appears only when source is compiling with Perl. 

I will look more on it tomorrow.

thanks Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Josh Berkus


All,

I just realized that even if you do this:

table foo (
id  serial,
bar varchar(200)
)

ALTER TABLE foo ALTER COLUMN bar TYPE VARCHAR(1000)

... it triggers a heap  index rebuild, even though it's completely 
unnecessary.  Is this a special case of VARCHAR, or are there other 
types where we should be allowing typemod changes without rebuilding?


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 Tom Lane wrote:
 # instrument the Windows shared memory reattachment problem?
 
 Yeah.  That was put on the list a month ago, when there was still plenty
 of time to do something about it; but since we missed getting it into
 beta2 I think it will have to wait.

 Agreed. I did some mucking around with it, but the parts I found
 reasonably easy to do were also reasonably useless :(

OK, pushed that item to TODO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 I just realized that even if you do this:
 ALTER TABLE foo ALTER COLUMN bar TYPE VARCHAR(1000)
 ... it triggers a heap  index rebuild, even though it's completely 
 unnecessary.

Yeah, this has been discussed before; I think it's even in the TODO
list.  The stumbling block has been to identify a reasonably clean way
of determining which datatype changes don't require a scan.

 Is this a special case of VARCHAR, or are there other 
 types where we should be allowing typemod changes without rebuilding?

There are any number of other cases where it's potentially interesting.
Consider:

* NUMERIC - NUMERIC with a larger precision and/or scale

* VARBIT lengthening

* TIMESTAMP precision increase

* VARCHAR(anything) - TEXT

and that's without considering the potential uses for user-defined
types.  Now that we allow user-defined types to have usable typmods,
I'm sure there will be applications for them too.  There are also
cases where a change might require a scan to ensure a new constraint
is met, but not a rewrite (eg, reducing the max length of VARCHAR).

We could certainly put in a quick hack that just covered a few of the
cases for built-in types, but it's not very pleasing ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-01 Thread Zdenek Kotala


Tom Lane píše v po 01. 06. 2009 v 16:09 -0400:
 Zdenek Kotala zdenek.kot...@sun.com writes:


 What is sys/list.h, and why is it being imported by the Perl headers?

It seems that problem is with Perl. It includes sys/mode.h. The new
change for gcc 4.2 is that mode.h includes vnode.h and it finally
sys/list.h which is generic list for kernel structures. 

 Can we do something like #define list_head solaris_list_head around 
 the problematic #include?  (Which one is that, anyway?)

I will try it. There is following hack:

CFLAGS=-D_SYS_VNODE_H 

which disable vnode.h and breaks include chain.

I need to install fresh nightly build of Solaris to test it on live
system.

Thanks Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Josh Berkus j...@agliodbs.com wrote:
 
 This approach allowed MSSQL to clean up on TPCE; to date their 
 performance on that benchmark is so much better than anyone else
 nobody else wants to publish.
 
Since they use a compatibility level setting to control whether a
request for a serializable transaction gives you snapshot isolation or
a true serializable transaction, you have to be careful interpreting
results like that.  Are you sure which one they used for this
benchmark?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Jeff Davis

On Mon, 2009-06-01 at 13:26 -0700, Josh Berkus wrote:
 All,
 
 I just realized that even if you do this:
 
 table foo (
   id  serial,
   bar varchar(200)
 )
 
 ALTER TABLE foo ALTER COLUMN bar TYPE VARCHAR(1000)
 
 ... it triggers a heap  index rebuild, even though it's completely 
 unnecessary.  Is this a special case of VARCHAR, or are there other 
 types where we should be allowing typemod changes without rebuilding?

NUMERIC(x, y) comes to mind, although that might be a more dangerous
case. If you turn a NUMERIC(5,0) into a NUMERIC(5,1), then '1.2' may be
stored as either '1' or '1.2' depending on whether you did the insert
before or after the change. That's because, with NUMERIC, it's not
really a constraint, but a rule for rounding.

There may be other interesting cases involving constraints. For
instance, if you have CHECK(i  200), you should be able to add CHECK(i
 1000) without an exclusive lock or recheck. Then, with an exclusive
lock, you can remove the original tighter constraint, but at least it
wouldn't have to recheck the entire table.

Not sure how much effort that is worth -- VARCHAR and NUMERIC typmods
are probably more common problems and easier to fix.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 9:24 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I'm concerned with whether you can be sure that the 999th time you
 run it the database won't randomly decide to declare a serialization
 failure for reasons you couldn't predict were possible.

 Now you're questioning whether SERIALIZABLE transaction isolation
 level is useful.  Probably not for everyone, but definitely for some.

No, I'm not. I'm questioning whether a serializable transaction
isolation level that makes no guarantee that it won't fire spuriously
is useful.

Postgres doesn't take block level locks or table level locks to do
row-level operations. You can write code and know that it's safe from
deadlocks.

Heikki proposed a list of requirements which included a requirement
that you not get spurious serialization failures and you rejected that
on the basis that that's not how MSSQL and Sybase implement it.

I'm unhappy with the idea that if I access too many rows or my query
conditions aren't written just so then the database will forget which
rows I'm actually concerned with and lock other random unrelated
records and possibly roll my transaction back even though my I had no
way of knowing my code was at risk.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 9:49 PM, Jeff Davis pg...@j-davis.com wrote:


 NUMERIC(x, y) comes to mind, although that might be a more dangerous
 case. If you turn a NUMERIC(5,0) into a NUMERIC(5,1), then '1.2' may be
 stored as either '1' or '1.2' depending on whether you did the insert
 before or after the change. That's because, with NUMERIC, it's not
 really a constraint, but a rule for rounding.

Well it's not like rewriting the table is going to accomplish anything though...


 There may be other interesting cases involving constraints. For
 instance, if you have CHECK(i  200), you should be able to add CHECK(i
  1000) without an exclusive lock or recheck. Then, with an exclusive
 lock, you can remove the original tighter constraint, but at least it
 wouldn't have to recheck the entire table.

We have the infrastructure for this kind of check actually, it's the
same kind of thing we do for partition exclusion...



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Jeff Davis

On Mon, 2009-06-01 at 22:12 +0100, Greg Stark wrote:
 No, I'm not. I'm questioning whether a serializable transaction
 isolation level that makes no guarantee that it won't fire spuriously
 is useful.

I am also concerned (depending on implementation, of course) that
certain situations can make it almost certain that you will get
serialization failures every time. For instance, a change in the heap
order, or data distribution, could mean that your application is unable
to make progress at all.

Is this a valid concern, or are there ways of avoiding this situation?

I would think that we'd need some way to detect that this is happening,
give it a few tries, and then resort to full serialization for a few
transactions so that the application can make progress.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Guillaume Smet

On Mon, Jun 1, 2009 at 10:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 We could certainly put in a quick hack that just covered a few of the
 cases for built-in types, but it's not very pleasing ...

Jonah proposed a patch for that a long time ago.

See http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php .

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Josh Berkus




Yeah, this has been discussed before; I think it's even in the TODO
list.


I couldn't find it.  At least, not under data types, and also not with 
the keyword typemod.  Anyone see it?



The stumbling block has been to identify a reasonably clean way
of determining which datatype changes don't require a scan.


Yep.  One possibility I'm thinking is supplying a function for each type 
which takes two typemods (old and new) and returns a value (none, check, 
rebuild) which defines what we need to do: nothing, check but not 
rebuild, or rebuild.  Default would be rebuild.  Then the logic is 
simple for each data type.


Note that this doesn't deal with the special case of VARCHAR--TEXT, but 
just with changing typemods.  Are there other cases of data *type* 
conversions where no check or rebuild is required?  Otherwise we might 
just special case VARCHAR--TEXT.


Oh, here's a general case: changing DOMAINs on the same base type should 
only be a check, and changing from a DOMAIN to its own base type should 
be a none.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Jeff Davis

On Mon, 2009-06-01 at 14:39 -0700, Josh Berkus wrote:
 Note that this doesn't deal with the special case of VARCHAR--TEXT, but 
 just with changing typemods.  Are there other cases of data *type* 
 conversions where no check or rebuild is required?  Otherwise we might 
 just special case VARCHAR--TEXT.

I observe that the casts (VARCHAR - TEXT and TEXT - VARCHAR) are
marked WITHOUT FUNCTION. If that's the case, can't we use that to say
that no heap rebuild is required? Perhaps we'll need to combine this
with the typmod checks to see if we need to check the heap. 

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote: 
 On Mon, Jun 1, 2009 at 9:24 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I'm concerned with whether you can be sure that the 999th time you
 run it the database won't randomly decide to declare a
 serialization failure for reasons you couldn't predict were
 possible.

 Now you're questioning whether SERIALIZABLE transaction isolation
 level is useful.  Probably not for everyone, but definitely for
 some.
 
 No, I'm not. I'm questioning whether a serializable transaction
 isolation level that makes no guarantee that it won't fire
 spuriously is useful.
 
Well, the technique I'm advocating virtually guarantees that there
will be false positives, since it looks only for the dangerous
structure of two adjacent read-write dependences rather than building
a rigorous read-write dependency graph for every serializable
transaction.  Even if you user very fine-grained locks (i.e., what
*columns* were modified in what rows) and had totally accurate
predicate locking, you would still get spurious rollbacks with this
technique.
 
In spite of that, I believe that it will run faster than traditional
serializable transactions, and in one benchmark it ran faster than
snapshot isolation -- apparently because it rolled back conflicting
transactions before they did updates and hit the update conflict
detection phase.
 
 Postgres doesn't take block level locks or table level locks to do
 row-level operations. You can write code and know that it's safe
 from deadlocks.
 
Who's talking about deadlocks?  If you're speaking more broadly of all
serialization failures, you can certainly get them in PostgreSQL.  So
one of us is not understanding the other here.  To clarify what I'm
talking about -- this technique introduces no blocking and cannot
cause a deadlock.
 
 Heikki proposed a list of requirements which included a requirement
 that you not get spurious serialization failures and you rejected
 that on the basis that that's not how MSSQL and Sybase implement it.
 
No, I rejected that on the basis that it precludes the use of the
technique published in the paper I cited, and I believe that technique
is the best currently available.  I'm perfectly happy to get to a
point where we have something which works correctly and have people
try to make it work better by tweaking the locking, but I think that
we'll find a point of diminishing returns -- where the cost of
tracking finer locks costs more than the cost of rerunning some
transactions.  For obvious high-risk situations, where you are
expending extreme resources on one database transaction, I believe it
will be most cost-effective to count on developers to recognize the
risk and use existing techniques.
 
 I'm unhappy with the idea that if I access too many rows or my query
 conditions aren't written just so then the database will forget
 which rows I'm actually concerned with and lock other random
 unrelated records and possibly roll my transaction back even though
 my I had no way of knowing my code was at risk.
 
Then you would apparently not be a good candidate for serializable
transactions, since I don't know of any implementation which performs
well which doesn't have those characteristics.  When Sybase introduced
row level locking, we benchmarked that against the page level locking,
and found that it was significantly slower for our mix.  We did
identify a small number of small tables with high update rates where
switching them to row level locking provided a small performance gain.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Josh Berkus


Jeff,


I observe that the casts (VARCHAR -  TEXT and TEXT -  VARCHAR) are
marked WITHOUT FUNCTION. If that's the case, can't we use that to say
that no heap rebuild is required? Perhaps we'll need to combine this
with the typmod checks to see if we need to check the heap.


yeah, you're right .. that would give us a short list of conversions 
which don't require a rewrite.However, as Tom points out, that 
doesn't mean that they might not need a reindex (as well as OID, there's 
also XML).


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Jeff Davis pg...@j-davis.com wrote: 
 On Mon, 2009-06-01 at 22:12 +0100, Greg Stark wrote:
 No, I'm not. I'm questioning whether a serializable transaction
 isolation level that makes no guarantee that it won't fire
 spuriously is useful.
 
 I am also concerned (depending on implementation, of course) that
 certain situations can make it almost certain that you will get
 serialization failures every time. For instance, a change in the
 heap order, or data distribution, could mean that your application
 is unable to make progress at all.
 
 Is this a valid concern, or are there ways of avoiding this
 situation?
 
I've been concerned about that possibility -- in the traditional
blocking implementations it is OK to attempt the retry almost
immediately, since a conflicting transaction should then block you
until one of the original transactions in the conflict completes.  It
appears to me that with the proposed technique you could jump back in
and hit exactly the same combination of read-write dependencies,
leading to repeated rollbacks.  I'm not happy with the thought of
trying to handle that with simple delays (or even escalating delays)
before retry.
 
I'm not sure how big a problem this is likely to be in practice, so
I've been trying to avoid the trap of premature optimization on this
point.  But a valid concern?  Certainly.
 
 I would think that we'd need some way to detect that this is
 happening, give it a few tries, and then resort to full
 serialization for a few transactions so that the application can
 make progress.
 
I'd hate to go to actual serial execution of all serializable
transactions.  Perhaps we could fall back to traditional blocking
techniques based on some heuristic?  That would create blocking, and
would lead to occassional deadlocks; however, it might be the optimal
fix, if this is found to actually be a problem.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Josh Berkus j...@agliodbs.com wrote:
 
 So, at least theoretically, anyone who had a traffic mix similar to
 TPCE would benefit.  Particularly, some long-running serializable
 transactions thrown into a mix of Read Committed and Repeatable
 Read transactions, for a stored procedure driven application.
 
A belated thought.  The proposed technique does yield different
behavior from traditional techniques for Read Committed and Repeatable
Read transactions which are run concurrently with Serializable
transactions.  In traditional blocking techniques, even a Read
Committed transaction only sees the database in a state consistent
with some serial execution of the serializable transactions.  As far
as I can see, this is not required by the SQL standard, but it might
possibly be an implementation artifact upon which some software might
rely.  Any idea whether this is the case with the TPC-E benchmark?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 yeah, you're right .. that would give us a short list of conversions 
 which don't require a rewrite.However, as Tom points out, that 
 doesn't mean that they might not need a reindex (as well as OID, there's 
 also XML).

Um.  I had actually forgotten about the reindexing point, but yup that
is a stumbling block to any no work conversions.  It might be best to
only handle cases where the column's base type is not changing, so that
we don't have any index semantics changes happening.  I think we could
still handle the varchar-text case (since they share index opclasses)
but that could be a hardwired special case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggested TODO: allow ALTERing of typemods without heap/index rebuild

2009-06-01 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 Yeah, this has been discussed before; I think it's even in the TODO
 list.

 I couldn't find it.  At least, not under data types, and also not with 
 the keyword typemod.  Anyone see it?

It's the last item under ALTER:

* Don't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the 
old and new data types are binary compatible

http://archives.postgresql.org/message-id/200903040137.n241bauv035...@wwwmaster.postgresql.org
http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote: 
 Josh Berkus j...@agliodbs.com writes:
 cost_nestloop and cost_hashjoin are broken for SEMI and ANTI joins
 
 Actually I'm hoping that Kevin Grittner will have something to
 report on that one soon
  
So far, I haven't found any performance regressions in the beta2
release compared to the snapshot from March 2nd on our challenging
queries.  They both perform as well or better than 8.3 for our
queries, although there is a slight increase in planning time to get
to the better plans.
 
Since there are new plans for most of these, I have had to actually
run them to confirm that there is no regression, and that takes some
time.  I'm still chipping away, but I would say that it looks good to
me; unless someone else has a query that's behaving badly, I wouldn't
hold it up for this.
 
I'll post right away if a subsequent test shows a problem.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Greg Stark

On Mon, Jun 1, 2009 at 11:07 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Greg Stark st...@enterprisedb.com wrote:

 No, I'm not. I'm questioning whether a serializable transaction
 isolation level that makes no guarantee that it won't fire
 spuriously is useful.

 Well, the technique I'm advocating virtually guarantees that there
 will be false positives, since it looks only for the dangerous
 structure of two adjacent read-write dependences rather than building
 a rigorous read-write dependency graph for every serializable
 transaction.  Even if you user very fine-grained locks (i.e., what
 *columns* were modified in what rows) and had totally accurate
 predicate locking, you would still get spurious rollbacks with this
 technique.

Yeah, I'm ok compromising on things like having updates on other
columns or even no-op updates trigger serialization failures. For one
thing they do currently, but more importantly from my point of view
they can be explained in documentation and make sense from a user's
point of view.

More generally any time you have a set of transactions that are
touching and selecting from the same set of records, I think it's
obvious to a user that a serialization failure might be possible.

I'm not happy having things like where x = 5 and y = 5 randomly
choose either to lock all records in one or the other index range (or
the whole table) when only the intersection are really interesting to
the plan. That leaves a careful programmer no way to tell which of his
transactions might conflict.

And I'm *really* unhappy with having the decision on which range to
lock depend on the planner decision. That means sometime (inevitably
in the middle of a night) the database will suddenly start getting
serialization failures on transactions that never did before
(inevitably critical batch jobs) because the planner switched plans.

 In spite of that, I believe that it will run faster than traditional
 serializable transactions, and in one benchmark it ran faster than
 snapshot isolation -- apparently because it rolled back conflicting
 transactions before they did updates and hit the update conflict
 detection phase.

I can get the answer infinitely fast if it doesn't have to be right

I know a serialization failure isn't a fatal error and the application
has to be prepared to retry. And I agree that some compromises are
reasonable, serialization failure doesn't have to mean the database
ran a theorem prover and proved that it was impossible to serialize
these transactions. But I think a programmer has to be able to look
at the set of transactions and say yeah I can see these transactions
all depend on the same records.

 Postgres doesn't take block level locks or table level locks to do
 row-level operations. You can write code and know that it's safe
 from deadlocks.

 Who's talking about deadlocks?  If you're speaking more broadly of all
 serialization failures, you can certainly get them in PostgreSQL.  So
 one of us is not understanding the other here.  To clarify what I'm
 talking about -- this technique introduces no blocking and cannot
 cause a deadlock.

Sorry, I meant to type a second paragraph there to draw the analogy.
Just as carefully written SQL code can be written to avoid deadlocks I
would expect to be able to look at SQL code and know it's safe from
serialization failures, or at least know where they might occur.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane

I wrote:
 Sam Mason s...@samason.me.uk writes:
 On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
 There is some case to be made that we should throw error here,
 which we could do by putting error tests where the attached patch
 has comments suggesting an error test.

 With things as they are I think it would be useful to throw an error
 here; if the user means 25 hours they should say 25 hours!

 Well, maybe, but I'm not really convinced.

I've gone ahead and committed the patch as-is, without the error tests.
There's still time to change it if anyone has a killer argument, but
I thought of another issue that would have to be dealt with: consider
values such as INTERVAL '13' MONTH.  Since per spec we should not
reduce this to 1 month, what is going to happen barring significant
redesign on the output side is that the value will print out as
'1 year 1 month'.  If we were to consider that as illegal input for
INTERVAL MONTH then we'd be creating a situation where valid data
fails to dump and reload.  This won't happen for all cases (eg 60
days doesn't overflow into months) but it shows the danger of throwing
error for cases that we can't clearly distinguish on both input and
output.  So I think we should be satisfied for now with accepting
inputs that are valid per spec, and not worry too much about whether
we are rejecting all values that are a bit outside spec.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User-facing aspects of serializable transactions

2009-06-01 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote:
 
 Just as carefully written SQL code can be written to avoid deadlocks
 I would expect to be able to look at SQL code and know it's safe
 from serialization failures, or at least know where they might
 occur.
 
This is the crux of our disagreement, I guess.  I consider existing
techniques fine for situations where that's possible.  But, could you
give me an estimate of how much time it would take you, up front and
ongoing, to do that review in our environment?  About 8,700 queries
undergoing frequent modification, by 21 programmers, for enhancements
in our three-month release cycle.  Plus various ad hoc queries.  We
have one full-time person to run ad hoc data fixes and reports
requested by the legislature and various outside agencies, like
universities doing research.
 
The whole point of the serializable transaction isolation level is
that it is the solution where you *can't* look at all your SQL code
and know where it's safe or where conflicts might occur.  If you can
do that, it's very likely that you don't need this feature.  The
proposed implementation, unlike traditional blocking techniques, won't
affect you if you don't choose to use serializable transactions.
 
Some people might be picturing the kind of blocking inherent in 
traditional techniques where, for example, we had to run our read-only
web application at Read Uncommitted to avoid deadlocks with the
updates from replication.  I like this technique because I don't think
I'd have to do that.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Fujii Masao

Hi,

On Mon, Jun 1, 2009 at 11:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 pg_standby can use ln command to restore an archived file,
 which might destroy the archived file as follows.

 Does it matter?  pg_standby's source area wouldn't normally be an
 archive in the real sense of the word, it's just a temporary staging
 area between master and slave.

If so, it might be deleted after triggering the warm-standby. But, we cannot
remove it because the latest xlog file which is required for normal recovery
might exist in it. This is another undesirable scenario. Is this problem?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 8:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Sam Mason s...@samason.me.uk writes:
 On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
 There is some case to be made that we should throw error here,
 which we could do by putting error tests where the attached patch
 has comments suggesting an error test.

 With things as they are I think it would be useful to throw an error
 here; if the user means 25 hours they should say 25 hours!

 Well, maybe, but I'm not really convinced.

 I've gone ahead and committed the patch as-is, without the error tests.
 There's still time to change it if anyone has a killer argument, but
 I thought of another issue that would have to be dealt with: consider
 values such as INTERVAL '13' MONTH.  Since per spec we should not
 reduce this to 1 month, what is going to happen barring significant
 redesign on the output side is that the value will print out as
 '1 year 1 month'.  If we were to consider that as illegal input for
 INTERVAL MONTH then we'd be creating a situation where valid data
 fails to dump and reload.  This won't happen for all cases (eg 60
 days doesn't overflow into months) but it shows the danger of throwing
 error for cases that we can't clearly distinguish on both input and
 output.  So I think we should be satisfied for now with accepting
 inputs that are valid per spec, and not worry too much about whether
 we are rejecting all values that are a bit outside spec.

Well, there is the possibility that if we implement something fully
spec-compliant in the future, we might run into a situation where
someone puts 13 months in, dumps and reloads, then puts in 13 months
in again, compares the two, and surprisingly they turn out to be
unequal.  But I'm having a hard time caring.  The behavior your patch
implements is clearly a lot more useful than what it replaced, and I
think it's arguably more useful than the spec behavior as well.

More to the point, it's also what 8.3.7 does:

Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

rhaas=# select '99 seconds'::interval;
 interval
--
 00:01:39
(1 row)

rhaas=# select '99 minutes'::interval;
 interval
--
 01:39:00
(1 row)

rhaas=# select '99 hours'::interval;
 interval
--
 99:00:00
(1 row)

rhaas=# select '99 days'::interval;
 interval
--
 99 days
(1 row)

rhaas=# select '99 weeks'::interval;
 interval
--
 693 days
(1 row)

rhaas=# select '99 months'::interval;
interval

 8 years 3 mons
(1 row)

I haven't checked, but hopefully these all now match the 8.4 behavior?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 More to the point, it's also what 8.3.7 does:

Well, no, because the cases at issue are where an interval qualifier
is specified.  8.3 did this:

regression=# select '99 seconds'::interval second;
 interval 
--
 00:00:39
(1 row)

and even more amusingly,

regression=# select interval '99' minute;
 interval 
--
 00:01:00
(1 row)

regression=# select interval '99' hour;  
 interval 
--
 00:00:00
(1 row)

It was all pretty broken back then.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Tom Lane

Fujii Masao masao.fu...@gmail.com writes:
 If so, it might be deleted after triggering the warm-standby. But, we cannot
 remove it because the latest xlog file which is required for normal recovery
 might exist in it. This is another undesirable scenario. Is this problem?

What recovery?  In the problem case you're positing, the slave server
has executed a checkpoint and come up live.  It's never going to be
interested in the old xlog again.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dblink patches for comment

2009-06-01 Thread Tom Lane

Joe Conway m...@joeconway.com writes:
 Probably better if I break this up in logical chunks too. This patch 
 only addresses the refactoring you requested here:
 http://archives.postgresql.org/message-id/28719.1230996...@sss.pgh.pa.us

This looks sane to me in a quick once-over, though I've not tested it.

A small suggestion for future patches: don't bother to reindent code
chunks that aren't changing --- it just complicates the diff with a
lot of uninteresting whitespace changes.  You can either do that after
review, or leave it to be done by pgindent.  (Speaking of which, we
need to schedule that soon...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Fujii Masao

Hi,

On Tue, Jun 2, 2009 at 10:00 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 If so, it might be deleted after triggering the warm-standby. But, we cannot
 remove it because the latest xlog file which is required for normal recovery
 might exist in it. This is another undesirable scenario. Is this problem?

 What recovery?  In the problem case you're positing, the slave server
 has executed a checkpoint and come up live.  It's never going to be
 interested in the old xlog again.

Yes, the old xlog itself is not used again. But, the *old file* might
be recycled
and used later. The case that I'm looking at is that the symlink to a temporary
area is recycled. Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dblink patches for comment

2009-06-01 Thread Tom Lane

Joe Conway m...@joeconway.com writes:
 Here's a much simpler SQL/MED support patch for dblink.

 This enforces security in the same manner for FOREIGN SERVER connections 
 as that worked out over time for other dblink connections. Essentially, 
 the FOREIGN SERVER and associated user MAPPING provides the needed info 
 for the libpq connection, but otherwise behavior is the same.

 I've also attached a doc patch.

The docs patch looks okay, except this comment is a bit hazy:

 +  -- Note: local connection must require authentication for this to work 
 properly

I think what it means is

 +  -- Note: local connection must require password authentication for this to 
 work properly

If not, please clarify some other way.  It might also be good to be a
bit more clear about what fail to work properly might entail.

As far as the code goes, hopefully Peter will take a look since he's
spent more time on the SQL/MED code than I have.  The only thing I can
see that looks bogus is that get_connect_string() is failing to handle
any quoting/escaping that might be needed for the values to be inserted
into the connection string.  I don't recall offhand what rules libpq
has for that, but I hope it at least implements doubled single quotes...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Tom Lane

Fujii Masao masao.fu...@gmail.com writes:
 Yes, the old xlog itself is not used again. But, the *old file* might
 be recycled and used later. The case that I'm looking at is that the
 symlink to a temporary area is recycled. Am I missing something?

Actually, I think the right fix for that would be to add defenses to
xlog.c to not try to recycle a file that is a symlink.  No good could
possibly come of that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-01 Thread Kevin Grittner

Sushant Sinha sushant...@gmail.com wrote: 
 
 I think that dot should be considered by as a word delimiter because
 when dot is not followed by a space, most of the time it is an error
 in typing. Beside they are not many valid english words that have
 dot in between.
 
It's not treating it as an English word, but as a host name.
 
select ts_debug('english', 'Mr.J.Sai Deepak');
 ts_debug
---
 (host,Host,Mr.J.Sai,{simple},simple,{mr.j.sai})
 (blank,Space symbols, ,{},,)
 (asciiword,Word, all
ASCII,Deepak,{english_stem},english_stem,{deepak})
(3 rows)
 
You could run it through a dictionary which would deal with host
tokens differently.  Just be aware of what you'll be doing to
www.google.com if you run into it.
 
I hope this helps.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote: 
 Josh Berkus j...@agliodbs.com writes:
 cost_nestloop and cost_hashjoin are broken for SEMI and ANTI joins
 
 Actually I'm hoping that Kevin Grittner will have something to
 report on that one soon
  
 So far, I haven't found any performance regressions in the beta2
 release compared to the snapshot from March 2nd on our challenging
 queries.  They both perform as well or better than 8.3 for our
 queries, although there is a slight increase in planning time to get
 to the better plans.

Thanks; I'll remove this from the Open Items list for the moment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Tom Lane

Joshua D. Drake j...@commandprompt.com writes:
 On Mon, 2009-06-01 at 11:09 -0700, Josh Berkus wrote:
 #  adjust information_schema precision and scale fields?
 -- save for 8.5

 I read this thread. It seems for the changes we can make we should just
 make them. The actual amount of change is actually very nominal.

I think the major argument against just change it is that we do not
wish to force an initdb now for beta testers, but if we don't there
is always going to be this niggling doubt about how an alleged 8.4
installation will actually behave.  Although I previously suggested
we could live with that, on reflection I don't think that the problem
is important enough to justify it.  The information_schema has had
this issue since day one, and we hadn't gotten complaints before.
So pushing it to 8.5 seems the best decision to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-01 Thread Fujii Masao

Hi,

On Tue, Jun 2, 2009 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 Yes, the old xlog itself is not used again. But, the *old file* might
 be recycled and used later. The case that I'm looking at is that the
 symlink to a temporary area is recycled. Am I missing something?

 Actually, I think the right fix for that would be to add defenses to
 xlog.c to not try to recycle a file that is a symlink.

OK, I tweaked Aidan's patch. Thanks Aidan!
http://archives.postgresql.org/message-id/20090601152736.gl15...@yugib.highrise.ca

Changes are:
- use lstat instead of stat
- add #if HAVE_WORKING_LINK and #endif code

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Index: src/backend/access/transam/xlog.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.341
diff -c -r1.341 xlog.c
*** src/backend/access/transam/xlog.c	28 May 2009 11:02:16 -	1.341
--- src/backend/access/transam/xlog.c	2 Jun 2009 02:31:34 -
***
*** 3006,3011 
--- 3006,3012 
  	struct dirent *xlde;
  	char		lastoff[MAXFNAMELEN];
  	char		path[MAXPGPATH];
+ 	struct stat stat_buf;
  
  	/*
  	 * Initialize info about where to try to recycle to.  We allow recycling
***
*** 3047,3054 
  /*
   * Before deleting the file, see if it can be recycled as a
   * future log segment.
   */
! if (InstallXLogFileSegment(endlogId, endlogSeg, path,
  		   true, max_advance,
  		   true))
  {
--- 3048,3060 
  /*
   * Before deleting the file, see if it can be recycled as a
   * future log segment.
+  * If it's a symlink, we don't recycle it.
   */
! if (
! #if HAVE_WORKING_LINK
! 	lstat(path, stat_buf) == 0  !S_ISLNK(stat_buf.st_mode) 
! #endif
! 	InstallXLogFileSegment(endlogId, endlogSeg, path,
  		   true, max_advance,
  		   true))
  {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with to_char('L')

2009-06-01 Thread Hiroshi Inoue

Tom Lane wrote:
 Hiroshi Inoue in...@tpf.co.jp writes:
 Tom Lane wrote:
 I think what this suggests is that there probably needs to be some
 encoding conversion logic near the places we examine localeconv()
 output.
 
 Attached is a patch to the current CVS.
 It uses a similar way like LC_TIME stuff does.
 
 I'm not really in a position to test/commit this, since I don't have a
 Windows machine.  However, since no one else is stepping up to deal with
 it, here's a quick review:

Thanks for the review.
I've forgotten the patch because Japanese doesn't have trouble with
this issue (the currency symbol is ascii \). If this is really
expected to be fixed, I would update the patch according to your
suggestion.

 * This seems to be assuming that the user has set LC_MONETARY and
 LC_NUMERIC the same.  What if they're different?

Strictky speaking they should be handled individually.

 * What if the selected locale corresponds to Unicode (ie UTF16)
 encoding?

As far as I tested set_locale(LC_MONETARY, xxx.65001) causes an error.

 * #define'ing strdup() to do something rather different from strdup
 seems pretty horrid from the standpoint of code readability and
 maintainability, especially with nary a comment explaining it.

Maybe using a function instead of strdup() which calls dbstr_win32()
in case of Windows would be better.
BTW grouping and money_grouping seem to be out of encoding conversion.
Are they guaranteed to be null terminated?

 * Code will dump core on malloc failure.

I can take care of it.

 * Since this code is surely not performance critical, I wouldn't bother
 with trying to optimize it; hence drop the special case for all-ASCII.

I can take care of it.
 
 * Surely we already have a symbol somewhere that can be used in
 place of this:
#defineMAX_BYTES_PER_CHARACTER 4

I can't find it.
max(pg_encoding_max_length(encoding), pg_encoding_max_length(PG_UTF8))
may be better.

regards,
Hiroshi Inoue



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dblink patches for comment

2009-06-01 Thread Joe Conway


Tom Lane wrote:

Joe Conway m...@joeconway.com writes:
Probably better if I break this up in logical chunks too. This patch 
only addresses the refactoring you requested here:

http://archives.postgresql.org/message-id/28719.1230996...@sss.pgh.pa.us


This looks sane to me in a quick once-over, though I've not tested it.


Thanks -- committed.


A small suggestion for future patches: don't bother to reindent code
chunks that aren't changing --- it just complicates the diff with a
lot of uninteresting whitespace changes.  You can either do that after
review, or leave it to be done by pgindent.  (Speaking of which, we
need to schedule that soon...)


Sorry. cvs diff -cb seems to help (see attached). It is half the size 
and much more readable.


Joe
Index: dblink.c
===
RCS file: /opt/src/cvs/pgsql/contrib/dblink/dblink.c,v
retrieving revision 1.77
diff -c -b -r1.77 dblink.c
*** dblink.c	1 Jan 2009 17:23:31 -	1.77
--- dblink.c	2 Jun 2009 03:04:04 -
***
*** 77,83 
  /*
   * Internal declarations
   */
! static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async, bool do_get);
  static remoteConn *getConnectionByName(const char *name);
  static HTAB *createConnHash(void);
  static void createNewConnection(const char *name, remoteConn * rconn);
--- 77,83 
  /*
   * Internal declarations
   */
! static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async);
  static remoteConn *getConnectionByName(const char *name);
  static HTAB *createConnHash(void);
  static void createNewConnection(const char *name, remoteConn * rconn);
***
*** 689,713 
  Datum
  dblink_record(PG_FUNCTION_ARGS)
  {
! 	return dblink_record_internal(fcinfo, false, false);
  }
  
  PG_FUNCTION_INFO_V1(dblink_send_query);
  Datum
  dblink_send_query(PG_FUNCTION_ARGS)
  {
! 	return dblink_record_internal(fcinfo, true, false);
  }
  
  PG_FUNCTION_INFO_V1(dblink_get_result);
  Datum
  dblink_get_result(PG_FUNCTION_ARGS)
  {
! 	return dblink_record_internal(fcinfo, true, true);
  }
  
  static Datum
! dblink_record_internal(FunctionCallInfo fcinfo, bool is_async, bool do_get)
  {
  	FuncCallContext *funcctx;
  	TupleDesc	tupdesc = NULL;
--- 689,735 
  Datum
  dblink_record(PG_FUNCTION_ARGS)
  {
! 	return dblink_record_internal(fcinfo, false);
  }
  
  PG_FUNCTION_INFO_V1(dblink_send_query);
  Datum
  dblink_send_query(PG_FUNCTION_ARGS)
  {
! 	PGconn	   *conn = NULL;
! 	char	   *connstr = NULL;
! 	char	   *sql = NULL;
! 	remoteConn *rconn = NULL;
! 	char	   *msg;
! 	bool		freeconn = false;
! 	int			retval;
! 
! 	if (PG_NARGS() == 2)
! 	{
! 		DBLINK_GET_CONN;
! 		sql = text_to_cstring(PG_GETARG_TEXT_PP(1));
! 	}
! 	else
! 		/* shouldn't happen */
! 		elog(ERROR, wrong number of arguments);
! 
! 	/* async query send */
! 	retval = PQsendQuery(conn, sql);
! 	if (retval != 1)
! 		elog(NOTICE, %s, PQerrorMessage(conn));
! 
! 	PG_RETURN_INT32(retval);
  }
  
  PG_FUNCTION_INFO_V1(dblink_get_result);
  Datum
  dblink_get_result(PG_FUNCTION_ARGS)
  {
! 	return dblink_record_internal(fcinfo, true);
  }
  
  static Datum
! dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
  {
  	FuncCallContext *funcctx;
  	TupleDesc	tupdesc = NULL;
***
*** 775,788 
  /* shouldn't happen */
  elog(ERROR, wrong number of arguments);
  		}
! 		else if (is_async  do_get)
  		{
  			/* get async result */
  			if (PG_NARGS() == 2)
  			{
  /* text,bool */
  DBLINK_GET_CONN;
! fail = PG_GETARG_BOOL(2);
  			}
  			else if (PG_NARGS() == 1)
  			{
--- 797,810 
  /* shouldn't happen */
  elog(ERROR, wrong number of arguments);
  		}
! 		else /* is_async */
  		{
  			/* get async result */
  			if (PG_NARGS() == 2)
  			{
  /* text,bool */
  DBLINK_GET_CONN;
! fail = PG_GETARG_BOOL(1);
  			}
  			else if (PG_NARGS() == 1)
  			{
***
*** 793,816 
  /* shouldn't happen */
  elog(ERROR, wrong number of arguments);
  		}
- 		else
- 		{
- 			/* send async query */
- 			if (PG_NARGS() == 2)
- 			{
- DBLINK_GET_CONN;
- sql = text_to_cstring(PG_GETARG_TEXT_PP(1));
- 			}
- 			else
- /* shouldn't happen */
- elog(ERROR, wrong number of arguments);
- 		}
  
  		if (!conn)
  			DBLINK_CONN_NOT_AVAIL;
  
- 		if (!is_async || (is_async  do_get))
- 		{
  			/* synchronous query, or async result retrieval */
  			if (!is_async)
  res = PQexec(conn, sql);
--- 815,824 
***
*** 911,929 
  			funcctx-attinmeta = attinmeta;
  
  			MemoryContextSwitchTo(oldcontext);
- 		}
- 		else
- 		{
- 			/* async query send */
- 			MemoryContextSwitchTo(oldcontext);
- 			PG_RETURN_INT32(PQsendQuery(conn, sql));
- 		}
- 	}
- 
- 	if (is_async  !do_get)
- 	{
- 		/* async query send -- should not happen */
- 		elog(ERROR, async query send called more than once);
  
  	}
  
--- 919,924 

-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] explain analyze rows=%.0f

2009-06-01 Thread Ron Mayer

Euler Taveira de Oliveira wrote:
 Robert Haas escreveu:
 ...EXPLAIN ANALYZE reports the number of rows as an integer...  Any
 chance we could reconsider this decision?  I often find myself wanting
 to know the value that is here called ntuples, but rounding
 ntuples/nloops off to the nearest integer loses too much precision.

 Don't you think is too strange having, for example, 6.67 rows? I would confuse
 users and programs that parses the EXPLAIN output. However, I wouldn't object

I don't think it's that confusing.   If it says 0.1 rows, I imagine most
people would infer that this means typically 0, but sometimes 1 or a few rows.

What I'd find strange about 6.67 rows in your example is more that on
the estimated rows side, it seems to imply an unrealistically precise estimate
in the same way that 667 rows would seem unrealistically precise to me.
Maybe rounding to 2 significant digits would reduce confusion?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] from_collapse_limit vs. geqo_threshold

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 3:34 PM, Selena Deckelmann sel...@endpoint.com wrote:
 In the spirit of helping wrap-up 8.4 todo items...
 Robert Haas wrote:
 For 8.4, I'd be happy to just improve the documentation.  I think this
 sentence could just be deleted from the section on
 from_collapse_limit:
 My vote would be to provide some information.

 Suggested revision of Robert's prose:

 Because genetic query optimization may be triggered, increasing
 from_collapse_limit should be considered relative to xref
 linkend=guc-geqo-threshold.

Here's my attempt.

...Robert
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2252,2259  SELECT * FROM parent WHERE key = 2400;
  The planner will merge sub-queries into upper queries if the
  resulting literalFROM/literal list would have no more than
  this many items.  Smaller values reduce planning time but might
! yield inferior query plans.  The default is eight.  It is usually
! wise to keep this less than xref linkend=guc-geqo-threshold.
  For more information see xref linkend=explicit-joins.
 /para
/listitem
--- 2252,2261 
  The planner will merge sub-queries into upper queries if the
  resulting literalFROM/literal list would have no more than
  this many items.  Smaller values reduce planning time but might
! yield inferior query plans.  The default is eight.  Note that if this
! value is less than xref linkend=guc-geqo-threshold, pulling up
! subqueries may trigger GEQO planning, since it will increase the
! number of literalFROM/literal items.
  For more information see xref linkend=explicit-joins.
 /para
/listitem

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Robert Haas

On Mon, Jun 1, 2009 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As of today we are three months behind the original plan for 8.4.0 release.
 In a one-year release cycle that's already pretty bad slip; but there now
 seems no chance of a release happening in less than a month, and if we
 continue to let things drift it could easily stretch to five or six
 months' slip.  Given the slow pace of bug reports there is no reason to
 be waiting.  We need to refocus our energy on getting the release out.

 The main thing that needs to happen now is to deal with the open items
 listed at
 http://wiki.postgresql.org/wiki/PostgreSQL_8.4_Open_Items
 either by fixing them or by agreeing that it's okay to let them slide
 to 8.5 or beyond.

Regarding this item:

* Consider reverting preventing regular users from base type creation

You raise this point:

tgl says: whether or not we think PL/Java is bulletproof, there are
other problems, for instance this one
http://archives.postgresql.org/message-id/87zlnwnvjg@news-spur.riddles.org.uk

That's a pretty overwhelming argument for leaving it as-is.  I think
we should remove this from the list of open items.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [RFC,PATCH] Only disable sigpipe during SSL write

2009-06-01 Thread Jeremy Kerr

If the connection isn't over SSL, there's no need to do the disable.

This effectively halves the number of syscalls performed by libpq when
SSL is not in use.

Signed-off-by: Jeremy Kerr j...@ozlabs.org

---
 src/interfaces/libpq/fe-secure.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index eb579cf..2101315 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -368,13 +368,13 @@ pqsecure_write(PGconn *conn, const void *ptr, size_t len)
 {
ssize_t n;
 
-   DISABLE_SIGPIPE(return -1);
-
 #ifdef USE_SSL
if (conn-ssl)
{
int err;
 
+   DISABLE_SIGPIPE(return -1);
+
n = SSL_write(conn-ssl, ptr, len);
err = SSL_get_error(conn-ssl, n);
switch (err)
@@ -433,15 +433,14 @@ pqsecure_write(PGconn *conn, const void *ptr, size_t len)
n = -1;
break;
}
+   RESTORE_SIGPIPE();
}
else
 #endif
{
n = send(conn-sock, ptr, len, 0);
-   REMEMBER_EPIPE(n  0  SOCK_ERRNO == EPIPE);
}
 
-   RESTORE_SIGPIPE();
 
return n;
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [RFC,PATCH] SIGPIPE masking in local socket connections

2009-06-01 Thread Jeremy Kerr

Currently, I'm seeing the psecure_{red,write} functions being invoked
when connecting to postgres via a unix domain socket. psecure_write
seems to alter the signal mask of the process to disable sigpipe
reporting. psecure_read only does this when the connection is using SSL.

When using a multithreaded client application on Linux, this can result
in poor scalability. Each change to the signal mask requires an
current-sighand-siglock, which becomes highly contended between
the client threads. It also means we do 3 syscalls per write: mask
sigpipe, write, unmask sigpipe.

The following patch changes psecure_write to be more like psecure_read -
it only alters the signal mask if the connection is over SSL. It's only
an RFC, as I'm not entirely sure about the reasoning behind blocking
SIGPIPE for the non-SSL case - there may be other considerations here.

With this change I see the following performance improvement
during a sysbench OLTP run:

 http://ozlabs.org/~jk/projects/db/data/sigpipe-perf.png

 load: sysbench --test=oltp --oltp-read-only=on, connecting locally,
 machine: POWER6, 64-way, 4.2GHz

Comments most welcome,


Jeremy

---
Jeremy Kerr (1):
  Only disable sigpipe during SSL write


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-01 Thread Kris Jurka




On Mon, 1 Jun 2009, Robert Haas wrote:


Regarding this item:

* Consider reverting preventing regular users from base type creation

You raise this point:

tgl says: whether or not we think PL/Java is bulletproof, there are
other problems, for instance this one
http://archives.postgresql.org/message-id/87zlnwnvjg@news-spur.riddles.org.uk

That's a pretty overwhelming argument for leaving it as-is.  I think
we should remove this from the list of open items.


Yes, that makes sense to me as the original requester of this open item. 
I thought it had been taken off a while ago.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

98 matches

Mail list logo