Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Martijn van Oosterhout
On Wed, Mar 09, 2011 at 04:49:28PM -0500, Tom Lane wrote:
 So I was moving some error checks around and all of a sudden the
 regression tests blew up on me, with lots of errors about how type X
 didn't support collations (which indeed it didn't).  After some
 investigation I realized what should have been apparent much earlier:
 the collations patch is trying to use one field for two different
 purposes.  In particular, collid in FuncExpr and related nodes is
 used in both of these ways:
 
 * as the collation to apply during execution of the function;
 
 * as the collation of the function's result.

Ouch, that is painful.

Looking back at my first attempt I see I made the same error, though I
had noted that it had an unusual failure mode, namely that:

func( a COLLATE x ) COLLATE y

would determine that x was the collation to use for func, not y,
and that y may be ignored. A bit of a corner case, but someone was
bound to try it.

I think I avoided the particular failure mode you found, because the
GetCollation on the FuncExpr didn't return the collation calculated for
the node, but the the collation derived from the collations of any
arguments that had the same type and the return value. So operators
like = and  automatically got NONE because none of their arguments are
booleans.

 regression=# create view vv as select 'z'::text  'y'::text as b;
 ERROR:  collations are not supported by type boolean

I'm my original idea, any data type was collatable, since I considered
ASC and DESC, NULLS FIRST/LAST to be collations every datatype had.
Thus the above wasn't an error. As long as the collation was a
collation appropriate for booleans it worked.

 There are basically two things we could do about this:
 
 1. Add two fields not one to nodes representing function/operator calls.
 
 2. Change exprCollation() to do a type_is_collatable check on the
 node result type before believing that the collid field is relevant.

It might be worthwhile adding an extra field, but I think I didn't do
it because you only need the information exactly once, while descending
the parse tree in parse_expr. But for clarity the extra field is a
definite win.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Nikhil Sontakke
Hi,

 Other peculiarity in the index file is that we found a lot of zeroed
 out pages. Blocks from #279 to #518 are all completely zeroed out
 without any signs of even a page header. Any ideas on how we can get
 so many zeroed out blocks? Apart from the extend code path, I fail to
 see any other. And this is an unusually large number of zero pages

 Well, if you see the root block child pointers, they go to pages
 2, 4-277,522-524.  So pages 278 to 521 seem unused for some reason,
 which is rather strange.  But apparently only page 523 is zeroed and
 shouldn't be.


Yeah, the very definition of strange all this.

 It seems hard to believe that there would be 245 unsuccessful attempts
 at extending the file.

 Page 522 is suspect too ... why does it have a single item?  Note its
 LSN is very close to the one on page 521.


If you look at my png, you will notice that there is a deleted block
chain at play around 522.

Deleted 519's previous points to Deleted 520.

Deleted 520's previous points to Deleted 521.

Deleted 521's previous points to 522

Want to know where the next of all these 4 blocks point to? Block 277.

Another interesting thing is all these Deleted blocks have next XID
set to FroxzenXID. And who sets it to that - only VACUUM FULL. That's
why my suspicions around VF..

Regards,
Nikhils

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 7:14 PM, Greg Stark gsst...@mit.edu wrote:
 On Wed, Mar 9, 2011 at 11:28 PM, Nikhil Sontakke
 nikhil.sonta...@enterprisedb.com wrote:
 Other peculiarity in the index file is that we found a lot of zeroed
 out pages. Blocks from #279 to #518 are all completely zeroed out
 without any signs of even a page header. Any ideas on how we can get
 so many zeroed out blocks? Apart from the extend code path, I fail to
 see any other. And this is an unusually large number of zero pages


 What does stat say for the index data file? Are the Size and Blocks
 values the same (modulo block size)? Or are these blocks actually not
 allocated?

 Postgres always forces blocks to be allocated but if they were lost
 due to filesystem corruption maybe they're not allocated any more.

Hmm, that gives me an idea.  What if the index relation were extended
repeatedly, but the blocks are still all zero on disk because they've
never been written out of shared_buffers?  Then, before the next
checkpoint, the system crashes.  All those blocks are now zero on
disk.  In theory, WAL replay should fix most of it, but there could be
portions of the WAL that haven't been flushed; or there could be some
other problem in WAL replay.  For example, it strikes me that this
situation would be consistent with:

1. Somebody inserts a bunch of new tuples into the relation, causing
growth in the index.
2. Before the blocks are written to disk, system crash.
3. pg_resetxlog

But it wouldn't have to be exactly that.  Like what if during replay,
you hit a WAL record that had a bit flipped so it failed the checksum.
 I think the system would just treat that as end-of-WAL and start up.
Now you could have some corruption, and a big chunk of zero blocks in
the index.  Now you go along in normal running, maybe not realizing
that those zero blocks are there, and start extending the relation as
you need to update the index...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update of replication/README

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 9:01 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 10, 2011 at 2:00 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 9, 2011 at 6:11 AM, Fujii Masao masao.fu...@gmail.com wrote:
 The attached patch updates replication/README to reflect current
 walsender/walreceiver behavior. It doesn't include any description
 about sync rep. That would need to be added later.

 Hrm.  What about this hunk?

 -Each walsender allocates an entry from the WalSndCtl array, and advertises
 -there how far it has streamed WAL already. This is used at checkpoints, to
 -avoid recycling WAL that hasn't been streamed to a slave yet. However,
 -that doesn't stop such WAL from being recycled when the connection is not
 -established.
 +Each walsender allocates an entry from the WalSndCtl array, and tracks
 +information about replication progress. User can monitor them via
 +statistics views.

 Is the deleted text not (or no longer) true?

 Yes. But, in fact, the deleted text is false in not only 9.1dev but
 also 9.0. IIRC,
 though my original patch of streaming replication prevented checkpoint from
 recycling unsent WAL files, that behavior was cut out and we introduced
 wal_keep_segments parameter before release of 9.0. But unfortunately I had
 not noticed that text until I read README yesterday...

OK, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Theory of operation of collation patch

2011-03-10 Thread Greg Stark
On Wed, Mar 9, 2011 at 1:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another interesting item ... I see that you added a collation field to
 TypeName, apparently on the grounds that the SQL spec includes collation
 in data type.  However, it seems to me that that is nonsense up with
 which we should not put.

The SQL committee has demonstrably awful taste but they're usually not
entirely nutty. Usually whatever they do they had a reason. Is there
any hint at all of what they were trying to accomplish here? When you
say basically only used in CAST and column definitions are there
other less common cases where it's convenient?

Or was this just a case of some existing database allowed COLLATE
clauses in column definitions as part of the type and they preferred
to have it in a different syntax so they did this to allow either to
work? If we allow collate clause in ColQualList are we covering the
very case they were trying to deal with?

Can you give an example of what a column definition would look like if
you put the COLLATE clause in the data type in a way that wouldn't
be parsed according to your plan?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Header comments in the recently added files

2011-03-10 Thread Bruce Momjian
Tom Lane wrote:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
  I found trivial mistakes in the recently added files.
  Will they fixed by some automated batches, or by manual?
 
  - Copyright (c) xxx-*2010*, PostgreSQL Global Development Group
in pg_collation.h, pg_foreign_table.h, basebackup.h, syncrep.h,
pg_backup_directory.c and auth_delay.c.
  - IDENTIFICATION $PostgreSQL$ in pg_collation.h, syncrep.h, and syncrep.c
Other files has their actual paths in the same place.
 
 It might be worth Bruce making another run of his copyright-update
 script to fix the former problems.  As for the latter problems,

I ran it just now and nothing was changed, so we are OK now.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 9:21 PM, Bruce Momjian br...@momjian.us wrote:
 Simon Riggs wrote:
 On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:

  postgres=# SELECT application_name, state, sync_priority, sync_state
  FROM pg_stat_replication;
   application_name |   state   | sync_priority | sync_state
  --+---+---+
   one              | STREAMING |             1 | POTENTIAL
   two              | STREAMING |             2 | SYNC
  (2 rows)

 Bug! Thanks.

 Is there a reason these status are all upper-case?

Not that I know of.

However, I think that some more fundamental rethinking of the state
mechanism may be in order.  When Magnus first committed this, it would
say CATCHUP whenever you were behind (even if only momentarily) and
STREAMING if you were caught up.  Simon then changed it so that it
says CATCHUP until you catch up the first time, and then STREAMING
afterward (even if you fall behind again).  Neither behavior seems
completely adequate to me.  I think we should have a way to know
whether we've ever been caught up, and if so when the most recent time
was.  So you could then say things like is the most recent time at
which the standby was caught up within the last 30 seconds?, which
would be a useful thing to monitor, and right now there's no way to do
it.  There's also a BACKUP state, but I'm not sure it makes sense to
lump that in with the others.  Some day it might be possible to stream
WAL and take a backup at the same time, over the same connection.
Maybe that should be a separate column or something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 10:21 AM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
  I found trivial mistakes in the recently added files.
  Will they fixed by some automated batches, or by manual?

  - Copyright (c) xxx-*2010*, PostgreSQL Global Development Group
    in pg_collation.h, pg_foreign_table.h, basebackup.h, syncrep.h,
    pg_backup_directory.c and auth_delay.c.
  - IDENTIFICATION $PostgreSQL$ in pg_collation.h, syncrep.h, and syncrep.c
    Other files has their actual paths in the same place.

 It might be worth Bruce making another run of his copyright-update
 script to fix the former problems.  As for the latter problems,

 I ran it just now and nothing was changed, so we are OK now.

Speaking of running scripts, I think we should run pgindent now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Wed, Mar 09, 2011 at 04:49:28PM -0500, Tom Lane wrote:
 There are basically two things we could do about this:
 
 1. Add two fields not one to nodes representing function/operator calls.
 
 2. Change exprCollation() to do a type_is_collatable check on the
 node result type before believing that the collid field is relevant.

 It might be worthwhile adding an extra field, but I think I didn't do
 it because you only need the information exactly once, while descending
 the parse tree in parse_expr. But for clarity the extra field is a
 definite win.

Hmm.  That suggests a third solution: revert the addition of *all* the
collid fields except the ones that represent collation-to-apply-during-
function-execution.  (So they'd still be there in FuncExpr/OpExpr, but
not most other places.)  Then we'd have to dig down more deeply in the
expression tree during select_common_collation, but we'd save space
and avoid confusion over the meaning of the fields.

I suspect this is probably not a good idea because of the added cost in
select_common_collation: aside from probably needing more syscache
lookups, there's a potential for worse-than-linear cost behavior if we
have to repeatedly dig through a deep expression tree to find out
collations.  We had a similar case in the past [ checks archives ... see
http://archives.postgresql.org/pgsql-performance/2005-06/msg00075.php
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=ba4200246
] so I'm hesitant to go down that road again.  Still, I'll throw it out
for comment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Native XML

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 7:03 PM, Josh Berkus j...@agliodbs.com wrote:
 Then I think the answer is that we need both data types.  One for
 text-XML and one for binary-XML.

That's what I think, too.  I'm not sure whether we want both of them
in core, but I think the binary-XML one would, at a minimum, make an
awfully nice extension to ship in contrib.  I'd also like to have text
and binary JSON types... very MongoDB-ish...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Theory of operation of collation patch

2011-03-10 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 Can you give an example of what a column definition would look like if
 you put the COLLATE clause in the data type in a way that wouldn't
 be parsed according to your plan?

Column definitions look and act the same.  The point of the change is to
not accept COLLATE in all the places where type names are used that are
*not* column definition lists.  The original patch accepted nonsense like

create cast (text collate C as text collate de_DE) with ...

and we'd have had to do quite a lot of additional hacking to plug those
holes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Bruce Momjian
Robert Haas wrote:
 On Thu, Mar 10, 2011 at 10:21 AM, Bruce Momjian br...@momjian.us wrote:
  Tom Lane wrote:
  Itagaki Takahiro itagaki.takah...@gmail.com writes:
   I found trivial mistakes in the recently added files.
   Will they fixed by some automated batches, or by manual?
 
   - Copyright (c) xxx-*2010*, PostgreSQL Global Development Group
   ? in pg_collation.h, pg_foreign_table.h, basebackup.h, syncrep.h,
   ? pg_backup_directory.c and auth_delay.c.
   - IDENTIFICATION $PostgreSQL$ in pg_collation.h, syncrep.h, and 
   syncrep.c
   ? Other files has their actual paths in the same place.
 
  It might be worth Bruce making another run of his copyright-update
  script to fix the former problems. ?As for the latter problems,
 
  I ran it just now and nothing was changed, so we are OK now.
 
 Speaking of running scripts, I think we should run pgindent now.

We usually do it during a late beta.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 Speaking of running scripts, I think we should run pgindent now.

 We usually do it during a late beta.

Last time we did it early and then again late, and that seemed to work
well.  I wouldn't object to a pgindent run now, but please sync with me
before you do --- I've got some heavy hacking to do on the collations
patch, and don't want to find myself trying to merge changes after a
pgindent run.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] collecting open items for PG 9.1

2011-03-10 Thread Robert Haas
Now that alpha4 is out the door (and the bug reports have begun to
roll in), we should probably give some more serious thought to the
road from here to beta1.  There's a partial list of open items here:

http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items

Many of those items related to synchronous replication, but I think
that's because Fujii Masao just made a big update rather than due to
any lack of open items elsewhere - in particular, it seems like there
may be some open items related to collation support, and perhaps other
things.  I think it would be helpful if anyone who is aware of other
things that ought to be addressed before we go to beta could add them
there - that way, we have a clear list that everyone can see of what
we need to hammer through, and we can start hammering it.

I am also curious what people think would be a realistic date to shoot
for in terms of beta1.  My first thought would be about a month from
now, i.e. the second full week in April, but I have no idea whether
that matches anyone else's thoughts on the matter.  I think if it's
going to take any longer than that, though, we probably ought to put
out another alpha around the end of March.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Robert Haas wrote:
  Speaking of running scripts, I think we should run pgindent now.
 
  We usually do it during a late beta.
 
 Last time we did it early and then again late, and that seemed to work
 well.  I wouldn't object to a pgindent run now, but please sync with me
 before you do --- I've got some heavy hacking to do on the collations
 patch, and don't want to find myself trying to merge changes after a
 pgindent run.

Andrew needs to update the typedef list in our GIT tree first.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 Speaking of running scripts, I think we should run pgindent now.

 We usually do it during a late beta.

 Last time we did it early and then again late, and that seemed to work
 well.  I wouldn't object to a pgindent run now, but please sync with me
 before you do --- I've got some heavy hacking to do on the collations
 patch, and don't want to find myself trying to merge changes after a
 pgindent run.

Yeah, +1 for doing it as soon as Tom is at a good stopping point.  It
makes things a lot simpler later on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] Sync rep doc corrections

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 11:43 PM, Bruce Momjian br...@momjian.us wrote:
 Thom Brown wrote:
 On 7 March 2011 22:31, Robert Haas robertmh...@gmail.com wrote:
  On Mon, Mar 7, 2011 at 6:16 AM, Thom Brown t...@linux.com wrote:
  On 7 March 2011 15:27, Thom Brown t...@linux.com wrote:
  I've attached a small patch with a bit of clarification and a typo fix
  in the synchronous_standby_names parameter info.
 
  Okay, I've noticed that the main documentation also needed some fixes,
  so those have been included in this new patch.
 
  I don't think it's necessary to cross-post these emails

 I've received conflicting information in the past on this, but I'm
 assuming you're suggesting to send this just to the -docs list in
 future?

 I send pure woring changes only to the docs list, and items that are
 related to behavior and docs to both.  Does that help?

Well, there's basically no point in posting patches you've already
committed.  We have a list where those get posted, and it's
-committers.  If you're referring to patches you haven't committed
yet, then the effect of sending them to both -docs and -hackers is
presumably to encourage even fewer people to read and respond to
traffic on -docs than already do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Magnus Hagander
On Thu, Mar 10, 2011 at 16:50, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 10, 2011 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 Speaking of running scripts, I think we should run pgindent now.

 We usually do it during a late beta.

 Last time we did it early and then again late, and that seemed to work
 well.  I wouldn't object to a pgindent run now, but please sync with me
 before you do --- I've got some heavy hacking to do on the collations
 patch, and don't want to find myself trying to merge changes after a
 pgindent run.

 Yeah, +1 for doing it as soon as Tom is at a good stopping point.  It
 makes things a lot simpler later on.

Agreed.

With git in play, it should be quite possible to merge with head just
before the pgindent run, then run pgindent on your local topic branch,
and then merge with head after the pgindent run - that should take
care of *most* of the conflicts, I think - as long as you use the same
typdef list.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Speaking of running scripts, I think we should run pgindent now.

 Yeah, +1 for doing it as soon as Tom is at a good stopping point.  It
 makes things a lot simpler later on.

IIRC the argument for an early pgindent run was to standardize the new
code for easier review.  I expect to be spending a whole lot of time
reading collate and SSI code over the next few weeks, so I'm in favor
of pgindent'ing that stuff first.  But I guess we need the typedef
list update before anything can happen.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Andrew Dunstan



On 03/10/2011 10:33 AM, Robert Haas wrote:

On Thu, Mar 10, 2011 at 10:21 AM, Bruce Momjianbr...@momjian.us  wrote:

Tom Lane wrote:

Itagaki Takahiroitagaki.takah...@gmail.com  writes:

I found trivial mistakes in the recently added files.
Will they fixed by some automated batches, or by manual?
- Copyright (c) xxx-*2010*, PostgreSQL Global Development Group
   in pg_collation.h, pg_foreign_table.h, basebackup.h, syncrep.h,
   pg_backup_directory.c and auth_delay.c.
- IDENTIFICATION $PostgreSQL$ in pg_collation.h, syncrep.h, and syncrep.c
   Other files has their actual paths in the same place.

It might be worth Bruce making another run of his copyright-update
script to fix the former problems.  As for the latter problems,

I ran it just now and nothing was changed, so we are OK now.

Speaking of running scripts, I think we should run pgindent now.



Please wait a few days at least. I am just setting up a new server at 
this very moment (now that SL6 is released) and will get a new FBSD 
buildfarm member extracting typedefs running.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 10:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Speaking of running scripts, I think we should run pgindent now.

 Yeah, +1 for doing it as soon as Tom is at a good stopping point.  It
 makes things a lot simpler later on.

 IIRC the argument for an early pgindent run was to standardize the new
 code for easier review.  I expect to be spending a whole lot of time
 reading collate and SSI code over the next few weeks, so I'm in favor
 of pgindent'ing that stuff first.  But I guess we need the typedef
 list update before anything can happen.

That's one good reason.  Another is that this is presumably the time
of the cycle when there are the fewest outstanding patches, making it
a good time for changes that are likely to conflict with lots of other
things.

At any rate, it sounds like Andrew needs a few days to get the typedef
list together, so let's wait for that to happen and then we'll see
where we are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Read uncommitted ever possible?

2011-03-10 Thread hans wulf
Hi,

if you want to do dirty counts or sums or any aggreate stuff, you will always 
have to visit the table. For many applications nobody cares about 0,01% 
inaccuracy.

If you could keep the data that has to be aggregated in the index you could 
approximate values really fast.

But because Read uncommitted is not implemented you will always have to visit 
the table. This is one reason why people have to still buy oracle.

I don't know the postgres code, but I don't thing it is a big deal, not to care 
about consistancy. The code for executing such a query should be quite basic, 
because no MVCC-Stuff has to be done.

Will this feature come any time soon? Even if Read uncommitted is a could 
read all sorts of old and dirty stuff it is still better than nothing.
-- 
NEU: FreePhone - kostenlos mobil telefonieren und surfen!   
Jetzt informieren: http://www.gmx.net/de/go/freephone

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Prefered Types

2011-03-10 Thread Zotov

Hello, i  have an old system where used implicit casting
float-integer
numeric-float
numeric-integer

I want define implicit casts, but postgresql don`t know cast priority
now postgresql have PREFERRED flag, but only flag
I can`t define prefer level like
Integer=0
Numeric=1
Float=2
Maybe
text = 2 or 3
and other to define My prefer cast more detail than just flag
i understand what it more dificult tuning, but more flexible
now i can only create duplicate operators like
numeric+integer, integer+numeric, integernumeric and many other
What can i do? Can i wait for prefer flag changed to prefer level?

--
С уважением,
Зотов Роман Владимирович
руководитель Отдела разработки
ЗАО НПО Консультант
г.Иваново, ул. Палехская, д. 10
тел./факс: (4932) 41-01-21
mailto: zo...@oe-it.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgindent (was Re: [HACKERS] Header comments in the recently added files)

2011-03-10 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I expect to be spending a whole lot of time reading collate and
 SSI code over the next few weeks, so I'm in favor of pgindent'ing
 that stuff first.
 
I've been running that throughout development, but it hasn't been
run after the last few changes.  If you want the SSI files in
pgindent format, you can get there by applying the attached patch.
 
-Kevin

*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***
*** 746,755  OldSerXidAdd(TransactionId xid, SerCommitSeqNo 
minConflictCommitSeqNo)
Assert(TransactionIdIsValid(tailXid));
  
/*
!* If the SLRU is currently unused, zero out the whole active region
!* from tailXid to headXid before taking it into use. Otherwise zero
!* out only any new pages that enter the tailXid-headXid range as we
!* advance headXid.
 */
if (oldSerXidControl-headPage  0)
{
--- 746,755 
Assert(TransactionIdIsValid(tailXid));
  
/*
!* If the SLRU is currently unused, zero out the whole active region 
from
!* tailXid to headXid before taking it into use. Otherwise zero out only
!* any new pages that enter the tailXid-headXid range as we advance
!* headXid.
 */
if (oldSerXidControl-headPage  0)
{
***
*** 855,862  OldSerXidSetActiveSerXmin(TransactionId xid)
/*
 * When no sxacts are active, nothing overlaps, set the xid values to
 * invalid to show that there are no valid entries.  Don't clear 
headPage,
!* though.  A new xmin might still land on that page, and we don't want
!* to repeatedly zero out the same page.
 */
if (!TransactionIdIsValid(xid))
{
--- 855,862 
/*
 * When no sxacts are active, nothing overlaps, set the xid values to
 * invalid to show that there are no valid entries.  Don't clear 
headPage,
!* though.  A new xmin might still land on that page, and we don't 
want to
!* repeatedly zero out the same page.
 */
if (!TransactionIdIsValid(xid))
{
***
*** 901,907  OldSerXidSetActiveSerXmin(TransactionId xid)
  void
  CheckPointPredicate(void)
  {
!   int tailPage;
  
LWLockAcquire(OldSerXidLock, LW_EXCLUSIVE);
  
--- 901,907 
  void
  CheckPointPredicate(void)
  {
!   int tailPage;
  
LWLockAcquire(OldSerXidLock, LW_EXCLUSIVE);
  
***
*** 1317,1328  SummarizeOldestCommittedSxact(void)
/*
 * This function is only called if there are no sxact slots available.
 * Some of them must belong to old, already-finished transactions, so
!* there should be something in FinishedSerializableTransactions list
!* that we can summarize. However, there's a race condition: while we
!* were not holding any locks, a transaction might have ended and 
cleaned
!* up all the finished sxact entries already, freeing up their sxact
!* slots. In that case, we have nothing to do here. The caller will find
!* one of the slots released by the other backend when it retries.
 */
if (SHMQueueEmpty(FinishedSerializableTransactions))
{
--- 1317,1328 
/*
 * This function is only called if there are no sxact slots available.
 * Some of them must belong to old, already-finished transactions, so
!* there should be something in FinishedSerializableTransactions list 
that
!* we can summarize. However, there's a race condition: while we were 
not
!* holding any locks, a transaction might have ended and cleaned up all
!* the finished sxact entries already, freeing up their sxact slots. In
!* that case, we have nothing to do here. The caller will find one of 
the
!* slots released by the other backend when it retries.
 */
if (SHMQueueEmpty(FinishedSerializableTransactions))
{
***
*** 2207,2213  PredicateLockTuple(const Relation relation, const HeapTuple 
tuple)
 */
if (relation-rd_index == NULL)
{
!   TransactionId   myxid;
  
targetxmin = HeapTupleHeaderGetXmin(tuple-t_data);
  
--- 2207,2213 
 */
if (relation-rd_index == NULL)
{
!   TransactionId myxid;
  
targetxmin = HeapTupleHeaderGetXmin(tuple-t_data);
  
***
*** 2217,  PredicateLockTuple(const Relation relation, const HeapTuple 
tuple)
--- 2217,2223 
if (TransactionIdFollowsOrEquals(targetxmin, 
TransactionXmin))
{
TransactionId xid = 
SubTransGetTopmostTransaction(targetxmin);
+ 
if (TransactionIdEquals(xid, myxid))
{

Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Bruce Momjian
hans wulf wrote:
 Hi,
 
 if you want to do dirty counts or sums or any aggreate stuff, you will
 always have to visit the table. For many applications nobody cares
 about 0,01% inaccuracy.
 
 If you could keep the data that has to be aggregated in the index you
 could approximate values really fast.
 
 But because Read uncommitted is not implemented you will always have
 to visit the table. This is one reason why people have to still buy
 oracle.
 
 I don't know the postgres code, but I don't thing it is a big deal,
 not to care about consistancy. The code for executing such a query
 should be quite basic, because no MVCC-Stuff has to be done.
 
 Will this feature come any time soon? Even if Read uncommitted is a
 could read all sorts of old and dirty stuff it is still better than
 nothing.

Dirty reads are unlikely to be implemented.  We do have a TODO item and
wiki page about how to allow index scans without heap access:

http://wiki.postgresql.org/wiki/Index-only_scans

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Greg Stark
On Thu, Mar 10, 2011 at 12:31 PM, hans wulf lo...@gmx.net wrote:
 I don't know the postgres code, but I don't thing it is a big deal, not to 
 care about consistancy. The code for executing such a query should be quite 
 basic, because no MVCC-Stuff has to be done.

I remember I used to think this would be simple -- just return all
rows regardless of visibility.

However I later learned things were more complex than that. You
probably want to return the latest version of any row regardless of
whether it's committed but *not* return two or more copies of the same
row which would really make the results entirely meaningless. That
alone would make it prohibitively hard to do.

I think I remember issues with potentially running into old rows that
don't even match the current definition of the table. That would
potentially cause you to crash or output garbage. However offhand I
don't see how that would be possible so perhaps I'm misremembering
this issue.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Merlin Moncure
On Thu, Mar 10, 2011 at 6:31 AM, hans wulf lo...@gmx.net wrote:
 Hi,

 if you want to do dirty counts or sums or any aggreate stuff, you will always 
 have to visit the table. For many applications nobody cares about 0,01% 
 inaccuracy.

 If you could keep the data that has to be aggregated in the index you could 
 approximate values really fast.

 But because Read uncommitted is not implemented you will always have to 
 visit the table. This is one reason why people have to still buy oracle.

 I don't know the postgres code, but I don't thing it is a big deal, not to 
 care about consistancy. The code for executing such a query should be quite 
 basic, because no MVCC-Stuff has to be done.

 Will this feature come any time soon? Even if Read uncommitted is a could 
 read all sorts of old and dirty stuff it is still better than nothing.

Oracle has a different mvcc implementation than postgres. We keep a
lot more records of questionable visibility around in the heap so in
most real world cases your 0.01% could be 50% inaccuracy or worse.

As Bruce noted the direction the postgres project has taken has been
to limit the downsides of our mvcc implementation.  A lot of the work
in the 8.x cycle (HOT, visibility map, etc) has been laying the
groundwork for the performance benefits you want without
cheating...and covering index scans (such that they are possible) are
on the radar.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Greg Stark
On Thu, Mar 10, 2011 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote:
 1. Somebody inserts a bunch of new tuples into the relation, causing
 growth in the index.

In case it's not obvious VACUUM FULL would do precisely that.

 2. Before the blocks are written to disk, system crash.
 3. pg_resetxlog

Do you have any record of the VACUUM FULL committing? Do you know what
LSN its commit record is at?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Jesper Krogh

On 2011-03-10 18:00, Bruce Momjian wrote:


 Dirty reads are unlikely to be implemented. We do have a TODO item
 and wiki page about how to allow index scans without heap access:

 http://wiki.postgresql.org/wiki/Index-only_scans


I think we (the company I work for) would help co-sponsor such
a feature. Would it be ok to add a section on the wiki with
a list of potential sponsors that might in total be able to sponsor
development of such a feature? Then perhaps a developer would
drop by.

... it would be easier if there was a feeling about how much actually
is required.

... can anyone create wiki accounts?

--
Jesper



Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Bruce Momjian
Jesper Krogh wrote:
 On 2011-03-10 18:00, Bruce Momjian wrote:
 
   Dirty reads are unlikely to be implemented. We do have a TODO item
   and wiki page about how to allow index scans without heap access:
 
   http://wiki.postgresql.org/wiki/Index-only_scans
 
 I think we (the company I work for) would help co-sponsor such
 a feature. Would it be ok to add a section on the wiki with
 a list of potential sponsors that might in total be able to sponsor
 development of such a feature? Then perhaps a developer would
 drop by.
 
 ... it would be easier if there was a feeling about how much actually
 is required.
 
 ... can anyone create wiki accounts?

Sure, anyone can add text to that wiki;  you create a community account
here:

http://www.postgresql.org/community/signup

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 12:52 PM, Greg Stark gsst...@mit.edu wrote:
 On Thu, Mar 10, 2011 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote:
 1. Somebody inserts a bunch of new tuples into the relation, causing
 growth in the index.

 In case it's not obvious VACUUM FULL would do precisely that.

Oh, I didn't even think about that.  Yeah, that could be it, too.  So
maybe VACUUM FULL - crash before checkpoint - problem with recovery?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Jesper Krogh

On 2011-03-10 19:25, Bruce Momjian wrote:


Sure, anyone can add text to that wiki;  you create a community account
here:

http://www.postgresql.org/community/signup


Suggestion: Add this url to the login box on the wiki.

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collecting open items for PG 9.1

2011-03-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Now that alpha4 is out the door (and the bug reports have begun to
 roll in), we should probably give some more serious thought to the
 road from here to beta1.  There's a partial list of open items here:
 http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items

 Many of those items related to synchronous replication, but I think
 that's because Fujii Masao just made a big update rather than due to
 any lack of open items elsewhere - in particular, it seems like there
 may be some open items related to collation support, and perhaps other
 things.  I think it would be helpful if anyone who is aware of other
 things that ought to be addressed before we go to beta could add them
 there - that way, we have a clear list that everyone can see of what
 we need to hammer through, and we can start hammering it.

Yeah, I currently have a list of about two dozen things I don't like
about collations, though some of those may reduce to this needs to be
commented better once I understand the code more fully.  That list is in
no shape to be put on the wiki though; it's mostly not intelligible to
anybody but me, and it's changing too fast anyway.  I'm currently hoping
to be done with that topic in a week or so.

 I am also curious what people think would be a realistic date to shoot
 for in terms of beta1.  My first thought would be about a month from
 now, i.e. the second full week in April, but I have no idea whether
 that matches anyone else's thoughts on the matter.  I think if it's
 going to take any longer than that, though, we probably ought to put
 out another alpha around the end of March.

Historically we've declared it beta once we think we are done with
initdb-forcing problems.  There are certainly some catversion bumps that
are going to come out of the collation stuff, because of changes in
expression node contents affecting stored rules.  But the other areas
that seem likely to be pretty buggy, like SSI and sync rep, operate
mostly below the level of anything that might require a catversion bump.
In any case, the existence of pg_upgrade means that might we need
another initdb? is not as strong a consideration as it once was, so
I'm not sure if we should still use that as a criterion.  I don't know
quite what ready for beta should mean otherwise, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Simon Riggs
On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:
  
   postgres=# SELECT application_name, state, sync_priority, sync_state
   FROM pg_stat_replication;
application_name |   state   | sync_priority | sync_state
   --+---+---+
one  | STREAMING | 1 | POTENTIAL
two  | streaming | 2 | sync
   (2 rows)
  
  Bug! Thanks.
 
 Is there a reason these status are all upper-case?

NOT AS FAR AS I KNOW.

I'll add it to the list of changes for beta.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collecting open items for PG 9.1

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Now that alpha4 is out the door (and the bug reports have begun to
 roll in), we should probably give some more serious thought to the
 road from here to beta1.  There's a partial list of open items here:
 http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items

 Many of those items related to synchronous replication, but I think
 that's because Fujii Masao just made a big update rather than due to
 any lack of open items elsewhere - in particular, it seems like there
 may be some open items related to collation support, and perhaps other
 things.  I think it would be helpful if anyone who is aware of other
 things that ought to be addressed before we go to beta could add them
 there - that way, we have a clear list that everyone can see of what
 we need to hammer through, and we can start hammering it.

 Yeah, I currently have a list of about two dozen things I don't like
 about collations, though some of those may reduce to this needs to be
 commented better once I understand the code more fully.  That list is in
 no shape to be put on the wiki though; it's mostly not intelligible to
 anybody but me, and it's changing too fast anyway.  I'm currently hoping
 to be done with that topic in a week or so.

OK, that sounds promising.

 I am also curious what people think would be a realistic date to shoot
 for in terms of beta1.  My first thought would be about a month from
 now, i.e. the second full week in April, but I have no idea whether
 that matches anyone else's thoughts on the matter.  I think if it's
 going to take any longer than that, though, we probably ought to put
 out another alpha around the end of March.

 Historically we've declared it beta once we think we are done with
 initdb-forcing problems.  There are certainly some catversion bumps that
 are going to come out of the collation stuff, because of changes in
 expression node contents affecting stored rules.  But the other areas
 that seem likely to be pretty buggy, like SSI and sync rep, operate
 mostly below the level of anything that might require a catversion bump.
 In any case, the existence of pg_upgrade means that might we need
 another initdb? is not as strong a consideration as it once was, so
 I'm not sure if we should still use that as a criterion.  I don't know
 quite what ready for beta should mean otherwise, though.

Well, I guess what I really trying to estimate was how much time will
it take us to fix the problems we already know about.  It's not
impossible for replication-related items to force a catversion bump
if, for example, we change the definition of pg_stat_replication, but
in any event I've never been that excited about avoiding initdb
between beta and release.  The main issue in my book is that we're not
going to actually release until we've fixed all the known regressions,
and we're not going to start finding those until we push things that
we THINK are relatively stable and then wait for the complaints to
start rolling in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,

2011-03-10 Thread Robert Haas
On Tue, Mar 8, 2011 at 5:48 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Mar 8, 2011 at 12:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Feb 9, 2011 at 5:12 PM, Magnus Hagander mag...@hagander.net wrote:
 I was also worried about the non-hot-standby case, but I see that the
 patch makes sure you can't enable pause when not in hot standby mode.
 Which in itself might be surprising - perhaps we need a NOTICE for
 when that happens as well?

 I didn't include this fix in the patch because I prefer FATAL to
 NOTICE for that.
 NOTICE doesn't stop recovery. So we might be unable to notice such a NOTICE
 message and stop the recovery before it's too late, i.e., the recovery has
 completed at the undesirable point. So I think that emitting FATAL is safer.

 I included this fix in the patch, which emits FATAL if 
 pause_at_recovery_target
 is enabled while hot standby is disabled and the recovery target is set.

Eh, this is problematic, because you can't claim in the documentation
(and the comments in recovery.conf.sample) that the parameter has no
effect when Hot Standby is not enabled, and then at the same time make
that combination a FATAL error.  I don't have a strong opinion on
whether to change the docs or make it not FATAL, but the two have to
match.

Committing the rest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 was.  So you could then say things like is the most recent time at
 which the standby was caught up within the last 30 seconds?, which
 would be a useful thing to monitor, and right now there's no way to do

Well in my experience with replication, that's not what I want to
monitor.  If the standby is synchronous, then it's not catching up, it's
streaming.  If it were not, it would not be a synchronous standby.

When a standby is asynchronous then what I want to monitor is its lag.

So the CATCHUP state is useful to see that a synchronous standby
candidate can not yet be a synchronous standby.  When it just lost its
synchronous status (and hopefully another standby is now the sync one),
then it's just asynchronous and I want to know its lag.

 it.  There's also a BACKUP state, but I'm not sure it makes sense to
 lump that in with the others.  Some day it might be possible to stream
 WAL and take a backup at the same time, over the same connection.
 Maybe that should be a separate column or something.

BACKUP is still meaningful if you stream WAL at the same time, because
you're certainly *not* applying them while doing the base backup, are
you?  So you're not yet a standby, that's what BACKUP means.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 2:42 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 was.  So you could then say things like is the most recent time at
 which the standby was caught up within the last 30 seconds?, which
 would be a useful thing to monitor, and right now there's no way to do

 Well in my experience with replication, that's not what I want to
 monitor.  If the standby is synchronous, then it's not catching up, it's
 streaming.  If it were not, it would not be a synchronous standby.

 When a standby is asynchronous then what I want to monitor is its lag.

 So the CATCHUP state is useful to see that a synchronous standby
 candidate can not yet be a synchronous standby.  When it just lost its
 synchronous status (and hopefully another standby is now the sync one),
 then it's just asynchronous and I want to know its lag.

Yeah, maybe.  The trick is how to measure the lag.  I proposed the
above scheme mostly as a way of giving the user some piece of
information that can be measured in seconds rather than WAL position,
but I'm open to better ideas.  Monitoring is pretty hard to do at all
in 9.0; in 9.1, we'll be able to tell them how many *bytes* behind
they are, but there's no easy way to figure out what that means in
terms of wall-clock time, which I think would be useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

2011-03-10 Thread Robert Haas
On Mon, Mar 7, 2011 at 6:21 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Mar 7, 2011 at 5:27 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Mar 7, 2011 at 7:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Efficient transaction-controlled synchronous replication.
 If a standby is broadcasting reply messages and we have named
 one or more standbys in synchronous_standby_names then allow
 users who set synchronous_replication to wait for commit, which
 then provides strict data integrity guarantees. Design avoids
 sending and receiving transaction state information so minimises
 bookkeeping overheads. We synchronize with the highest priority
 standby that is connected and ready to synchronize. Other standbys
 can be defined to takeover in case of standby failure.

 This version has very strict behaviour; more relaxed options
 may be added at a later date.

 Pretty cool! I'd appreciate very much your efforts and contributions.

 Here are another comments:

        if ((wrote_xlog  XactSyncCommit) || forceSyncCommit || nrels  0 ||
 SyncRepRequested())

 Whenever synchronous_replication is TRUE, we disable synchronous_commit.
 But, before disabling that, we should check also max_wal_senders and
 synchronous_standby_names? Otherwise, synchronous_commit can
 be disabled unexpectedly even in non replication case.

Yeah, that's bad.  At the risk of repeating myself, I don't think this
code should be checking SyncRepRequested() in the first place.  If the
user has turned off synchronous_commit, then we should just commit
asynchronously, even if sync rep is otherwise in force.  Otherwise,
this if statement is going to get really complicated.   The logic is
already at least mildly wrong here anyway: clearly we do NOT need to
commit synchronously if the transaction has not written xlog, even if
sync rep is enabled.

 -                       /* Let the master know that we received some data. */
 -                       XLogWalRcvSendReply();
 -                       XLogWalRcvSendHSFeedback();

 This change completely eliminates the difference between write_location
 and flush_location in pg_stat_replication. If this change is reasoable, we
 should get rid of write_location from pg_stat_replication since it's useless.
 If not, this change should be reverted. I'm not sure whether monitoring
 the difference between write and flush locations is useful. But I guess that
 someone thought so and that code was added.

I could go either way on this but clearly we need to do one or the other.

 +       /*
 +        * Current location of the head of the queue. All waiters should have
 +        * a waitLSN that follows this value, or they are currently being 
 woken
 +        * to remove themselves from the queue. Protected by SyncRepLock.
 +        */
 +       XLogRecPtr      lsn;

 The comment , or they are currently being woken to remove themselves
 from the queue is no longer required because the proc is currently removed
 by walsender.

Fixed.

 I found some typos. The attached patch fixes them.

Committed with minor changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_common_collation callers way too ready to throw error

2011-03-10 Thread Peter Eisentraut
On ons, 2011-03-09 at 18:07 -0500, Tom Lane wrote:
 The first of these errors is OK, but surely the second is not, because
 ||
 doesn't give a fig about collations.  I think instead of this:
 
 /* XXX: If we knew which functions required collation
 information,
  * we could selectively set the last argument to true here. */
 funccollid = select_common_collation(pstate, fargs, false);
 
 we need:
 
 /*
  * If we knew which functions required collation information,
  * we could selectively set the last argument to false here,
  * allowing the error to be reported at parse time not
 runtime.
  */
 funccollid = select_common_collation(pstate, fargs, true);
 
 Now the downside of that is that a runtime failure won't give you an
 parse error pointer to indicate which function is having trouble ...
 but having an error pointer for an error that shouldn't be thrown in
 the first place is small consolation.

Sounds reasonable.

Btw., the ultimate plan here was that functions or operators that would
care about collation would be marked as such in pg_proc.  That plan
basically just ran out of time, but if you think it'd be useful, maybe
we could reactivate it quickly.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] configure gaps

2011-03-10 Thread Bruce Momjian
Andrew Dunstan wrote:
 
 
 On 02/21/2011 09:33 PM, Bruce Momjian wrote:
  Tom Lane wrote:
  Andrew Dunstanand...@dunslane.net  writes:
  I propose that we add the following test for this case:
   AC_CHECK_HEADER(Python.h, [], [AC_MSG_ERROR([header filePython.h
   is required for Python])])
  You'd need to pay attention to python_includespec, but otherwise seems
  reasonable.
  Did this get done?  If so, I don't see it.
 
 Oh, no. It skipped my TODO list. I'll try to get it done in the next day 
 or so.

I have added this to the 9.1 open items list:

http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items#Code_Issues


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_common_collation callers way too ready to throw error

2011-03-10 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On ons, 2011-03-09 at 18:07 -0500, Tom Lane wrote:
 Now the downside of that is that a runtime failure won't give you an
 parse error pointer to indicate which function is having trouble ...
 but having an error pointer for an error that shouldn't be thrown in
 the first place is small consolation.

 Sounds reasonable.

 Btw., the ultimate plan here was that functions or operators that would
 care about collation would be marked as such in pg_proc.  That plan
 basically just ran out of time, but if you think it'd be useful, maybe
 we could reactivate it quickly.

I think it's worth doing at some point, but not right now.  By the time
you get done adding support for the flag to CREATE FUNCTION, pg_dump,
etc, it's not a small task.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Theory of operation of collation patch

2011-03-10 Thread Peter Eisentraut
On tis, 2011-03-08 at 20:52 -0500, Tom Lane wrote:
 I think we should drop collate clause from TypeName and just have it
 in columnDef and the expression syntax.

Yes, that sounds better in retrospect.  It's easier to see that now that
we see all the cases where it's used and not used.

 This might also ease the
 ambiguity problem that evidently led you to restrict the expression
 production's argument to c_expr.

Maybe, but I seem to recall that I did actually check and concluded that
c_expr covers all cases where collate clause is allowed.  We could of
course allow more cases, but maybe it's not necessary.

 It would also allow us to meet the
 letter of the spec for column definition, in that collate clause
 is not required to immediately follow data type.

Note that that case is listed under a separate feature.  I'm not sure if
it's worth supporting, but if they bothered putting it in it's probably
for compatibility with some existing implementation.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

Jan Wieck had a detailed proposal to make that happen at last developper
meeting, but then ran out of time to implement it for 9.1 it seems.  The
idea was basically to have a ticker in core, an SRF that would associate
txid_snapshot with wall clock time.  Lots of good things would come from
that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

Of course if you think that's important enough for you to implement it
between now and beta, that would be great :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] configure gaps

2011-03-10 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Andrew Dunstan wrote:
 
 
 On 02/21/2011 09:33 PM, Bruce Momjian wrote:
 Tom Lane wrote:
 Andrew Dunstanand...@dunslane.net  writes:
 I propose that we add the following test for this case:
 AC_CHECK_HEADER(Python.h, [], [AC_MSG_ERROR([header filePython.h
 is required for Python])])
 You'd need to pay attention to python_includespec, but otherwise seems
 reasonable.
 Did this get done?  If so, I don't see it.
 
 Oh, no. It skipped my TODO list. I'll try to get it done in the next day 
 or so.

 I have added this to the 9.1 open items list:

   http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items#Code_Issues

Waste of time, because that got done some time ago.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] configure gaps

2011-03-10 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Andrew Dunstan wrote:
  
  
  On 02/21/2011 09:33 PM, Bruce Momjian wrote:
  Tom Lane wrote:
  Andrew Dunstanand...@dunslane.net  writes:
  I propose that we add the following test for this case:
  AC_CHECK_HEADER(Python.h, [], [AC_MSG_ERROR([header filePython.h
  is required for Python])])
  You'd need to pay attention to python_includespec, but otherwise seems
  reasonable.
  Did this get done?  If so, I don't see it.
  
  Oh, no. It skipped my TODO list. I'll try to get it done in the next day 
  or so.
 
  I have added this to the 9.1 open items list:
 
  http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items#Code_Issues
 
 Waste of time, because that got done some time ago.

OK, thanks.  Removed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree parent pointer and checkpoints

2011-03-10 Thread Bruce Momjian

Has this been addressed?

---

Heikki Linnakangas wrote:
 On 13.11.2010 00:34, Greg Stark wrote:
  On Fri, Nov 12, 2010 at 7:20 PM, Heikki Linnakangas
  heikki.linnakan...@enterprisedb.com  wrote:
  I think we can work around that with a small modification to the page split
  algorithm. In a nutshell, when the child page is split, put a flag on the
  left half indicating that the rightlink must always be followed, regardless
  of the NSN. When the downlink is inserted to the parent, clear the flag.
  Setting and clearing of these flags need to be performed during WAL replay
  as well.
 
  Does this not cause duplicate results? Or does GIST already have to be
  prepared to deal with duplicate results?
 
 The GiST search algorithm avoids duplicate results by remembering the 
 LSN on the parent page when it follows a downlink. The split currently 
 happens like this:
 
 0. (the child page is locked)
 1. The parent page is locked.
 2. The child page is split. The original page becomes the left half, and 
 new buffers are allocated for the right halves.
 3. The downlink is inserted on the parent page (and the original 
 downlink is updated to reflect only the keys that stayed on the left 
 page). While keeping the child pages locked, the NSN field on the 
 children are updated with the new LSN of the parent page.
 
 To avoid duplicates, when a scan looks at the child page, it needs to 
 know if it saw the parent page before or after the downlink was 
 inserted. If it saw it before, the scan needs to follow the rightlink to 
 the right half, otherwise it will follow the downlink as usual (if it 
 matched). The scan checks that by comparing the LSN it saw on the parent 
 page with the NSN on the child page. If parent LSN  NSN, we saw the 
 parent before the downlink was inserted.
 
 Now, the problem with crash recovery is that the above algorithm depends 
 on the split to keep the parent and child locked until the downlink is 
 inserted in the parent. If you crash between steps 2 and 3, the locks 
 are gone. If a later insert then updates the parent page, because of a 
 split on some unrelated child page, that will bump the LSN of the parent 
 above the NSN on the child. Scans will see that the parent LSN  child 
 NSN, and will no longer follow the rightlink.
 
 And the fix for that is to set a flag on the child page indicating that 
 rightlink has to be always followed regardless of the LSN/NSN, because 
 the downlink hasn't been inserted yet. When the downlink is inserted, 
 the flag is cleared and we rely on the existing LSN/NSN mechanism to 
 avoid duplicate results.
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] configure gaps

2011-03-10 Thread Andrew Dunstan



On 03/10/2011 03:14 PM, Bruce Momjian wrote:

Did this get done?  If so, I don't see it.

Oh, no. It skipped my TODO list. I'll try to get it done in the next day
or so.

I have added this to the 9.1 open items list:

http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items#Code_Issues



It's been done. See 
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=42e5223648e7e56f8041bed329929dbd5529ea92


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree parent pointer and checkpoints

2011-03-10 Thread Bruce Momjian
Bruce Momjian wrote:
 
 Has this been addressed?

I see we have with this commit:

9de3aa65f01fb51cbc725e8508ea233e4e92c46c

---


 
 ---
 
 Heikki Linnakangas wrote:
  On 13.11.2010 00:34, Greg Stark wrote:
   On Fri, Nov 12, 2010 at 7:20 PM, Heikki Linnakangas
   heikki.linnakan...@enterprisedb.com  wrote:
   I think we can work around that with a small modification to the page 
   split
   algorithm. In a nutshell, when the child page is split, put a flag on the
   left half indicating that the rightlink must always be followed, 
   regardless
   of the NSN. When the downlink is inserted to the parent, clear the flag.
   Setting and clearing of these flags need to be performed during WAL 
   replay
   as well.
  
   Does this not cause duplicate results? Or does GIST already have to be
   prepared to deal with duplicate results?
  
  The GiST search algorithm avoids duplicate results by remembering the 
  LSN on the parent page when it follows a downlink. The split currently 
  happens like this:
  
  0. (the child page is locked)
  1. The parent page is locked.
  2. The child page is split. The original page becomes the left half, and 
  new buffers are allocated for the right halves.
  3. The downlink is inserted on the parent page (and the original 
  downlink is updated to reflect only the keys that stayed on the left 
  page). While keeping the child pages locked, the NSN field on the 
  children are updated with the new LSN of the parent page.
  
  To avoid duplicates, when a scan looks at the child page, it needs to 
  know if it saw the parent page before or after the downlink was 
  inserted. If it saw it before, the scan needs to follow the rightlink to 
  the right half, otherwise it will follow the downlink as usual (if it 
  matched). The scan checks that by comparing the LSN it saw on the parent 
  page with the NSN on the child page. If parent LSN  NSN, we saw the 
  parent before the downlink was inserted.
  
  Now, the problem with crash recovery is that the above algorithm depends 
  on the split to keep the parent and child locked until the downlink is 
  inserted in the parent. If you crash between steps 2 and 3, the locks 
  are gone. If a later insert then updates the parent page, because of a 
  split on some unrelated child page, that will bump the LSN of the parent 
  above the NSN on the child. Scans will see that the parent LSN  child 
  NSN, and will no longer follow the rightlink.
  
  And the fix for that is to set a flag on the child page indicating that 
  rightlink has to be always followed regardless of the LSN/NSN, because 
  the downlink hasn't been inserted yet. When the downlink is inserted, 
  the flag is cleared and we rely on the existing LSN/NSN mechanism to 
  avoid duplicate results.
  
  -- 
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com
  
  -- 
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + It's impossible for everything to be true. +
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

 Jan Wieck had a detailed proposal to make that happen at last developper
 meeting, but then ran out of time to implement it for 9.1 it seems.  The
 idea was basically to have a ticker in core, an SRF that would associate
 txid_snapshot with wall clock time.  Lots of good things would come from
 that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

 Of course if you think that's important enough for you to implement it
 between now and beta, that would be great :)

I think that's actually something a little different, and more
complicated, but I do think it'd be useful.  I was hoping there was a
simple way to get some kind of time-based information into
pg_stat_replication, but if there isn't, there isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql execute vs. SELECT ... INTO

2011-03-10 Thread Bruce Momjian
Andrew Dunstan wrote:
 
 
 On 11/05/2010 06:54 PM, Tom Lane wrote:
  Andrew Dunstanand...@dunslane.net  writes:
  The comment on the commit says:
   EXECUTE of a SELECT ... INTO now draws a 'not implemented' error,
   rather than executing the INTO clause with non-plpgsql semantics
   as it was doing for the last few weeks/months.  This keeps our options
   open for making it do the right plpgsql-ish thing in future without
   creating a backwards compatibility problem.  There is no loss of
   functionality since people can get the same behavior with CREATE 
  TABLE AS.
  Do we really still need to keep out options open on this after all that
  time?
  I think it's still a good idea that it won't do something that is very
  much different from what a non-EXECUTE'd SELECT INTO will do.
 
  I forget, is there a HINT there suggesting CREATE TABLE AS?  Maybe we
  should add one if not.
 
 No, (see below) we should certainly improve that and document the 
 behavior, if we're going to keep it.
 
  if (*ptr == 'S' || *ptr == 's')
  ereport(ERROR,
  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
   errmsg(EXECUTE of SELECT ... INTO is not 
 implemented),
   errhint(You might want to use EXECUTE ... 
 INTO instead.)));

Can someone suggest updated hint text, like this?

  errhint(You might want to use EXECUTE ... INTO or EXECUTE CREATE TABLE AS 
instead.)));

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overlap operators for ARRAY and NULLs

2011-03-10 Thread Bruce Momjian

I have added a link to this from the exiting TODO item:

Improve handling of NULLs in arrays

* BUG #4509: array_cat's null behaviour is inconsistent
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01040.php 

---

Itagaki Takahiro wrote:
 There might be inconsistency in overlap operators for array
 types ( @, @, and  ) when the argument arrays contain NULLs.
 
 - SELECT 2 = ANY (ARRAY[1, NULL])   = NULL
 - SELECT ARRAY[2] @ ARRAY[1, NULL] = false
 
 NULL means unknown in definition, so should it return NULL
 rather than false?
 
 I found the issue when I read spec of MULTISET. In the SQL
 standard, the following expression returns NULL. I was thinking
 to make SUBMULTISET OF to be an alias of @ operator, but
 they seems to be incompatible.
 
 - MULTISET[2] SUBMULTISET OF MULTISET[1, NULL] = NULL
 
 Will we change the operator's behavior? It would be more
 consistent, but incompatible with older server versions.
 If impossible, I'll add separated submultiset_of() function
 instead of @ operator for MULTISET supports.
 
 -- 
 Itagaki Takahiro
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Martijn van Oosterhout
On Thu, Mar 10, 2011 at 10:34:00AM -0500, Tom Lane wrote:
 Hmm.  That suggests a third solution: revert the addition of *all* the
 collid fields except the ones that represent collation-to-apply-during-
 function-execution.  (So they'd still be there in FuncExpr/OpExpr, but
 not most other places.)  Then we'd have to dig down more deeply in the
 expression tree during select_common_collation, but we'd save space
 and avoid confusion over the meaning of the fields.

Yeah, it occurred to me if you made each collate clause translate to a
collate node that changes the collation, a bit like casts, then the
parse nodes don't need to know about collation at all.

 I suspect this is probably not a good idea because of the added cost in
 select_common_collation: aside from probably needing more syscache
 lookups, there's a potential for worse-than-linear cost behavior if we
 have to repeatedly dig through a deep expression tree to find out
 collations.  We had a similar case in the past [ checks archives ... see
 http://archives.postgresql.org/pgsql-performance/2005-06/msg00075.php
 http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=ba4200246
 ] so I'm hesitant to go down that road again.  Still, I'll throw it out
 for comment.

Two things can make a difference here:

- If you knew which operators/functions cared about the collation, the
  cost could be manageable. We don't so...

- ISTM that in theory any algorithm that is defined by recursion at
  each node, should be calculatable via a single pass of the tree by
  something like parse_expr. That's essentially what the variables are
  doing in the Expr nodes, though whether you need one or two is
  ofcourse another question.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Indent authentication overloading

2011-03-10 Thread Bruce Momjian

Added to TODO:

Rename unix domain socket 'ident' connections to 'peer', to avoid
confusion with TCP 'ident'

* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01053.php 

---

Magnus Hagander wrote:
 Currently, we overload indent meaning both unix socket
 authentication and ident over tcp, depending on what type of
 connection it is. This is quite unfortunate - one of them being one of
 the most secure options we have, the other one being one of the most
 *insecure* ones (really? ident over tcp? does *anybody* use that
 intentionally today?)
 
 Should we not consider naming those two different things?
 
 If not now, then at least put it on the TODO of things to do the next
 time we need to break backwards compatibility with the format of
 pg_hba.conf? Though if we're going to break backwards compatibility
 anywhere, pg_hba is probably one of the least bad places to do it...
 
 -- 
 ?Magnus Hagander
 ?Me: http://www.hagander.net/
 ?Work: http://www.redpill-linpro.com/
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

2011-03-10 Thread Robert Haas
On Mon, Mar 7, 2011 at 4:47 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Anyway, the reload of the configuration file should not
 cause the server to end unexpectedly. IOW, GUC assign hook should
 use GUC_complaint_elevel instead of FATAL, in ereport. The attached
 patch fixes that, and includes two typo fixes.

Committed the typo fixes, and the GUC assign hook fix as a separate commit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Read uncommitted ever possible?

2011-03-10 Thread Nicolas Barbier
2011/3/10 Jesper Krogh jes...@krogh.cc:

 On 2011-03-10 19:25, Bruce Momjian wrote:

 Sure, anyone can add text to that wiki;  you create a community account
 here:

        http://www.postgresql.org/community/signup

 Suggestion: Add this url to the login box on the wiki.

+1, Adrian von Bidder had the same problem just two days ago.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indent authentication overloading

2011-03-10 Thread Magnus Hagander
On Thu, Mar 10, 2011 at 22:22, Bruce Momjian br...@momjian.us wrote:

 Added to TODO:

        Rename unix domain socket 'ident' connections to 'peer', to avoid
        confusion with TCP 'ident'

Should we consider adding peer as an alias for ident already in
9.1 (and change the default pg_hba.conf template), and then deprecate
ident for 9.2 and remove it in 9.3 or something? By adding the alias
now (yes, I know it's not in the last CF :P), we can move what's going
to be a long process up one release...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Thu, Mar 10, 2011 at 10:34:00AM -0500, Tom Lane wrote:
 I suspect this is probably not a good idea because of the added cost in
 select_common_collation: aside from probably needing more syscache
 lookups, there's a potential for worse-than-linear cost behavior if we
 have to repeatedly dig through a deep expression tree to find out
 collations.

 Two things can make a difference here:

 - If you knew which operators/functions cared about the collation, the
   cost could be manageable. We don't so...

Yeah, the possibility of skipping select_common_collation altogether for
most operators is pretty attractive.  Maybe we'll get to that before
we're done, but I don't want to assume it'll be done for 9.1.

 - ISTM that in theory any algorithm that is defined by recursion at
   each node, should be calculatable via a single pass of the tree by
   something like parse_expr. That's essentially what the variables are
   doing in the Expr nodes, though whether you need one or two is
   ofcourse another question.

We could do that if we were willing to go back and fill in the collation
fields after the whole expression tree is built.  If you want to fill in
at the time the FuncExpr/OpExpr is first built, then you will get O(N^2)
behavior from repeated calculations in a deep tree if you don't cache
the results for the lower levels.  Which is what the output-collation
fields would do for us.

A post-pass is not out of the question, but it's enough unlike
everything else the parser does that I'm not too thrilled about it.

Also, there's the issue that started the whole discussion, which is that
sometimes we *do* need to know, post-parse-analysis, what the result
collation of an expression tree is.  See CREATE VIEW.  If that's the
*only* thing that ever needed to know it, I wouldn't mind accepting a
double calculation of the collation for CREATE VIEW ... but somehow it
doesn't seem real likely that no other uses for the information will
emerge, and some of them might be more performance-critical.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Tom Lane
I wrote:
 A post-pass is not out of the question, but it's enough unlike
 everything else the parser does that I'm not too thrilled about it.

On the other hand ... one thing that's been bothering me is that
select_common_collation assumes that explicit collation derivation
doesn't bubble up in the tree, ie a COLLATE is only a forcing function
for the immediate parent expression node.  It's not at all clear to me
that that's a correct reading of the spec.  If it's not, the only way
we could make it work correctly in the current design is to keep
*two* additional fields, both the collation OID and an explicit/implicit
derivation flag.  Which would be well past the level of annoying.
But in a post-pass implementation it would be no great trouble to do
either one, and we'd not be looking at a forced initdb to change our
minds either.

Maybe a post-pass, with only collation-to-apply fields actually stored
in the tree, is the way to go.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Martijn van Oosterhout
On Thu, Mar 10, 2011 at 05:16:52PM -0500, Tom Lane wrote:
 On the other hand ... one thing that's been bothering me is that
 select_common_collation assumes that explicit collation derivation
 doesn't bubble up in the tree, ie a COLLATE is only a forcing function
 for the immediate parent expression node.  It's not at all clear to me
 that that's a correct reading of the spec.  If it's not, the only way
 we could make it work correctly in the current design is to keep
 *two* additional fields, both the collation OID and an explicit/implicit
 derivation flag.  Which would be well past the level of annoying.
 But in a post-pass implementation it would be no great trouble to do
 either one, and we'd not be looking at a forced initdb to change our
 minds either.

I beleive the current interpretation, that is the COLLATE only applies
to levels above, is the correct interpretation. COLLATE binds tightly,
so

A op B COLLATE C  parses as  A op (B COLLATE C)

which is why it works. It's actually the only way that makes sense,
otherwise it becomes completely impossible to specify different
collations for a function and its return value.

For example in your example with a view:

CREATE VIEW foo AS func(x COLLATE A) COLLATE B;

B is the collation for the output column, A is the collation for the
function.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-10 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Thu, Mar 10, 2011 at 05:16:52PM -0500, Tom Lane wrote:
 On the other hand ... one thing that's been bothering me is that
 select_common_collation assumes that explicit collation derivation
 doesn't bubble up in the tree, ie a COLLATE is only a forcing function
 for the immediate parent expression node.  It's not at all clear to me
 that that's a correct reading of the spec.

 I beleive the current interpretation, that is the COLLATE only applies
 to levels above, is the correct interpretation. COLLATE binds tightly,
 so

 A op B COLLATE C  parses as  A op (B COLLATE C)

 which is why it works.

No, that's not what I'm on about.  Consider

(((A COLLATE X) || B) || (C COLLATE Y))  (D COLLATE Z)

(I've spelled out the parenthesization in full for clarity, but most
of these parens could be omitted.)  Is this expression legal, or
should the  operator be throwing an error for conflicting
explicitly-derived collations?  Our code as it stands will take it,
because no individual operator sees more than one COLLATE among its
arguments.  But I'm not sure this is right.  The only text I can find
in SQL2008 that seems to bear on the point is in 4.2.2:

Anything that has a declared type can, if that type is a
character string type, be associated with a collation applicable
to its character set; this is known as a declared type
collation. Every declared type that is a character string type
has a collation derivation, this being either none, implicit, or
explicit. The collation derivation of a declared type with a
declared type collation that is explicitly or implicitly
specified by a data type is implicit. If the collation
derivation of a declared type that has a declared type collation
is not implicit, then it is explicit. The collation derivation
of an expression of character string type that has no declared
type collation is none.

As I read this, the collation attached to any Var clause is implicit
(because it came from the Var's data type), and the collation attached
to a CollateClause is presumably explicit, but where does it say what
happens at higher levels in the expression tree?  It's at least arguable
that the result collation of an expression is explicit if its input
collation was explicit.  The fact that the default in case of doubt
apparently is supposed to be explicit doesn't give any aid or comfort
to your position either.  If explicitness comes only from the immediate
use of COLLATE, why don't they say that?  This is worded to make one
think that most cases will have explicit derivation, not only COLLATE.

I wonder if anyone can check the behavior of nested collate clauses in
DB2 or some other probably-spec-conforming database.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Can't specify default collation?

2011-03-10 Thread Tom Lane
This seems a tad unfriendly:

db1=# create table foo (f1 text collate default);
ERROR:  collation default for current database encoding UTF8 does not exist
LINE 1: create table foo (f1 text collate default);
  ^

Not being able to explicitly specify the default behavior is a no-no
according to most people who have thought about language design for more
than a moment.

The reason it's failing is that default is entered into pg_collation
with collencoding = 0 (SQL_ASCII), and the lookup code is designed to
ignore all entries with collencoding different from the current
database's encoding.  So, in fact, the above command *will* work if
you're in a SQL_ASCII database.  Just not elsewhere.

What I'm inclined to do about this is set default's collencoding to
-1, with the semantics of works for any encoding, and fix the lookup
routines to try -1 if they don't get a match with the database encoding.
Having done that, we could also use -1 for C and POSIX, thus
avoiding having to make a bunch of duplicate entries for them.

BTW, I would like to eventually have C and POSIX in there all the
time (ie created by pg_collation.h), so that they can be used even in
machines that don't have locale_t support.  I haven't yet gotten around
to reading the parts of the collation patch that might need to change
to support this, so I'm not sure how much work it'd be.  But I'd say
that being able to do COLLATE C in an otherwise non-C database would
cover a very large fraction of the user requests I've read about this,
so being able to handle that case even without locale_t support would be
really useful IMO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git diff script is not portable

2011-03-10 Thread Bruce Momjian
Bruce Momjian wrote:
 Magnus Hagander wrote:
  On Wed, Nov 17, 2010 at 19:54, Bruce Momjian br...@momjian.us wrote:
   Looking here:
  
   ? ? ? ?http://wiki.postgresql.org/wiki/Working_with_Git
  
   the script we recommend for doing context diffs for git,
   git-external-diff, is not portable:
  
   ? ? ? ?http://anarazel.de/pg/git-external-diff
  
   It uses diff -L, which is not supported by FreeBSD, and I imagine many
   other operating systems.
  
   If we want people to use this to produce context diffs, we should
   provide a portable script. ?I can modify it to be portable, but it is
   currently hosted on some other site. ?How should I handle this? ? ?How
   do I contact the author, or perhaps I should create a new on on
   ftp.postgresql.org and link to that.
  
  Do it the git way - fork it and put it on your github page.
  
  Or do it the pg way - fork it and put it in src/tools.
 
 src/tools is a very good idea.  Objections?

Done, and linked to from our git wiki page:

http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indent authentication overloading

2011-03-10 Thread Bruce Momjian
Magnus Hagander wrote:
 On Thu, Mar 10, 2011 at 22:22, Bruce Momjian br...@momjian.us wrote:
 
  Added to TODO:
 
  ? ? ? ?Rename unix domain socket 'ident' connections to 'peer', to avoid
  ? ? ? ?confusion with TCP 'ident'
 
 Should we consider adding peer as an alias for ident already in
 9.1 (and change the default pg_hba.conf template), and then deprecate
 ident for 9.2 and remove it in 9.3 or something? By adding the alias
 now (yes, I know it's not in the last CF :P), we can move what's going
 to be a long process up one release...

Well, we can certainly do that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-03-10 Thread Bruce Momjian

What happened with this patch?  Alvaro saw a 25x speedup.

---

Pavel Stehule wrote:
 Hello
 
 this patch remove a multiple detoasting of varlena values in plpgsql.
 
 It is usable mainly for iteration over longer array directly loaded
 from relation.
 
 It's doesn't have a impact on semantic or behave - it's just eliminate
 some performance trap.
 
 sample: table 1 rows one column with array with 1000 string fields:
 
 patched pl time: 6 sec
 unpatched pl time: 170 sec
 
 This doesn't change my opinion on FOR-IN-ARRAY cycle (is still
 important for readability) - just remove one critical performance
 issue
 
 Regards
 
 Pavel Stehule

[ Attachment, skipping... ]

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indent authentication overloading

2011-03-10 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Thu, Mar 10, 2011 at 22:22, Bruce Momjian br...@momjian.us wrote:
 
 Added to TODO:
 
        Rename unix domain socket 'ident' connections to 'peer', to avoid
        confusion with TCP 'ident'

 Should we consider adding peer as an alias for ident already in
 9.1 (and change the default pg_hba.conf template), and then deprecate
 ident for 9.2 and remove it in 9.3 or something? By adding the alias
 now (yes, I know it's not in the last CF :P), we can move what's going
 to be a long process up one release...

It doesn't strike me as urgent enough to be worth pushing through in a
hurry.  We have got plenty to do to get 9.1 out the door already ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-03-10 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 What happened with this patch?  Alvaro saw a 25x speedup.

It got bounced.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2011-03-10 Thread Bruce Momjian
Robert Haas wrote:
 2010/11/25 pasman pasma?ski pasma...@gmail.com:
 I've seen this as well be a performance issue, in particular with 
 partitioned tables.
 Out of habit I now write functions that always cache the value of the 
 function in
 a variable and use the variable in the actual query to avoid this 
 particular gotcha.
 
  subquery may be used to cache constants:
 
  SELECT a_date
  ?FROM test
  ?WHERE a_date=(SELECT current_date);
 
 
  Seq Scan on test1 ?(cost=0.01..10.76 rows=5 width=4)
   ?Filter: (a_date = $0)
   ?InitPlan 1 (returns $0)
   ? ?- ?Result ?(cost=0.00..0.01 rows=1 width=0)
 
 Interesting.  So we pull a subquery out and treat it as an initplan,
 but not a stable function?  Hmm...

I assume this is _not_ a TODO.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 7:32 PM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 2010/11/25 pasman pasma?ski pasma...@gmail.com:
 I've seen this as well be a performance issue, in particular with 
 partitioned tables.
 Out of habit I now write functions that always cache the value of the 
 function in
 a variable and use the variable in the actual query to avoid this 
 particular gotcha.
 
  subquery may be used to cache constants:
 
  SELECT a_date
  ?FROM test
  ?WHERE a_date=(SELECT current_date);
 
 
  Seq Scan on test1 ?(cost=0.01..10.76 rows=5 width=4)
   ?Filter: (a_date = $0)
   ?InitPlan 1 (returns $0)
   ? ?- ?Result ?(cost=0.00..0.01 rows=1 width=0)

 Interesting.  So we pull a subquery out and treat it as an initplan,
 but not a stable function?  Hmm...

 I assume this is _not_ a TODO.

Dunno, not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unnecessary limit on max_standby_streaming_delay

2011-03-10 Thread Bruce Momjian
Peter Eisentraut wrote:
 On fre, 2010-12-17 at 12:57 +0100, Magnus Hagander wrote:
  The limit on max_standby_streaming_delay is currently 35 minutes
  (around) - or you have to set it to unlimited. This is because the GUC
  is limited to MAX_INT/1000, unit milliseconds.
  
  Is there a reason for the /1000, or is it just an oversight thinking
  the unit was in seconds?
 
 Yeah, I actually noticed this week that log_min_duration_statement is
 limited to the same 35 minutes for the same reason.  Might be good to
 address that, too.  Note that statement_timeout does not have that
 limit.

Added to TODO:

Increase maximum values for max_standby_streaming_delay and
log_min_duration_statement

* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01517.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql expanded auto

2011-03-10 Thread Bruce Momjian

I have added a link to this on the TODO list:

Add auto expanded mode that outputs in expanded format if wrapped
mode can't wrap the output to the screen width

* Re: psql wrapped format default for backslash-d commands
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01638.php 

---

Peter Eisentraut wrote:
 I have often found myself wanting that psql automatically switch between
 normal and \x mode depending on the width of the output.  Would others
 find this useful?
 
 Attached is a crude demo patch.  Enable with \pset expanded auto.
 

[ Attachment, skipping... ]

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump -X

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 10:36 PM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 Back in 2006, we have this commit:

 commit 2b25e1169f44368c120931787628d51731b5cc8c
 Author: Peter Eisentraut pete...@gmx.net
 Date:   Sat Oct 7 20:59:05 2006 +

     The -X option in pg_dump was supposed to be a workaround for the lack of
     portable long options.  But we have had portable long options for a long
     time now, so this is obsolete.  Now people have added options which 
 *only*
     work with -X but not as regular long option, so I'm putting a stop to 
 this:
     -X is deprecated; it still works, but it has been removed from the
     documentation, and please don't add more of them.

 Since then, two additional -X options have crept in, doubtless due to
 mimicry of the existing options without examination of the commit
 logs.  I think we should either (a) remove the -X option altogether or
 (b) change the comment so that it clearly states the same message that
 appears here in the commit log, namely, that no new -X options are to
 be created.  The existing comment says that -X is deprecated, but that
 doesn't make it entirely 100% clear that the code isn't intended to be
 further updated, at least judging by the results.

 Code comment added with attached, applied patch.

At a minimum, we should probably also remove -X no-security-label and
-X no-unlogged-table-data, which don't exist in any released versions
(unless you want to count alphas).  But considering that this has been
deprecated and undocumented since 8.2, I think it might be time to
pull the plug on -X altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

2011-03-10 Thread Fujii Masao
On Thu, Dec 9, 2010 at 4:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, 2010-12-09 at 00:16 +0100, Heikki Linnakangas wrote:
 On 09.12.2010 00:10, Heikki Linnakangas wrote:
  On 08.12.2010 16:00, Simon Riggs wrote:
  Heikki pointed out to me that the btree delete record processing does
  not respect vacuum_defer_cleanup_age. It should.
 
  Attached patch to implement that.
 
  Looking to commit in next few hours barring objections/suggestions, to
  both HEAD and 9_0_STABLE, in time for next minor release.
 
  Please note that it was Noah Misch that raised this a while ago:
 
  http://archives.postgresql.org/pgsql-hackers/2010-11/msg01919.php

 On closer look, that's not actually the same issue, sorry for the noise..

 Heikki, this one *is* important. Will fix. Thanks for the analysis Noah.

Is this an open item for 9.1?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-03-10 Thread Pavel Stehule
2011/3/11 Bruce Momjian br...@momjian.us:

 What happened with this patch?  Alvaro saw a 25x speedup.

There is not conformance about form in this patch. But there are a
FOREACH statement - so if somebody uses this statement, then he will
not have a problems with performance.

Regards

Pavel


 ---

 Pavel Stehule wrote:
 Hello

 this patch remove a multiple detoasting of varlena values in plpgsql.

 It is usable mainly for iteration over longer array directly loaded
 from relation.

 It's doesn't have a impact on semantic or behave - it's just eliminate
 some performance trap.

 sample: table 1 rows one column with array with 1000 string fields:

 patched pl time: 6 sec
 unpatched pl time: 170 sec

 This doesn't change my opinion on FOR-IN-ARRAY cycle (is still
 important for readability) - just remove one critical performance
 issue

 Regards

 Pavel Stehule

 [ Attachment, skipping... ]


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2011-03-10 Thread Bruce Momjian
Josh Berkus wrote:
 On 12/6/10 6:10 PM, Tom Lane wrote:
  Robert Haas robertmh...@gmail.com writes:
  On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus j...@agliodbs.com wrote:
  Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
  From my run, it looks like even so regular fsync might be better than
  open_sync.
  
  But I think you need to use fsync_writethrough if you actually want 
  durability.
  
  Yeah.  Unless your laptop contains an SSD, those numbers are garbage on
  their face.  So that's another problem with test_fsync: it omits
  fsync_writethrough.
 
 Yeah, the issue with test_fsync appears to be that it's designed to work
 without os-specific switches no matter what, not to accurately reflect
 how we access wal.

I have now modified pg_test_fsync to use O_DIRECT for O_SYNC/O_FSYNC,
and O_DSYNC, if supported, so it now matches how we use WAL (except we
don't use O_DIRECT when in 'archive' and 'hot standby' mode).  Applied
patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_test_fsync/pg_test_fsync.c b/contrib/pg_test_fsync/pg_test_fsync.c
new file mode 100644
index d075483..49a7b3c
*** a/contrib/pg_test_fsync/pg_test_fsync.c
--- b/contrib/pg_test_fsync/pg_test_fsync.c
***
*** 23,29 
  #define XLOG_BLCKSZ_K	(XLOG_BLCKSZ / 1024)
  
  #define LABEL_FORMAT		%-32s
! #define NA_FORMAT			LABEL_FORMAT %18s
  #define OPS_FORMAT			%9.3f ops/sec
  
  static const char *progname;
--- 23,29 
  #define XLOG_BLCKSZ_K	(XLOG_BLCKSZ / 1024)
  
  #define LABEL_FORMAT		%-32s
! #define NA_FORMAT			%18s
  #define OPS_FORMAT			%9.3f ops/sec
  
  static const char *progname;
*** handle_args(int argc, char *argv[])
*** 134,139 
--- 134,144 
  	}
  
  	printf(%d operations per test\n, ops_per_test);
+ #if PG_O_DIRECT != 0
+ 	printf(O_DIRECT supported on this platform for open_datasync and open_sync.\n);
+ #else
+ 	printf(Direct I/O is not supported on this platform.\n);
+ #endif
  }
  
  static void
*** test_sync(int writes_per_op)
*** 184,226 
  	/*
  	 * Test open_datasync if available
  	 */
! #ifdef OPEN_DATASYNC_FLAG
! 	printf(LABEL_FORMAT, open_datasync
! #if PG_O_DIRECT != 0
! 		 (non-direct I/O)*
! #endif
! 		);
  	fflush(stdout);
  
! 	if ((tmpfile = open(filename, O_RDWR | O_DSYNC, 0)) == -1)
! 		die(could not open output file);
! 	gettimeofday(start_t, NULL);
! 	for (ops = 0; ops  ops_per_test; ops++)
! 	{
! 		for (writes = 0; writes  writes_per_op; writes++)
! 			if (write(tmpfile, buf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
! die(write failed);
! 		if (lseek(tmpfile, 0, SEEK_SET) == -1)
! 			die(seek failed);
! 	}
! 	gettimeofday(stop_t, NULL);
! 	close(tmpfile);
! 	print_elapse(start_t, stop_t);
! 
! 	/*
! 	 * If O_DIRECT is enabled, test that with open_datasync
! 	 */
! #if PG_O_DIRECT != 0
  	if ((tmpfile = open(filename, O_RDWR | O_DSYNC | PG_O_DIRECT, 0)) == -1)
  	{
! 		printf(NA_FORMAT, o_direct, n/a**\n);
  		fs_warning = true;
  	}
  	else
  	{
! 		printf(LABEL_FORMAT, open_datasync (direct I/O));
! 		fflush(stdout);
! 
  		gettimeofday(start_t, NULL);
  		for (ops = 0; ops  ops_per_test; ops++)
  		{
--- 189,207 
  	/*
  	 * Test open_datasync if available
  	 */
! 	printf(LABEL_FORMAT, open_datasync);
  	fflush(stdout);
  
! #ifdef OPEN_DATASYNC_FLAG
  	if ((tmpfile = open(filename, O_RDWR | O_DSYNC | PG_O_DIRECT, 0)) == -1)
  	{
! 		printf(NA_FORMAT, n/a*\n);
  		fs_warning = true;
  	}
  	else
  	{
! 		if ((tmpfile = open(filename, O_RDWR | O_DSYNC | PG_O_DIRECT, 0)) == -1)
! 			die(could not open output file);
  		gettimeofday(start_t, NULL);
  		for (ops = 0; ops  ops_per_test; ops++)
  		{
*** test_sync(int writes_per_op)
*** 234,252 
  		close(tmpfile);
  		print_elapse(start_t, stop_t);
  	}
- #endif
- 
  #else
! 	printf(NA_FORMAT, open_datasync, n/a\n);
  #endif
  
  /*
   * Test fdatasync if available
   */
- #ifdef HAVE_FDATASYNC
  	printf(LABEL_FORMAT, fdatasync);
  	fflush(stdout);
  
  	if ((tmpfile = open(filename, O_RDWR, 0)) == -1)
  		die(could not open output file);
  	gettimeofday(start_t, NULL);
--- 215,231 
  		close(tmpfile);
  		print_elapse(start_t, stop_t);
  	}
  #else
! 	printf(NA_FORMAT, n/a\n);
  #endif
  
  /*
   * Test fdatasync if available
   */
  	printf(LABEL_FORMAT, fdatasync);
  	fflush(stdout);
  
+ #ifdef HAVE_FDATASYNC
  	if ((tmpfile = open(filename, O_RDWR, 0)) == -1)
  		die(could not open output file);
  	gettimeofday(start_t, NULL);
*** test_sync(int writes_per_op)
*** 263,269 
  	close(tmpfile);
  	print_elapse(start_t, stop_t);
  #else
! 	printf(NA_FORMAT, fdatasync, n/a\n);
  #endif
  
  /*
--- 242,248 
  	close(tmpfile);
  	print_elapse(start_t, stop_t);
  #else
! 	printf(NA_FORMAT, n/a\n);
  #endif
  
  /*
*** test_sync(int writes_per_op)
*** 

Re: [HACKERS] Default mode for shutdown

2011-03-10 Thread Bruce Momjian
Josh Kupershmidt wrote:
 On Wed, Dec 15, 2010 at 10:11 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  It occurs to me that we may need a new mode, which disconnects sessions
  that are not in a transaction (or as soon as they are) but leaves
  in-progress transactions alone; this could be the new default. ?Of
  course, this is much more difficult to implement than the current modes.
 
 I like this idea, if it's feasible. Might I also suggest that the
 smart-mode shutdown give a HINT to the user that he can forcibly kill
 off existing sessions using -m fast. Right now, we  show something
 like this:
 
 $ pg_ctl -D PGDATA stop
 waiting for server to shut down
 ... failed
 pg_ctl: server does not shut down
 
 And it's not immediately obvious to the user why the server didn't
 shut down, or how to fix things.

I have applied the attached patch to mention -m fast when a smart
shutdown or restart fails.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 2fab5c9..4b9fb84
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*** do_stop(void)
*** 865,870 
--- 865,873 
  			print_msg(_( failed\n));
  
  			write_stderr(_(%s: server does not shut down\n), progname);
+ 			if (shutdown_mode == SMART_MODE)
+ write_stderr(_(TIP: the \-m fast\ option immediately disconnects sessions rather than\n
+ 			waiting for session-initiated disconnection.\n));
  			exit(1);
  		}
  		print_msg(_( done\n));
*** do_restart(void)
*** 952,957 
--- 955,963 
  			print_msg(_( failed\n));
  
  			write_stderr(_(%s: server does not shut down\n), progname);
+ 			if (shutdown_mode == SMART_MODE)
+ write_stderr(_(TIP: the \-m fast\ option immediately disconnects sessions rather than\n
+ 			waiting for session-initiated disconnection.\n));
  			exit(1);
  		}
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump -X

2011-03-10 Thread Bruce Momjian
Robert Haas wrote:
 Back in 2006, we have this commit:
 
 commit 2b25e1169f44368c120931787628d51731b5cc8c
 Author: Peter Eisentraut pete...@gmx.net
 Date:   Sat Oct 7 20:59:05 2006 +
 
 The -X option in pg_dump was supposed to be a workaround for the lack of
 portable long options.  But we have had portable long options for a long
 time now, so this is obsolete.  Now people have added options which *only*
 work with -X but not as regular long option, so I'm putting a stop to 
 this:
 -X is deprecated; it still works, but it has been removed from the
 documentation, and please don't add more of them.
 
 Since then, two additional -X options have crept in, doubtless due to
 mimicry of the existing options without examination of the commit
 logs.  I think we should either (a) remove the -X option altogether or
 (b) change the comment so that it clearly states the same message that
 appears here in the commit log, namely, that no new -X options are to
 be created.  The existing comment says that -X is deprecated, but that
 doesn't make it entirely 100% clear that the code isn't intended to be
 further updated, at least judging by the results.

Code comment added with attached, applied patch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
new file mode 100644
index 0884517..feeeae8
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*** main(int argc, char **argv)
*** 469,475 
  break;
  
  			case 'X':
! /* -X is a deprecated alternative to long options */
  if (strcmp(optarg, disable-dollar-quoting) == 0)
  	disable_dollar_quoting = 1;
  else if (strcmp(optarg, disable-triggers) == 0)
--- 469,478 
  break;
  
  			case 'X':
! /*
!  *	-X is a deprecated alternative to long options;
!  *	no new -X options are to be added.
!  */
  if (strcmp(optarg, disable-dollar-quoting) == 0)
  	disable_dollar_quoting = 1;
  else if (strcmp(optarg, disable-triggers) == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump -X

2011-03-10 Thread Bruce Momjian
Robert Haas wrote:
 On Thu, Mar 10, 2011 at 10:36 PM, Bruce Momjian br...@momjian.us wrote:
  Robert Haas wrote:
  Back in 2006, we have this commit:
 
  commit 2b25e1169f44368c120931787628d51731b5cc8c
  Author: Peter Eisentraut pete...@gmx.net
  Date: ? Sat Oct 7 20:59:05 2006 +
 
  ? ? The -X option in pg_dump was supposed to be a workaround for the lack 
  of
  ? ? portable long options. ?But we have had portable long options for a 
  long
  ? ? time now, so this is obsolete. ?Now people have added options which 
  *only*
  ? ? work with -X but not as regular long option, so I'm putting a stop to 
  this:
  ? ? -X is deprecated; it still works, but it has been removed from the
  ? ? documentation, and please don't add more of them.
 
  Since then, two additional -X options have crept in, doubtless due to
  mimicry of the existing options without examination of the commit
  logs. ?I think we should either (a) remove the -X option altogether or
  (b) change the comment so that it clearly states the same message that
  appears here in the commit log, namely, that no new -X options are to
  be created. ?The existing comment says that -X is deprecated, but that
  doesn't make it entirely 100% clear that the code isn't intended to be
  further updated, at least judging by the results.
 
  Code comment added with attached, applied patch.
 
 At a minimum, we should probably also remove -X no-security-label and
 -X no-unlogged-table-data, which don't exist in any released versions
 (unless you want to count alphas).  But considering that this has been
 deprecated and undocumented since 8.2, I think it might be time to
 pull the plug on -X altogether.

I remove the new -X options with the attached, applied patch.  The
existing options are not really costing us anything except a few lines
of code.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
new file mode 100644
index feeeae8..546a04c
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*** main(int argc, char **argv)
*** 481,490 
  	outputNoTablespaces = 1;
  else if (strcmp(optarg, use-set-session-authorization) == 0)
  	use_setsessauth = 1;
- else if (strcmp(optarg, no-security-label) == 0)
- 	no_security_label = 1;
- else if (strcmp(optarg, no-unlogged-table-data) == 0)
- 	no_unlogged_table_data = 1;
  else
  {
  	fprintf(stderr,
--- 481,486 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

2011-03-10 Thread Heikki Linnakangas

On 10.12.2010 19:55, Noah Misch wrote:

On Thu, Dec 09, 2010 at 09:48:25AM +, Simon Riggs wrote:

On Fri, 2010-12-03 at 21:43 +0200, Heikki Linnakangas wrote:

Seems reasonable. HeapTupleHeaderAdvanceLatestRemovedXid() will need
similar treatment. Actually, btree_xlog_delete_get_latestRemovedXid()
could just call HeapTupleHeaderAdvanceLatestRemoveXid().


Yes, it applies to other cases also. Thanks for the suggestion.

Fix committed. Please double-check my work, committed early since I'm
about to jump on a plane.


Thanks for making that change.  For my understanding, why does the xmin == xmax
special case in HeapTupleHeaderAdvanceLatestRemoveXid not require !HEAP_UPDATED,
as the corresponding case in HeapTupleSatisfiesVacuum requires?  I can neither
think of a recipe for triggering a problem as the code stands, nor come up with
a sound explanation for why no such recipe can exist.


The difference is in the purpose of those two functions. 
HeapTupleSatisfiesVacuum decides if a tuple can be safely vacuumed away. 
For that purpose, you can't remove a tuple from the middle of an update 
chain, even if that tuple was never visible to any other transaction, 
because someone might still want to follow the update chain to find the 
latest version of the row. HeapTupleHeaderAdvanceLatestRemoveXid is used 
to decide if removing a tuple would conflict with in-progress hot 
standby queries. For that purpose, you don't need to care about breaking 
update chains, as Hot Standby is only used for read-only queries and 
read-only queries never follow update chains.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

2011-03-10 Thread Heikki Linnakangas

On 11.03.2011 06:21, Fujii Masao wrote:

On Thu, Dec 9, 2010 at 4:55 PM, Simon Riggssi...@2ndquadrant.com  wrote:

On Thu, 2010-12-09 at 00:16 +0100, Heikki Linnakangas wrote:

On 09.12.2010 00:10, Heikki Linnakangas wrote:

On 08.12.2010 16:00, Simon Riggs wrote:

Heikki pointed out to me that the btree delete record processing does
not respect vacuum_defer_cleanup_age. It should.

Attached patch to implement that.

Looking to commit in next few hours barring objections/suggestions, to
both HEAD and 9_0_STABLE, in time for next minor release.


Please note that it was Noah Misch that raised this a while ago:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01919.php


On closer look, that's not actually the same issue, sorry for the noise..


Heikki, this one *is* important. Will fix. Thanks for the analysis Noah.


Is this an open item for 9.1?


Simon fixed it, commit b9075a6d2f9b07a00262a670dd60272904c79dce.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers