Re: [HACKERS] patch for new feature: Buffer Cache Hibernation

2011-05-09 Thread Mitsuru IWASAKI
Hi,
Sorry, I missed these messages because I didn't subscribe to this list.
# I've just subscribed temporary

  I think that all the complexity with CRCs etc. is unlikely to lead anywhere
  too, and those two issues are not completely unrelated.  The simplest,
  safest thing here is the right way to approach this, not the most
  complicated one, and a simpler format might add some flexibility here to
  reload more cache state too.  The bottleneck on reloading the cache state is
  reading everything from disk.  Trying to micro-optimize any other part of
  that is moving in the wrong direction to me.  I doubt you'll ever measure a
  useful benefit that overcomes the expense of maintaining the code.  And you
  seem to be moving to where someone can't restore cache state when they
  change shared_buffers.  A simpler implementation might still work in that
  situation; reload until you run out of buffers if shared_buffers shrinks,
  reload until you're done with the original size.
 
 Yeah, I'm pretty well convinced this whole approach is a dead end.
 Priming the OS buffer cache seems way more useful.  I also think
 saving the blocks to be read rather than the actual blocks makes a lot
 more sense.

OK, there are two your suggestions here IIUC.
# if not, please correct me.
1. restore buffer blocks based on buffer descriptors, not from the saved file.
2. support restoring cache state even if shared_buffers had changed.

For 1, I've just finish my work.  The latest patch is available at:
http://people.freebsd.org/~iwasaki/postgres/buffer-cache-hibernation-postgresql-20110507.patch
 

On my box, shared_buffers can be set up to only 200MB.
Elapsed time for starting up is almost the same, about 3 sec (w/o
hibernation takes about 1 sec).
For shutdown, writing buffer blocks takes about 10 sec, otherwise
about 1 sec.

Well, it seems you were right :)
By restoring buffer blocks based on buffer descriptors, the OS buffer
cache will be filled too.  This can help buffer updating performance
I believe.

I think saving buffer blocks is still useful for debugging or portability,
so I would like to remain the support code in my patch.


For 2, I'm not sure how to implement this.
The problem is that freelist.c:StrategyControl is also restored at
startup, but I have no idea currently how to adjust StrategyControl
when shared_buffer had changed.
StrategyControl has important data on buffer allocation, so this should be
matched with shared_buffer, I belive.

Changing shared_buffer is not so often on production environment.
Current implementation like this;
If shared_buffer had changed, restoring is aborted only on that time
and saving is executed with new shared_buffer at shutdown, restoring
is executed at startup on next time.

I have one more day for working on this, but I may give up...

Thanks

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: Smooth replication during VACUUM FULL

2011-05-09 Thread Simon Riggs
On Mon, May 2, 2011 at 6:15 PM, Gabriele Bartolini
gabriele.bartol...@2ndquadrant.it wrote:

 You can easily spot in the graphs the point where VACUUM FULL terminates,
 then it is just a matter of flushing the WAL delay for replication.

Agreed.

 Anyway, I hope I can give you more detailed information tomorrow. Thanks.

Did you find anything else of note, or is your patch ready to commit?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clog_redo causing very long recovery time

2011-05-09 Thread Simon Riggs
On Fri, May 6, 2011 at 4:22 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 The attached fix-clogredo diff is my proposal for a fix for this.

 That seems pretty grotty :-(

 I think a more elegant fix might be to just swap the order of the
 ExtendCLOG and ExtendSUBTRANS calls in GetNewTransactionId.  The
 reason that would help is that pg_subtrans isn't WAL-logged, so if
 we succeed doing ExtendSUBTRANS and then fail in ExtendCLOG, we
 won't have written any XLOG entry, and thus repeated failures will not
 result in repeated XLOG entries.  I seem to recall having considered
 exactly that point when the clog WAL support was first done, but the
 scenario evidently wasn't considered when subtransactions were stuck
 in :-(.

I agree with Tom about the need for a fix that prevents generation of
repeated WAL records.

OTOH, I also like Joe's fix in the recovery code to avoid responding
to repeated records.

Can we have both please?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process wakeups when idle and power consumption

2011-05-09 Thread Peter Geoghegan
I've taken a look into it, and I'm not optimistic about the likelihood
of the way I've suggested that we can register a callback on process
termination on windows being acceptable. It seems to be a kludge too
far. It does work on Vista, just not very well. There is a
considerable delay on closing the above console application that uses
this technique, for example, and there seems to be an unpredictable
delay in the callback occurring.

A simpler solution on Windows might be to make the timeout on
auxiliary processes much smaller, but have it increase on each
subsequent timeout (starting from scratch if we wakeup for any reason
other than timeout) until eventually it maxes out at something like
the current value for PGARCH_AUTOWAKE_INTERVAL. If backends are
sleeping for increasing periods of time, the chance of the postmaster
crashing goes down, so denial of service is much less of a concern.

An alternative might be to just not do this on Windows. Certainly,
idle wakeups are likely to be less important on that platform, which
is not a very popular choice for virtual machines deployed on cloudy
infrastructure, the use case that will benefit from these enhancements
the most, by some margin.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not write block xlog flush request 3FD/0 is not satisfied

2011-05-09 Thread Yves Weißig
All right, what would we be the best way to debug such a problem?

Yves

-Original Message-
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Greg Stark
Sent: Sunday, May 08, 2011 6:36 PM
To: weis...@rbg.informatik.tu-darmstadt.de
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] could not write block  xlog flush request 3FD/0 is
not satisfied

On Sun, May 8, 2011 at 4:18 PM, Yves Weißig
weis...@rbg.informatik.tu-darmstadt.de wrote:
 ERROR:  xlog flush request 3FD/0 is not satisfied --- flushed only to
 0/20E2DC4


That's a pretty big difference in log positions. It seems likely you've
overwritten the block header writing garbage to the LSN.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process wakeups when idle and power consumption

2011-05-09 Thread Heikki Linnakangas

On 09.05.2011 12:20, Peter Geoghegan wrote:

I've taken a look into it, and I'm not optimistic about the likelihood
of the way I've suggested that we can register a callback on process
termination on windows being acceptable. It seems to be a kludge too
far. It does work on Vista, just not very well. There is a
considerable delay on closing the above console application that uses
this technique, for example, and there seems to be an unpredictable
delay in the callback occurring.


Can't we use the pipe trick on Windows? The API is different, but we use 
pipes on Windows for other things already. When a process is launched, 
open a pipe between postmaster and the child process. In the child, 
spawn a thread that just calls ReadFile() on the pipe, which blocks. If 
postmaster dies, the ReadFile() call will return with an error.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: Smooth replication during VACUUM FULL

2011-05-09 Thread Gabriele Bartolini

Il 09/05/11 09:14, Simon Riggs ha scritto:

Anyway, I hope I can give you more detailed information tomorrow. Thanks.

Did you find anything else of note, or is your patch ready to commit?


Unfortunately I did not have much time to run further tests.

The ones I have done so far show that it mostly works (see attached 
graph), but there are some unresolved spikes that will require further 
work in 9.2.


Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

attachment: smooth-replication-vacuum-full-s50-600.png
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process wakeups when idle and power consumption

2011-05-09 Thread Peter Geoghegan
On 9 May 2011 11:19, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

 Can't we use the pipe trick on Windows? The API is different, but we use
 pipes on Windows for other things already. When a process is launched, open
 a pipe between postmaster and the child process. In the child, spawn a
 thread that just calls ReadFile() on the pipe, which blocks. If postmaster
 dies, the ReadFile() call will return with an error.

Alright. I'm currently working on a proof-of-concept implementation of
that. In the meantime, any thoughts on how this should meld with the
existing latch implementation?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] superusers are members of all roles?

2011-05-09 Thread Robert Haas
On Sat, May 7, 2011 at 11:42 PM, Bruce Momjian br...@momjian.us wrote:
 Is this a TODO?

I think so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check in src/test/isolation is unworkable

2011-05-09 Thread Andrew Dunstan



On 05/08/2011 09:54 PM, Andrew Dunstan wrote:



On 05/08/2011 07:35 PM, Tom Lane wrote:

I believe that the make check target in src/test/isolation is
fundamentally unportable, as is illustrated by the fact that buildfarm
member coypu is currently choking on it.  The reason is that the
pg_isolation_regress program depends on libpq, and in particular it
depends on having an *installed* libpq.  Anyplace where it appears to
work, it's because you already installed Postgres, or at least libpq.


darn, you're right.




OK, I have crake running the installation checks: 
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2011-05-09%2004%3A17%3A01stg=isolation-check, 
so I have checked in a hot fix for the buildfarm client: 
https://github.com/PGBuildFarm/client-code/commit/c3c20a6457f57efcdcecb83e9c8168791f33f699


What's a bit annoying is that these tests were checked in without a 
vestige of MSVC support, and nobody pinged the usual suspects (i.e. 
Magnus and me) to ask for help in providing it, unless my memory is even 
worse than usual. We have a bit of work to do to enable that, which I'll 
try to get done before pgcon.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [pgsql-advocacy] New Canadian nonprofit for trademark, postgresql.org domain, etc.

2011-05-09 Thread Roberto Mello
On Sat, May 7, 2011 at 2:01 PM, Peter Eisentraut pete...@gmx.net wrote:

 On fre, 2011-05-06 at 21:53 +0200, Cédric Villemain wrote:
 
  I think it might be better if the association don't need (or have )
  activity other than 'technical' and to set up another nonprofit
  association for real activity.

 If instead you limit yourselves to holding and maintaining the mentioned
 assets, the board meets once a year to approve last year's minutes, file
 the paperwork, and go home, you can't do much wrong.


I quite agree on this.

The NPO new is excellent, btw.

Roberto


Re: [HACKERS] clog_redo causing very long recovery time

2011-05-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 I agree with Tom about the need for a fix that prevents generation of
 repeated WAL records.

 OTOH, I also like Joe's fix in the recovery code to avoid responding
 to repeated records.

 Can we have both please?

Why?  The patch in the recovery code is seriously ugly, and it won't
do anything useful once we've fixed the other end.  Please notice also
that we'd need several instances of that kluge if we want to cover all
the SLRU-based cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check in src/test/isolation is unworkable

2011-05-09 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 What's a bit annoying is that these tests were checked in without a 
 vestige of MSVC support, and nobody pinged the usual suspects (i.e. 
 Magnus and me) to ask for help in providing it,

Speaking of pinging Windows people, have either of you noticed the
reports that CREATE/ALTER USER VALID UNTIL 'infinity' is crashing on
Windows?

http://archives.postgresql.org/pgsql-bugs/2011-05/msg9.php
http://archives.postgresql.org/pgsql-bugs/2011-05/msg00030.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching

2011-05-09 Thread Robert Haas
On Sat, May 7, 2011 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On the flip side, the risk of it flat-out blowing up seems pretty
 small.  For someone to invent their own version of wchar_t that uses
 something other than Unicode code points would be pretty much pure
 masochism, wouldn't it?

 Well, no, that's not clear.  The C standard has pretty carefully avoided
 constraining the wchar_t representation, so implementors are free to do
 whatever is most convenient from the standpoint of their library routines.
 I could easily see somebody deciding to do something that wasn't quite
 Unicode because it let him re-use lookup tables designed for some other
 encoding, or some such.

 Now it's also perfectly possible, maybe even likely, that nobody's done
 that on any platform we care about.  But I don't believe we know that
 with any degree of certainty.  We definitely have not made any effort to
 establish whether it's true --- for example, we have no regression tests
 that address the point.  (I think that collate.linux.utf8 touches on it,
 but we're a long way from being able to run that on non-glibc
 platforms...)

Well, since any problems in this are are going to bite us eventually
in 9.0+ even without any further action on our part, maybe it would be
wise to think up something we could add to the regression tests.  That
would give us some immediate feedback from the buildfarm, and also
significantly improve the odds of someone compiling on a weird
platform noticing if things are broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf error checking strategy

2011-05-09 Thread Robert Haas
On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 On Wed, Apr 6, 2011 at 5:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 So I'm thinking we should adopt a strategy that's less likely to result
 in divergent behavior among different backends. ?The idea I have in mind
 is to have the first validation pass only check that each name is a
 legal GUC variable name, and not look at the values at all. ?If so, try
 to apply all the values. ?Any that fail to apply we log as usual, but
 still apply the others. ?ISTM that verifying the names should be enough
 protection against broken files for practical purposes, and it should be
 something that all backends will agree on even if there are individual
 values that are not valid for all.

 Comments?

 I don't think now is a good time for a major behavior change in this
 area, and I'm not convinced this is the best possible design.

 There are a number of parameters which are currently PGC_POSTMASTER
 rather than PGC_SIGHUP precisely because of the possibility of
 backends being out of step with each other.  wal_level is an obvious
 example, and one that it would be *really* nice to be able to change
 without a server restart.  It would be nice to have a real solution to
 that problem, but this isn't it, and I don't want to engineer it right
 now.

 Is this a TODO?

 Yes, definitely.  Perhaps summarize as rethink how we handle partially
 correct postgresql.conf files.  Or maybe Robert sees it as rethink
 approach to making sure all backends share the same value of critical
 settings?  Or maybe those are two different TODOs?

The second is what I had in mind.  I'm thinking that at least for
critical GUCs we need a different mechanism for making sure everything
stays in sync, like having the postmaster write a precompiled file and
convincing the backends to read it in some carefully synchronized
fashion.  However, it's not clear to me whether something along those
lines (or some other lines) would solve the problem you were
complaining about; therefore it's possible, as you say, that there are
two separate action items here.  Or maybe not: maybe someone can come
up with an approach that swats both problems in one go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching

2011-05-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Well, since any problems in this are are going to bite us eventually
 in 9.0+ even without any further action on our part, maybe it would be
 wise to think up something we could add to the regression tests.  That
 would give us some immediate feedback from the buildfarm, and also
 significantly improve the odds of someone compiling on a weird
 platform noticing if things are broken.

No objection here, but how will we do that?  The regression tests are
designed to work in any locale/encoding, and would become significantly
less useful if they weren't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Robert Haas
On Sun, May 8, 2011 at 12:02 AM, Greg Smith g...@2ndquadrant.com wrote:
 Attached patch is a first cut at what moving one contrib module (in this
 case pg_buffercache) to a new directory structure might look like.  The idea
 is that src/extension could become a place for first-class extensions to
 live.  Those are ones community is committed to providing in core, but are
 just better implemented as extensions than in-database functions, for
 reasons that include security.  This idea has been shared by a lot of people
 for a while, only problem is that it wasn't really practical to implement
 cleanly until the extensions code hit.  I think it is now, this attempts to
 prove it.

 Since patches involving file renaming are clunky, the changes might be
 easier to see at
 https://github.com/greg2ndQuadrant/postgres/commit/507923e21e963c873a84f1b850d64e895776574f
 where I just pushed this change too.  The install step for the modules looks
 like this now:

 gsmith@grace:~/pgwork/src/move-contrib/src/extension/pg_buffercache$ make
 install
 /bin/mkdir -p '/home/gsmith/pgwork/inst/move-contrib/lib/postgresql'
 /bin/mkdir -p
 '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension'
 /bin/sh ../../../config/install-sh -c -m 755  pg_buffercache.so
 '/home/gsmith/pgwork/inst/move-contrib/lib/postgresql/pg_buffercache.so'
 /bin/sh ../../../config/install-sh -c -m 644 ./pg_buffercache.control
 '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension/'
 /bin/sh ../../../config/install-sh -c -m 644 ./pg_buffercache--1.0.sql
 ./pg_buffercache--unpackaged--1.0.sql
  '/home/gsmith/pgwork/inst/move-contrib/share/postgresql/extension/'
 $ psql -c create extension pg_buffercache
 CREATE EXTENSION

 The only clunky bit I wasn't really happy with is the amount of code
 duplication that comes from having a src/extension/Makefile that looks
 almost, but not quite, identical to contrib/Makefile.  The rest of the
 changes don't seem too bad to me, and even that's really only 36 lines that
 aren't touched often.  Yes, the paths are different, so backports won't
 happen without an extra step.  But the code changes required were easier
 than I was expecting, due to the general good modularity of the extensions
 infrastructure.  So long as the result ends up in
 share/postgresql/extension/ , whether they started in contrib/module or
 src/extension/module doesn't really matter to CREATE EXTENSION.  But
 having them broke out this way makes it easy for the default Makefile to
 build and install them all.  (I recognize I didn't do that last step yet
 though)

 I'll happily go covert pgstattuple and the rest of the internal diagnostics
 modules to this scheme, and do the doc cleanups, this upcoming week if it
 means I'll be able to use those things without installing all of contrib one
 day.  Ditto for proposing RPM and Debian packaging changes that match them.
  All that work will get paid back the first time I don't have to fill out a
 bunch of paperwork (again) at a customer site justifying why they need to
 install the contrib [RPM|deb] package (which has some scary stuff in it) on
 all their servers, just so I can get some bloat or buffer inspection module.

I would really like to see us try to group things by topic, and not
just by whether or not we can all agree that the extension is
important enough to be first-class (which is bound to be a bit
tendentious).  We probably can't completely avoid some bikeshedding on
that topic, but even there it strikes me that sorting by topic might
make things a bit more clear.  For example, if we could somehow group
together all the diagnostic tools, maybe something like the list
below, I think that would be a start.  Now then we might go on to
argue about which are the more useful diagnostic tools, but I think
it's easier to argue about that category than it is to argue in the
abstract about whether you'd rather have hstore or pgstattuple, to
which the answer can only be that depends.

auto_explain
oid2name
pageinspect
pg_buffercache
pg_freespacemap
pg_stat_statements
pg_test_fsync (perhaps)
pgrowlocks
pgstattuple

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 10:39 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, since any problems in this are are going to bite us eventually
 in 9.0+ even without any further action on our part, maybe it would be
 wise to think up something we could add to the regression tests.  That
 would give us some immediate feedback from the buildfarm, and also
 significantly improve the odds of someone compiling on a weird
 platform noticing if things are broken.

 No objection here, but how will we do that?  The regression tests are
 designed to work in any locale/encoding, and would become significantly
 less useful if they weren't.

I'm just shooting from the hip here, but maybe we could have a
separate (probably smaller) set of tests that are only designed to
work in a limited range of locales and/or encodings.  I'm really
pleased that we now have the src/test/isolation stuff, and I think
some more auxilliary test suites would be quite excellent.  Even if
people didn't always want to run every single one when doing things
manually, the buildfarm certainly could.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Bruce Momjian
Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  ... Maybe someone out there is under the impression
  that I get high off of rejecting patches; but the statistics you cite
  from the CF app don't exactly support the contention that I'm going
  around looking for reasons to reject things, or if I am, I'm doing a
  pretty terrible job finding them.
 
 Hm ... there are people out there who think *I* get high off rejecting
 patches.  I have a t-shirt to prove it.  But I seem to be pretty
 ineffective at it too, judging from these numbers.

Late reply, but almost all the things Tom rejects I would have rejected
too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 10:58 AM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  ... Maybe someone out there is under the impression
  that I get high off of rejecting patches; but the statistics you cite
  from the CF app don't exactly support the contention that I'm going
  around looking for reasons to reject things, or if I am, I'm doing a
  pretty terrible job finding them.

 Hm ... there are people out there who think *I* get high off rejecting
 patches.  I have a t-shirt to prove it.  But I seem to be pretty
 ineffective at it too, judging from these numbers.

 Late reply, but almost all the things Tom rejects I would have rejected
 too.

Well, I think I've been guilty more than once of leaning on Tom to try
to get him to accept patches that he might've been inclined to reject.
 I think that my standards for code quality are similar to Tom's
(though sometimes I let through things he would have caught, woops)
but I think I am more inclined to commit feature changes that he might
not find entirely worthwhile.  Like Tom, I'm reasonably wary of random
knickknacks that are extremely special-purpose or will slow down
common cases, but on the average I think I'm slightly more
new-feature-positive than he is.  Not without some exceptions, of
course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf error checking strategy

2011-05-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yes, definitely.  Perhaps summarize as rethink how we handle partially
 correct postgresql.conf files.  Or maybe Robert sees it as rethink
 approach to making sure all backends share the same value of critical
 settings?  Or maybe those are two different TODOs?

 The second is what I had in mind.  I'm thinking that at least for
 critical GUCs we need a different mechanism for making sure everything
 stays in sync, like having the postmaster write a precompiled file and
 convincing the backends to read it in some carefully synchronized
 fashion.  However, it's not clear to me whether something along those
 lines (or some other lines) would solve the problem you were
 complaining about; therefore it's possible, as you say, that there are
 two separate action items here.  Or maybe not: maybe someone can come
 up with an approach that swats both problems in one go.

Well, the thing that was annoying me was that because a backend saw one
value in postgresql.conf as incorrect, it was refusing to apply any
changes at all from postgresql.conf.  And worse, there was no log entry
to give any hint what was going on.  This doesn't seem to me to have
much to do with the problem you're on about.  I agree it's conceivable
that someone might think of a way to solve both issues at once, but
I think we'd better list them as separate TODOs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Bruce Momjian
Greg Smith wrote:
 On 04/21/2011 12:39 PM, Robert Haas wrote:
  In fact, I've been wondering if we shouldn't consider extending the
  support window for 8.2 past the currently-planned December 2011.
  There seem to be quite a lot of people running that release precisely
  because the casting changes in 8.3 were so painful, and I think the
  incremental effort on our part to extend support for another year
  would be reasonably small.
 
 The pending EOL for 8.2 is the only thing that keeps me sane when 
 speaking with people who refuse to upgrade, yet complain that their 8.2 
 install is slow.  This last month, that seems to be more than usual why 
 does autovacuum suck so much? complaints that would all go away with an 
 8.3 upgrade.  Extending the EOL is not doing any of these users a 
 favor.  Every day that goes by when someone is on a version of 
 PostgreSQL that won't ever allow in-place upgrade is just making worse 
 the eventual dump and reload they face worse.  The time spent porting to 
 8.3 is a one-time thing; the suffering you get trying to have a 2011 
 sized database on 2006's 8.2 just keeps adding up the longer you 
 postpone it.

Interesting.  You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument I can't dump/reload my data would be gone.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching

2011-05-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, May 9, 2011 at 10:39 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 No objection here, but how will we do that?  The regression tests are
 designed to work in any locale/encoding, and would become significantly
 less useful if they weren't.

 I'm just shooting from the hip here, but maybe we could have a
 separate (probably smaller) set of tests that are only designed to
 work in a limited range of locales and/or encodings.  I'm really
 pleased that we now have the src/test/isolation stuff, and I think
 some more auxilliary test suites would be quite excellent.  Even if
 people didn't always want to run every single one when doing things
 manually, the buildfarm certainly could.

Hmm.  We don't need new infrastructure like the isolation tests do,
so another subdirectory seems like overkill.  I am thinking about a new
target installcheck-collations in src/test/regress/GNUmakefile that
creates a UTF8-encoding database and runs a different test schedule than
the regular tests.

The problem we'd have is that there's no way (at present) to make such
a test pass on every platform.  Windows has its own set of locale names
(which initdb fails to install as collations anyway) and we also have
the problem that OS X can be counted on to get UTF8 sorting wrong.
(It might be okay for case-folding though; not sure.)  Possibly we could
just provide an alternate expected file for OS X, but I don't see a
decent workaround for Windows --- it would pretty much have to have its
very own test case.

Andrew, what kinds of options have we got for persuading the buildfarm
to run such tests only on a subset of platforms?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] switch UNLOGGED to LOGGED

2011-05-09 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie may 06 23:25:09 -0300 2011:
 On Fri, Apr 22, 2011 at 4:13 AM, Leonardo Francalanci m_li...@yahoo.it 
 wrote:
  Maybe you should change  xl_act_commit to have a separate list of rels to
  drop the init fork for  (instead of mixing those with the list of files to
  drop as a  whole).
 
  I tried to follow your suggestion, thank you very much.
 
 I have to admit I don't like this approach very much.  I can't see
 adding 4 bytes to every commit record for this feature.

Hmm, yeah.  Maybe we can add a flags int8 somewhere in that struct and
set a bit in it if nrels, nsubxacts, nmsgs and respective arrays are present.
That would save some int's that are already in there.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Joshua Berkus
All,

 I agree that we should not reduce the support window. The fact that we
 can do in place upgrades of the data only addresses one pain point in
 upgrading. Large legacy apps require large retesting efforts when
 upgrading, often followed by lots more work renovating the code for
 backwards incompatibilities.

Definitely.  Heck, I can't get half our clients to apply *update* releases 
because they have a required QA process which takes a month.  And a lot of 
companies are just now deploying the 8.4 versions of their products. 

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Greg Smith

On 05/09/2011 12:06 PM, Andrew Dunstan wrote:
The fact that we can do in place upgrades of the data only addresses 
one pain point in upgrading. Large legacy apps require large retesting 
efforts when upgrading, often followed by lots more work renovating 
the code for backwards incompatibilities. This can be a huge cost for 
what the suits see as little apparent gain, and making them do it more 
frequently in order to stay current will not win us any friends.


I just had a why a new install on 8.3? conversation today, and it was 
all about the application developer not wanting to do QA all over again 
for a later release.


Right now, one of the major drivers for why upgrade? has been the 
performance improvements in 8.3, relative to any older version.  The 
main things pushing happy 8.3 sites to 8.4 or 9.0 that I see are either 
VACUUM issues (improved with partial vacuum in 8.4) or wanting real-time 
replication (9.0).  I predict many sites that don't want either are 
likely to sit on 8.3 for a really long time.  The community won't be 
able to offer a compelling reason why smaller sites in particular should 
go through the QA an upgrade requires.  The fact that the app QA time is 
now the main driver--not the dump and reload time--is good, because it 
makes it does make it easier for the people with the biggest data sets 
to move.  They're the ones that need the newer versions the most anyway, 
and in that regard having in-place upgrade start showing up as of 8.3 
was really just in time.


I think 8.3 is going to be one of those releases like 7.4, where people 
just keep running it forever.  At least shortening the upgrade path has 
made that concern a little bit better.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backpatching of Teach the regular expression functions to do case-insensitive matching

2011-05-09 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I am thinking about a new target installcheck-collations in
 src/test/regress/GNUmakefile that creates a UTF8-encoding database
 and runs a different test schedule than the regular tests.
 
I don't know the best way to do this (or how many people agree we
should), but I found that running the standard `make installcheck`
against a database where the postgresql.conf file set
default_transaction_isolation = serializable helped provide some
baseline testing of SSI.  None of the results change, but it helps
protect against certain classes of dumb error.  (I know this from
personal experience.)  I know *I'd* feel better if at least a few
buildfarm animals were set up to do this.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Josh Berkus
Greg,

 [There were complaints upthread about things like how Aster's patch 
 submissions were treated.  Those were WIP patches that half implemented 
 some useful ideas. 

There are two reasons why I think we failed with the Aster patches:

1) I passed Aster along to Bruce, who said he would review the patches
and give them a private response on them before they put them on
-hackers (which response would be these aren't nearly ready) Bruce
punted on this instead, passing their submissions straight through to
-hackers without review.

2) Our process for reviewing and approving patches, and what criteria
such patches are required to meet, is *very* opaque to a first-time
submitter (as in no documentation the submitter knows about), and does
not become clearer as they go through the process.  Aster, for example,
was completely unable to tell the difference between hackers who were
giving them legit feedback, and random list members who were
bikeshedding.  As a result, they were never able to derive a concrete
list of these are the things we need to fix to make the patch
acceptable, and gave up.

While the first was specific to the Aster submissions, I've seen the
second problem with lots of first-time submissions to this list.  Our
feedback to submitters of big patches requires a lot of comprehension of
project personalities and politics to make any sense of.  As I don't
think we can change this, I think the best answer is to tell people
Don't submit a big patch to PostgreSQL until you've done a few small
patches first.  You'll regret it.

 That goes double for some of the people complaining in this thread about 
 dissatisfaction with the current process.

The problem is not the process itself, but that there is little
documentation of that process, and that much of that documentation does
not match the defacto process.  Obviously, the onus is on me as much as
anyone else to fix this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync reliability

2011-05-09 Thread Bruce Momjian

FYI, does wal.c need updated comments to explain the file system
semantics we expect, and how our code triggers it?

---

Greg Smith wrote:
 On 04/23/2011 09:58 AM, Matthew Woodcraft wrote:
  As far as I can make out, the current situation is that this fix (the
  auto_da_alloc mount option) doesn't work as advertised, and the ext4
  maintainers are not treating this as a bug.
 
  See https://bugzilla.kernel.org/show_bug.cgi?id=15910
 
 
 I agree with the resolution that this isn't a bug.  As pointed out 
 there, XFS does the same thing, and this behavior isn't going away any 
 time soon.  Leaving behind zero-length files in situations where 
 developers tried to optimize away a necessary fsync happens.
 
 Here's the part where the submitter goes wrong:
 
 We first added a fsync() call for each extracted file. But scattered 
 fsyncs resulted in a massive performance degradation during package 
 installation (factor 10 or more, some reported that it took over an hour 
 to unpack a linux-headers-* package!) In order to reduce the I/O 
 performance degradation, fsync calls were deferred...
 
 Stop right there; the slow path was the only one that had any hope of 
 being correct.  It can actually slow things by a factor of 100X or more, 
 worst-case.  So, we currently have the choice between filesystem 
 corruption or major performance loss:  yes, you do.  Writing files is 
 tricky and it can either be slow or safe.  If you're going to avoid even 
 trying to enforce the right thing here, you're really going to get 
 really burned.
 
 It's unfortunate that so many people are used to the speed you get in 
 the common situation for a while now with ext3 and cheap hard drives:  
 all writes are cached unsafely, but the filesystem resists a few bad 
 behaviors.  Much of the struggle where people say this is so much 
 slower, I won't put up with it and try to code around it is futile, and 
 it's hard to separate out the attempts to find such optimizations from 
 the legitimate complaints.
 
 Anyway, you're right to point out that the filesystem is not necessarily 
 going to save anyone from some of the tricky rename situations even with 
 the improvements made to delayed allocation.  They've fixed some of the 
 worst behavior of the earlier implementation, but there are still 
 potential issues in that area it seems.
 
 -- 
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
 
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 1:53 PM, Josh Berkus j...@agliodbs.com wrote:
 While the first was specific to the Aster submissions, I've seen the
 second problem with lots of first-time submissions to this list.  Our
 feedback to submitters of big patches requires a lot of comprehension of
 project personalities and politics to make any sense of.

Ah ha!  Now we're getting somewhere.  As was doubtless obvious from my
previous responses, I don't agree that the process is as broken as I
felt you were suggesting, and I think we've made a lot of
improvements.  However, I am in complete agreement with you on this
point.  Unfortunately, people often come into our community with
incorrect assumptions about how it works, including:

- someone's in charge
- there's one right answer
- it's our job to fix your problem

Now if you read a few hundred emails (which is not that much calendar
time, if you read them all) it's not too hard to figure out what the
real dynamic is, and I think that real dynamic is increasingly
positive (with some unfortunate exceptions).  But if the first thing
you do is post (no doubt about some large or controversial change),
yeah, serious culture shock.

 That goes double for some of the people complaining in this thread about
 dissatisfaction with the current process.

 The problem is not the process itself, but that there is little
 documentation of that process, and that much of that documentation does
 not match the defacto process.  Obviously, the onus is on me as much as
 anyone else to fix this.

I can't disagree with this, either.  I'm not sure where it would be
possible for us to document this that people would actually see and
read, and I think it's a tough to understand just from reading a wiki
page or a blog post: if you've never been part of a community that
operates this way, then it's kind of strange and it takes a while to
adjust.  Of course from the inside it seems to make a fair amount of
sense, but what good is that?  Anyhow, whatever we can do to help
people get into the swing of things I'm highly in favor of.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 1:14 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 05/09/2011 10:53 AM, Robert Haas wrote:

 I would really like to see us try to group things by topic, and not
 just by whether or not we can all agree that the extension is
 important enough to be first-class (which is bound to be a bit
 tendentious).

 Having played around with the prototype, I think it doesn't actually matter
 if there's a further division below the new one I introduced.  The main
 thing I think is worth pointing out is that I only feel extensions with no
 external dependencies are worth the trouble of re-classifying here.  If it
 were worth reorganizing contrib just for the sake of categorizing it, that
 would have been done years ago.  The new thing is that extensions make it
 really easy to make some tools available in the server's extensions
 subdirectly, without actually activating them in the default install.

 Looking at your list:

 auto_explain
 oid2name
 pageinspect
 pg_buffercache
 pg_freespacemap
 pg_stat_statements
 pg_test_fsync (perhaps)
 pgrowlocks
 pgstattuple


 oid2name and pg_test_fsync would be out because those are real executables.
  I'd rather not introduce the risk/complexity of playing around with moving
 standalone utilities of such marginal value.  Whereas I think it sets an
 excellent precedent if the server is shipping with some standard add-ons,
 built using the same extension mechanism available to external code, in the
 core server package.  I'd certainly be happy to add auto_explain and
 pg_stat_statements (also extremely popular things to install for me) to that
 list.

I'm happy enough with that set of guidelines: namely, that we'd use
src/extension only for things that don't require additional
dependencies, and not for things that build standalone executables.
If we're going to move things around, I think we should take the
trouble to categorize them along the way, and your idea of inserting
one more subdirectory under src/extension for grouping seems fine to
me.

I don't think we should be too obstinate about trying to twist the arm
of packagers who (as Tom points out) will do whatever they want in
spite of us, but the current state of contrib, with all sorts of
things of varying type, complexity, and value mixed together cannot
possibly be a good thing.  Even if the effect of all this is that some
distributions end up with postgresql-server-instrumentation and
postgresql-server-datatypes packages, rather than putting everything
in postgresql-server, I still think that'd be better than having a
monolithic lump called postgresql-contrib.  Heaven only knows what
could be in there (says the sys admin)...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Josh Berkus
Robert,

 I can't disagree with this, either.  I'm not sure where it would be
 possible for us to document this that people would actually see and
 read, and I think it's a tough to understand just from reading a wiki
 page or a blog post:

Still, if we had a wiki page which was a really comprehensive guide to
submitting patches, then we could send people a link after they submit
their first patch.   As well as having it in the header for the
commitfest page.

While it wouldn't do everything, it would help.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures

2011-05-09 Thread Bruce Momjian
Josh Berkus wrote:
 Peter,
 
  I would like to collect some specs on this feature.  So does anyone have
  links to documentation of existing implementations, or their own spec
  writeup?  A lot of people appear to have a very clear idea of this
  concept in their own head, so let's start collecting those.
 
 Delta between SPs and Functions for PostgreSQL:
 
 * SPs are executed using CALL or EXECUTE, and not SELECT.
 
 * SPs do not return a value
 ** optional: SPs *may* have OUT parameters.

[ Late reply.]

What is it about stored procedures that would require it not to return a
value or use CALL?  I am trying to understand what part of this is
procedures (doesn't return a values, we decided there isn't much value
for that syntax vs. functions), and anonymous transactions.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
 Greg Smith wrote:
 On 04/21/2011 12:39 PM, Robert Haas wrote:
  In fact, I've been wondering if we shouldn't consider extending the
  support window for 8.2 past the currently-planned December 2011.
  There seem to be quite a lot of people running that release precisely
  because the casting changes in 8.3 were so painful, and I think the
  incremental effort on our part to extend support for another year
  would be reasonably small.

 The pending EOL for 8.2 is the only thing that keeps me sane when
 speaking with people who refuse to upgrade, yet complain that their 8.2
 install is slow.  This last month, that seems to be more than usual why
 does autovacuum suck so much? complaints that would all go away with an
 8.3 upgrade.  Extending the EOL is not doing any of these users a
 favor.  Every day that goes by when someone is on a version of
 PostgreSQL that won't ever allow in-place upgrade is just making worse
 the eventual dump and reload they face worse.  The time spent porting to
 8.3 is a one-time thing; the suffering you get trying to have a 2011
 sized database on 2006's 8.2 just keeps adding up the longer you
 postpone it.

 Interesting.  You could argue that once 8.3 is our earliest supported
 release that we could even shrink the support window because the
 argument I can't dump/reload my data would be gone.

Personally, I think the support window is on the borderline of being
too short already.  There are several Linux distributions out there
that offer 5-year support for certain releases.  Even assuming they
incorporate the latest version of PostgreSQL at the time they wrap the
final release, it'll already be some months since we released that
version, and that means we'll stop supporting that version of
PostgreSQL before they stop supporting that release.  I regularly have
systems that run for 3 or 4 years without needing to be reinstalled,
and they're not necessarily running the bleeding-edge version of
PostgreSQL when first installed.  So they, too, are on the trailing
edge of our support.  As much as I believe that 9.0 (and, now, 9.1)
are the future and people should move to them, we can't enforce that.
EOL doesn't necessarily drive people to move.  If they're just running
yum update they're going to get 8.whatever.latest, and that's out of
support and missing relevant bug fixes, then it is.  I haven't run
into much 8.1 recently, but it seems there is still a decent chunk of
8.2 out there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf error checking strategy

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, May 8, 2011 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yes, definitely.  Perhaps summarize as rethink how we handle partially
 correct postgresql.conf files.  Or maybe Robert sees it as rethink
 approach to making sure all backends share the same value of critical
 settings?  Or maybe those are two different TODOs?

 The second is what I had in mind.  I'm thinking that at least for
 critical GUCs we need a different mechanism for making sure everything
 stays in sync, like having the postmaster write a precompiled file and
 convincing the backends to read it in some carefully synchronized
 fashion.  However, it's not clear to me whether something along those
 lines (or some other lines) would solve the problem you were
 complaining about; therefore it's possible, as you say, that there are
 two separate action items here.  Or maybe not: maybe someone can come
 up with an approach that swats both problems in one go.

 Well, the thing that was annoying me was that because a backend saw one
 value in postgresql.conf as incorrect, it was refusing to apply any
 changes at all from postgresql.conf.  And worse, there was no log entry
 to give any hint what was going on.  This doesn't seem to me to have
 much to do with the problem you're on about.  I agree it's conceivable
 that someone might think of a way to solve both issues at once, but
 I think we'd better list them as separate TODOs.

OK by me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Bruce Momjian
Robert Haas wrote:
  Interesting. ?You could argue that once 8.3 is our earliest supported
  release that we could even shrink the support window because the
  argument I can't dump/reload my data would be gone.
 
 Personally, I think the support window is on the borderline of being
 too short already.  There are several Linux distributions out there
 that offer 5-year support for certain releases.  Even assuming they
 incorporate the latest version of PostgreSQL at the time they wrap the
 final release, it'll already be some months since we released that
 version, and that means we'll stop supporting that version of
 PostgreSQL before they stop supporting that release.  I regularly have
 systems that run for 3 or 4 years without needing to be reinstalled,
 and they're not necessarily running the bleeding-edge version of
 PostgreSQL when first installed.  So they, too, are on the trailing
 edge of our support.  As much as I believe that 9.0 (and, now, 9.1)
 are the future and people should move to them, we can't enforce that.
 EOL doesn't necessarily drive people to move.  If they're just running
 yum update they're going to get 8.whatever.latest, and that's out of
 support and missing relevant bug fixes, then it is.  I haven't run
 into much 8.1 recently, but it seems there is still a decent chunk of
 8.2 out there.

I agree we don't want to shorten the window --- I was just pointing out
that we have more upgrade options than in the past.  One big push for
shortening was the Win32 issues on 8.0 and perhaps 8.1 that were
unfixable, which helped push retiring, at least on that platforms, and
once you retire on one platform, there is momentum to retire all
platforms for that release.

With Win32 stable on 8.2, we could say we don't need to shorten the
window as much, but pg_upgrade would allow us to keep it the same as now
because upgrades are potentially easier.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Bruce Momjian
Greg Smith wrote:
 [There were complaints upthread about things like how Aster's patch 
 submissions were treated.  Those were WIP patches that half implemented 
 some useful ideas.  But they were presented as completed features, and 
 they seemed to expect the community would pick those up and commit in 
 that not quite right state without extended additional work on their 
 side.  Not doing that sort of thing is part of the reason the PostgreSQL 
 code isn't filled with nothing but the fastest hack to get any given job 
 done.  Anyone who thinks I'm misrepresenting that view of history should 
 revisit the lengthy feedback provided to them at 
 https://commitfest.postgresql.org/action/patch_view?id=173 and 
 https://commitfest.postgresql.org/action/patch_view?id=205 -- it 
 actually goes back even further than that because the first versions of 
 these patches were even less suitable for commit.]

[ Again, sorry for my late reply.]

Greg hits a big item above --- it takes 3-4x more work to get a patch to
merge cleanly into our code (look like it was always there) than to
write the initial patch.  If the author isn't willing to do that 3-4x
work, it is not something the community is going to do on a regular
basis, so it is not surprising the patches are dropped.  This is very
often true of academicly-developed patches too.  (I know I rewrite my
patches 4-5 times, and some feel even that is not enough interations for
me.  ;-) )

 That goes double for some of the people complaining in this thread about 
 dissatisfaction with the current process.  If you're not helping review 
 patches already, you're not participating in the thing that needs the 
 most help.  This is not a problem you make better with fuzzy management 
 directives to be nicer to people.  There are real software engineering 
 issues about how to ensure good code quality at its core.

I agree on this one too.  It is good for people outside the patch review
group to make suggestions (external review is good), but when those
external people can't give clear examples of problems, it is impossible
for the patch review group to react or improve, and the complaints do
more harm than good.  The complaints did spark discussion to reevaluate
our development process, so something good did come out of it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 2:41 PM, Josh Berkus j...@agliodbs.com wrote:
 Robert,

 I can't disagree with this, either.  I'm not sure where it would be
 possible for us to document this that people would actually see and
 read, and I think it's a tough to understand just from reading a wiki
 page or a blog post:

 Still, if we had a wiki page which was a really comprehensive guide to
 submitting patches, then we could send people a link after they submit
 their first patch.   As well as having it in the header for the
 commitfest page.

 While it wouldn't do everything, it would help.

I'm all in favor.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Collation mega-cleanups

2011-05-09 Thread Bruce Momjian

Tom this collation stuff has seen more post-feature-commit cleanups than
I think any patch I remember.  Is there anything we can learn from this?

Yes, this is coming from me, who some consider to be the king of
post-commit cleanups, namely, cleaning up my own commits.

---

Tom Lane wrote:
 I just noticed that the collation patch has modified char2wchar and
 wchar2char to accept a collation OID as argument ... but it hasn't done
 anything to make those arguments actually work.  Since those functions
 depend on wcstombs and mbstowcs, which respond to LC_CTYPE and nothing
 else, this flat out does not work in non-default collations.  What's
 more, there doesn't seem to be any such thing as wcstombs_l or
 mbstowcs_l (at least my Fedora box hasn't got them), so this can't be
 fixed within the available glibc API.
 
 Right at the moment this only affects str_tolower, str_toupper, and
 str_initcap; there are other uses of these functions in the text search
 code, but those always pass DEFAULT_COLLATION_OID.
 
 It's possible that things are not too broken in practice, because it's
 likely that the transformations done by these functions only depend on
 the encoding indicated by LC_CTYPE, and we (try to) enforce that all
 locales used in a given database match the database encoding.  Still,
 that's a rather shaky chain of reasoning.
 
 The complete lack of code comments on this doesn't make me any happier
 --- in fact, the comments for char2wchar and wchar2char still claim that
 they have the same API as wcstombs and mbstowcs, which can hardly be
 considered true when they don't even have the same argument lists.
 
 Any thoughts what to do about this?
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
 Interesting.  You could argue that once 8.3 is our earliest supported
 release that we could even shrink the support window because the
 argument I can't dump/reload my data would be gone.

 Personally, I think the support window is on the borderline of being
 too short already.  There are several Linux distributions out there
 that offer 5-year support for certain releases.

Keep in mind that at least some contributors are paid to do exactly that
long-term support (and if you've not heard, Red Hat is up to seven years
support on RHEL ...).  So the work is going to get done, and if it
doesn't get committed to the community SCM, I'm not sure that really
helps anybody.

Although whether we do formal releases is a different question.  Maybe
it would be sensible to continue patching an old branch but not bother
wrapping up release tarballs?  But the incremental work to do one more
set of release notes and one more tarball build is not that large.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] switch UNLOGGED to LOGGED

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 12:51 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of vie may 06 23:25:09 -0300 2011:
 On Fri, Apr 22, 2011 at 4:13 AM, Leonardo Francalanci m_li...@yahoo.it 
 wrote:
  Maybe you should change  xl_act_commit to have a separate list of rels to
  drop the init fork for  (instead of mixing those with the list of files to
  drop as a  whole).
 
  I tried to follow your suggestion, thank you very much.

 I have to admit I don't like this approach very much.  I can't see
 adding 4 bytes to every commit record for this feature.

 Hmm, yeah.  Maybe we can add a flags int8 somewhere in that struct and
 set a bit in it if nrels, nsubxacts, nmsgs and respective arrays are present.
 That would save some int's that are already in there.

Yes, that seems like a very appealing approach.  There is plenty of
bit-space available in xinfo, and we could reserve a bit each for
nrels, nsubxacts, and nmsgs, with set meaning that an integer count of
that item is present and clear meaning that the count is omitted from
the structure (and zero).  This will probably require a bit of tricky
code reorganization so I think it should be done separately from the
main patch.  With that done, then it's not a big deal for the main
patch to add in one more array that will normally get omitted.  And in
the process, we can save 12 bytes on every commit record in the common
case, which is quite appealing: I don't expect a huge performance
gain, but a penny saved is a penny earned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Inconsistent treatment of serials in pg_dump

2011-05-09 Thread Josh Berkus
All,

Just encountered this:

create table josh ( id serial not null, desc text );

pg_dump -Fc -T josh -f no_josh_dump postgres

pg_dump -Fc -t josh -f josh_dump postgres

pg_restore -d new no_josh_dump
pg_restore -d new josh_dump

pg_restore: [archiver (db)] Error from TOC entry 2645; 1259 49910
SEQUENCE josh_id_seq postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  relation
josh_id_seq already exists

It seems that if I exclude a table using -T, its dependant sequences do
not get excluded.  But if I include it using -t, its dependent sequences
*do* get included.

Is there a reason this is a good idea, or is it just an oversight?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Bruce Momjian
Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
  Interesting. ?You could argue that once 8.3 is our earliest supported
  release that we could even shrink the support window because the
  argument I can't dump/reload my data would be gone.
 
  Personally, I think the support window is on the borderline of being
  too short already.  There are several Linux distributions out there
  that offer 5-year support for certain releases.
 
 Keep in mind that at least some contributors are paid to do exactly that
 long-term support (and if you've not heard, Red Hat is up to seven years
 support on RHEL ...).  So the work is going to get done, and if it
 doesn't get committed to the community SCM, I'm not sure that really
 helps anybody.
 
 Although whether we do formal releases is a different question.  Maybe
 it would be sensible to continue patching an old branch but not bother
 wrapping up release tarballs?  But the incremental work to do one more
 set of release notes and one more tarball build is not that large.

I think the big reason we trimmed the support window was to push people
off of old releases, not to lighten our workload.  Until we stated that
a release was not supported, we didn't give administrators ammunition to
force upgrades within their organizations.

Yeah, that is a lousy reason, but it was the stated case when we shrunk
to five years.  You can argue that our more recent releases are not as
stop using them bad as previous ones, but Greg Smith's statement about
autovacuum badness reinforces that goal.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Andrew Dunstan



On 05/09/2011 11:43 AM, Robert Haas wrote:

Interesting.  You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument I can't dump/reload my data would be gone.

Personally, I think the support window is on the borderline of being
too short already.  There are several Linux distributions out there
that offer 5-year support for certain releases.


Some (RH?) offer significantly longer periods.

I agree that we should not reduce the support window. The fact that we 
can do in place upgrades of the data only addresses one pain point in 
upgrading. Large legacy apps require large retesting efforts when 
upgrading, often followed by lots more work renovating the code for 
backwards incompatibilities. This can be a huge cost for what the suits 
see as little apparent gain, and making them do it more frequently in 
order to stay current will not win us any friends. I often want to wait 
a while after a release for certain customers, while it beds down, and 
to get them to start moving towards upgrading well before it's the last 
minute. That makes an effective life of four years or less per release 
as things are now. That's plenty short enough.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] hint bit cache v5

2011-05-09 Thread Merlin Moncure
Attached is an updated version of the 'hint bit cache'.

What's changed:
*) 'bucket' was changed to 'page' everywhere
*) rollup array is now gets added during 'set', not the 'get' (pretty
dumb the way it was before -- wasn't really dealing with non-commit
bits yet)
*) more source comments, including a description of the cache in the intro
*) now caching 'invalid' bits.

I went back and forth several times whether to store invalid bits in
the same cache, a separate cache, or not at all.  I finally settled
upon storing them in the same cache which has some pros and cons.  It
makes it more or less exactly like the clog cache (so I could
copy/pasto some code out from there), but adds some overhead because 2
bit addressing is more expensive than 1 bit addressing -- this is
showing up in profiling...i'm upping the estimate of cpu bound scan
overhead from 1% to 2%.   Still fairly cheap, but i'm running into the
edge of where I can claim the cache is 'free' for most workloads --
any claim is worthless without real world testing though.  Of course,
if tuple hint bits are set or PD_ALL_VISIBLE is set, you don't have to
pay that price.

What's not:
*) Haven't touched any satisfies routine besides
HeapTupleSatisfiesMVCC (should they be?)
*) Haven't pushed the cache data into CacheMemoryContext.  I figure
this is the way to go, but requires extra 'if' on every cache 'get'.
*) Didn't abstract the clog bit addressing macros.  I'm leaning on not
doing this, but maybe they should be.  My reasoning is that there is
no requirement for hint bit cache that pages should be whatever block
size is, and I'd like to reserve the ability to adjust the cache page
size independently.

I'd like to know if this is a strategy that merits further work...If
anybody has time/interest that is.  It's getting close to the point
where I can just post it to the commit fest for review.  In
particular, I'm concerned if Tom's earlier objections can be
satisfied. If not, it's back to the drawing board...

merlin


hbache_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Alvaro Herrera
Excerpts from Robert Haas's message of lun may 09 14:31:33 -0400 2011:

 I'm happy enough with that set of guidelines: namely, that we'd use
 src/extension only for things that don't require additional
 dependencies, and not for things that build standalone executables.
 If we're going to move things around, I think we should take the
 trouble to categorize them along the way, and your idea of inserting
 one more subdirectory under src/extension for grouping seems fine to
 me.

For executables we already have src/bin.  Do we really need a separate
place for, say, pg_standby or pg_upgrade?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collation mega-cleanups

2011-05-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom this collation stuff has seen more post-feature-commit cleanups than
 I think any patch I remember.  Is there anything we can learn from this?

The pre-commit review was obviously woefully inadequate.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures

2011-05-09 Thread Merlin Moncure
On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian br...@momjian.us wrote:
 Josh Berkus wrote:
 Peter,

  I would like to collect some specs on this feature.  So does anyone have
  links to documentation of existing implementations, or their own spec
  writeup?  A lot of people appear to have a very clear idea of this
  concept in their own head, so let's start collecting those.

 Delta between SPs and Functions for PostgreSQL:

 * SPs are executed using CALL or EXECUTE, and not SELECT.

 * SPs do not return a value
 ** optional: SPs *may* have OUT parameters.

 [ Late reply.]

 What is it about stored procedures that would require it not to return a
 value or use CALL?  I am trying to understand what part of this is
 procedures (doesn't return a values, we decided there isn't much value
 for that syntax vs. functions), and anonymous transactions.

FWICT the sql standard.  The only summary of standard behaviors I can
find outside of the standard itself is here:
http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.
 Peter's synopsis of how the standard works is murky at best and
competing implementations are all over the place...SQL server's
'CALL'  feature is basically what I personally would like to see. It
would complement our functions nicely.

Procedures return values and are invoked with CALL.  Functions return
values and are in-query callable.

The fact that 'CALL' is not allowed inside a query seems to make it
pretty darn convenient to make the additional distinction of allowing
transactional control statements there and not in functions.  You
don't *have* to allow transactional control statements and could offer
this feature as an essentially syntax sugar enhancement, but then run
the risk of boxing yourself out of a useful properties of this feature
later on because of backwards compatibility issues (in particular, the
assumption that your are in a running transaction in the procedure
body).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collation mega-cleanups

2011-05-09 Thread Robert Haas
On Mon, May 9, 2011 at 2:58 PM, Bruce Momjian br...@momjian.us wrote:
 Tom this collation stuff has seen more post-feature-commit cleanups than
 I think any patch I remember.  Is there anything we can learn from this?

How about don't commit all the large patches at the end of the cycle?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inconsistent treatment of serials in pg_dump

2011-05-09 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 It seems that if I exclude a table using -T, its dependant sequences do
 not get excluded.  But if I include it using -t, its dependent sequences
 *do* get included.

 Is there a reason this is a good idea, or is it just an oversight?

It's not immediately clear to me that those switches ought to be exact
inverses.

As a counterexample, consider the case where multiple tables share the
same sequence.  Suppressing one of the tables with -T ought not lead to
suppressing the sequence.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Robert Haas's message of lun may 09 14:31:33 -0400 2011:
 I'm happy enough with that set of guidelines: namely, that we'd use
 src/extension only for things that don't require additional
 dependencies, and not for things that build standalone executables.
 If we're going to move things around, I think we should take the
 trouble to categorize them along the way, and your idea of inserting
 one more subdirectory under src/extension for grouping seems fine to
 me.

 For executables we already have src/bin.  Do we really need a separate
 place for, say, pg_standby or pg_upgrade?

Putting them in there implies we think they are of core-code quality.
I'm definitely *not* ready to grant that status to pg_upgrade, for
instance.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inconsistent treatment of serials in pg_dump

2011-05-09 Thread Josh Berkus

 As a counterexample, consider the case where multiple tables share the
 same sequence.  Suppressing one of the tables with -T ought not lead to
 suppressing the sequence.

Now, that's a good point.  And I don't expect that pg_dump can
distinguish between a serial and an sequence with a dependency?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Sure, but that's a documentation issue, which again is not going to be
 helped by a source-tree rearrangement.

So we have several problem to solve here, and I agree that source code
rearrangement is fixing none of them.  Maybe it would ease maintaining
down the road, though, but I'll leave that choice up to you.

Which contribs are ready (safe) for production?  We could handle that in
the version numbers, having most of contrib at version 0.9.1 (say) and
some of them at version 1.0.  We could also stop distributing examples
in binary form, only ship them in source package.

Then we need to include inspection extensions into the core packaging,
but still as extensions.  That's more of a packager problem, except that
they need a clear and strong message about it.  Maybe have a new
Makefile and build those extensions as part of the server build, and
install them as part as the normal install.

Another mix of those ideas is to ship inspection extensions and ready
for production ones all into a new package postgresql-server-extensions
that the main server would depend on, and the ones that are adding more
dependencies still in contrib, where not-ready for production extensions
would not get built.

This kind of ideas would also allow to quite easily remove things from
the main server and have them as extensions, available on any install
but a CREATE EXTENSION (WITH SCHEMA pg_catalog) command away.  And still
maintained at the same quality level in the source tree.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inconsistent treatment of serials in pg_dump

2011-05-09 Thread Alvaro Herrera
Excerpts from Josh Berkus's message of lun may 09 16:43:10 -0400 2011:

 Now, that's a good point.  And I don't expect that pg_dump can
 distinguish between a serial and an sequence with a dependency?

They're the same thing, so no.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays

2011-05-09 Thread J. Greg Davidson

E.1.2.2. Casting

  * Tighten casting checks for domains based on arrays (Tom Lane) 

When a domain is based on an array type,..., such a domain type
is no longer allowed to match an anyarray parameter of a
polymorphic function, except by explicitly downcasting it to the
base array type. 

This will require me to add hundreds of casts to my code.  I do not get
how this will Tighten casting checks.  It will certainly not tighten
my code!  Could you explain how it is good to not be able to do array
operations with a type which is an array?

BTW: All of my DOMAINs which are array types exist because of
PostgreSQL's inability to infer array types for DOMAINs, so I
have lots of code like this:

CREATE DOMAIN foo_ids AS integer;
CREATE DOMAIN foo_id_arrays AS integer[];

I would love to be able to simply use foo_ids[] instead of
having to have the second DOMAIN foo_id_arrays.

If there is some value which I'm missing in the above Tightening,
perhaps it could be put in *after* PostgreSQL were given the ability
to understand foo_ids[] as an array of foo_ids.

Thanks,

_Greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Server Programming Interface underspecified in 4.1beta1

2011-05-09 Thread J. Greg Davidson
The documentation for the Server Programming Interface
is underspecified such that it is impossible to create
real extensions using only the documented interface.
For example, in the example 43.5

http://www.postgresql.org/docs/9.1/static/spi-examples.html

a variable of type SPITupleTable is being dereferenced
as tupdesc-natts but this is not documented.  The SPI
documentation suggests studying the contributed extension
code for further examples but that code is full of hundreds
of features which are not in the SPI at all.

My own SPI code uses dozens of macros and field names which
I found in existing source and in conversations on the
mailing lists.  I'm always nervous when a new release
comes out because the API I am using is not official and
might therefore change and break my code.  This is especially
bad since much of the code is not type-safe and problems
will tend to be hidden by all of the casts in the
pre-ISO-C-style macro code.

It would be good if the Server Programming Interface were
sufficiently documented that most of the contributed
extensions which ship with PostgreSQL and most of the
SPI-based modules in the backend were using ONLY the
documented features of the SPI.  Macros hiding casts and
typedefs hiding void * types should replaced with
inline functions and pointers to specific strong types.

To be very clear: Yes, I can always rummage around in
the include files and source to find out how to do things
IN THIS RELEASE.  But I can't expect PostgreSQL development
to avoid breaking the idioms I happen upon, nor can
the developers write unit tests to ensure that proper
extensions using the SPI will not break.

I would like to be able to program to a C or C++ SPI
which is clean, complete and type-safe.  I am good at
reading API documentation in C or C++ and would be happy
to review any proposed improvements.

_Greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Greg Smith

On 05/09/2011 03:31 PM, Alvaro Herrera wrote:

For executables we already have src/bin. Do we really need a separate
place for, say, pg_standby or pg_upgrade?
   


There's really no executables in contrib that I find myself regularly 
desperate for/angry at because they're not installed as an integral part 
of the server, the way I regularly find myself cursing some things that 
are now extensions.  The only contrib executable I use often is pgbench, 
and that's normally early in server life--when it's still possible to 
get things installed easily most places.  And it's something that can be 
removed when finished, in cases where people are nervous about the 
contrib package.


Situations where pg_standby or pg_upgrade suddenly pop up as emergency 
needs seem unlikely too, which is also the case with oid2name, 
pg_test_fsync, pg_archivecleanup, and vacuumlo.  I've had that happen 
with pg_archivecleanup exactly once since it appeared--running out of 
space and that was the easiest way to make the problem go away 
immediately and permanently--but since it was on an 8.4 server we had to 
download the source and build anyway.


Also, my experience is that people are not that paranoid about running 
external binaries, even though they could potentially do harm to the 
database.  Can always do a backup beforehand.  But the idea of loading a 
piece of code that lives in the server all the time freaks them out.  
The way the word contrib implies (and sometimes is meant to mean) low 
quality, while stuff that ships with the main server package does not, 
has been beaten up here for years already.  It's only a few cases where 
that's not fully justified, and the program can easily become an 
extension, that I feel are really worth changing here.  There are 49 
directories in contrib/ ; at best maybe 20% of them will ever fall into 
that category.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why not install pgstattuple by default?

2011-05-09 Thread Greg Smith

On 05/09/2011 02:31 PM, Robert Haas wrote:

I don't think we should be too obstinate about trying to twist the arm
of packagers who (as Tom points out) will do whatever they want in
spite of us, but the current state of contrib, with all sorts of
things of varying type, complexity, and value mixed together cannot
possibly be a good thing.


I think the idea I'm running with for now means that packagers won't 
actually have to do anything.  I'd expect typical packaging for 9.1 to 
include share/postgresql/extension from the build results without being 
more specific.  You need to grab 3 files from there to get the plpgsql 
extension, and I can't imagine any packager listing them all by name.  
So if I dump the converted contrib extensions to put files under there, 
and remove them from the contrib build area destination, I suspect they 
will magically jump from the contrib to the extensions area of the 
server package at next package build; no packager level changes 
required.  The more I look at this, the less obtrusive of a change it 
seems to be.  Only people who will really notice are users who discover 
more in the basic server package, and of course committers with 
backporting to do.


Since packaged builds of 9.1 current with beta1 seem to be in short 
supply still, this theory looks hard to prove just yet.  I'm very 
excited that it's packaging week here however (rolls eyes), so I'll 
check it myself.  I'll incorporate the suggestions made since I posted 
that test patch and do a bigger round of this next, end to end with an 
RPM set as the output.  It sounds like everyone who has a strong opinion 
on what this change might look like has sketched a similar looking 
bikeshed.  Once a reasonable implementation is hammered out, I'd rather 
jump to the big argument between not liking change vs. the advocacy 
benefits to PostgreSQL of doing this; they are considerable in my mind.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server Programming Interface underspecified in 4.1beta1

2011-05-09 Thread Andrew Dunstan



On 05/09/2011 05:23 PM, J. Greg Davidson wrote:

It would be good if the Server Programming Interface were
sufficiently documented that most of the contributed
extensions which ship with PostgreSQL and most of the
SPI-based modules in the backend were using ONLY the
documented features of the SPI.



Docs patches welcome.

(BTW, 4.1beta)

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Greg Stark
On Mon, May 9, 2011 at 7:18 PM, Robert Haas robertmh...@gmail.com wrote:
 Ah ha!  Now we're getting somewhere.  As was doubtless obvious from my
 previous responses, I don't agree that the process is as broken as I
 felt you were suggesting, and I think we've made a lot of
 improvements.  However, I am in complete agreement with you on this
 point.  Unfortunately, people often come into our community with
 incorrect assumptions about how it works, including:

 - someone's in charge
 - there's one right answer
 - it's our job to fix your problem

 Now if you read a few hundred emails (which is not that much calendar
 time, if you read them all) it's not too hard to figure out what the
 real dynamic is, and I think that real dynamic is increasingly
 positive (with some unfortunate exceptions).  But if the first thing
 you do is post (no doubt about some large or controversial change),
 yeah, serious culture shock.

Honestly it's not even that clear. It took me years to realize that
when Tom says There's problems x, y, z he doesn't mean give up now
there are all these fatal flaws but rather think about these things
and maybe they're problems and maybe they're not, but we need to
figure that out.

To be fair that's true for everyone on th4 list depending on the
audience. We have a tendency to state general concerns as pretty
black-and-white statements that would read to a newbie as fatal flaws
that aren't worth investigating.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unfriendly handling of pg_hba SSL options with SSL off

2011-05-09 Thread Bruce Momjian
Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On m??n, 2011-04-25 at 14:18 -0400, Tom Lane wrote:
  In the particular case at hand, if someone is trying to use the same
  hostssl-containing pg_hba.conf across multiple systems, is it not
  reasonable to suppose that he should have SSL turned on in
  postgresql.conf on all those systems?  If he doesn't, it's far more
  likely to be a configuration mistake that he'd appreciate being pointed
  out to him, instead of having to reverse-engineer why some of the
  systems aren't working like others.
 
  I think, people use and configure PostgreSQL in all kinds of ways, so we
  shouldn't assume what they might be thinking.  Especially if an
  artificial boundary has the single purpose of being helpful.
 
 Well, it's not just to be helpful, it's to close off code paths that
 are never going to be sufficiently well-tested to not have bizarre
 failure modes.  That helps both developers (who don't have to worry
 about testing/fixing such code paths) and users (who won't have to deal
 with the bizarre failure modes).
 
 But in any case, I think that the presence of a hostssl line in
 pg_hba.conf is pretty strong evidence that the admin intends to use SSL,
 so we should tell him about it if he's forgotten the other piece of
 setup he needs.

Late reply, but we are basically ignoring 'local' lines if the build
doesn't support unix domain sockets (windows), but throwing an error for
hostssl usage if ssl is not compiled in.  Is the only logic here that
'local' is part of the default pg_hba.conf and hostssl is not?  Is that
good logic?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Kevin Grittner wrote:
 Peter Eisentraut pete...@gmx.net wrote:
  
  what would be the actual use cases of any of these features? 
  Let's collect some, so we can think of ways to make them work.
  
 The two things which leap to mind for me are:
  
 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.  The \d commands would be changed to
 call the SPs for releases recent enough to support this.  Eventually
 psql would be free of worrying about which release contained which
 columns in which system tables, because it would just be passing the
 parameters in and displaying whatever results came back.
  
 I have used products which implemented something like this, and
 found it quite useful.

Uh, why does this require stored procedures?  Seems our existing
function capabilities are even better suited to this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On m??n, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
  (1)  All the \d commands in psql should be implemented in SPs so
  that they are available from any client, through calling one SP
  equivalent to one \d command.
 
  You don't need stored procedures with special transaction behavior for
  this.
 
 No, but what you *would* need is the ability to return multiple result
 sets from one call.  Even then, you could not exactly duplicate the
 current output of \d; but you could duplicate the functionality.

Oh, good point.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Alvaro Herrera
Excerpts from Greg Stark's message of lun may 09 19:44:15 -0400 2011:

 Honestly it's not even that clear. It took me years to realize that
 when Tom says There's problems x, y, z he doesn't mean give up now
 there are all these fatal flaws but rather think about these things
 and maybe they're problems and maybe they're not, but we need to
 figure that out.

These things may seem trivial but I think they are worth documenting.
It feels weird to document something that's inherently social rather
than technical (to me at least), but if that's what we need to help
others to collaborate, then we should.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Andrew Dunstan



On 05/09/2011 08:20 PM, Bruce Momjian wrote:

Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:

(1)  All the \d commands in psql should be implemented in SPs so
that they are available from any client, through calling one SP
equivalent to one \d command.

You don't need stored procedures with special transaction behavior for
this.

No, but what you *would* need is the ability to return multiple result
sets from one call.  Even then, you could not exactly duplicate the
current output of \d; but you could duplicate the functionality.

Oh, good point.  Thanks.


Multiple resultsets in one call would be a good thing, though, no?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Greg Smith

Josh Berkus wrote:

As I don't think we can change this, I think the best answer is to tell people
Don't submit a big patch to PostgreSQL until you've done a few small
patches first.  You'll regret it.
  


When I last did a talk about getting started writing patches, I had a 
few people ask me afterwards if I'd ever run into problems with having 
patch submissions rejected.  I said I hadn't.  When asked what my secret 
was, I told them my first serious submission modified exactly one line 
of code[1].  And *that* I had to defend in regards to its performance 
impact.[2]


Anyway, I think the intro message should be Don't submit a big patch to 
PostgreSQL until you've done a small patch and some patch review 
instead though.  It's both a good way to learn what not to do, and it 
helps with one of the patch acceptance bottlenecks.



The problem is not the process itself, but that there is little
documentation of that process, and that much of that documentation does
not match the defacto process.  Obviously, the onus is on me as much as
anyone else to fix this.
  


I know the documentation around all this has improved a lot since then.  
Unfortunately there's plenty of submissions done by people who never 
read it.  Sometimes it's because people didn't know about it; in others 
I suspect it was seen but some hard parts ignored because it seemed like 
too much work.


One of these days I'm going to write the Formatting Curmudgeon Guide to 
Patch Submission, to give people an idea just how much diff reading and 
revision a patch should go through in order to keep common issues like 
spurious whitespace diffs out of it.  Submitters can either spent a 
block of time sweating those details out themselves, or force the job 
onto a reviewer/committer; they're always there.  And every minute it's 
sitting in someone else's hands who is doing that job instead of reading 
the real code, the odds of the patch being kicked back go up.


[1] http://archives.postgresql.org/pgsql-patches/2007-03/msg00553.php
[2] http://archives.postgresql.org/pgsql-patches/2007-02/msg00222.php

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays

2011-05-09 Thread Merlin Moncure
On Mon, May 9, 2011 at 6:25 PM, J. Greg Davidson g...@ngender.net wrote:

 E.1.2.2. Casting

      * Tighten casting checks for domains based on arrays (Tom Lane)

        When a domain is based on an array type,..., such a domain type
        is no longer allowed to match an anyarray parameter of a
        polymorphic function, except by explicitly downcasting it to the
        base array type.

 This will require me to add hundreds of casts to my code.  I do not get
 how this will Tighten casting checks.  It will certainly not tighten
 my code!  Could you explain how it is good to not be able to do array
 operations with a type which is an array?

 BTW: All of my DOMAINs which are array types exist because of
 PostgreSQL's inability to infer array types for DOMAINs, so I
 have lots of code like this:

 CREATE DOMAIN foo_ids AS integer;
 CREATE DOMAIN foo_id_arrays AS integer[];

 I would love to be able to simply use foo_ids[] instead of
 having to have the second DOMAIN foo_id_arrays.

 If there is some value which I'm missing in the above Tightening,
 perhaps it could be put in *after* PostgreSQL were given the ability
 to understand foo_ids[] as an array of foo_ids.

I didn't read the thread that led up to this change (see:
http://postgresql.1045698.n5.nabble.com/Domains-versus-arrays-versus-typmods-td3227701.html)
but if I had, I would have argued that the problem cases listed, the
worst being the failed constraint check, do not justify the
compatibility break :(.  In the pre-unnest() world, you might have
gotten away with it, but it's been out for two released versions (and
some ad hoc approaches for longer than that) and perhaps we were too
quick on the trigger.  Considering we've already got a report during
beta1, this does not exactly inspire confidence.

We've got other cases of known bugs where a good solution is unclear
or breaks unknown amounts of user code.  The giant clusterfark
surrounding is null and composites comes to mind.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash-safe visibility map, take five

2011-05-09 Thread Rob Wultsch
On Fri, May 6, 2011 at 2:47 PM, Robert Haas robertmh...@gmail.com wrote:
 Comments?

At my day job there is saying: Silence is consent.

I am surprised there has not been more discussion of this change,
considering the magnitude of the possibilities it unlocks.


-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process wakeups when idle and power consumption

2011-05-09 Thread Fujii Masao
On Mon, May 9, 2011 at 8:27 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 9 May 2011 11:19, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 Can't we use the pipe trick on Windows? The API is different, but we use
 pipes on Windows for other things already. When a process is launched, open
 a pipe between postmaster and the child process. In the child, spawn a
 thread that just calls ReadFile() on the pipe, which blocks. If postmaster
 dies, the ReadFile() call will return with an error.

 Alright. I'm currently working on a proof-of-concept implementation of
 that. In the meantime, any thoughts on how this should meld with the
 existing latch implementation?

How about making WaitLatch monitor the file descriptor for the pipe
by using select()?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Andrew Dunstan



On 05/09/2011 09:43 PM, Greg Smith wrote:


When I last did a talk about getting started writing patches, I had a 
few people ask me afterwards if I'd ever run into problems with having 
patch submissions rejected.  I said I hadn't.


Part of the trouble is in the question. Having a patch rejected is not 
really a problem; it's something you should learn from. I know it can be 
annoying. I get annoyed when it happens to me too. But I try to get over 
it as quickly as possible, and either fix the patch, or find another 
(and better) way to do the same thing, or move on. Everybody here is 
acting in good faith, and nobody's on a power trip. That's one of the 
good things about working on Postgres. If it were otherwise I would have 
moved on to something else long ago.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Christopher Browne
On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 05/09/2011 08:20 PM, Bruce Momjian wrote:

 Tom Lane wrote:

 Peter Eisentrautpete...@gmx.net  writes:

 On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:

 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.

 You don't need stored procedures with special transaction behavior for
 this.

 No, but what you *would* need is the ability to return multiple result
 sets from one call.  Even then, you could not exactly duplicate the
 current output of \d; but you could duplicate the functionality.

 Oh, good point.  Thanks.

 Multiple resultsets in one call would be a good thing, though, no?

 cheers

I *thought* the purpose of having stored procedures was to allow a
substrate supporting running multiple transactions, so it could do
things like:
- Managing vacuums
- Managing transactions
- Replacing some of the need for dblink.
- Being an in-DB piece that could manage LISTENs

It seems to be getting bikeshedded into something with more
functional argument functionality than stored functions.

I think we could have a perfectly successful implementation of stored
procedures that supports ZERO ability to pass arguments in or out.
That's quite likely to represent a good start.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] the big picture for index-only scans

2011-05-09 Thread Robert Haas
So, what do we need in order to find our way to index-only scans?

1. The visibility map needs to be crash-safe.  The basic idea of
index-only scans is that, instead of checking the heap to find out
whether each tuple is visible, we first check the visibility map.  If
the visibility map bit is set, then we know all tuples on the page are
visible to all transactions, and therefore the tuple of interest is
visible to our transaction.  Assuming that a significant number of
visibility map bits are set, this should enable us to avoid a fair
amount of I/O, especially on large tables, because the visibility map
is roughly 8000 times smaller than the heap, and therefore far more
practical to keep in cache.  However, before we can rely on the
visibility map for this purpose, we need to fix the problems that can
leave bits set inappropriately in the face of an inconveniently-timed
crash.  I've been working on a patch for this on and off for a few
months now; my latest version is in need of review[1].

2. Crash safe visibility map vs. pg_upgrade.  Even if we make the
visibility map crash-safe in 9.2, people are going to want to use
pg_upgrade to migrate from older versions, bringing their
possibly-not-quite-correct visibility map forks along with them.  How
should we handle that?  We could (2A) arrange to have pg_upgrade nuke
all visibility forks when upgrading from a release where the
visibility map is not crash-safe to one where it is; (2B) keep a piece
of state somewhere indicating, for each relation, whether or not the
visibility map can be trusted, set it to false only if pg_upgrade
brings the relation over from and older version, and set it to true
after a successful vacuum that skips no intervening pages; or (2C)
advise the user to do a VACUUM FULL on each of their tables
pre-upgrade, and if they don't, treat wrong answers as their own
fault.  (I doubt anyone will advocate for this option, but for the
sake of completeness...).  (2A) seems like the simplest solution,
especially because it also avoids the overhead of checking the is the
visibility map bit reliable? flag every time we want to plan a query.

3. Statistics.  I believe that in order to accurately estimate the
cost of an index-only scan, we're going to need to know the fraction
of tuples that are on pages whose visibility map bits are set.  I
believe it should be fairly straightforward to have ANALYZE collect
this information; and I'm inclined to do that as a separate patch.  It
seems like it would also be nice to know what fraction of tuples are
on pages that don't have the visibility map set but where, in fact,
all tuples on the page are visible to all transactions, so it would be
legal to set the bit.  A large discrepancy between these two
percentages might be a good reason to auto-vacuum the table (perhaps
using a really lazy vacuum[2]).  I'm not sure if this can be added
cheaply, though, and in any case, any change to the set of criteria
that will trigger an auto-vacuum is probably a can of worms.
Thoughts?

4. Planner and executor changes.  In contrast to Heikki's original
implementation, I'm inclined to not to try to split the Index Scan
node into index scan and heap fetch.  Since there are many choices for
where to put the heap fetch node (any level of the join tree between
the index scan and the root), this seems likely to result in a
combinatorial explosion of paths[3], and I'm not real sure that the
payback will be adequate.  Furthermore, the idea of allowing user code
to see tuples that will only later be determined not to have been
visible to that MVCC snapshot in the first place seems pretty scary
from a security perspective, though certainly there are possible
benefits[4].  Instead, I'm inclined to just have the planner evaluate
whether the necessary columns can be extracted purely from the index.
If not, we proceed as now.  If so, we can use the index only
approach of using the visibility map to decide which heap fetches can
be skipped.  It's not clear to me whether we need to compare the cost
of the standard approach with the cost of the index only approach:
in theory, if there aren't any visibility map bits anyway, the index
only approach could be slower.  But I'm not sure whether that problem
is significant or common enough to be worth expending a lot of code
on.  Either way, the number of actual paths doesn't need to increase,
because in this design, even if we apply a costing model, one approach
will dominate the other.  Heikki also suggested considering index
scans in cases where we don't now[4, again] but I'm inclined to leave
this, too, for a later optimization, again because balancing the
increase in paths against the possible performance benefits of using
indexes in more situations seems finicky.  In short, for a first cut
at this, I just want to look at this as a way to get cheaper index
scans, and leave everything else to future work.

Any thoughts welcome.  Incidentally, if anyone else feels like working
on this, feel 

Re: [HACKERS] crash-safe visibility map, take five

2011-05-09 Thread Merlin Moncure
On Fri, May 6, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 30, 2011 at 8:52 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Another question:
 To address the problem in
 http://archives.postgresql.org/pgsql-hackers/2010-02/msg02097.php
 , should we just clear the vm before the log of insert/update/delete?
 This may reduce the performance, is there another solution?

 Yeah, that's a straightforward way to fix it. I don't think the performance
 hit will be too bad. But we need to be careful not to hold locks while doing
 I/O, which might require some rearrangement of the code. We might want to do
 a similar dance that we do in vacuum, and call visibilitymap_pin first, then
 lock and update the heap page, and then set the VM bit while holding the
 lock on the heap page.

 Here's an attempt at implementing the necessary gymnastics.

Is there a quick synopsis of why you have to do (sometimes) the
pin-lock-unlock-pin-lock mechanic? How come you only can fail to
get the pin at most once?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-05-09 Thread Merlin Moncure
On Mon, May 9, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote:
 So, what do we need in order to find our way to index-only scans?

 1. The visibility map needs to be crash-safe.  The basic idea of
 index-only scans is that, instead of checking the heap to find out
 whether each tuple is visible, we first check the visibility map.  If
 the visibility map bit is set, then we know all tuples on the page are
 visible to all transactions, and therefore the tuple of interest is
 visible to our transaction.  Assuming that a significant number of
 visibility map bits are set, this should enable us to avoid a fair
 amount of I/O, especially on large tables, because the visibility map
 is roughly 8000 times smaller than the heap, and therefore far more
 practical to keep in cache.

hm, what are the implications for tuple hint bits, short and long
term?  I'm particularly interested if you think any hint bit i/o
mitigation strategies are worth pursuing.

 2. Crash safe visibility map vs. pg_upgrade.  Even if we make the
 visibility map crash-safe in 9.2, people are going to want to use
 pg_upgrade to migrate from older versions, bringing their
 possibly-not-quite-correct visibility map forks along with them.  How
 should we handle that?  We could (2A) arrange to have pg_upgrade nuke
 all visibility forks when upgrading from a release where the
 visibility map is not crash-safe to one where it is;

+1 on 2A.

 3. Statistics.  I believe that in order to accurately estimate the
 cost of an index-only scan, we're going to need to know the fraction
 of tuples that are on pages whose visibility map bits are set.

It would be helpful to know the performance benefit of index only
scans before knowing how much benefit to attribute here.  Maybe a
system wide kludge would for starters anyway, like assuming 60% of
pages can be vis checked from the VM, or a single GUC, Then again,
maybe not.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XML with invalid chars

2011-05-09 Thread Noah Misch
On Sun, May 08, 2011 at 06:25:27PM -0400, Andrew Dunstan wrote:
 On 04/27/2011 11:41 PM, Noah Misch wrote:
 On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote:
 On 04/27/2011 05:30 PM, Noah Misch wrote:
 To make things worse, the dump/reload problems seems to depend on your 
 version
 of libxml2, or something.  With git master, a CentOS 5 system with
 2.6.26-2.1.2.8.el5_5.1 accepts the ^A byte, but an Ubuntu 8.04 LTS system 
 with
 2.6.31.dfsg-2ubuntu rejects it.  Even with a patch like this, systems with 
 a
 lenient libxml2 will be liable to store XML data that won't restore on a 
 system
 with a strict libxml2.  Perhaps we should emit a build-time warning if the 
 local
 libxml2 is lenient?
 No, I think we need to be strict ourselves.
 Then I suppose we'd also scan for invalid characters in xml_parse()?  Or, at
 least, do so when linked to a libxml2 that neglects to do so itself?

 Yep.

I see you've gone with doing it unconditionally.  I'd lean toward testing the
library in pg_xml_init and setting a flag indicating whether we need the extra
pass.  However, a later patch can always optimize that.

 Injecting the check here aids xmlelement and xmlforest , but 
 xmlcomment
 and xmlpi still let the invalid byte through.  You can also still inject 
 the
 byte into an attribute value via xmlelement.  I wonder if it wouldn't 
 make
 more sense to just pass any XML that we generate from scratch through 
 libxml2.
 There are a lot of holes to plug, otherwise.
 Maybe there are, but I'd want lots of convincing that we should do that
 at this stage. Maybe for 9.2. I think we can plug the holes fairly
 simply for xmlpi and xmlcomment, and catch the attributes by moving this
 check up into map_sql_value_to_xml_value().
 I don't have much convincing to offer -- hunting down the holes seem fine, 
 too.

 I think I've done that. Here's the patch I have now. It looks like we  
 can catch pretty much everything by putting checks in four places, which  
 isn't too bad.

 Please review and try to break.

Here are the test cases I tried:

-- caught successfully
SELECT E'\x01'::xml;
SELECT xmlcomment(E'\x01');
SELECT xmlelement(name foo, xmlattributes(E'\x01' AS bar), '');
SELECT xmlelement(name foo, NULL, E'\x01');
SELECT xmlforest(E'\x01' AS foo);
SELECT xmlpi(name foo, E'\x01');
SELECT query_to_xml($$SELECT E'\x01'$$, true, false, '');

-- not caught
SELECT xmlroot('root/', version E'\x01');
SELECT xmlcomment(E'\ufffe');

-- not directly related, but also wrongly accepted
SELECT xmlroot('root/', version ' ');
SELECT xmlroot('root/', version 'foo');

Offhand, I don't find libxml2's handling of XML declarations particularly
consistent.  My copy's xmlCtxtReadDoc() API (used by xml_in when xmloption =
document) accepts '?xml version=foo?' but rejects '?xml version= ?'.
Its xmlParseBalancedChunkMemory() API (used by xml_in when xmloption = content)
accepts anything, even control characters.  The XML 1.0 standard is stricter:
the version must match ^1\.[0-9]+$.  We might want to tighten this at the same
time.

 diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
 index ee82d46..12cfd56 100644
 --- a/src/backend/utils/adt/xml.c
 +++ b/src/backend/utils/adt/xml.c
 @@ -142,6 +142,20 @@ static void SPI_sql_row_to_xmlelement(int rownum, 
 StringInfo result,
  #define NAMESPACE_XSI http://www.w3.org/2001/XMLSchema-instance;
  #define NAMESPACE_SQLXML http://standards.iso.org/iso/9075/2003/sqlxml;
  
 +/* forbidden C0 control chars */
 +#define FORBIDDEN_C0  \
 + \x01\x02\x03\x04\x05\x06\x07\x08\x0B\x0C\x0E\x0F\x10\x11 \
 + \x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F
 +
 +static inline void
 +check_forbidden_c0(char * str)
 +{
 + if (strpbrk(str,FORBIDDEN_C0) != NULL)
 + ereport(ERROR,
 + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 +  errmsg(character out of range),
 +  errdetail(XML does not support control 
 characters.)));

This would be an errhint, I think.  However, the message seems to emphasize
the wrong thing.  XML 1.0 defines a lexical production called Char that
includes various Unicode character ranges.  Control characters as we know them
happen to not fall in any of those ranges.  The characters aren't unsupported
in the sense of being missing features; the language simply forbids them.

libxml2's error message for this case is PCDATA invalid Char value 1
(assuming \x01).  Mentioning PCDATA seems redundant, since no other context
offers greater freedom.  How about:

ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(invalid XML 1.0 Char \\U%08x, char_val)));

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Christopher Browne wrote:
  Multiple resultsets in one call would be a good thing, though, no?
 
  cheers
 
 I *thought* the purpose of having stored procedures was to allow a
 substrate supporting running multiple transactions, so it could do
 things like:
 - Managing vacuums
 - Managing transactions
 - Replacing some of the need for dblink.
 - Being an in-DB piece that could manage LISTENs
 
 It seems to be getting bikeshedded into something with more
 functional argument functionality than stored functions.
 
 I think we could have a perfectly successful implementation of stored
 procedures that supports ZERO ability to pass arguments in or out.
 That's quite likely to represent a good start.

I am kind of confused too, particularly with the CALL syntax.  I thought
our function call usage was superior in every way to CALL, so why
implement CALL?  I assume for SQL-standards compliance, right?  Does
multiple result sets require CALL?  I assume autonomous transactions
don't require CALL.

Are we assuming no one is going to want a function that allows multiple
result sets or autonomous transactions?  That seems unlikely.  I would
think CALL is independent of those features.  Maybe we need those
features to support SQL-standard CALL, and we will just add those
features to functions too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 4.1beta1: ANYARRAY disallowed for DOMAIN types which happen to be arrays

2011-05-09 Thread Tom Lane
J. Greg Davidson g...@ngender.net writes:
   * Tighten casting checks for domains based on arrays (Tom Lane) 

 When a domain is based on an array type,..., such a domain type
 is no longer allowed to match an anyarray parameter of a
 polymorphic function, except by explicitly downcasting it to the
 base array type. 

 This will require me to add hundreds of casts to my code.  I do not get
 how this will Tighten casting checks.  It will certainly not tighten
 my code!  Could you explain how it is good to not be able to do array
 operations with a type which is an array?

The discussion that led up to that decision is in this thread:
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01362.php
specifically here:
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01545.php

The previous behavior was clearly broken.  The new behavior is at least
consistent.  It might be more user-friendly if we did automatic
downcasts in these cases, but we were not (and still are not) doing
automatic downcasts for domains over scalar types in comparable cases,
so it's not very clear why domains over array types should be treated
differently.

To be concrete, consider the function array_append(anyarray, anyelement)
yielding anyarray.  Suppose we have a domain D over int[] and the call
array_append(var_of_type_D, 42).  If we automatically downcast the
variable to int[], should the result of the function be considered to be
of type D, or type int[]?  This isn't a trivial distinction because
choosing to consider it of type D means we have to re-check D's domain
constraints, which might or might not be satisfied by the modified
array.  Previous releases considered the result to be of type D,
*without* rechecking the domain constraints, which was flat out wrong.

So we basically had three alternatives to make it better:
* downcast to the array type, which would possibly silently
  break applications that were relying on the function result
  being considered of the domain type
* re-apply domain checks on the function result, which would be
  a performance hit and possibly again result in unobvious
  breakage
* explicitly break it by throwing a parse error until you
  downcast (and then upcast the function result if you want)
I realize that #3 is a bit unpleasant, but are either of the other two
better?  At least #3 shows you where you need to check for problems.

There is another issue that wasn't really mentioned in the previous
thread, which is that if we are matching a domain-over-array to a
function's ANYARRAY argument, what exactly should be allowed to match to
ANYELEMENT --- or if the function returns ANYELEMENT, what should the
imputed result type be?  AFAICS it's impossible to give an answer to
that without effectively deciding that function argument matching
smashes the domain to its base type (the array type).  It's not very
clear what's the point of a domain type if every operation on it is
going to neglect its domain-ness.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unfriendly handling of pg_hba SSL options with SSL off

2011-05-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Late reply, but we are basically ignoring 'local' lines if the build
 doesn't support unix domain sockets (windows), but throwing an error for
 hostssl usage if ssl is not compiled in.  Is the only logic here that
 'local' is part of the default pg_hba.conf and hostssl is not?  Is that
 good logic?

I wouldn't have a problem with making the Windows port throw an error
for local lines.  We'd have to fix initdb to remove that line from the
sample file (if it doesn't already), but that's surely not hard.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Pavel Stehule
2011/5/10 Bruce Momjian br...@momjian.us:
 Christopher Browne wrote:
  Multiple resultsets in one call would be a good thing, though, no?
 
  cheers

 I *thought* the purpose of having stored procedures was to allow a
 substrate supporting running multiple transactions, so it could do
 things like:
 - Managing vacuums
 - Managing transactions
 - Replacing some of the need for dblink.
 - Being an in-DB piece that could manage LISTENs

 It seems to be getting bikeshedded into something with more
 functional argument functionality than stored functions.

 I think we could have a perfectly successful implementation of stored
 procedures that supports ZERO ability to pass arguments in or out.
 That's quite likely to represent a good start.

 I am kind of confused too, particularly with the CALL syntax.  I thought
 our function call usage was superior in every way to CALL, so why
 implement CALL?  I assume for SQL-standards compliance, right?  Does
 multiple result sets require CALL?  I assume autonomous transactions
 don't require CALL.


no - you are little bit confused :). CALL and function execution
shares nothing. There is significant differences between function and
procedure. Function is called only from executor - from some plan, and
you have to know a structure of result before run. The execution of
CALL is much simple - you just execute code - without plan and waiting
for any result - if there is.

 Are we assuming no one is going to want a function that allows multiple
 result sets or autonomous transactions?  That seems unlikely.  I would
 think CALL is independent of those features.  Maybe we need those
 features to support SQL-standard CALL, and we will just add those
 features to functions too.


We can use a SETOF cursors for returning a multiple result sets now.
But there are a few complications:

a) The client should to wait for finish of all sets from multiple
result sets - minimally in PL/pgSQL
b) client 'psql' doesn't support a unpacking result when result is
multiple result set
c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL

Regards

Pavel




 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers