Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Heikki Linnakangas
Dimitri Fontaine wrote:
 Le 15 juil. 09 à 23:03, Heikki Linnakangas a écrit :
 2. The primary should have no business reading back from the archive.
 The standby can read from the archive, as it can today.
 
 Sorry to insist, but I'm not sold on your consensus here, yet:
   http://archives.postgresql.org/pgsql-hackers/2009-07/msg00486.php
 
 There's a true need for the solution to be simple to install, and
 providing a side channel for the standby to go read the archives itself
 isn't it.

I think a better way to address that need is to provide a built-in
mechanism for the standby to request a base backup and have it sent over
the wire. That makes the initial setup very easy.

 Furthermore, the counter-argument against having the primary
 able to send data from the archives to some standby is that it should
 still work when primary's dead, but as this is only done in the setup
 phase, I don't see that being able to continue preparing a not-yet-ready
 standby against a dead primary is buying us anything.

The situation arises also when the standby falls badly behind. A simple
solution to that is to add a switch in the master to specify always
keep X MB of WAL in pg_xlog. The standby will then still find it in
pg_xlog, making it harder for a standby to fall so much behind that it
can't find the WAL it needs in the primary anymore. Tom suggested that
we can just give up and re-sync with a new base backup, but that really
requires built-in base backup capability, and is only practical for
small databases.

I think we should definitely have both those features, but it's not
urgent. The replication works without them, although requires that you
set up traditional archiving as well.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread Joshua Tolley
On Thu, Jul 16, 2009 at 01:49:14PM +0900, KaiGai Kohei wrote:
 Joshua,
 
 I found your name as a reviewer at the commitfest.postgresql.org.
 
 However, I don't think the initial proposal of the largeobject
 security is now on the state to be reviewed seriously.
 
 In my preference, it should be reworked based on the TOASTing
 approach as discussed in this thread previously, if we can
 find a reasonable solution for the issue I raised before.

For whatever it's worth, I consider my capability limited to making sure the
patch applies and coming up with a few interesting ways of testing it out, but
not seriously studying the code and knowing when there might be a competitive
alternative implementation. I don't yet understand the issues that have been
raised, but will quit working to review the patch if you feel it's not ready.
Thanks for letting me know; I hope a solution to the problems you've brought
up is forthcoming.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] Mostly Harmless: c++reserved - patch 1 of 4

2009-07-16 Thread Peter Eisentraut
On Friday 05 December 2008 11:13:37 Kurt Harriman wrote:
 1. c++reserved

I have applied (an extended version of) this patch.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Dimitri Fontaine
Hi,

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I think a better way to address that need is to provide a built-in
 mechanism for the standby to request a base backup and have it sent over
 the wire. That makes the initial setup very easy.

Great idea :) 

So I'll reproduce the sketch I did in this other mail, adding the 'base'
state where the prerequisite base backup is handled, that will help
clarify the next points:

 0. base: slave asks the master for a base-backup, at the end of this it
reaches the base-lsn

 1. init: slave asks the master the current LSN and start streaming WAL

 2. setup: slave asks the master for missing WALs from its base-lsn to
this LSN it just got, and apply them all to reach initial LSN (this
happens in parallel to 1.)

 3. catchup: slave has replayed missing WALs and now is replaying the
stream he received in parallel, and which applies from init LSN
(just reached)

 4. sync: slave is applying the stream as it gets it, either as part of
the master transaction or not depending on the GUC settings

 The situation arises also when the standby falls badly behind. A simple
 solution to that is to add a switch in the master to specify always
 keep X MB of WAL in pg_xlog. The standby will then still find it in
 pg_xlog, making it harder for a standby to fall so much behind that it
 can't find the WAL it needs in the primary anymore. Tom suggested that
 we can just give up and re-sync with a new base backup, but that really
 requires built-in base backup capability, and is only practical for
 small databases.

I think that when the standby is back in business after a connection
glitch (or any other transient error), its current internal state is
still 'sync' and walreceiver asks for next LSN (RedoPTR?). Now, 2 cases
are possible:

 a. primary still has it handy, so the standby is still in sync but
lagging behind (and primary knows how much)

 b. primary is not able to provide the requested WAL entry, so the slave
is back to 'setup' state, with base-lsn the point reached just
before loosing sync (the one walreceiver just asked for).

Now, a standby in 'setup' state isn't ready (yet), and for example
synchronous replication won't be possible in this state: we can't ask
the primary to refuse to COMMIT any transaction (holding it, eg) while a
standby hasn't reached 'sync' state.

The way your talking about the issue make me think there's a mix between
how to handle a lagging standby and an out-of-sync standby. For clarity,
I think we should have very distinct states and responses. And yes, as
Tom and you keep saying, a synced standby by definition should not need
any access to its primary archives. So if it does, it's no more in sync.

 I think we should definitely have both those features, but it's not
 urgent. The replication works without them, although requires that you
 set up traditional archiving as well.

Agreed, it's not essential for the feature as far as hackers are
concerned.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: support for multiplexing SIGUSR1

2009-07-16 Thread Jaime Casanova
Hi,

I'm reviewing this patch:
http://archives.postgresql.org/message-id/3f0b79eb0907022341m1d36a841x19c3e2a5a6906...@mail.gmail.com

This one applies almost cleanly, except for a minor hunk in elog.c and
postinit.c
Compiles and pass regression tests (i tried both steps in a debian
lenny amd turion x2 64bits and in a windows xp sp2)

about the patch itself:
Tom objects to a previous patch for this here:
http://archives.postgresql.org/message-id/14969.1228835...@sss.pgh.pa.us
This new patch doesn't use PGPROC struct anymore, instead it uses a
ProcSignalSlot struct defined as:

typedef struct {
pid_t   pss_pid;
sig_atomic_t pss_signalFlags[NUM_PROCSIGNALS];
} ProcSignalSlot;

which, AFAIU, seems to be in sync with Tom's advice here:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00556.php

something that make me nervous is this:
/*
 * Note: Since there's no locking, it's possible that the target
 * process detaches from shared memory and exits right after this
 * test, before we set the flag and send signal. And the signal slot
 * might even be recycled by a new process, so it's remotely possible
 * that we set a flag for a wrong process. That's OK, all the signals
 * are such that no harm is done if they're mistakenly fired.
 */

can we signal a wrong process and still be fine?

besides, seems like SendProcSignal is still attached to SIGUSR1 only,
is this fine?

the rest of the patch (or better ways of testing it) is beyond my knowledge...
i think a reviewer should take a look on it, specially Tom because he
rejected the other one...

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Fujii Masao
Hi,

On Thu, Jul 16, 2009 at 6:03 AM, Heikki
Linnakangasheikki.linnakan...@enterprisedb.com wrote:
 I don't think there's much point assigning more reviewers to Synch Rep
 at this point. I believe we have consensus on four major changes:

Thanks for clarifying the issues! Okey, I'll rework the patch.

 1. Change the way synchronization is done when standby connects to
 primary. After authentication, standby should send a message to primary,
 stating the begin point (where begin is an XLogRecPtr, not a WAL
 segment name). Primary starts streaming WAL starting from that point,
 and keeps streaming forever. pg_read_xlogfile() needs to be removed.

I assume that begin should indicate the location of the last valid record.
In other words, at first the standby tries to recover by using only the XLOG
files which exist in its archive or pg_xlog. When it has reached the last valid
record, it requests the XLOG records which follow begin to the primary.
Is my understanding OK?

http://archives.postgresql.org/pgsql-hackers/2009-07/msg00475.php
As I described before, the XLOG file which the standby creates should be
recoverable. So, when begin indicates the middle of the XLOG file, the
primary should start sending the records from the head of the file including
begin. Is this OK?

Or, the primary should start from begin? In this case, since we can
expect that the incomplete file including begin would exist in also the
standby, the records following begin need to be appended into it.
And, if that incomplete file is the restored one from archive, it would need
to be renamed from a temporary name before being appended.

A timeline/backup history file is also required for recovery, but it's not
found in the standby. So, they need to be shipped from the primary, and
this capability is provided by pg_read_xlogfile(). If removing the function,
how should we transfer those history files? The function similar to
pg_read_xlogfile() with which the filename needs to be specified is still
necessary?

 2. The primary should have no business reading back from the archive.
 The standby can read from the archive, as it can today.

In this case, a backup history file should be stored in pg_xlog for a while,
because it might be requested by the standby. So far pg_start_backup()
has removed the previous backup history file soon. We should introduce
a new GUC parameter to determine how many backup history files should
exist in pg_xlog?

CHECKPOINT should not recycle the XLOG files following the file which
is requested by the standby in that moment. So, we need to tweak the
recycling policy.

 3. Need to support multiple WALSenders. While multiple slave support
 isn't 1st priority right now, it's not acceptable that a new WALSender
 can't connect while one is active already. That can cause trouble in
 case of network problems etc.

Sorry, I didn't get your point. You think multiple slave support isn't 1st
priority, and yet why should multiple walsender mechanism be necessary?
Can you describe the problem cases in more detail?

 4. It is not acceptable that normal backends have to wait for walsender
 to send data.

Umm... this is true in asynchronous replication case. Also true while the
standby is catching up with the primary. After those servers get into
synchronization, the backend should wait for walsender to send data (and
also walreceiver to write/fsync data) before returning success of COMMIT
to the client. Is my understanding right?

In current Synch Rep, the backend basically doesn't wait for walsender in
asynchronous mode. But only when wal_buffers is filled with unsent data,
the backend waits for walsender to send data because there is no room to
insert new data. You suggest only that this problem case should be solved?

 That means that connecting a standby behind a slow
 connection to the primary can grind the primary to a halt.

This is the fate of *synchronous* replication, isn't it? If a user want to get
around such problem, asynchronous mode should be chosen, I think.

 walsender
 needs to be able to read data from disk, not just from shared memory. (I
 raised this back in December
 http://archives.postgresql.org/message-id/495106fa.1050...@enterprisedb.com)

OK, I'll try it.

 As a hint, I think you'll find it a lot easier if you implement only
 asynchronous replication at first. That reduces the amount of
 inter-process communication a lot. You can then add synchronous
 capability in a later commitfest. I would also suggest that for point 4,
 you implement WAL sender so that it *only* reads from disk at first, and
 only add the capability send from wal_buffers later on, and only if
 performance testing shows that it's needed.

Sounds good. I'll advance development in stages as you suggested.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?

2009-07-16 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:

 Bruce Momjian br...@momjian.us writes:
 A larger question is what do we do with pg_migrator now.  I am
 embarrassed I didn't find these errors before, but now that they are
 known, and will probably need an 8.4.1 to fix, should I remove the
 pg_migrator 8.4 source code from pgfoundry?

 Well, we'd still like people to test it, but at the very least it
 should be marked as having known bugs.

What about having pg_migrator bails out if destination cluster version
is  80401 ?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
Hi Robert, Hi all,

The patch applies cleanly and works as intended - no surprise here. After the 
changes the documentation is at least as easy to understand as before and the 
code changes look sensible
Also not surprisingly that's not the area I expected problems I guess ;-)

For performance testing I replayed query logs from sites I easily could get my 
hands on (3 different, halfway interesting ones). I found no relevant 
differences on the first site which is sensible because 
{from,join}_collapse_limit wasn't reached anyway.
More interesting are the queries from the two sites having reporting queries:

On the first, simpler, schema I found on average 30% plan time increase and 40% 
execution time decrease. Most of the queries stayed the same, only a few 
changed radically (in both directions). No big differences between geqo=on/off.

The queries on the second reporting schema unfortunately are different. Its the 
one were I copied the crazy example I attached in the original thread.
With geqo=off a good part of the queries used daily use too much memory to plan 
sensibly and geqo=on outright fails with: 
Error: Failed to make a valid plan
on some.
I stopped trying to make performance measurements there.
Noticeable even some plans which were plannable in reasonable time before now 
are problematic with enable_join_ordering=false!

I agree that those queries are crazy, but I am not sure how many of those are 
out there...

So, while I think the changes are principally a good idea, as 
{from,join}_collapse_limit are a bit confusing options, I personally! do not 
think geqo is ready for it today, especially as the benefit is relatively 
small.

If I am the only one having access to such complicated queries its fine - I am 
working on the sites query generation/schema anyway. 

Could perhaps some other people having complicated queries check how they work 
out with those changes? It should be enough to check with a very big 
{join,from}_collapse_limit? Kevin?

I have also to admit that I somewhat like the current behaviour in theory. 
Currently you can have a view with hand-optimized JOIN order which will not 
get inlined and/or reordered use it together with something unoptimized and 
the unoptimized part will be reordered in many cases...

I found it somewhat hard to review a patch were my meaning was biased from 
beginning. As Tom listed himself listed himself as a reviewer I will happiliy 
(err?) concede to his and your judgement.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Thu, Jul 16, 2009 at 6:03 AM, Heikki
 Linnakangasheikki.linnakan...@enterprisedb.com wrote:
 1. Change the way synchronization is done when standby connects to
 primary. After authentication, standby should send a message to primary,
 stating the begin point (where begin is an XLogRecPtr, not a WAL
 segment name). Primary starts streaming WAL starting from that point,
 and keeps streaming forever. pg_read_xlogfile() needs to be removed.
 
 I assume that begin should indicate the location of the last valid record.
 In other words, at first the standby tries to recover by using only the XLOG
 files which exist in its archive or pg_xlog. When it has reached the last 
 valid
 record, it requests the XLOG records which follow begin to the primary.
 Is my understanding OK?

Yes.

 http://archives.postgresql.org/pgsql-hackers/2009-07/msg00475.php
 As I described before, the XLOG file which the standby creates should be
 recoverable. So, when begin indicates the middle of the XLOG file, the
 primary should start sending the records from the head of the file including
 begin. Is this OK?
 
 Or, the primary should start from begin? In this case, since we can
 expect that the incomplete file including begin would exist in also the
 standby, the records following begin need to be appended into it.

I would expect the standby to append to the partial XLOG file.

 And, if that incomplete file is the restored one from archive, it would need
 to be renamed from a temporary name before being appended.

The archive should not normally contain partial XLOG files, only if you
manually copy one there after primary has crashed. So I don't think
that's something we need to support.

 A timeline/backup history file is also required for recovery, but it's not
 found in the standby. So, they need to be shipped from the primary, and
 this capability is provided by pg_read_xlogfile(). If removing the function,
 how should we transfer those history files? The function similar to
 pg_read_xlogfile() with which the filename needs to be specified is still
 necessary?

Hmm. You only need the timeline history file if the base backup was
taken in an earlier timeline. That situation would only arise if you
(manually) take a base backup, restore to a server (which creates a new
timeline), and then create a slave against that server. At least in the
1st phase, I think we can assume that the standby has access to the same
archive, and will find the history file from there. If not, throw an
error. We can add more bells and whistles later.

 CHECKPOINT should not recycle the XLOG files following the file which
 is requested by the standby in that moment. So, we need to tweak the
 recycling policy.

Yep.

 3. Need to support multiple WALSenders. While multiple slave support
 isn't 1st priority right now, it's not acceptable that a new WALSender
 can't connect while one is active already. That can cause trouble in
 case of network problems etc.
 
 Sorry, I didn't get your point. You think multiple slave support isn't 1st
 priority, and yet why should multiple walsender mechanism be necessary?
 Can you describe the problem cases in more detail?

As the patch stands, new walsender connections are refused when one is
active already. What if the walsender connection is in a zombie state?
For example, it's trying to send WAL to the slave, but the network
connection is down, and the packets are going to a black hole. It will
take a while for the TCP layer to declare the connection dead, and close
the socket. During that time, you can't connect a new slave to the
master, or the same slave using a better network connection.

The most robust way to fix that is to support multiple walsenders. The
zombie walsender can take its time to die, while the new walsender
serves the new connection. You could tweak SO_TIMEOUTs and stuff, but
even then the standby process could be in some weird hung state.

And of course, when we get around to add support for multiple slaves,
we'll have to do that anyway. Better get it right to begin with.

 4. It is not acceptable that normal backends have to wait for walsender
 to send data.
 
 Umm... this is true in asynchronous replication case. Also true while the
 standby is catching up with the primary. After those servers get into
 synchronization, the backend should wait for walsender to send data (and
 also walreceiver to write/fsync data) before returning success of COMMIT
 to the client. Is my understanding right?

Even in synchronous replication, a backend should only have to wait when
it commits. You would only see the difference with very large
transactions that write more WAL than fits in wal_buffers, though, like
data loading.

 In current Synch Rep, the backend basically doesn't wait for walsender in
 asynchronous mode. But only when wal_buffers is filled with unsent data,
 the backend waits for walsender to send data because there is no room to
 insert new data. You suggest only that this 

[HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz

Hi folks,

Today I got bitten a bit, trying to write C function for postgresql,  
that also includes some of company's internal stuff. Needles to say,  
our stuff defines BOOL, as well as postgresql's c.h include file.


Now, for us, we will probably change it, but is there any reason for  
postgresql nowadays not to use stdbool.h, apart from fact, that no one  
made an effort ? Having said that, wouldn't it be easy as just  
replacing all TRUE/FALSE/BOOL with lowercase ones, and including  
stdbool from c.h ?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?

2009-07-16 Thread Peter Eisentraut
On Thursday 16 July 2009 07:09:22 Bruce Momjian wrote:
 Uh, how is this going to behave in 8.5?  Do we still dump sequences, and
 if so, aren't we heading down the road of dumping stuff only because a
 previous release needed it?

Which leads me to a related question:  Do you plan to maintain one version of 
pg_migrator that can upgrade any version to any other version (within reason), 
or will there be separate binaries, say pg_migrator-8.4 and pg_migrator-8.5, 
that each can only upgrade from $selfversion-1 to $selfversion?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-16 Thread Greg Stark
On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com wrote:
 postgres=# select avg(a) from (select a, row_number() over (order by
 a) as r, count(*) over () as rc from x ) p where r in
 ((rc+1)/2,(rc+2)/2) ;

How does this compare to the plain non-windowing SQL implementation:

select a from x order by a offset (select trunc(count(*)/2) from x) limit 1

(except that that only works if count(*) is odd).

Interestingly finding the median is actually O(n) using Quickselect.
Maybe we should provide a C implementation of quickselect as a window
function. I'm not sure how to wedge in the concept that the sort is
unnecessary even though the ORDER BY is specified though.

I'm also not sure how to handle this if the set has to be spooled to
disk. Quicksort and Quickselect do a lot of scans throught he data and
wouldn't perform well on disk.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-16 Thread Pavel Stehule
 I'm also not sure how to handle this if the set has to be spooled to
 disk. Quicksort and Quickselect do a lot of scans throught he data and
 wouldn't perform well on disk.

I thing, so problem is in aggregate func used as window func - or some
missing optimalisation.

when I replaced count(*) over () by subselect (SELECT count(*) FROM
...) then I got expected speed.

Pavel


 --
 greg
 http://mit.edu/~gsstark/resume.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does pg_standby require libpq.so.5?

2009-07-16 Thread Heikki Linnakangas
Committed.

Fujii Masao wrote:
 Hi,
 
 On Thu, Jun 25, 2009 at 10:09 PM, Magnus Hagandermag...@hagander.net wrote:
 The second question is, is it worth doing this so extremely late in the
 8.4 development? After mentioning it quickly in an offlist discussion
 with Heikki, I think our conclusion was that we should wait with this
 until the tree opens for 8.5.
 
 That's OK. I'll wait for 8.5.
 
 It's not a very likely scenario that
 anybody actually has pg_standby on a machine that doesn't have libpq on
 it - since it needs to have a PostgreSQL server on it to make any sense
 
 My first concern was whether the latest pg_standby can work with old postgres
 (old libpq.so).
 
 Regards,
 


-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] DefaultACLs

2009-07-16 Thread Nikhil Sontakke
Hi Petr,

 this is first public version of our DefaultACLs patch as described on
 http://wiki.postgresql.org/wiki/DefaultACL .

I have been assigned by Robert to do an initial review of your GRANT
ON ALL patch mentioned here
(http://archives.postgresql.org/pgsql-hackers/2009-07/msg00207.php)

Does this new DefaultACL patch nullify this earlier one? Or it is
different and should be looked at first since it was added to the
commitfest before the defaultACL patch? It is a bit confusing. Please
clarify.

Regards,
Nikhils

 It allows GRANT/REVOKE permissions to be inherited by objects based on
 schema permissions at create type by use of ALTER SCHEMA foo SET DEFAULT
 PRIVILEGES ON TABLE SELECT TO bar syntax. There is also ADD and DROP for
 appending and removing those default privileges. It works for tables, views,
 sequences and functions. More info about syntax and some previous discussion
 is on wiki.

 There is also GRANT DEFAULT PRIVILEGES ON tablename which *replaces* current
 object privileges with the default ones. Only owner can do both of those
 commands (ALTER SCHEMA can be done only by schema owner and GRANT can be
 done only by object owner).

 It adds new catalog table which stores the default permissions for given
 schema and object type. We didn't add syscache entry for that as Stephen
 Frost didn't feel we should do that (yet). Three functions were also
 exported from aclchk.c because most of the ALTER SCHEMA stuff is done in
 schemacmds.c.

 The current version is fully working and includes some regression tests.
 There is however no documentation at this moment.
 Patch is against current Git HEAD (it is context diff).

 --
 Regards
 Petr Jelinek (PJMODOS)



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





-- 
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?

2009-07-16 Thread Bruce Momjian
Dimitri Fontaine wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 
  Bruce Momjian br...@momjian.us writes:
  A larger question is what do we do with pg_migrator now.  I am
  embarrassed I didn't find these errors before, but now that they are
  known, and will probably need an 8.4.1 to fix, should I remove the
  pg_migrator 8.4 source code from pgfoundry?
 
  Well, we'd still like people to test it, but at the very least it
  should be marked as having known bugs.
 
 What about having pg_migrator bails out if destination cluster version
 is  80401 ?

We can do that, but because 8.4.1 is not released yet, we might as well
just remove the source code.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Bernd Helmle



--On 16. Juli 2009 11:12:34 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl 
wrote:



Now, for us, we will probably change it, but is there any reason for
postgresql nowadays not to use stdbool.h, apart from fact, that no one
made an effort ? Having said that, wouldn't it be easy as just replacing
all TRUE/FALSE/BOOL with lowercase ones, and including stdbool from c.h ?


If i remember correctly, older SUN compilers doesn't have it. Don't know if 
that is still an issue yet, but we need to take care for non-gcc compilers 
i think.


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] DefaultACLs

2009-07-16 Thread Petr Jelinek

Nikhil Sontakke wrote:

Does this new DefaultACL patch nullify this earlier one? Or it is
different and should be looked at first since it was added to the
commitfest before the defaultACL patch? It is a bit confusing. Please
clarify.
  
No, DefaultACLs applies to objects created in the future while GRANT ON 
ALL affects existing objects.
DefaultACLs is more important functionality so it should probably take 
precedence in review process.


There is however one thing that needs some attention. Both patches add 
distinction between VIEW and TABLE objects for acls into parser and they 
both do it differently. GRANT ON ALL works by adding ACL_OBJECT_VIEW and 
tracks that object type in code (that was my original method in both 
patches) while DefaultACLs uses method suggested by Stephen Frost which 
is creating new enum with relation, view, function and sequence members 
(those are object types for which both DefaultACLs and GRANT ON ALL are 
applicable). The second method has advantage of minimal changes to 
existing code.
It's pointless to use both methods so one of the patches will have to be 
adjusted. The problem is that most people seem to dislike the addition 
of ACL_OBJECT_VIEW but on the other hand I don't like the idea of adding 
another object type variable into GrantStmt struct which would be needed 
if we adjusted GRANT ON ALL to Stephen Frost's method.


--
Regards
Petr Jelinek (PJMODOS)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Rick Gigger

On Jul 16, 2009, at 12:07 AM, Heikki Linnakangas wrote:


Dimitri Fontaine wrote:

Le 15 juil. 09 à 23:03, Heikki Linnakangas a écrit :
Furthermore, the counter-argument against having the primary
able to send data from the archives to some standby is that it should
still work when primary's dead, but as this is only done in the setup
phase, I don't see that being able to continue preparing a not-yet- 
ready

standby against a dead primary is buying us anything.


The situation arises also when the standby falls badly behind. A  
simple

solution to that is to add a switch in the master to specify always
keep X MB of WAL in pg_xlog. The standby will then still find it in
pg_xlog, making it harder for a standby to fall so much behind that it
can't find the WAL it needs in the primary anymore. Tom suggested that
we can just give up and re-sync with a new base backup, but that  
really

requires built-in base backup capability, and is only practical for
small databases.


If you use an rsync like algorithm for doing the base backups wouldn't  
that increase the size of the database for which it would still be  
practical to just re-sync?  Couldn't you in fact sync a very large  
database if the amount of actual change in the files was a small  
percentage of the total size?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz


On 16 Jul 2009, at 12:52, Bernd Helmle wrote:




--On 16. Juli 2009 11:12:34 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl 
 wrote:



Now, for us, we will probably change it, but is there any reason for
postgresql nowadays not to use stdbool.h, apart from fact, that no  
one
made an effort ? Having said that, wouldn't it be easy as just  
replacing
all TRUE/FALSE/BOOL with lowercase ones, and including stdbool from  
c.h ?


If i remember correctly, older SUN compilers doesn't have it. Don't  
know if that is still an issue yet, but we need to take care for non- 
gcc compilers i think.

For those guys, we could just put in c.h as it is now...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz

oh, another thing.
stdbool is C99 standard feature. Not gcc extension.
Just in case, someone thinks otherwise.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?

2009-07-16 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Which leads me to a related question: Do you plan to maintain one
 version of pg_migrator that can upgrade any version to any other
 version (within reason), or will there be separate binaries, say
 pg_migrator-8.4 and pg_migrator-8.5, that each can only upgrade from
 $selfversion-1 to $selfversion?

I think we should plan on the latter, at least until we have a few
versions under our belts and can see what we might be letting ourselves
in for.  My guess is that n-1 to n will be plenty challenging enough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-16 Thread Hitoshi Harada
2009/7/16 Greg Stark gsst...@mit.edu:
 On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 postgres=# select avg(a) from (select a, row_number() over (order by
 a) as r, count(*) over () as rc from x ) p where r in
 ((rc+1)/2,(rc+2)/2) ;

 How does this compare to the plain non-windowing SQL implementation:

 select a from x order by a offset (select trunc(count(*)/2) from x) limit 1

 (except that that only works if count(*) is odd).

 Interestingly finding the median is actually O(n) using Quickselect.
 Maybe we should provide a C implementation of quickselect as a window
 function. I'm not sure how to wedge in the concept that the sort is
 unnecessary even though the ORDER BY is specified though.

median() should be aggregate, not window function, shouldn't it?


 I'm also not sure how to handle this if the set has to be spooled to
 disk. Quicksort and Quickselect do a lot of scans throught he data and
 wouldn't perform well on disk.

The WindowAgg spools rows into the tuplestore, which holds the data in
memory as far as it fits in. Do you have any idea how it stores
millons of millions of rows without tuplestore?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 The queries on the second reporting schema unfortunately are different. Its 
 the 
 one were I copied the crazy example I attached in the original thread.
 With geqo=off a good part of the queries used daily use too much memory to 
 plan 
 sensibly and geqo=on outright fails with: 
 Error: Failed to make a valid plan
 on some.

We're not going to be able to fix this unless you show us examples.

 Noticeable even some plans which were plannable in reasonable time before now 
 are problematic with enable_join_ordering=false!

And this even more so --- it doesn't make any sense at all.

 So, while I think the changes are principally a good idea, as 
 {from,join}_collapse_limit are a bit confusing options, I personally! do not 
 think geqo is ready for it today, especially as the benefit is relatively 
 small.

In general I think that any such bugs are there anyway and need to be
fixed anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 15:13:02 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The queries on the second reporting schema unfortunately are different.
  Its the one were I copied the crazy example I attached in the original
  thread. With geqo=off a good part of the queries used daily use too much
  memory to plan sensibly and geqo=on outright fails with:
  Error: Failed to make a valid plan
  on some.
 We're not going to be able to fix this unless you show us examples.
In the other thread I attached a similar to the real schema + example query. 
Not enough? And why?

  Noticeable even some plans which were plannable in reasonable time before
  now are problematic with enable_join_ordering=false!
 And this even more so --- it doesn't make any sense at all.
Why? With a high from_collapse_limit more subqueries get inlined - before 
inlining they get planned separately.

  So, while I think the changes are principally a good idea, as
  {from,join}_collapse_limit are a bit confusing options, I personally! do
  not think geqo is ready for it today, especially as the benefit is
  relatively small.
 In general I think that any such bugs are there anyway and need to be
 fixed anyway.
Understandable.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Tom Lane
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:
 oh, another thing.
 stdbool is C99 standard feature.

We are still targeting C89, not C99.

Another reason not to depend on stdbool is that, so far as I can see,
the standard does not promise that type _Bool has size = 1 byte.
We have to have that because of on-disk compatibility requirements.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz


On 16 Jul 2009, at 14:20, Tom Lane wrote:


Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

oh, another thing.
stdbool is C99 standard feature.


We are still targeting C89, not C99.

Another reason not to depend on stdbool is that, so far as I can see,
the standard does not promise that type _Bool has size = 1 byte.
We have to have that because of on-disk compatibility requirements.


I think the latter is easily fixable, or forceable to be one byte.
Why C89, and not C99 ? Virtually all compilers for last 4 years have/ 
had C99 support.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 15:18:01 Andres Freund wrote:
 On Thursday 16 July 2009 15:13:02 Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   The queries on the second reporting schema unfortunately are different.
   Its the one were I copied the crazy example I attached in the original
   thread. With geqo=off a good part of the queries used daily use too
   much memory to plan sensibly and geqo=on outright fails with:
   Error: Failed to make a valid plan
   on some.
 
  We're not going to be able to fix this unless you show us examples.

 In the other thread I attached a similar to the real schema + example
 query. Not enough? And why?
For reference:
http://archives.postgresql.org/message-
id/200907091700.43411.and...@anarazel.de

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Tom Lane
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:
 Why C89, and not C99 ? Virtually all compilers for last 4 years have/ 
 had C99 support.

Not everybody is running a compiler released within the last 4 years.

The short answer is that C99 doesn't appear to offer enough advantage
over C89, *for our purposes*, to justify freezing out older systems.
stdbool is a perfect example of an addition that offers precisely
zero actual functional improvement.  All it would be for us is an
additional autoconf headache and portability hazard.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-16 Thread Hitoshi Harada
2009/7/16 Pavel Stehule pavel.steh...@gmail.com:
 I'm also not sure how to handle this if the set has to be spooled to
 disk. Quicksort and Quickselect do a lot of scans throught he data and
 wouldn't perform well on disk.

 I thing, so problem is in aggregate func used as window func - or some
 missing optimalisation.

 when I replaced count(*) over () by subselect (SELECT count(*) FROM
 ...) then I got expected speed.


WindowAgg always spools its input in the buffer though (in your case)
it throws away row by row, so compared with pure aggregate it has
overhead. I think this is reasonable approach for large data situation
and different type of window. But yes, we must improve the current
model.

1) There should be some kind of lightweight approach for such
small-data/simple-window situations.

2) tuplestore_puttupleslot() seems to me heavy (copy, check, etc) even
if the data fits in the memory by triming rows. We want to have more
flexible temporary storage on the fly.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Psql List Languages

2009-07-16 Thread Peter Eisentraut
On Thursday 16 July 2009 00:38:31 Fernando Ike de Oliveira wrote:
   I applied the Tom Lane and Peter considerations, but I had that
 remove one column (Owner) of out command \dL to compatibility with 7.4
 version.

The mandate is to work as best as they can with older versions, not to provide 
only the feature set that works the same across old versions.  The correct 
behavior should be to show the owner column if the server version supports it.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Peter Eisentraut
On Thursday 16 July 2009 16:23:31 Grzegorz Jaskiewicz wrote:
 On 16 Jul 2009, at 14:20, Tom Lane wrote:
  Grzegorz Jaskiewicz g...@pointblue.com.pl writes:
  oh, another thing.
  stdbool is C99 standard feature.
 
  We are still targeting C89, not C99.
 
  Another reason not to depend on stdbool is that, so far as I can see,
  the standard does not promise that type _Bool has size = 1 byte.
  We have to have that because of on-disk compatibility requirements.

 I think the latter is easily fixable, or forceable to be one byte.

How do you plan to do that?

 Why C89, and not C99 ? Virtually all compilers for last 4 years have/
 had C99 support.

Well, I think we want to run on systems that are older than 4 years, too.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz


On 16 Jul 2009, at 14:53, Peter Eisentraut wrote:


On Thursday 16 July 2009 16:23:31 Grzegorz Jaskiewicz wrote:

On 16 Jul 2009, at 14:20, Tom Lane wrote:

Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

oh, another thing.
stdbool is C99 standard feature.


We are still targeting C89, not C99.

Another reason not to depend on stdbool is that, so far as I can  
see,

the standard does not promise that type _Bool has size = 1 byte.
We have to have that because of on-disk compatibility requirements.


I think the latter is easily fixable, or forceable to be one byte.


How do you plan to do that?

by casting it to 1 byte type such as char ?
I don't think anyone will add 3rd state to boolean in stdbool, at  
least not any time soon :)


And it is pretty annoying, when your product also has its own BOOLean  
defined...






Why C89, and not C99 ? Virtually all compilers for last 4 years have/
had C99 support.


Well, I think we want to run on systems that are older than 4 years,  
too.



Sure, but that's probably less than 1% of all systems.
The 4 years was a guess, I think its much more than that.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mostly Harmless: c++bookends - patch 2 of 4

2009-07-16 Thread Robert Haas
On Mon, Jul 13, 2009 at 5:51 PM, Peter Eisentrautpete...@gmx.net wrote:
 So I think either decoration is added to all of these files or none of them.
 And I think the former is not going to go over well.

We do have some things that are conditioned on __cplusplus already,
such as c.h, pg_config.h.in, and postgres_ext.h.  So at some
point we at least thought about supporting inclusion of our header
files from C++.  But I agree that if we're going to do it at all, we
ought to do it everywhere.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] plpythonu datatype conversion improvements

2009-07-16 Thread Peter Eisentraut
On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:
 Patch for plpythonu

This patch doesn't apply; I think it got mangled during email transport.  
(Tabs changed to spaces, it looks like.)  Could you resend the patch as a 
separate attachment in a way that it doesn't get mangled?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] revised hstore patch

2009-07-16 Thread Andrew Gierth
Revision to previous hstore patch to fix (and add tests for) some edge
case bugs with nulls or empty arrays.

-- 
Andrew (irc:RhodiumToad)



hstore_20090716.patch.gz
Description: hstore patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Tom Lane
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:
 On 16 Jul 2009, at 14:53, Peter Eisentraut wrote:
 the standard does not promise that type _Bool has size = 1 byte.
 We have to have that because of on-disk compatibility requirements.

 I think the latter is easily fixable, or forceable to be one byte.

 How do you plan to do that?

 by casting it to 1 byte type such as char ?

That's hardly going to improve readability for anyone.  Also, it will
flat out not work for the catalog struct declarations.  When we say
bool relhasindex; the compiler had better think that that's a
one-byte field.

 And it is pretty annoying, when your product also has its own BOOLean  
 defined...

IOW you're not using stdbool either?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mostly Harmless: c++bookends - patch 2 of 4

2009-07-16 Thread Peter Eisentraut
On Thursday 16 July 2009 17:00:03 Robert Haas wrote:
 On Mon, Jul 13, 2009 at 5:51 PM, Peter Eisentrautpete...@gmx.net wrote:
  So I think either decoration is added to all of these files or none of
  them. And I think the former is not going to go over well.

 We do have some things that are conditioned on __cplusplus already,
 such as c.h, pg_config.h.in, and postgres_ext.h.  So at some
 point we at least thought about supporting inclusion of our header
 files from C++.  But I agree that if we're going to do it at all, we
 ought to do it everywhere.

We do support using the frontend headers (libpq, ecpg) from C++.  That's what 
postgres_ext.h is about.  The code in pg_config.h.in is autogenerated by 
Autoconf.  The stuff in c.h is probably still there from before we rearranged 
the header files so that the frontend includes don't use c.h.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Grzegorz Jaskiewicz


On 16 Jul 2009, at 15:17, Tom Lane wrote:


Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

That's hardly going to improve readability for anyone.  Also, it will
flat out not work for the catalog struct declarations.  When we say
bool relhasindex; the compiler had better think that that's a
one-byte field.


Sure, but I would certainly hope, there's not too many places where  
you actually convert it from disc representation, to internal and vice  
versa.





And it is pretty annoying, when your product also has its own BOOLean
defined...


IOW you're not using stdbool either?


Well, saying that I don't is quite an overstatement. It was decided  
long before I started working for that customer, and is full of  
problems like that. But still,
it would be nice for postgresql to at least not cause problems like  
that. Having said that, I will probably fix it on customer's side, but  
I wanted to see if you guys will be happy with patch that changes that  
in postgresql.


thanks .


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Thursday 16 July 2009 15:13:02 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
 Error: Failed to make a valid plan

 We're not going to be able to fix this unless you show us examples.

 In the other thread I attached a similar to the real schema + example query. 
 Not enough? And why?

I tried the example query and couldn't get Failed to make a valid plan
out of it ... what settings do you need for that?

However, I do observe that this seems a sufficient counterexample
against the theory that we can just remove the collapse limits and let
GEQO save us on very complex queries.  On my machine, the example query
takes about 22 seconds to plan using CVS HEAD w/ all default settings.
If I set both collapse_limit variables to very high values (I used 999),
it takes ... um ... not sure; I gave up waiting after half an hour.
I also tried with geqo_effort reduced to the minimum of 1, but that
didn't produce a plan in reasonable time either (I gave up after ten
minutes).  So if we remove the collapse limits, Postgres will completely
fail on this query --- the only way out would be enable_join_ordering =
off, which is hardly likely to produce a decent plan.

Maybe we should leave the collapse_limit logic alone and address
Robert's gripes by just raising the default values a lot (I'm thinking
100 or so).  That way there's an escape hatch for anyone who has
pathological queries to deal with --- just dial the settings down.

 Noticeable even some plans which were plannable in reasonable time before
 now are problematic with enable_join_ordering=false!

 And this even more so --- it doesn't make any sense at all.

 Why? With a high from_collapse_limit more subqueries get inlined - before 
 inlining they get planned separately.

Okay, I misunderstood which two cases you were comparing there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 However, I do observe that this seems a sufficient counterexample
 against the theory that we can just remove the collapse limits and let
 GEQO save us on very complex queries.  On my machine, the example query
 takes about 22 seconds to plan using CVS HEAD w/ all default settings.
 If I set both collapse_limit variables to very high values (I used 999),
 it takes ... um ... not sure; I gave up waiting after half an hour.

What's the point of GEQO if it doesn't guarantee to produce the
optimal plana and *also* doesn't guarantee to produce some plan, any
plan, within some reasonable amount of time? Either we need to fix
that or else I don't see what it's buying us over our regular planner
which also might not produce a plan within a reasonable amount of time
but at least if it does it'll be the right plan.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Zdenek Kotala

Grzegorz Jaskiewicz píše v čt 16. 07. 2009 v 14:59 +0100:

 
  Why C89, and not C99 ? Virtually all compilers for last 4 years have/
  had C99 support.
 
  Well, I think we want to run on systems that are older than 4 years,  
  too.
 
 
 Sure, but that's probably less than 1% of all systems.
 The 4 years was a guess, I think its much more than that.

For example Solaris 8 is 9 years old and still is used in production. I
guess HP-UX is in same situation. And so on. I not able to say how many
PostgreSQL runs on them but how Tom mentioned there is no significant
reason to break old platform.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
I wrote:
 If I set both collapse_limit variables to very high values (I used 999),
 it takes ... um ... not sure; I gave up waiting after half an hour.
 I also tried with geqo_effort reduced to the minimum of 1, but that
 didn't produce a plan in reasonable time either (I gave up after ten
 minutes).

After I gave up letting the machine be idle to get a fair timing,
I turned on oprofile monitoring.  It looks a bit interesting:

samples  %image name   symbol name
886498   53.8090  postgres have_relevant_eclass_joinclause
460596   27.9574  postgres bms_overlap
1427648.6655  postgres bms_is_subset
1262747.6646  postgres have_join_order_restriction
14205 0.8622  postgres list_nth_cell
2721  0.1652  postgres generate_join_implied_equalities
2445  0.1484  libc-2.9.so  memset
2202  0.1337  postgres have_relevant_joinclause
1678  0.1019  postgres make_canonical_pathkey
1648  0.1000  postgres pfree
884   0.0537  postgres bms_union
762   0.0463  postgres gimme_tree
660   0.0401  libc-2.9.so  memcpy
571   0.0347  postgres AllocSetFree
475   0.0288  postgres AllocSetAlloc
431   0.0262  postgres has_relevant_eclass_joinclause
389   0.0236  postgres check_list_invariants
260   0.0158  postgres join_is_legal
238   0.0144  postgres bms_copy

So maybe a redesign of the equivalence-class joinclause mechanism is in
order.  Still, this is unlikely to fix the fundamental issue that the
time for large join problems grows nonlinearly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Kenneth Marshall
On Thu, Jul 16, 2009 at 04:27:39PM +0100, Greg Stark wrote:
 On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote:
  However, I do observe that this seems a sufficient counterexample
  against the theory that we can just remove the collapse limits and let
  GEQO save us on very complex queries. ?On my machine, the example query
  takes about 22 seconds to plan using CVS HEAD w/ all default settings.
  If I set both collapse_limit variables to very high values (I used 999),
  it takes ... um ... not sure; I gave up waiting after half an hour.
 
 What's the point of GEQO if it doesn't guarantee to produce the
 optimal plana and *also* doesn't guarantee to produce some plan, any
 plan, within some reasonable amount of time? Either we need to fix
 that or else I don't see what it's buying us over our regular planner
 which also might not produce a plan within a reasonable amount of time
 but at least if it does it'll be the right plan.
 

I do agree that we should have an actually time limit cap for
GEQO that would have it return the best plan so far at that time.
Then you can at least bound your planning time.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Tom Lane
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:
 On 16 Jul 2009, at 15:17, Tom Lane wrote:
 That's hardly going to improve readability for anyone.  Also, it will
 flat out not work for the catalog struct declarations.  When we say
 bool relhasindex; the compiler had better think that that's a
 one-byte field.

 Sure, but I would certainly hope, there's not too many places where  
 you actually convert it from disc representation, to internal and vice  
 versa.

We don't convert --- it's expected to be the same representation.
As for not too many of them, I think grepping for references to bool
catalog fields will show you differently ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 17:16:31 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On Thursday 16 July 2009 15:13:02 Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
  Error: Failed to make a valid plan
 
  We're not going to be able to fix this unless you show us examples.
 
  In the other thread I attached a similar to the real schema + example
  query. Not enough? And why?

 I tried the example query and couldn't get Failed to make a valid plan
 out of it ... what settings do you need for that?
It unfortunately depends on settings and luck. This dependence on luck was the 
reason why I liked geqo to behave somewhat deterministically...

With  {join,from}_collapse_limit = 100 it seems to be triggered reliably. With 
lower values it seems harder trigger, with bigger it simply takes too long to 
even get there.

Efficiencywise using geqo with higher limits nearly all time is spent in:

geqo
gimme_tree 
have_join_order_restriction
has_legal_joinclause
have_relevant_joinclause
have_relevant_eclass (30% self)
bms_overlap (50%self)

I am not yet fully understanding geqo, but it looks like there are some 
possibilities to improve this. 
Although such efficiency improvements would no not explain the completely 
failing plans...

Do you have an idea which kind of plans benefit most from using geqo? I had a 
somewhat hard time finding any query were geqo was substantially faster than 
the standard join search.

That also somewhat explains why I saw improvements with 64bit bitmapsets...


 However, I do observe that this seems a sufficient counterexample
 against the theory that we can just remove the collapse limits and let
 GEQO save us on very complex queries.  On my machine, the example query
 takes about 22 seconds to plan using CVS HEAD w/ all default settings.
 If I set both collapse_limit variables to very high values (I used 999),
 it takes ... um ... not sure; I gave up waiting after half an hour.
 I also tried with geqo_effort reduced to the minimum of 1, but that
 didn't produce a plan in reasonable time either (I gave up after ten
 minutes).  So if we remove the collapse limits, Postgres will completely
 fail on this query --- the only way out would be enable_join_ordering =
 off, which is hardly likely to produce a decent plan.
 Maybe we should leave the collapse_limit logic alone and address
 Robert's gripes by just raising the default values a lot (I'm thinking
 100 or so).  That way there's an escape hatch for anyone who has
 pathological queries to deal with --- just dial the settings down.
Yes, I think thats sensible. I don't know if there are any queries out there 
that benefit from a higher limits.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 4:32 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 samples  %        image name               symbol name
 886498   53.8090  postgres                 have_relevant_eclass_joinclause
 460596   27.9574  postgres                 bms_overlap

 So maybe a redesign of the equivalence-class joinclause mechanism is in
 order.  Still, this is unlikely to fix the fundamental issue that the
 time for large join problems grows nonlinearly.

Perhaps it's GEQO's fault that it's using these functions
inappropriately, calling them often to calculate these answers
whenever it needs them instead of looking once for join clauses and
then optimizing based on the results. But I've never actually looked
at geqo, mabe that's inherent in the design?


-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Heikki Linnakangas
Rick Gigger wrote:
 If you use an rsync like algorithm for doing the base backups wouldn't
 that increase the size of the database for which it would still be
 practical to just re-sync?  Couldn't you in fact sync a very large
 database if the amount of actual change in the files was a small
 percentage of the total size?

It would certainly help to reduce the network traffic, though you'd
still have to scan all the data to see what has changed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Thu, Jul 16, 2009 at 4:32 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 So maybe a redesign of the equivalence-class joinclause mechanism is in
 order.  Still, this is unlikely to fix the fundamental issue that the
 time for large join problems grows nonlinearly.

 Perhaps it's GEQO's fault that it's using these functions
 inappropriately, calling them often to calculate these answers
 whenever it needs them instead of looking once for join clauses and
 then optimizing based on the results. But I've never actually looked
 at geqo, mabe that's inherent in the design?

geqo isn't doing anything the regular planner wouldn't do under similar
conditions.  It might well be that better caching is the answer to this
particular problem, but I don't have time to look closer today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 17:27:39 Greg Stark wrote:
 On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote:
  However, I do observe that this seems a sufficient counterexample
  against the theory that we can just remove the collapse limits and let
  GEQO save us on very complex queries.  On my machine, the example query
  takes about 22 seconds to plan using CVS HEAD w/ all default settings.
  If I set both collapse_limit variables to very high values (I used 999),
  it takes ... um ... not sure; I gave up waiting after half an hour.
 What's the point of GEQO if it doesn't guarantee to produce the
 optimal plana and *also* doesn't guarantee to produce some plan, any
 plan, within some reasonable amount of time? Either we need to fix
 that or else I don't see what it's buying us over our regular planner
 which also might not produce a plan within a reasonable amount of time
 but at least if it does it'll be the right plan.
Well, I could not find a plan where it errored out with the old limits. So one 
could argue its just not adapted.
Although I also could not find a single case where geqo was relevantly faster 
with the default settings even if it was used.
The default settings currently make it relatively hard to trigger geqo at all.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 The default settings currently make it relatively hard to trigger geqo at all.

Yes, and that was intentional.  One of the implications of what we're
discussing here is that geqo would get used a lot more for typical
complex queries (if there is any such thing as a typical one).  So
it's fully to be expected that the fallout would be pressure to improve
geqo in various ways.

Given that we are at the start of the development cycle, that prospect
doesn't scare me --- there's plenty of time to fix whatever needs
fixing.  However, I am leaning to the feeling that I don't want to be
putting people in a position where they have no alternative but to use
geqo.  So adjusting rather than removing the collapse limits is seeming
like a good idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-16 Thread Pavel Stehule
2009/7/16 Hitoshi Harada umi.tan...@gmail.com:
 2009/7/16 Greg Stark gsst...@mit.edu:
 On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com 
 wrote:
 postgres=# select avg(a) from (select a, row_number() over (order by
 a) as r, count(*) over () as rc from x ) p where r in
 ((rc+1)/2,(rc+2)/2) ;

 How does this compare to the plain non-windowing SQL implementation:

 select a from x order by a offset (select trunc(count(*)/2) from x) limit 1

 (except that that only works if count(*) is odd).

 Interestingly finding the median is actually O(n) using Quickselect.
 Maybe we should provide a C implementation of quickselect as a window
 function. I'm not sure how to wedge in the concept that the sort is
 unnecessary even though the ORDER BY is specified though.

 median() should be aggregate, not window function, shouldn't it?

yes - the core of my topic is significant slowness query, that use
window functions, when aggregate function was used. This case could be
simply optimized.

This case isn't important for me. Simply I played with w.f. and I
found Celko's query - and I was surprised, because this query was
faster, then other - I expected some else.


 I'm also not sure how to handle this if the set has to be spooled to
 disk. Quicksort and Quickselect do a lot of scans throught he data and
 wouldn't perform well on disk.

 The WindowAgg spools rows into the tuplestore, which holds the data in
 memory as far as it fits in. Do you have any idea how it stores
 millons of millions of rows without tuplestore?

 Regards,


 --
 Hitoshi Harada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Thursday 16 July 2009 17:16:31 Tom Lane wrote:
 I tried the example query and couldn't get Failed to make a valid plan
 out of it ... what settings do you need for that?

 It unfortunately depends on settings and luck. This dependence on luck was 
 the 
 reason why I liked geqo to behave somewhat deterministically...

 With  {join,from}_collapse_limit = 100 it seems to be triggered reliably. 
 With 
 lower values it seems harder trigger, with bigger it simply takes too long to 
 even get there.

OK, I see it at 100.  Would you confirm that what you get is the failure
in random_init_pool (geqo_pool.c) not the identically-phrased message
elsewhere?  (If you have VERBOSITY = verbose you should see the error
location info.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 18:23:06 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On Thursday 16 July 2009 17:16:31 Tom Lane wrote:
  I tried the example query and couldn't get Failed to make a valid plan
  out of it ... what settings do you need for that?
 
  It unfortunately depends on settings and luck. This dependence on luck
  was the reason why I liked geqo to behave somewhat deterministically...
 
  With  {join,from}_collapse_limit = 100 it seems to be triggered reliably.
  With lower values it seems harder trigger, with bigger it simply takes
  too long to even get there.

 OK, I see it at 100.  Would you confirm that what you get is the failure
 in random_init_pool (geqo_pool.c) not the identically-phrased message
 elsewhere?  (If you have VERBOSITY = verbose you should see the error
 location info.)
Yes. I should have seen that. Its not exactly surprising...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The default settings currently make it relatively hard to trigger geqo at
  all.

 Yes, and that was intentional.  One of the implications of what we're
 discussing here is that geqo would get used a lot more for typical
 complex queries (if there is any such thing as a typical one).  So
 it's fully to be expected that the fallout would be pressure to improve
 geqo in various ways.

 Given that we are at the start of the development cycle, that prospect
 doesn't scare me --- there's plenty of time to fix whatever needs
 fixing.  However, I am leaning to the feeling that I don't want to be
 putting people in a position where they have no alternative but to use
 geqo.  So adjusting rather than removing the collapse limits is seeming
 like a good idea.
Hm. I see a, a bit more fundamental problem with geqo:
I tried several queries, and I found not a single one, where the whole 
genetical process did any significant improvments to the 'worth'.
It seems that always the best variant out of the pool is either the path 
choosen in the end, or at least the cost difference is _really_ low.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 4:41 PM, Heikki
Linnakangasheikki.linnakan...@enterprisedb.com wrote:
 Rick Gigger wrote:
 If you use an rsync like algorithm for doing the base backups wouldn't
 that increase the size of the database for which it would still be
 practical to just re-sync?  Couldn't you in fact sync a very large
 database if the amount of actual change in the files was a small
 percentage of the total size?

 It would certainly help to reduce the network traffic, though you'd
 still have to scan all the data to see what has changed.

The fundamental problem with pushing users to start over with a new
base backup is that there's no relationship between the size of the
WAL and the size of the database.

You can plausibly have a system with extremely high transaction rate
generating WAL very quickly, but where the whole database fits in a
few hundred megabytes. In that case you could be behind by only a few
minutes and have it be faster to take a new base backup.

Or you could have a petabyte database which is rarely updated. In
which case it might be faster to apply weeks' worth of logs than to
try to take a base backup.

Only the sysadmin is actually going to know which makes more sense.
Unless we start tieing WAL parameters to the database size or
something like that.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Robert Haas
On Thu, Jul 16, 2009 at 12:49 PM, Andres Freundand...@anarazel.de wrote:
 On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The default settings currently make it relatively hard to trigger geqo at
  all.

 Yes, and that was intentional.  One of the implications of what we're
 discussing here is that geqo would get used a lot more for typical
 complex queries (if there is any such thing as a typical one).  So
 it's fully to be expected that the fallout would be pressure to improve
 geqo in various ways.

 Given that we are at the start of the development cycle, that prospect
 doesn't scare me --- there's plenty of time to fix whatever needs
 fixing.  However, I am leaning to the feeling that I don't want to be
 putting people in a position where they have no alternative but to use
 geqo.  So adjusting rather than removing the collapse limits is seeming
 like a good idea.
 Hm. I see a, a bit more fundamental problem with geqo:
 I tried several queries, and I found not a single one, where the whole
 genetical process did any significant improvments to the 'worth'.
 It seems that always the best variant out of the pool is either the path
 choosen in the end, or at least the cost difference is _really_ low.

Ouch.  Did you insert some debugging code to get that information, or
how did you come to that conclusion?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-07-16 Thread Jeff Davis
On Thu, 2009-07-16 at 15:22 +1000, Brendan Jurd wrote:
 I had a play around with the feature in psql.  I think the syntax is
 okay, but using ALTER TABLE ... ADD as you mentioned upthread could
 be a better option.

Ok, I think we're pretty much settled on that option then.

Another idea that I thought about is that:

   ALTER TABLE foo ADD UNIQUE (a, b) USING foo_idx;

could be a shorthand for:

   ALTER TABLE foo ADD INDEX CONSTRAINT (a =, b =) USING foo_idx;

The benefit is that it could go over GiST indexes or hash indexes, not
just btrees. The syntax could also be useful to turn an existing btree
into a unique btree.

 I noticed that there's no change to the output of \d in psql to show
 the constraint, so when I do a \d on my test table, I can see that
 there's a gist index there, but I can't tell that there is also a
 constraint on it.  This seems like a pretty significant shortcoming.
 Essentially once you've created one of these index constraints, it
 vanishes into the catalogs and becomes invisible to the user.  This
 might call for a modification of pg_get_indexdef()?

I agree, that's important. Psql support, regression tests, and docs are
all intertwined somewhat with the syntax, so I held off on that work
until I got a little feedback. I will get to work and see if I can put
together a more complete version in the next few days.

If you happen to have time, you can see if you can break my current
patch. I expect the basic algorithm to remain about the same for my next
version, so if you see any problems with that, please let me know. Also,
if you see any possible improvements that could make it useful for more
situations, that would be helpful, too.

But I think I have enough information to move forward, so if you want to
move on to a more complete patch, feel free.

Thanks for the review!

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Robert Haas
On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 I wrote:
 If I set both collapse_limit variables to very high values (I used 999),
 it takes ... um ... not sure; I gave up waiting after half an hour.
 I also tried with geqo_effort reduced to the minimum of 1, but that
 didn't produce a plan in reasonable time either (I gave up after ten
 minutes).

 After I gave up letting the machine be idle to get a fair timing,
 I turned on oprofile monitoring.  It looks a bit interesting:

That is interesting, but there's not really enough detail here to see
what is going on.  I'm more interested in what the high-level
functions are doing that's causing these guys to be called so many
times.  As Greg says, if the planning time curve for GEQO isn't better
than the one for the standard planner, it's the epitome of pointless.

 So maybe a redesign of the equivalence-class joinclause mechanism is in
 order.  Still, this is unlikely to fix the fundamental issue that the
 time for large join problems grows nonlinearly.

Nonlinear is one thing, but this looks more like exponential.  I
understand that the standard planner is exponential; GEQO should not
be.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 19:13:55 Robert Haas wrote:
 On Thu, Jul 16, 2009 at 12:49 PM, Andres Freundand...@anarazel.de wrote:
  On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   The default settings currently make it relatively hard to trigger geqo
   at all.
 
  Yes, and that was intentional.  One of the implications of what we're
  discussing here is that geqo would get used a lot more for typical
  complex queries (if there is any such thing as a typical one).  So
  it's fully to be expected that the fallout would be pressure to improve
  geqo in various ways.
 
  Given that we are at the start of the development cycle, that prospect
  doesn't scare me --- there's plenty of time to fix whatever needs
  fixing.  However, I am leaning to the feeling that I don't want to be
  putting people in a position where they have no alternative but to use
  geqo.  So adjusting rather than removing the collapse limits is seeming
  like a good idea.
 
  Hm. I see a, a bit more fundamental problem with geqo:
  I tried several queries, and I found not a single one, where the whole
  genetical process did any significant improvments to the 'worth'.
  It seems that always the best variant out of the pool is either the path
  choosen in the end, or at least the cost difference is _really_ low.
 Ouch.  Did you insert some debugging code to get that information, or
 how did you come to that conclusion?
Yes, I enabled GEQO_DEBUG and added some more debugging output.  

Btw, a higher generation count does not change that.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 So maybe a redesign of the equivalence-class joinclause mechanism is in
 order.  Still, this is unlikely to fix the fundamental issue that the
 time for large join problems grows nonlinearly.

 Nonlinear is one thing, but this looks more like exponential.  I
 understand that the standard planner is exponential; GEQO should not
 be.

Well, the equivclass code is new as of 8.3.  It's possible that this
got broken relatively recently ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 19:22:30 Robert Haas wrote:
 On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote:
  I wrote:
  If I set both collapse_limit variables to very high values (I used 999),
  it takes ... um ... not sure; I gave up waiting after half an hour.
  I also tried with geqo_effort reduced to the minimum of 1, but that
  didn't produce a plan in reasonable time either (I gave up after ten
  minutes).
 
  After I gave up letting the machine be idle to get a fair timing,
  I turned on oprofile monitoring.  It looks a bit interesting:
 That is interesting, but there's not really enough detail here to see
 what is going on.  I'm more interested in what the high-level
 functions are doing that's causing these guys to be called so many
 times.  As Greg says, if the planning time curve for GEQO isn't better
 than the one for the standard planner, it's the epitome of pointless.
It is not the actual genetic searching I now found out (or more precisely, 
read the trace correctly).

At the start of the query GEQO fills a pool with random paths through the 
searchspace. Unfortunately a random path is not very likely to succeed. So it 
checks and checks and...

Thats why that problem is not visible with a simple join out of 100 or so 
tables - all paths are valid there...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-07-16 Thread Robert Haas
2009/7/16 KaiGai Kohei kai...@ak.jp.nec.com:
 However, I don't think the initial proposal of the largeobject
 security is now on the state to be reviewed seriously.

OK, I am moving this patch to returned with feedback.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] plpythonu datatype conversion improvements

2009-07-16 Thread Caleb Welton
Sorry about that.  Here it is again as an attachment.

-Caleb


On 7/16/09 7:16 AM, Peter Eisentraut pete...@gmx.net wrote:

On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:
 Patch for plpythonu

This patch doesn't apply; I think it got mangled during email transport.
(Tabs changed to spaces, it looks like.)  Could you resend the patch as a
separate attachment in a way that it doesn't get mangled?




plpython_bytea.patch
Description: plpython_bytea.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Josh Berkus

All,

Well, after an hour of tinkering with docbook DTDs and openjade I've 
given up on building docs for the patch I was reviewing on my Mac.


If I'm encountering this difficulty building docs, so are many of the 
other new patch reviewers.  Which means we're *not* reviewing docs for 
completeness, correctness, or correspondence to the actual feature 
syntax until beta time.


This seems like a serious issue for development.  Reviewers, how many of 
you are able to build docs with each patch?


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW

2009-07-16 Thread Josh Berkus

All,

1) Patch applies cleanly against CVS head.

2) Patch compiles and builds cleanly.

3) Unable to check docs because of general doc build problems.

4) Tested the following commands, using a 10MB table of PostgreSQL log data:

postgres=# COPY marchlog TO '/tmp/marchlog1.csv' with csv header;
COPY 81097
postgres=# COPY marchlog TO '/tmp/marchlog2.csv' with csv header force 
quote *;

COPY 81097
postgres=# COPY marchlog TO '/tmp/marchlog3.csv' with csv header force 
quote process_id;

COPY 81097
postgres=# COPY marchlog TO '/tmp/marchlog4.csv' with csv force quote *;
COPY 81097
postgres=# COPY marchlog TO '/tmp/marchlog5.csv' with force quote *;
ERROR:  COPY force quote available only in CSV mode
STATEMENT:  COPY marchlog TO '/tmp/marchlog5.csv' with force quote *;
ERROR:  COPY force quote available only in CSV mode

postgres=# COPY reloadlog FROM '/tmp/marchlog2.csv' with csv header;
COPY 81097

postgres-# \copy marchlog TO '/tmp/marchlog5.csv' with csv force quote *;
postgres-#

Per discussion, I did not test FORCE QUOTE NOT NULL *.

All output looked as expected.  This patch did not seem to change 
eariler functionality, and seems to quote as specified.


Unless there are other things we want to test (CLOBs?) I think the patch 
is probably ready for code review of the FORCE QUOTE * portion.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?

2009-07-16 Thread Bruce Momjian
Peter Eisentraut wrote:
 On Thursday 16 July 2009 07:09:22 Bruce Momjian wrote:
  Uh, how is this going to behave in 8.5?  Do we still dump sequences, and
  if so, aren't we heading down the road of dumping stuff only because a
  previous release needed it?
 
 Which leads me to a related question:  Do you plan to maintain one version of 
 pg_migrator that can upgrade any version to any other version (within 
 reason), 
 or will there be separate binaries, say pg_migrator-8.4 and pg_migrator-8.5, 
 that each can only upgrade from $selfversion-1 to $selfversion?

One binary/source tree.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE *

2009-07-16 Thread Josh Berkus

Andrew,


FORCE NOT NULL is in any case a fairly blunt instrument - it doesn't
work for a column of any type that doesn't accept an empty string as
valid input, such as numeric types.


Con: this allows COPY to produce output which cannot be reloaded into 
PostgreSQL.


Pro: there is a lot of extremely broken external software which expects 
nulls to be expressed as .  This improves compatiblity with them.



--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Kevin Grittner
On the admin list there was a request for an application name
column in pg_stat_activity.
 
http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php
 
This is available in a lot of other DBMS products, can be useful to
DBAs, and seems pretty cheap and easy.  Could we get that onto the
TODO list?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Jaime Casanova
On Thu, Jul 16, 2009 at 2:08 PM, Kevin
Grittnerkevin.gritt...@wicourts.gov wrote:
 On the admin list there was a request for an application name
 column in pg_stat_activity.

 http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php

 This is available in a lot of other DBMS products, can be useful to
 DBAs, and seems pretty cheap and easy.

ah? how do you implement that? and what's the use case for?

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] navigation menu for documents

2009-07-16 Thread David E. Wheeler

On Jul 14, 2009, at 3:21 PM, Andrew Dunstan wrote:

Yes, really. What you suggest here is just not adequate, IMNSHO. I  
don't want to have to scroll to the top or bottom of the page to get  
navigation, and I want to be able to see the navigation and go where  
I want directly.


Hey Andrew,

Check out what I've done for the Bricolage documentation:

  http://www.bricolagecms.org/docs/devel/api/

I'm using jQuery to pull the proper doc into a div. I'm still noodling  
with it, trying to fix encoding issues on Windows, but it's pretty  
close to done.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Rick Gigger

On Jul 16, 2009, at 11:09 AM, Greg Stark wrote:


On Thu, Jul 16, 2009 at 4:41 PM, Heikki
Linnakangasheikki.linnakan...@enterprisedb.com wrote:

Rick Gigger wrote:
If you use an rsync like algorithm for doing the base backups  
wouldn't

that increase the size of the database for which it would still be
practical to just re-sync?  Couldn't you in fact sync a very large
database if the amount of actual change in the files was a small
percentage of the total size?


It would certainly help to reduce the network traffic, though you'd
still have to scan all the data to see what has changed.


The fundamental problem with pushing users to start over with a new
base backup is that there's no relationship between the size of the
WAL and the size of the database.

You can plausibly have a system with extremely high transaction rate
generating WAL very quickly, but where the whole database fits in a
few hundred megabytes. In that case you could be behind by only a few
minutes and have it be faster to take a new base backup.

Or you could have a petabyte database which is rarely updated. In
which case it might be faster to apply weeks' worth of logs than to
try to take a base backup.

Only the sysadmin is actually going to know which makes more sense.
Unless we start tieing WAL parameters to the database size or
something like that.


Once again wouldn't an rsync like algorithm help here.  Couldn't you  
have the default be to just create a new base backup for them , but  
then allow you to specify an existing base backup if you've already  
got one?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE *

2009-07-16 Thread Andrew Dunstan



Josh Berkus wrote:

Andrew,


FORCE NOT NULL is in any case a fairly blunt instrument - it doesn't
work for a column of any type that doesn't accept an empty string as
valid input, such as numeric types.


Con: this allows COPY to produce output which cannot be reloaded into 
PostgreSQL.


Pro: there is a lot of extremely broken external software which 
expects nulls to be expressed as .  This improves compatiblity 
with them.




FORCE NOT NULL is only valid when we import data, not when we export 
data, so what other programs expect to receive is irrelevant to any 
argument about FORCE NOT NULL.


AFAICT on a brief look at the patch, it doesn't affect the quoting of 
nulls on export, it just allows * as an alias for all columns for FORCE 
QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced 
quoting of null values, only non-null values. We have never quoted null 
values, and I'm fairly resistant to any suggestion that we should.


As for importing data from programs that produce all values in quotes 
including null/missing values (your pro case above), arguably what we 
need is another flag that would turn an empty string into a null.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Kevin Grittner
Jaime Casanova jcasa...@systemguards.com.ec wrote: 
 Kevin Grittnerkevin.gritt...@wicourts.gov wrote:
 On the admin list there was a request for an application name
 column in pg_stat_activity.
 
 ah? how do you implement that? and what's the use case for?
 
It would be passed as a connection property.  (If that's not feasible,
perhaps a session GUC, which could map to a client-side connection
property is JDBC, etc.)  For many environments, it is at least as
important as an IP address or PID to help someone managing a database
with a lot of connections.  It would not do anything on the server
except show up in pg_stat_activity as another piece of information
about each connection.
 
We would probably want to modify psql, pg_dump, etc. to put the
application name into this connection property, at least by default. 
We may want to add a command-line switch to allow user override -- to
provide something more detailed.  For example,
--application-name='Weekly Purge' could by specified on the psql
command line.
 
Yes, this is only as useful as the compliance of the client
applications make it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 8:08 PM, Kevin
Grittnerkevin.gritt...@wicourts.gov wrote:
 On the admin list there was a request for an application name
 column in pg_stat_activity.

 http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php

 This is available in a lot of other DBMS products, can be useful to
 DBAs, and seems pretty cheap and easy.  Could we get that onto the
 TODO list?

I think you should just add it.

Ok, we probably need some kind of policy for what to do before just
adding things to the TODO but I think it should be relatively liberal.
Something like, you should post that you're going to add it to the
-hackers list, get at least one person agreeing with the item and no
fatal flaws. Oh, and you should check for duplicates or for the same
item on the things we don't want list. But if having done that you
should assume it's up to you to just go ahead and add it.

In this case I don't see any harm in having an opaque application
identifier. Dangers (but surmountable ones I assume) would be:

1) The authenticity of the application identifier needs to be
downplayed -- don't even think of using it for security for example.

2) encoding issues if different connections are in different encodings...

3) backwards compatibility both in the library api and protocol

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synch Rep for CommitFest 2009-07

2009-07-16 Thread Robert Haas
On Thu, Jul 16, 2009 at 1:09 PM, Greg Starkgsst...@mit.edu wrote:
 On Thu, Jul 16, 2009 at 4:41 PM, Heikki
 Linnakangasheikki.linnakan...@enterprisedb.com wrote:
 Rick Gigger wrote:
 If you use an rsync like algorithm for doing the base backups wouldn't
 that increase the size of the database for which it would still be
 practical to just re-sync?  Couldn't you in fact sync a very large
 database if the amount of actual change in the files was a small
 percentage of the total size?

 It would certainly help to reduce the network traffic, though you'd
 still have to scan all the data to see what has changed.

 The fundamental problem with pushing users to start over with a new
 base backup is that there's no relationship between the size of the
 WAL and the size of the database.

 You can plausibly have a system with extremely high transaction rate
 generating WAL very quickly, but where the whole database fits in a
 few hundred megabytes. In that case you could be behind by only a few
 minutes and have it be faster to take a new base backup.

 Or you could have a petabyte database which is rarely updated. In
 which case it might be faster to apply weeks' worth of logs than to
 try to take a base backup.

 Only the sysadmin is actually going to know which makes more sense.
 Unless we start tieing WAL parameters to the database size or
 something like that.

I think we need a way for the master to know who its slaves are and
keep any given bit of WAL available until all slaves have succesfully
read it, just as we keep each WAL file until we successfully copy it
to the archive.  Otherwise, there's no way to be sure that a
connection break won't result in the need for a new base backup.  (In
a way, a slave is very similar to an additional archive.)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] navigation menu for documents

2009-07-16 Thread Andrew Dunstan



David E. Wheeler wrote:

On Jul 14, 2009, at 3:21 PM, Andrew Dunstan wrote:

Yes, really. What you suggest here is just not adequate, IMNSHO. I 
don't want to have to scroll to the top or bottom of the page to get 
navigation, and I want to be able to see the navigation and go where 
I want directly.


Hey Andrew,

Check out what I've done for the Bricolage documentation:

  http://www.bricolagecms.org/docs/devel/api/

I'm using jQuery to pull the proper doc into a div. I'm still noodling 
with it, trying to fix encoding issues on Windows, but it's pretty 
close to done.





Yes, that's nice, it's just the sort of thing I had in mind - if you can 
do it with a div instead of frames I'm fine with that.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW

2009-07-16 Thread Robert Haas
On Thu, Jul 16, 2009 at 2:47 PM, Josh Berkusj...@agliodbs.com wrote:
 Unless there are other things we want to test (CLOBs?) I think the patch is
 probably ready for code review of the FORCE QUOTE * portion.

I think perhaps we should ask the patch author to remove the NOT NULL
stuff first?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote:
 Kevin Grittnerkevin.gritt...@wicourts.gov wrote:
 On the admin list there was a request for an application name
 column in pg_stat_activity.

 http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php

 This is available in a lot of other DBMS products, can be useful to
 DBAs, and seems pretty cheap and easy.  Could we get that onto the
 TODO list?
 
 I think you should just add it.
 
Done.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW

2009-07-16 Thread Josh Berkus

On 7/16/09 12:53 PM, Robert Haas wrote:

On Thu, Jul 16, 2009 at 2:47 PM, Josh Berkusj...@agliodbs.com  wrote:

Unless there are other things we want to test (CLOBs?) I think the patch is
probably ready for code review of the FORCE QUOTE * portion.


I think perhaps we should ask the patch author to remove the NOT NULL
stuff first?


Yes, current status is Waiting on Author.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.application_name

2009-07-16 Thread Heikki Linnakangas
Kevin Grittner wrote:
 We would probably want to modify psql, pg_dump, etc. to put the
 application name into this connection property, at least by default. 
 We may want to add a command-line switch to allow user override -- to
 provide something more detailed.  For example,
 --application-name='Weekly Purge' could by specified on the psql
 command line.

I've seen it set to argv[0] on other DBMSs, if not set explicitly. That
would be a reasonable default, and would handle psql, pg_dump etc.
without changes.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Kenneth Marshall
On Thu, Jul 16, 2009 at 06:49:08PM +0200, Andres Freund wrote:
 On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   The default settings currently make it relatively hard to trigger geqo at
   all.
 
  Yes, and that was intentional.  One of the implications of what we're
  discussing here is that geqo would get used a lot more for typical
  complex queries (if there is any such thing as a typical one).  So
  it's fully to be expected that the fallout would be pressure to improve
  geqo in various ways.
 
  Given that we are at the start of the development cycle, that prospect
  doesn't scare me --- there's plenty of time to fix whatever needs
  fixing.  However, I am leaning to the feeling that I don't want to be
  putting people in a position where they have no alternative but to use
  geqo.  So adjusting rather than removing the collapse limits is seeming
  like a good idea.
 Hm. I see a, a bit more fundamental problem with geqo:
 I tried several queries, and I found not a single one, where the whole 
 genetical process did any significant improvments to the 'worth'.
 It seems that always the best variant out of the pool is either the path 
 choosen in the end, or at least the cost difference is _really_ low.
 
 
 Andres
 

Hi Andres,

From some of my reading of the literature on join order
optimization via random sampling, such as what would establish
the initial GEQO pool, there is a very good possibility of having
a pretty good plan in the first pool, especially for our larger
initial pool sizes of 100-1000. And in fact, the final plan has
a good chance of being of approximately the same cost as a member
of the initial pool. Uniform sampling alone can give you a close
to optimum plan 80% of the time with an initial sample size of
100. And using biased sampling raises that to 99% or better.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE *

2009-07-16 Thread Josh Berkus

Andrew,


AFAICT on a brief look at the patch, it doesn't affect the quoting of
nulls on export, it just allows * as an alias for all columns for FORCE
QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced
quoting of null values, only non-null values. We have never quoted null
values, and I'm fairly resistant to any suggestion that we should.


See?  That's what happens when I can't build the docs.  ;-)  (and 
there's no previous discussion of the feature).




As for importing data from programs that produce all values in quotes
including null/missing values (your pro case above), arguably what we
need is another flag that would turn an empty string into a null.


h, TODO, please?  There's a lot of this out there, and I've had to 
build sed into a lot of import routines.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Merlin Moncure
On Thu, Jul 16, 2009 at 2:34 PM, Josh Berkusj...@agliodbs.com wrote:
 All,

 Well, after an hour of tinkering with docbook DTDs and openjade I've given
 up on building docs for the patch I was reviewing on my Mac.

 If I'm encountering this difficulty building docs, so are many of the other
 new patch reviewers.  Which means we're *not* reviewing docs for
 completeness, correctness, or correspondence to the actual feature syntax
 until beta time.

 This seems like a serious issue for development.  Reviewers, how many of you
 are able to build docs with each patch?

Isn't it possible though to write and/or review the documentation
patch without building it?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE *

2009-07-16 Thread Chris Spotts

Josh Berkus wrote:

Andrew,


AFAICT on a brief look at the patch, it doesn't affect the quoting of
nulls on export, it just allows * as an alias for all columns for FORCE
QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced
quoting of null values, only non-null values. We have never quoted null
values, and I'm fairly resistant to any suggestion that we should.


See?  That's what happens when I can't build the docs.  ;-)  (and 
there's no previous discussion of the feature).




As for importing data from programs that produce all values in quotes
including null/missing values (your pro case above), arguably what we
need is another flag that would turn an empty string into a null.


h, TODO, please?  There's a lot of this out there, and I've had to 
build sed into a lot of import routines.



+1 For that on the TODO, happens all the time...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for TODO Item: Add prompt escape to display the client and server versions

2009-07-16 Thread Peter Eisentraut
On Thursday 07 May 2009 05:23:41 Dickson S. Guedes wrote:
 This is a WIP patch (for the TODO item in the subject) that I'm putting
 in the Commit Fest queue for 8.5.

The problem I'm seeing with this is that currently it resolves

%v (client) = 8.5devel
%V (server) = 8.5.0

Besides being inconsistent, it's also pretty misleading.  I'd imagine a 
significant use of this feature would be to know that one is connected to, 
say, a beta version.

More generally, does anyone actually need this feature?  psql complains loudly 
enough if the version numbers are not the right ones.  I don't know why this 
would need to be repeated in the prompt.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Greg Smith

On Thu, 16 Jul 2009, Josh Berkus wrote:

Well, after an hour of tinkering with docbook DTDs and openjade I've given up 
on building docs for the patch I was reviewing on my Mac.


It's easier to get the whole chain working under Linux, but even that 
isn't trivial.  I think one useful step here would be to write up some 
practical docs on the package setup side here for various popular 
platforms on the wiki.  I can probably find where I have the RedHat and 
Ubuntu recipies I use around here somewhere, to kick that off as part of 
the review I'm doing for the multi-threaded pgbench.  It's been my 
experience that everybody runs into pretty much the same problems here 
getting standard, but said problems are unique to the OS.


If someone write up something similar for OS X, so there's a recipe for 
getting the standard docs built on all the major development platforms 
where this could be straightforward (I shudder to think what a Cygwin 
guide would look like), that would make it much easier to push toward 
having more people do doc review.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY WITH CSV FORCE QUOTE *

2009-07-16 Thread Andrew Dunstan



Chris Spotts wrote:


As for importing data from programs that produce all values in quotes
including null/missing values (your pro case above), arguably what we
need is another flag that would turn an empty string into a null.


h, TODO, please?  There's a lot of this out there, and I've had 
to build sed into a lot of import routines.



+1 For that on the TODO, happens all the time...



Well, somebody had better suggest a syntax for it, preferably without 
adding yet another keyword.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make planning via GEQO repeatable

2009-07-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 Query planning via GEQO currently can yield a different plan on every
 invokation of the planner due to its non-exhaustive nature.
 This often can be inconvenient because at times there may be a very
 bad plan. It also makes it very hard to reproduce a problem with GEQO.

Applied with some editorialization.  Mainly, I didn't see the point of
preserving the ability to have nondeterministic planning, and I
especially didn't care for having that still be the default behavior.
So I just made it unconditionally initialize the seed.  It would of
course take only minor tweaking to do things differently.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: support for multiplexing SIGUSR1

2009-07-16 Thread Jaime Casanova
On Thu, Jul 16, 2009 at 2:57 AM, Jaime
Casanovajcasa...@systemguards.com.ec wrote:
 Hi,

 I'm reviewing this patch:
 http://archives.postgresql.org/message-id/3f0b79eb0907022341m1d36a841x19c3e2a5a6906...@mail.gmail.com

Another thing that took my attention, i don't think this is safe (it
assumes only one auxiliary process of any type, don't know if we have
various of the same kind but...):

+   /*
+* Assign backend ID to auxiliary processes like backends, in order to
+* allow multiplexing signal to auxiliary processes. Since backends use
+* ID in the range from 1 to MaxBackends (inclusive), we assign
+* auxiliary processes with MaxBackends + AuxProcType + 1 as
an unique ID.
+*/
+   MyBackendId = MaxBackends + auxType + 1;
+   MyProc-backendId = MyBackendId;


-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make planning via GEQO repeatable

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 23:04:58 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  Query planning via GEQO currently can yield a different plan on every
  invokation of the planner due to its non-exhaustive nature.
  This often can be inconvenient because at times there may be a very
  bad plan. It also makes it very hard to reproduce a problem with GEQO.

 Applied with some editorialization.  Mainly, I didn't see the point of
 preserving the ability to have nondeterministic planning, and I
 especially didn't care for having that still be the default behavior.
 So I just made it unconditionally initialize the seed.  It would of
 course take only minor tweaking to do things differently.
Nice. 
Mainly I did not have the guts to change the behaviour completely...

archive.org has a copy of the dead link to the comp.ai.genetic FAQ linked at
http://web.archive.org/web/20051226001402/http://www.cs.bham.ac.uk/Mirrors/ftp.de.uu.net/EC/clife/www/location.htm
which is the same as the one referenced by alvaro in
http://archives.postgresql.org/pgsql-docs/2009-07/msg4.php

If considerered relevant enough, you can update the link...


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Bernd Helmle



--On 16. Juli 2009 13:32:03 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl 
wrote:



oh, another thing.
stdbool is C99 standard feature. Not gcc extension.


There might be compiler versions out there which claims to be C99 but do 
not provide full compliant include headers. SUN Studio 12 at least has the 
following in its documentation, as a quick research brings up:


Though the compiler defaults to supporting the features of C99 listed 
below, standard headers provided by the Solaris software in /usr/include do 
not yet conform with the 1999 ISO/IEC C standard



--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Euler Taveira de Oliveira
Merlin Moncure escreveu:
 Isn't it possible though to write and/or review the documentation
 patch without building it?
 
cd pgsql/doc/src/sgml  gmake check


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] boolean in C

2009-07-16 Thread Dann Corbit
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Bernd Helmle
 Sent: Thursday, July 16, 2009 8:47 AM
 To: Grzegorz Jaskiewicz
 Cc: pgsql-hackers Hackers
 Subject: Re: [HACKERS] boolean in C
 
 --On 16. Juli 2009 13:32:03 +0100 Grzegorz Jaskiewicz
 g...@pointblue.com.pl
 wrote:
 
  oh, another thing.
  stdbool is C99 standard feature. Not gcc extension.
 
 There might be compiler versions out there which claims to be C99 but
 do
 not provide full compliant include headers. SUN Studio 12 at least has
 the
 following in its documentation, as a quick research brings up:
 
 Though the compiler defaults to supporting the features of C99 listed
 below, standard headers provided by the Solaris software in
 /usr/include do
 not yet conform with the 1999 ISO/IEC C standard

It's more or less a generic problem.  There is only a handful of fully
functional C99 compilers[0], and all the others have Some c99 features
to one degree or another.

Microsoft's compiler is particularly abysmal, but then again, they have
no claims of C99 compliance so there is nothing to complain about there.
Those few features that they do implement are implemented in a
non-standard way.

GCC is also only partially compliant[1].  

I believe that the Dinkum library is the only certified C99 standard
library[2] as well.

[0] see: http://www.peren.com/pages/branding_set.htm 
[1] see: http://gcc.gnu.org/c99status.html
[2] see: http://www.dinkumware.com/manuals/

I don't think that a product (that is expected to run on as many
platforms as PostgreSQL is expected to run on) is even possible to write
in C99 code because there are not enough compliant compilers available.

IMO-YMMV


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status report: getting plpgsql to use the core lexer

2009-07-16 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 
 One problem that wasn't obvious when I started is that if you are
 trying to use a reentrant lexer, Bison insists on including its
 YYSTYPE union in the call signature of the lexer.  Of course,
 YYSTYPE means different things to the core grammar and plpgsql's
 grammar.  I tried to work around that by having an interface layer
 that would (among other duties) translate as needed.  It turned out
 to be a real PITA, not least because you can't include both
 definitions into the same C file.  The scheme I have has more or
 less failed --- I think I'd need *two* interface layers to make it
 work without unmaintainable kluges.  It would probably be better to
 try to adjust the core lexer's API some more so that it does not
 depend on the core YYSTYPE, but I'm not sure yet how to get Bison to
 play along without injecting an interface layer (and hence wasted
 cycles) into the core grammar/lexer interface.
 
 Another pretty serious issue is that the current plpgsql lexer
 treats various sorts of qualified names as single tokens.  I had
 thought this could be worked around in the interface layer by doing
 more lookahead.  You can do that, and it mostly works, but it's
 mighty tedious.  The big problem is that yytext gets out of step
 --- it will point at the last token the core lexer has processed,
 and there's no good way to back it up after lookahead.  I spent a
 fair amount of time trying to work around that by eliminating uses
 of yytext in plpgsql, and mostly succeeded, but there are still
 some left.  (Some of the remaining regression failures are error
 messages that point at the wrong token because they rely on yytext.)
 
 Now, having name lookup happen at the lexical level is pretty bogus
 anyhow.  The long-term solution here is probably to avoid doing
 lookup in the plpgsql lexer and move it into some sort of callback
 hook in the main parser, as we've discussed before.  I didn't want
 to get into that right away, but I'm now thinking it has to happen
 before not after refactoring the lexer code.  One issue that has to
 be surmounted before that can happen is that plpgsql currently
 throws away all knowledge of syntactic scope after initial
 processing of a function --- the name stack is no longer available
 when we want to parse individual SQL commands.  We can probably
 rearrange that design but it's another bit of work I don't have time
 for right now.
 
All of this sounds pretty familiar to me.  As you may recall, our
framework includes a SQL parser which parses the subset of standard
SQL we feel is portable enough, and generates Java classes to
implement the code in lowest common denominator SQL with all
procedural code for triggers and stored procedures handled in Java
(which runs in our middle tier database service).  We use ANTLR, and
initially had a three-phase process: lexer, parser, and tree-walkers
to generate code.  We were doing way too much in the parser phase --
checking for table names, column names, data types, etc.  The syntax
of SQL forced us to do a lot of scanning forward and remembering where
we were (especially to get the FROM clause information so we could
process columns in the result list).
 
We were able to get to much cleaner code by rewriting the parser to
have a dumb phase to get the overall structure into an AST, and then
use a tree-walker phase to do all the lookups and type resolution
after we had the rough structure, writing another AST to walk for code
generation.  Besides making the code cleaner and easier to maintain,
it helped us give better error messages pointing more accurately to
the source of the problem.  I don't know if a similar approach is
feasible in flex/bison, but if it is, refactoring for an extra pass
might be worth the trouble.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status report: getting plpgsql to use the core lexer

2009-07-16 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 ...
 We were able to get to much cleaner code by rewriting the parser to
 have a dumb phase to get the overall structure into an AST, and then
 use a tree-walker phase to do all the lookups and type resolution
 after we had the rough structure, writing another AST to walk for code
 generation.  Besides making the code cleaner and easier to maintain,
 it helped us give better error messages pointing more accurately to
 the source of the problem.  I don't know if a similar approach is
 feasible in flex/bison, but if it is, refactoring for an extra pass
 might be worth the trouble.

That's actually what we have in the core parser.  plpgsql is trying to
take shortcuts, and this whole project is exactly about weaning it away
from that.  The bottom line is I tried to tackle the sub-projects in the
wrong order...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Tom Lane
Greg Smith gsm...@gregsmith.com writes:
 On Thu, 16 Jul 2009, Josh Berkus wrote:
 Well, after an hour of tinkering with docbook DTDs and openjade I've given 
 up 
 on building docs for the patch I was reviewing on my Mac.

 It's easier to get the whole chain working under Linux, but even that 
 isn't trivial.

Really?  It's just worked for me on the last several Fedora releases.
You do need to install the docbook packages of course ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Andrew Dunstan



Tom Lane wrote:

Greg Smith gsm...@gregsmith.com writes:
  

On Thu, 16 Jul 2009, Josh Berkus wrote:

Well, after an hour of tinkering with docbook DTDs and openjade I've given up 
on building docs for the patch I was reviewing on my Mac.
  


  
It's easier to get the whole chain working under Linux, but even that 
isn't trivial.



Really?  It's just worked for me on the last several Fedora releases.
You do need to install the docbook packages of course ... 



  


Yes, that's my experience also.

In any case, you really don't need to build the docs to read them. You 
might not like SGML, but it's not *that* hard to understand. Surely our 
patch reviewers can read the SGML text.


Of course, we should check that the docs build cleanly after the patch 
is applied, but that's a different issue. As far as building goes, the 
CVS HEAD docs at 
http://developer.postgresql.org/pgdocs/postgres/index.html are rebuilt 
frequently, so we actually check as soon as the patch is applied.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: generalized index constraints

2009-07-16 Thread Brendan Jurd
2009/7/17 Jeff Davis pg...@j-davis.com:
 Another idea that I thought about is that:

   ALTER TABLE foo ADD UNIQUE (a, b) USING foo_idx;

 could be a shorthand for:

   ALTER TABLE foo ADD INDEX CONSTRAINT (a =, b =) USING foo_idx;

 The benefit is that it could go over GiST indexes or hash indexes, not
 just btrees. The syntax could also be useful to turn an existing btree
 into a unique btree.

I like that idea ... although how would this interact (if at all) with
the existing pg_index.isunique flag?  Would it become deprecated in
favour of using indconstrats, or would you actually look at switching
isunique to TRUE if somebody applies a constraint which is made up
entirely of equality ops?

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Docbook toolchain interfering with patch review?

2009-07-16 Thread Brendan Jurd
2009/7/17 Josh Berkus j...@agliodbs.com:
 This seems like a serious issue for development.  Reviewers, how many of you
 are able to build docs with each patch?

Being able to build docs did require some fidgeting with the docbook
packages (on Gentoo).  The only trick was working out exactly which
packages I needed to install.  Since getting past that, I've not had
any problems building the docs.  Although it is pretty slow.

As Merlin and Andrew have noted, being able to build the docs is a
nice-to-have for documentation review, not a genuine requirement.  You
*can* review changes to SGML right there in the diff.  Especially if
the changes are not extensive and/or don't alter the structure of the
document.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >