Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-24 Thread Shigeru HANADA
On Tue, 23 Nov 2010 10:22:02 -0800
Joshua D. Drake j...@commandprompt.com wrote:
 On Tue, 2010-11-23 at 20:18 +0200, Heikki Linnakangas wrote:
  On 23.11.2010 14:22, Shigeru HANADA wrote:
 
   OID is supported to get oid from the source table (yes, it works only
   for postgresql_fdw but it seems useful to support).
  
  I don't think that's worthwhile. Oids on user tables is a legacy 
  feature, not recommended for new applications.
 
 Agreed. We should do everything we can to NOT encourage their use.

Agreed.  I'll remove OIDs support and repost the patch in new thread.

Regards,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas

On 24.11.2010 06:56, Joachim Wieland wrote:

On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 19.11.2010 23:46, Joachim Wieland wrote:


FATAL:  too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978,
pArray-maxKnownAssignedXids: 6890


Hmm, that's a lot of entries in KnownAssignedXids.

Can you recompile with WAL_DEBUG, and run the recovery again with
wal_debug=on ? That will print all the replayed WAL records, which is a lot
of data, but it might give a hint what's going on.


Sure, but this gives me only one more line:

[...]
LOG:  redo starts at 1F8/FC00E978
LOG:  REDO @ 1F8/FC00E978; LSN 1F8/FC00EE90: prev 1F8/FC00E930; xid
385669; len 21; bkpb1: Heap - insert: rel 1663/16384/18373; tid
3829898/23
FATAL:  too many KnownAssignedXids
CONTEXT:  xlog redo insert: rel 1663/16384/18373; tid 3829898/23
LOG:  startup process (PID 4587) exited with exit code 1
LOG:  terminating any other active server processes


Thanks, I can reproduce this now. This happens when you have a wide gap 
between the oldest still active xid and the latest xid.


When recovery starts, we fetch the oldestActiveXid from the checkpoint 
record. Let's say that it's 100. We then start replaying WAL records 
from the Redo pointer, and the first record (heap insert in your case) 
contains an Xid that's much larger than 100, say 1. We call 
RecordKnownAssignedXids() to make note that all xids between that range 
are in-progress, but there isn't enough room in the array for that.


We normally get away with a smallish array because the array is trimmed 
at commit and abort records, and the special xid-assignment record to 
handle the case of a lot of subtransactions. We initialize the array 
from the running-xacts record that's written at a checkpoint. That 
mechanism fails in this case because the heap insert record is seen 
before the running-xacts record, causing all those xids in the range 
100-1 to be considered in-progress. The running-xacts record that 
comes later would prune them, but we don't have enough slots to hold 
them until that.


Hmm. I'm not sure off the top of my head how to fix that. Perhaps stash 
the xids we see during WAL replay in private memory instead of putting 
them in the KnownAssignedXids array until we see the running-xacts record.


To reproduce this, I did this in the master:

postgres=# CREATE FUNCTION insertfunc(n integer) RETURNS VOID AS $$
declare
  i integer;
begin
  FOR i IN 1..n LOOP
BEGIN
  INSERT INTO foo VALUES (1);
EXCEPTION WHEN division_by_zero THEN RAISE NOTICE 'divbyzero';
END;
  END LOOP;
end;
$$ LANGUAGE plpgsql;
postgres=# SELECT insertfunc(1);

After letting that run for a while, so that a couple of checkpoints have 
occurred, kill the master and start standby to recover that from 
archive. After it has replayed all the WAL, stop the standby and restart it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas

On 24.11.2010 12:48, Heikki Linnakangas wrote:

On 24.11.2010 06:56, Joachim Wieland wrote:

On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

On 19.11.2010 23:46, Joachim Wieland wrote:


FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978,
pArray-maxKnownAssignedXids: 6890


Hmm, that's a lot of entries in KnownAssignedXids.

Can you recompile with WAL_DEBUG, and run the recovery again with
wal_debug=on ? That will print all the replayed WAL records, which is
a lot
of data, but it might give a hint what's going on.


Sure, but this gives me only one more line:

[...]
LOG: redo starts at 1F8/FC00E978
LOG: REDO @ 1F8/FC00E978; LSN 1F8/FC00EE90: prev 1F8/FC00E930; xid
385669; len 21; bkpb1: Heap - insert: rel 1663/16384/18373; tid
3829898/23
FATAL: too many KnownAssignedXids
CONTEXT: xlog redo insert: rel 1663/16384/18373; tid 3829898/23
LOG: startup process (PID 4587) exited with exit code 1
LOG: terminating any other active server processes


Thanks, I can reproduce this now. This happens when you have a wide gap
between the oldest still active xid and the latest xid.

When recovery starts, we fetch the oldestActiveXid from the checkpoint
record. Let's say that it's 100. We then start replaying WAL records
from the Redo pointer, and the first record (heap insert in your case)
contains an Xid that's much larger than 100, say 1. We call
RecordKnownAssignedXids() to make note that all xids between that range
are in-progress, but there isn't enough room in the array for that.

We normally get away with a smallish array because the array is trimmed
at commit and abort records, and the special xid-assignment record to
handle the case of a lot of subtransactions. We initialize the array
from the running-xacts record that's written at a checkpoint. That
mechanism fails in this case because the heap insert record is seen
before the running-xacts record, causing all those xids in the range
100-1 to be considered in-progress. The running-xacts record that
comes later would prune them, but we don't have enough slots to hold
them until that.

Hmm. I'm not sure off the top of my head how to fix that. Perhaps stash
the xids we see during WAL replay in private memory instead of putting
them in the KnownAssignedXids array until we see the running-xacts record.


Looking closer at RecordKnownAssignedTransactionIds(), there's a related 
much more serious bug there too. When latestObservedXid is initialized 
to the oldest still-running xid, oldestActiveXid, at WAL recovery, we 
zero the CLOG starting from the oldestActiveXid. That will zap away the 
clog bits of any old transactions that had already committed before the 
checkpoint started, but were younger than the oldest still running 
transaction. The transactions will be lost :-(.


It's dangerous to initialize latestObservedXid to anything to an older 
value. The idea of keeping the seen xids in a temporary list private to 
the startup process until the running-xacts record would solve that 
problem too. ProcArrayInitRecoveryInfo() would not be needed anymore, 
the KnownAssignedXids tracking would start at the first running-xacts 
record (or shutdown checkpoint) we see, not any sooner than that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas

On 24.11.2010 13:38, Heikki Linnakangas wrote:

It's dangerous to initialize latestObservedXid to anything to an older
value.


older value than the nextXid-1 from the checkpoint record, I meant to say.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Instrument checkpoint sync calls

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 1:02 AM, Greg Smith g...@2ndquadrant.com wrote:
 Robert Haas wrote:
 Did this get eaten by the email goblin, or you're still working on it?

 Fell behind due to an unfortunately timed bit of pneumonia.  Hurray for the
 health benefits of cross country flights.  Will fix this up, rebase my other
 patch, and head toward some more review/'Fest cleanup now that I'm feeling
 better.

Ouch.  Fringe benefits of consulting.  Thanks for the update.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 2:10 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Anything we can do about this?  That's a lot of overhead, and it'd be
 a lot worse on a big machine with 8GB of shared_buffers.

 Micro-optimizing that search for the non-zero value helps a little bit
 (attached). Reduces the percentage shown by oprofile from about 16% to 12%
 on my laptop.

 For bigger gains,

The first optimization that occurred to me was remove the loop
altogether.  I could maybe see needing to do something like this if
we're recovering from an error, but why do we need to do this (except
perhaps to fail an assertion) if we're exiting cleanly?  Even a
session-lifetime buffer-pin leak would be quite disastrous, one would
think.

 Now, the other question is if this really matters. Even if we eliminate that
 loop in AtProcExit_Buffers altogether, is connect/disconnect still be so
 slow that you have to use a connection pooler if you do that a lot?

Oh, I'm sure this isn't going to be nearly enough to fix that problem,
but every little bit helps; and if we never do the first optimization,
we'll never get to #30 or wherever it is that it really starts to move
the needle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 Well, very little about pg_dump is very [E], IMNSHO. The question in my
 mind here is what format the list file will take

 I was thinking same format as pg_restore -l, only without the dumpIDs.

 Nope ... those strings are just helpful comments, they aren't really
 guaranteed to be unique identifiers.  In any case, it seems unlikely
 that a user could expect to get the more complicated cases exactly right
 other than by consulting pg_dump | pg_restore -l output.  Which makes
 the use-case kind of dubious to me.

 I don't say that this wouldn't be a useful feature, but you need a
 better spec than this.

One thing I've often wished for is the ability to dump a specific
function (usually right after after I accidentally rm the file the
source code was in).  pg_dump has -t to pick a table, but there's no
analagous way to select an object that isn't a relation.  I think the
first step here would be to design a system that lets you use a
command-line argument to dump an arbitrary object, and after that you
could work on reading the object descriptors from a file rather than
the command line.

As a first attempt at syntax, I might suggest something along the
lines of object type: object name, where the types and names might
look to COMMENT ON for inspiration.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com 
 writes:
 I think you (Robert) misunderstood dramatically what Pavel try to do.
 Pavel did an excellent optimization work for a specific point. This
 specific point looks crucial for me in the current behavior of
 PostgreSQL[1]. AFAIS Pavel didn't want to implement a genious syntax,
 but an optimization feature.

 As near as I can tell, Pavel is bullheadedly insisting on adding new
 syntax, not on the optimization aspect of it.  I already pointed out
 how he could get 100% of the performance benefit using the existing
 syntax, but he doesn't appear to be willing to pursue that route.

Right, that was my impression, too.  But, I think this may be partly a
case of people talking past each other.  My impression of this
conversation was a repetition of this sequence:

A: This syntax is bad.
B: But it's way faster!

...which makes no sense.  However, what I now think is going on here
is that there are really two separate things that are wished for here
- a more compact syntax, and a performance improvement.  And taken
separately, I agree with both of those desires.  PL/pgsql is an
incredibly clunky language syntactically, and it's also slow.  A patch
that improves either one of those things has value, whether or not it
also does the other one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Dmitriy Igrishin
Hey hackers,

Completely agree with Robert !
It would be nice to dump functions definitions, e.g. to make it possible
keep them in git separately.
I also want to propose to make it possible dump function definitions
as CREATE OR REPLACE FUNCTION rather than just CREATE
FUNCTION (as pg_dump dumps them now). It is would be useful
as well as psql's \ef.

2010/11/24 Robert Haas robertmh...@gmail.com

 On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Josh Berkus j...@agliodbs.com writes:
  Well, very little about pg_dump is very [E], IMNSHO. The question in my
  mind here is what format the list file will take
 
  I was thinking same format as pg_restore -l, only without the dumpIDs.
 
  Nope ... those strings are just helpful comments, they aren't really
  guaranteed to be unique identifiers.  In any case, it seems unlikely
  that a user could expect to get the more complicated cases exactly right
  other than by consulting pg_dump | pg_restore -l output.  Which makes
  the use-case kind of dubious to me.
 
  I don't say that this wouldn't be a useful feature, but you need a
  better spec than this.

 One thing I've often wished for is the ability to dump a specific
 function (usually right after after I accidentally rm the file the
 source code was in).  pg_dump has -t to pick a table, but there's no
 analagous way to select an object that isn't a relation.  I think the
 first step here would be to design a system that lets you use a
 command-line argument to dump an arbitrary object, and after that you
 could work on reading the object descriptors from a file rather than
 the command line.

 As a first attempt at syntax, I might suggest something along the
 lines of object type: object name, where the types and names might
 look to COMMENT ON for inspiration.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
// Dmitriy.


Re: [HACKERS] Extensions, this time with a patch

2010-11-24 Thread Itagaki Takahiro
On Tue, Nov 23, 2010 at 18:19, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Please find that done in attached v4 of the cfparser patch.

RECOVERY_COMMAND_FILE is opened twice in the patch. The first time
is for checking the existence, and the second time is for parsing.
Instead of the repeat, how about adding FILE* version of parser?
It will be also called from ParseConfigFile() as a sub routine.

  bool ParseConfigFd(FILE *fd, const char *config_file, int depth, ...)


BTW, the parser supports include and custom_variable_classes
not only for postgresql.conf but also for all files. Is it an
intended behavior?  I think they are harmless, so we don't have
to change the codes; include might be useful even in recovery.conf,
and custom_variable_classes will be unrecognized recovery
parameter error after all.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Andrew Dunstan



On 11/24/2010 07:29 AM, Robert Haas wrote:

  As a first attempt at syntax, I might suggest something along the
lines of object type: object name, where the types and names might
look to COMMENT ON for inspiration.



pg_dump already uses a list of object types (e.g. as seen in the output 
from pg_restore --list). Let's not invent something new if we don't need 
to. But we'll need more than that. We'll need enough for disambiguation, 
especially for functions which is your stated use case. So, something 
like this might work:


FUNCTION:myschema.myfunc:integer,text,timestamp with time zone

We could allow the object type to compare case insensitively. For extra 
credit, allow type aliases, and don't require argument types if no 
disambiguation is needed.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 8:41 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 11/24/2010 07:29 AM, Robert Haas wrote:
  As a first attempt at syntax, I might suggest something along the
 lines of object type: object name, where the types and names might
 look to COMMENT ON for inspiration.

 pg_dump already uses a list of object types (e.g. as seen in the output from
 pg_restore --list). Let's not invent something new if we don't need to. But
 we'll need more than that. We'll need enough for disambiguation, especially
 for functions which is your stated use case. So, something like this might
 work:

    FUNCTION:myschema.myfunc:integer,text,timestamp with time zone

Actually, that's pretty much exactly what I was proposing, except that
I would've kept the existing convention for how to write a function's
arguments:

FUNCTION:myschema.myfunc(integer,text,timestamp with time zone)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] requested feature: limit for text or bytea parameters in log

2010-11-24 Thread Pavel Stehule
Hello

We use a very rich log - for pgFouine processing. But sometime there
are logger 1MB parameters. It's absolutely useless. These long values
can be replaced by

 first n chars ... truncated original length: 223636, md5:
jhagjkafhskdjfhdsf

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Joachim Wieland
On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Nope ... those strings are just helpful comments, they aren't really
 guaranteed to be unique identifiers.  In any case, it seems unlikely
 that a user could expect to get the more complicated cases exactly right
 other than by consulting pg_dump | pg_restore -l output.  Which makes
 the use-case kind of dubious to me.

In which case would the catalogId, i.e. (tableoid, oid) not be unique?
Or do you rather mean that it does not necessarily refer to the same
object if that object got somehow recreated or that it could be
different on different installations of the same database?

Thanks,
Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Andrew Dunstan



On 11/24/2010 09:05 AM, Joachim Wieland wrote:

On Wed, Nov 24, 2010 at 1:15 AM, Tom Lanet...@sss.pgh.pa.us  wrote:

Nope ... those strings are just helpful comments, they aren't really
guaranteed to be unique identifiers.  In any case, it seems unlikely
that a user could expect to get the more complicated cases exactly right
other than by consulting pg_dump | pg_restore -l output.  Which makes
the use-case kind of dubious to me.

In which case would the catalogId, i.e. (tableoid, oid) not be unique?
Or do you rather mean that it does not necessarily refer to the same
object if that object got somehow recreated or that it could be
different on different installations of the same database?


It would be unique, but a pain in the neck for users to get. Robert's 
idea will have more traction with users.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Nope ... those strings are just helpful comments, they aren't really
 guaranteed to be unique identifiers.  In any case, it seems unlikely
 that a user could expect to get the more complicated cases exactly right
 other than by consulting pg_dump | pg_restore -l output.  Which makes
 the use-case kind of dubious to me.

 In which case would the catalogId, i.e. (tableoid, oid) not be unique?

Catalog OID + object OID would be unique, but surely we don't want to
make users deal in specifying the objects to be dumped with that.

Actually, what occurs to me to wonder is whether the facility has to be
guaranteed unique at all.  If for instance you have a group of overloaded
functions, is there really a big use-case for dumping just one and not
the whole group?  Even if you think there's some use for it, is it big
enough to justify a quantum jump in the complexity of the feature?

Here's a radically simplified proposal: provide a switch
--object-name=pattern
where pattern follows the same rules as in psql \d commands (just
to use something users will already know).  Dump every object,
of any type, whose qualified name matches the pattern, ie the same
objects that would be shown by \d (of the relevant type) using the
pattern.  Accept multiple occurrences of the switch and dump the
union of the matched objects.

(Now that I think about it, this is the same as the existing --table
switch, just generalized to match any object type.)

There would be some cases where this'd dump more than you really want,
but I think it'd make up for that in ease-of-use.  It's not clear to me
that dumping a few extra objects is a big problem except for the case
where the objects are large tables, and in that case if you aren't
specifying a sufficiently exact pattern, it's your own fault not a
limitation of the feature.

BTW, what about dependencies?  One of the main complaints we've heard
about pg_restore's filtering features is that they are not smart about
including things like the indexes of a selected table, or the objects it
depends on (eg, functions referred to in CHECK constraints).  I'm not
sure that a pure name-based filter will be any more usable than
pg_restore's filter, if there is no accounting for dependencies.
The risk of not including dependencies at dump time is vastly higher
than in pg_restore, too, since by the time you realize you omitted
something critical it may be too late to go back and get another dump.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Actually, what occurs to me to wonder is whether the facility has to be
 guaranteed unique at all.  If for instance you have a group of overloaded
 functions, is there really a big use-case for dumping just one and not
 the whole group?  Even if you think there's some use for it, is it big
 enough to justify a quantum jump in the complexity of the feature?

Well, I think that being able to dump one specific function is a
pretty important use case.  But I don't see why that's necessarily
irreconcilable with your suggested syntax of --function=pattern, if we
assume that the pattern is being matched against
pg_proc.oid::regprocedure and define the matching rules such that
foo(text) will not match sfoo(text).  Nothing anyone has proposed
sounds like a quantum jump in complexity over your proposal.

 BTW, what about dependencies?  One of the main complaints we've heard
 about pg_restore's filtering features is that they are not smart about
 including things like the indexes of a selected table, or the objects it
 depends on (eg, functions referred to in CHECK constraints).  I'm not
 sure that a pure name-based filter will be any more usable than
 pg_restore's filter, if there is no accounting for dependencies.

I am 100% positive that it will be a big step forward.  I think the
dependency issue is vaguely interesting, but less important and
orthogonal.  This will be very useful for cherry-picking an object
from one server or database and loading it into another, a need that
comes up with some frequency.  Sure, it'd be handy to be able to
cherry-pick the dependencies automatically, but you don't always need
or even want that - you may know that they are already present in the
target DB.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Simon Riggs
On Wed, 2010-11-24 at 12:48 +0200, Heikki Linnakangas wrote:
 When recovery starts, we fetch the oldestActiveXid from the checkpoint
 record. Let's say that it's 100. We then start replaying WAL records 
 from the Redo pointer, and the first record (heap insert in your case)
 contains an Xid that's much larger than 100, say 1. We call 
 RecordKnownAssignedXids() to make note that all xids between that
 range are in-progress, but there isn't enough room in the array for
 that.

Agreed.

 Hmm. I'm not sure off the top of my head how to fix that. Perhaps
stash 
 the xids we see during WAL replay in private memory instead of
 putting 
 them in the KnownAssignedXids array until we see the running-xacts
 record.

Moving LogStandbySnapshot() earlier will help but won't solve it fully.

Will think.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 2:10 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Micro-optimizing that search for the non-zero value helps a little bit
 (attached). Reduces the percentage shown by oprofile from about 16% to 12%
 on my laptop.

That micro-optimization looks to me like your compiler leaves
something to be desired.

 The first optimization that occurred to me was remove the loop
 altogether.

Or make it execute only in assert-enabled mode, perhaps.

This check had some use back in the bad old days, but the ResourceOwner
mechanism has probably removed a lot of the argument for it.

The counter-argument might be that failing to remove a buffer pin would
be disastrous; but I can't see that it'd be worse than failing to remove
an LWLock, and we have no belt-and-suspenders-too loop for those.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Joachim Wieland
On Wed, Nov 24, 2010 at 9:38 AM, Andrew Dunstan and...@dunslane.net wrote:
 It would be unique, but a pain in the neck for users to get. Robert's idea
 will have more traction with users.

Whatever approach we use, we need to think about the use case where 1%
of the objects should be dumped but should also make sure that you can
more or less easily dump 99% of the objects. Roberts use case is the
1% use case. Especially for the 99% case however, pg_dump needs to
create a full list of all available objects in whatever format as a
proposal so that you could just save  edit this list and then delete
what you don't want instead of writing such a list from scratch.

Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Right, that was my impression, too.  But, I think this may be partly a
 case of people talking past each other.  My impression of this
 conversation was a repetition of this sequence:

 A: This syntax is bad.
 B: But it's way faster!

 ...which makes no sense.  However, what I now think is going on here
 is that there are really two separate things that are wished for here
 - a more compact syntax, and a performance improvement.  And taken
 separately, I agree with both of those desires.  PL/pgsql is an
 incredibly clunky language syntactically, and it's also slow.  A patch
 that improves either one of those things has value, whether or not it
 also does the other one.

I understand the desire for nicer syntax, in the abstract.  I'm just
unimpressed by this particular change, mainly because I'm afraid that
it will make syntax-error behaviors worse and foreclose future options
for other changes to FOR.  If it were necessary to change the syntax
to get the performance benefit, I might think that on balance we should
do so; but it isn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Tom Lane
Dmitriy Igrishin dmit...@gmail.com writes:
 I also want to propose to make it possible dump function definitions
 as CREATE OR REPLACE FUNCTION rather than just CREATE
 FUNCTION (as pg_dump dumps them now).

It's intentional that pg_dump doesn't do that.  Please don't think
that pg_dump is a substitute for \ef.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Actually, what occurs to me to wonder is whether the facility has to be
 guaranteed unique at all.  If for instance you have a group of overloaded
 functions, is there really a big use-case for dumping just one and not
 the whole group?  Even if you think there's some use for it, is it big
 enough to justify a quantum jump in the complexity of the feature?

 Well, I think that being able to dump one specific function is a
 pretty important use case.  But I don't see why that's necessarily
 irreconcilable with your suggested syntax of --function=pattern, if we
 assume that the pattern is being matched against
 pg_proc.oid::regprocedure and define the matching rules such that
 foo(text) will not match sfoo(text).  Nothing anyone has proposed
 sounds like a quantum jump in complexity over your proposal.

It *will* be manifestly harder to use if users have to spell the
argument types just so.  Consider int4 versus integer, varchar versus
character varying (and not character varying(N)), etc etc.  I think
that leaving out the argument types is something we should very strongly
consider here.

 BTW, what about dependencies?  One of the main complaints we've heard
 about pg_restore's filtering features is that they are not smart about
 including things like the indexes of a selected table, or the objects it
 depends on (eg, functions referred to in CHECK constraints).  I'm not
 sure that a pure name-based filter will be any more usable than
 pg_restore's filter, if there is no accounting for dependencies.

 I am 100% positive that it will be a big step forward.

Apparently you haven't been reading pgsql-bugs and pgsql-novice for the
last five or ten years.  These are large problems in practice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 Whatever approach we use, we need to think about the use case where 1%
 of the objects should be dumped but should also make sure that you can
 more or less easily dump 99% of the objects. Roberts use case is the
 1% use case. Especially for the 99% case however, pg_dump needs to
 create a full list of all available objects in whatever format as a
 proposal so that you could just save  edit this list and then delete
 what you don't want instead of writing such a list from scratch.

For that I'd suggest an --exclude=pattern switch.  I'm really not
happy with the idea of applying pg_restore's -l then -L workflow
to dumps from a live database.  It's workable for pg_restore because
the dump file doesn't change underneath you between the two runs.
But having to make a list for pg_dump seems like a foot-gun.  Imagine
a DBA who wants to exclude a large log table from his dumps, so he
makes a -l-like list and removes that table, sets up the cron job to
use that list, and forgets about it.  Months later, he finds out that
his backups don't contain $critical-object-added-later.  A static
external list of objects to be dumped just doesn't make sense to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Actually, what occurs to me to wonder is whether the facility has to be
 guaranteed unique at all.  If for instance you have a group of overloaded
 functions, is there really a big use-case for dumping just one and not
 the whole group?  Even if you think there's some use for it, is it big
 enough to justify a quantum jump in the complexity of the feature?

 Well, I think that being able to dump one specific function is a
 pretty important use case.  But I don't see why that's necessarily
 irreconcilable with your suggested syntax of --function=pattern, if we
 assume that the pattern is being matched against
 pg_proc.oid::regprocedure and define the matching rules such that
 foo(text) will not match sfoo(text).  Nothing anyone has proposed
 sounds like a quantum jump in complexity over your proposal.

 It *will* be manifestly harder to use if users have to spell the
 argument types just so.  Consider int4 versus integer, varchar versus
 character varying (and not character varying(N)), etc etc.  I think
 that leaving out the argument types is something we should very strongly
 consider here.

I don't see why this is an either/or question.  Can't we make them optional?

 BTW, what about dependencies?  One of the main complaints we've heard
 about pg_restore's filtering features is that they are not smart about
 including things like the indexes of a selected table, or the objects it
 depends on (eg, functions referred to in CHECK constraints).  I'm not
 sure that a pure name-based filter will be any more usable than
 pg_restore's filter, if there is no accounting for dependencies.

 I am 100% positive that it will be a big step forward.

 Apparently you haven't been reading pgsql-bugs and pgsql-novice for the
 last five or ten years.  These are large problems in practice.

That seems like a cheap shot, since you already know that I haven't
been reading any of the mailing lists for that long.  I have, however,
been using PostgreSQL for that long, and I've hit this problem myself.
 I don't say that being able to dump an exact object and nothing more
will solve every use case, but I do say it's useful, at least to me.
I've wanted it many times.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The first optimization that occurred to me was remove the loop
 altogether.

 Or make it execute only in assert-enabled mode, perhaps.

 This check had some use back in the bad old days, but the ResourceOwner
 mechanism has probably removed a lot of the argument for it.

Yeah, that's what I was thinking - this could would have been a good
backstop when our cleanup mechanisms were not as robust as they seem
to be today.  But making the check execute only in assert-enabled more
doesn't seem right, since the check actually acts to mask other coding
errors, rather than reveal them.  Maybe we replace the check with one
that only occurs in an Assert-enabled build and just loops through and
does Assert(PrivateRefCount[i] == 0).  I'm not sure exactly where this
gets called in the shutdown sequence, though - is it sensible to
Assert() here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or make it execute only in assert-enabled mode, perhaps.

 But making the check execute only in assert-enabled more
 doesn't seem right, since the check actually acts to mask other coding
 errors, rather than reveal them.  Maybe we replace the check with one
 that only occurs in an Assert-enabled build and just loops through and
 does Assert(PrivateRefCount[i] == 0).

Yeah, that would be sensible.  There is precedent for this elsewhere
too; I think there's a similar setup for checking buffer refcounts
during transaction cleanup.

 I'm not sure exactly where this
 gets called in the shutdown sequence, though - is it sensible to
 Assert() here?

Assert is sensible anywhere.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested easy TODO: pg_dump --from-list

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It *will* be manifestly harder to use if users have to spell the
 argument types just so.  Consider int4 versus integer, varchar versus
 character varying (and not character varying(N)), etc etc.  I think
 that leaving out the argument types is something we should very strongly
 consider here.

 I don't see why this is an either/or question.  Can't we make them optional?

That might work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-24 Thread Pavel Stehule
Hello

2010/11/24 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 Right, that was my impression, too.  But, I think this may be partly a
 case of people talking past each other.  My impression of this
 conversation was a repetition of this sequence:

 A: This syntax is bad.
 B: But it's way faster!

 ...which makes no sense.  However, what I now think is going on here
 is that there are really two separate things that are wished for here
 - a more compact syntax, and a performance improvement.  And taken
 separately, I agree with both of those desires.  PL/pgsql is an
 incredibly clunky language syntactically, and it's also slow.  A patch
 that improves either one of those things has value, whether or not it
 also does the other one.

 I understand the desire for nicer syntax, in the abstract.  I'm just
 unimpressed by this particular change, mainly because I'm afraid that
 it will make syntax-error behaviors worse and foreclose future options
 for other changes to FOR.  If it were necessary to change the syntax
 to get the performance benefit, I might think that on balance we should
 do so; but it isn't.


I am for any readable syntax. It must not be FOR-IN-ARRAY. I
understand to problem with syntax-error checking. But I am not sure if
some different loop with control variable can be less ugliness in
language. Cannot we rewrite a parsing for-clause be more robust? I
agree with you, so there can be a other request in future. And if I
remember well, there was only few changes in other statements (on
parser level) and significant changes in FOR.

probably some hypothetical statement should be (my opinion)

FOR var [, vars]
UNKNOWN SYNTAX
LOOP
  ...
END LOOP;

PL/SQL uses a enhanced FOR

FOR var IN collection.first .. collection.last
LOOP
END LOOP;

From my view a introduction of new keyword should be a higher risk so
I don't would to do.

Regards

Pavel Stehule



                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements

2010-11-24 Thread Maciek Sakrejda
Result is oid=23, format=(0) T, value = 0x00,0x00,0x00,0x02

What do you mean regarding the format? Are you just inferring that
from the data? If memory serves, the format of a particular column is
not specified anywhere other than the RowDescription, and according to
your JDBC log output above, the server is telling you the format is
text (1) (which is your point--it doesn't match the resulting
data--but I want to make sure we're clear on what's actually going
on).

Also, can you narrow this down to a simple, self-contained test case
(with code)? Even if it's against a custom driver build, that would be
easier to investigate.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 11:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or make it execute only in assert-enabled mode, perhaps.

 But making the check execute only in assert-enabled more
 doesn't seem right, since the check actually acts to mask other coding
 errors, rather than reveal them.  Maybe we replace the check with one
 that only occurs in an Assert-enabled build and just loops through and
 does Assert(PrivateRefCount[i] == 0).

 Yeah, that would be sensible.  There is precedent for this elsewhere
 too; I think there's a similar setup for checking buffer refcounts
 during transaction cleanup.

 I'm not sure exactly where this
 gets called in the shutdown sequence, though - is it sensible to
 Assert() here?

 Assert is sensible anywhere.

OK, patch attached.  Here's what oprofile output looks like with this applied:

3505 10.4396  libc-2.11.2.so   memset
2051  6.1089  libc-2.11.2.so   memcpy
1686  5.0217  postgres AllocSetAlloc
1642  4.8907  postgres hash_search_with_hash_value
1247  3.7142  libc-2.11.2.so   _int_malloc
1096  3.2644  libc-2.11.2.so   fread
855   2.5466  ld-2.11.2.so do_lookup_x
723   2.1535  ld-2.11.2.so _dl_fixup
645   1.9211  ld-2.11.2.so strcmp
620   1.8467  postgres MemoryContextAllocZero

Somehow I don't think I'm going to get much further with this without
figuring out how to get oprofile to cough up a call graph.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


AtProcExit_Buffers.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 OK, patch attached.

Two comments:

1. A comment would help, something like Assert we released all buffer pins.

2. AtProcExit_LocalBuffers should be redone the same way, for
consistency (it likely won't make any performance difference).
Note the comment for AtProcExit_LocalBuffers, too; that probably
needs to be changed along the lines of If we missed any, and
assertions aren't enabled, we'll fail later in DropRelFileNodeBuffers
while trying to drop the temp rels.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 10:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Right, that was my impression, too.  But, I think this may be partly a
 case of people talking past each other.  My impression of this
 conversation was a repetition of this sequence:

 A: This syntax is bad.
 B: But it's way faster!

 ...which makes no sense.  However, what I now think is going on here
 is that there are really two separate things that are wished for here
 - a more compact syntax, and a performance improvement.  And taken
 separately, I agree with both of those desires.  PL/pgsql is an
 incredibly clunky language syntactically, and it's also slow.  A patch
 that improves either one of those things has value, whether or not it
 also does the other one.

 I understand the desire for nicer syntax, in the abstract.  I'm just
 unimpressed by this particular change, mainly because I'm afraid that
 it will make syntax-error behaviors worse and foreclose future options
 for other changes to FOR.  If it were necessary to change the syntax
 to get the performance benefit, I might think that on balance we should
 do so; but it isn't.

It'd be nicer syntax if there were some way to have the keyword not
adjacent to the arbitrary expression.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4-vintage problem in postmaster.c

2010-11-24 Thread Stefan Kaltenbrunner

On 11/15/2010 03:24 PM, Alvaro Herrera wrote:

Excerpts from Tom Lane's message of sáb nov 13 19:07:50 -0300 2010:

Stefan Kaltenbrunnerste...@kaltenbrunner.cc  writes:

On 11/13/2010 06:58 PM, Tom Lane wrote:

Just looking at it, I think that the logic in canAcceptConnections got
broken by somebody in 8.4, and then broken some more in 9.0: in some
cases it will return an okay to proceed status without having checked
for TOOMANY children.  Was this system possibly in PM_WAIT_BACKUP or
PM_HOT_STANDBY state?  What version was actually running?



I don't have too many details on the actual setup (working on that) but
the box in question is running 8.4.2 and had no issues before the
upgrade to 8.4 (ie 8.3 was reported to work fine - so a 8.4+ breakage
looks plausible).


Well, this failure would certainly involve a flood of connection
attempts, so it's possible it's a pre-existing bug that they just did
not happen to trip over before.  But the sequence of events that I'm
thinking about is a smart shutdown attempt (SIGTERM to postmaster)
while an online backup is in progress, followed by a flood of
near-simultaneous connection attempts while the backup is still active.


As far as I could gather from Stefan's description, I think this is
pretty unlikely.  It seems to me that the too many children error
message is very common in the 8.3 setup already, and the only reason
they have a problem on 8.4 is that it crashes instead.


not sure if that is true - but 8.4 crashes whereas 8.3  just (seems to) 
works - the issue is still there with 8_4_STABLE...


DEBUG3 level output (last few hours - 7MB in size) is available under 
http://www.kaltenbrunner.cc/files/postgresql-2010-11-24_143513.log


From looking at the code I'm not immediatly seeing what is going wrong 
here but maybe somebody else has an idea.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements

2010-11-24 Thread Radosław Smogura
I didn't described log correctly, 1st attached response is normal execution; 
flags QUERY_SUPPRESS_BEGIN | QUERY_ONESHOT, 2nd is compiled statement 
QUERY_SUPPRESS_BEGIN only.

Text format is marked as 0, binary format is 1. 

The 1st shown execution (flags=17) is good, it tells that result is sended in 
binary format, as int4, but 2nd one (flags=16) (statement compiled) result is 
bad because

Server described row as text 
07:52:06.061 (54)  =BE RowDescription(1)
07:52:06.061 (54) Field(,INT4,4,T)
but recieved tuple was clearly in binary format, 0x00, 0x00, 0x00, 0x02. 
(Look at length should be 1 or 2 if it's text format and value 2)
Speaking it simple, server said you will recive text data, but sent it as 
binary encoded int.

I've checked this againt 8.4 and 9.0.1.

Maciek Sakrejda msakre...@truviso.com Wednesday 24 November 2010 18:02:27
 Result is oid=23, format=(0) T, value = 0x00,0x00,0x00,0x02
 
 What do you mean regarding the format? Are you just inferring that
 from the data? If memory serves, the format of a particular column is
 not specified anywhere other than the RowDescription, and according to
 your JDBC log output above, the server is telling you the format is
 text (1) (which is your point--it doesn't match the resulting
 data--but I want to make sure we're clear on what's actually going
 on).

It's jdbc2.PreparedStatementTest, form JDBC driver unit tests. I've exposed 
sources here http://www.rsmogura.net/pgsql/pgjdbc_exp_20101124.tar.gz
compiled driver and unit tests are in parent directory.
In above all not related to this bug tests has been commented, just run ant 
test.
 Also, can you narrow this down to a simple, self-contained test case
 (with code)? Even if it's against a custom driver build, that would be
 easier to investigate.
 ---
 Maciek Sakrejda | System Architect | Truviso
 
 1065 E. Hillsdale Blvd., Suite 215
 Foster City, CA 94404
 (650) 242-3500 Main
 www.truviso.com


Kind regards,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 1:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 OK, patch attached.

 Two comments:

Revised patch attached.

I tried configuring oprofile with --callgraph=10 and then running
oprofile with -c, but it gives kooky looking output I can't interpret.
 For example:

  642.8571  postgres record_in
  857.1429  postgres pg_perm_setlocale
17035 5.7219  libc-2.11.2.so   memcpy
  17035100.000  libc-2.11.2.so   memcpy [self]

Not that helpful.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


AtProcExit_Buffers-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements

2010-11-24 Thread Maciek Sakrejda
 Text format is marked as 0, binary format is 1.

Sorry--I misread the docs. This is consistent and something does look fishy.

Thanks for the tarball. I can take a look tonight.
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Gerhard Heift
On Wed, Nov 24, 2010 at 01:20:36PM -0500, Robert Haas wrote:
 On Wed, Nov 24, 2010 at 1:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  OK, patch attached.
 
  Two comments:
 
 Revised patch attached.
 
 I tried configuring oprofile with --callgraph=10 and then running
 oprofile with -c, but it gives kooky looking output I can't interpret.
  For example:
 
   642.8571  postgres record_in
   857.1429  postgres pg_perm_setlocale
 17035 5.7219  libc-2.11.2.so   memcpy
   17035100.000  libc-2.11.2.so   memcpy [self]
 
 Not that helpful.  :-(

Have a look at the wiki:
http://wiki.postgresql.org/wiki/Profiling_with_OProfile#Additional_analysis

 Robert Haas

Regards,
  Gerhard Heift


signature.asc
Description: Digital signature


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 19:01:32 Robert Haas wrote:
 Somehow I don't think I'm going to get much further with this without
 figuring out how to get oprofile to cough up a call graph.
I think to do that sensibly you need CFLAGS=-O2 -fno-omit-frame-pointer...

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Bruce Momjian
Someone offlist reported query slowness because we don't convert
function calls with all-constant parameters to be a constants before we
start a sequential scan:

EXPLAIN SELECT * FROM test WHERE 
x = to_date('2001-01-01', '-MM-DD') AND 
x = to_date('2001-01-01', '-MM-DD');

  QUERY PLAN

---

 Seq Scan on test  (cost=0.00..58.00 rows=12 width=4)
   Filter: ((x = to_date('2001-01-01'::text, '-MM-DD'::text)) AND
(x = to_date('2001-01-01'::text, '-MM-DD'::text)))
(2 rows)

Notice the to_date()'s were not converted to constants in EXPLAIN so
they are evaluated for every row.  to_date() is marked STABLE.

Is this something we should improve?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Notice the to_date()'s were not converted to constants in EXPLAIN so
 they are evaluated for every row.  to_date() is marked STABLE.

 Is this something we should improve?

No.  This is per expectation.  Only IMMUTABLE functions can be folded to
constants in advance of the query.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Notice the to_date()'s were not converted to constants in EXPLAIN so
  they are evaluated for every row.  to_date() is marked STABLE.
 
  Is this something we should improve?
 
 No.  This is per expectation.  Only IMMUTABLE functions can be folded to
 constants in advance of the query.

Well CREATE FUNCTION says about STABLE:

   STABLE indicates that the function cannot modify the
   database, and that within a single table scan it will
   consistently return the same result for the same
   argument values, but that its result could change
   across SQL statements. This is the appropriate
   selection for functions whose results depend on
   database lookups, parameter variables (such as the
   current time zone), etc. (It is inappropriate for
   AFTER triggers that wish to query rows modified by the
   current command.) Also note that the current_timestamp
   family of functions qualify as stable, since their
   values do not change within a transaction.

I realize they can't be converted to constants before the query starts
but is there a reason we can't convert those functions to constants in
the executor before a table scan?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Gerhard Heift ml-postgresql-20081012-3...@gheift.de writes:
 On Wed, Nov 24, 2010 at 01:20:36PM -0500, Robert Haas wrote:
 I tried configuring oprofile with --callgraph=10 and then running
 oprofile with -c, but it gives kooky looking output I can't interpret.

 Have a look at the wiki:
 http://wiki.postgresql.org/wiki/Profiling_with_OProfile#Additional_analysis

The critical piece of information is there now, but it wasn't a minute
ago.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-24 Thread Dimitri Fontaine
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 RECOVERY_COMMAND_FILE is opened twice in the patch. The first time
 is for checking the existence, and the second time is for parsing.
 Instead of the repeat, how about adding FILE* version of parser?
 It will be also called from ParseConfigFile() as a sub routine.

   bool ParseConfigFd(FILE *fd, const char *config_file, int depth, ...)

Something like the attached, version 5 of the patch? I've been using the
function name ParseConfigFp because the internal parameter was called fp
in the previous function body. I suppose that could easily be changed at
commit time if necessary.

 BTW, the parser supports include and custom_variable_classes
 not only for postgresql.conf but also for all files. Is it an
 intended behavior?  I think they are harmless, so we don't have
 to change the codes; include might be useful even in recovery.conf,
 and custom_variable_classes will be unrecognized recovery
 parameter error after all.

Extensions will need the support for custom_variable_classes as it is
done now, and as you say, the recovery will just error out. You have to
clean your recovery.conf file then try again (I just tried and confirm).

I personally don't see any harm to have the features in all currently
known uses-cases, and I don't see any point in walking an extra mile
here to remove a feature in some cases.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 5024,5200  str_time(pg_time_t tnow)
  }
  
  /*
-  * Parse one line from recovery.conf. 'cmdline' is the raw line from the
-  * file. If the line is parsed successfully, returns true, false indicates
-  * syntax error. On success, *key_p and *value_p are set to the parameter
-  * name and value on the line, respectively. If the line is an empty line,
-  * consisting entirely of whitespace and comments, function returns true
-  * and *keyp_p and *value_p are set to NULL.
-  *
-  * The pointers returned in *key_p and *value_p point to an internal buffer
-  * that is valid only until the next call of parseRecoveryCommandFile().
-  */
- static bool
- parseRecoveryCommandFileLine(char *cmdline, char **key_p, char **value_p)
- {
- 	char	   *ptr;
- 	char	   *bufp;
- 	char	   *key;
- 	char	   *value;
- 	static char *buf = NULL;
- 
- 	*key_p = *value_p = NULL;
- 
- 	/*
- 	 * Allocate the buffer on first use. It's used to hold both the parameter
- 	 * name and value.
- 	 */
- 	if (buf == NULL)
- 		buf = malloc(MAXPGPATH + 1);
- 	bufp = buf;
- 
- 	/* Skip any whitespace at the beginning of line */
- 	for (ptr = cmdline; *ptr; ptr++)
- 	{
- 		if (!isspace((unsigned char) *ptr))
- 			break;
- 	}
- 	/* Ignore empty lines */
- 	if (*ptr == '\0' || *ptr == '#')
- 		return true;
- 
- 	/* Read the parameter name */
- 	key = bufp;
- 	while (*ptr  !isspace((unsigned char) *ptr) 
- 		   *ptr != '='  *ptr != '\'')
- 		*(bufp++) = *(ptr++);
- 	*(bufp++) = '\0';
- 
- 	/* Skip to the beginning quote of the parameter value */
- 	ptr = strchr(ptr, '\'');
- 	if (!ptr)
- 		return false;
- 	ptr++;
- 
- 	/* Read the parameter value to *bufp. Collapse any '' escapes as we go. */
- 	value = bufp;
- 	for (;;)
- 	{
- 		if (*ptr == '\'')
- 		{
- 			ptr++;
- 			if (*ptr == '\'')
- *(bufp++) = '\'';
- 			else
- 			{
- /* end of parameter */
- *bufp = '\0';
- break;
- 			}
- 		}
- 		else if (*ptr == '\0')
- 			return false;		/* unterminated quoted string */
- 		else
- 			*(bufp++) = *ptr;
- 
- 		ptr++;
- 	}
- 	*(bufp++) = '\0';
- 
- 	/* Check that there's no garbage after the value */
- 	while (*ptr)
- 	{
- 		if (*ptr == '#')
- 			break;
- 		if (!isspace((unsigned char) *ptr))
- 			return false;
- 		ptr++;
- 	}
- 
- 	/* Success! */
- 	*key_p = key;
- 	*value_p = value;
- 	return true;
- }
- 
- /*
   * See if there is a recovery command file (recovery.conf), and if so
   * read in parameters for archive recovery and XLOG streaming.
   *
!  * XXX longer term intention is to expand this to
!  * cater for additional parameters and controls
!  * possibly use a flex lexer similar to the GUC one
   */
  static void
  readRecoveryCommandFile(void)
  {
  	FILE	   *fd;
- 	char		cmdline[MAXPGPATH];
  	TimeLineID	rtli = 0;
  	bool		rtliGiven = false;
! 	bool		syntaxError = false;
  
  	fd = AllocateFile(RECOVERY_COMMAND_FILE, r);
  	if (fd == NULL)
  	{
  		if (errno == ENOENT)
! 			return;/* not there, so no archive recovery */
  		ereport(FATAL,
  (errcode_for_file_access(),
   errmsg(could not open recovery command file \%s\: %m,
  		RECOVERY_COMMAND_FILE)));
  	}
  
! 	/*
! 	 * Parse the file...
! 	 */
! 	while (fgets(cmdline, sizeof(cmdline), fd) != NULL)
! 	{
! 		char	   *tok1;
! 		char	   *tok2;
  
! 		if (!parseRecoveryCommandFileLine(cmdline, tok1, tok2))
! 		{
! 			syntaxError = true;
! 			break;
! 		}
! 		if (tok1 == NULL)
! 			continue;
! 
! 		if 

Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Revised patch attached.

The asserts in AtProcExit_LocalBuffers are a bit pointless since
you forgot to remove the code that forcibly zeroes LocalRefCount[]...
otherwise +1.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Full results, and call graph, attached.  The first obvious fact is
 that most of the memset overhead appears to be coming from
 InitCatCache.

AFAICT that must be the palloc0 calls that are zeroing out (mostly)
the hash bucket headers.  I don't see any real way to make that cheaper
other than to cut the initial sizes of the hash tables (and add support
for expanding them later, which is lacking in catcache ATM).  Not
convinced that that creates any net savings --- it might just save
some cycles at startup in exchange for more cycles later, in typical
backend usage.

Making those hashtables expansible wouldn't be a bad thing in itself,
mind you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I realize they can't be converted to constants before the query starts
 but is there a reason we can't convert those functions to constants in
 the executor before a table scan?

Other than the significant number of cycles that would be wasted (in
most cases) checking for the possibility, probably not.  I'm dubious
that it would average out to a win though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-11-24 Thread Peter Eisentraut
On mån, 2010-11-22 at 11:58 +0900, Itagaki Takahiro wrote:
 * Did you see any performance regression by collation?
 I found a bug in lc_collate_is_c(); result = 0 should be
 checked before any other checks. SearchSysCache1() here
 would be a performance regression.

That code turned out to be buggy anyway, because it was using the
result cache variable independent of the collation parameter.

I did some profiling with this now.  The problem is that this function
lc_collate_is_c() would need to cache the C-ness property for any
number of collations.  Depending on what call pattern you expect or want
to optimize for, you might end up caching most of the pg_collation
catalog, which is actually the mandate of SearchSysCache, but the
profile shows that SearchSysCache takes a large chunk of the additional
run time.

If I remove that branch altogether, that is, don't treat the C locale
specially at all in the nondefault collation case, then using non-C
locales as nondefault collation is almost as fast as using non-C locales
as default location.  However, using the C locale as a nondefault
collation would then be quite slow (still faster that non-C locales).

The solution would perhaps be a custom, lightweight caching system, but
I haven't thought of one yet.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Marti Raudsepp
On Wed, Nov 24, 2010 at 21:52, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Notice the to_date()'s were not converted to constants in EXPLAIN so
 they are evaluated for every row.  to_date() is marked STABLE.

 No.  This is per expectation.  Only IMMUTABLE functions can be folded to
 constants in advance of the query.

This is something that has bit me in the past.

I realize that STABLE functions cannot be constant-folded at
planning-time. But are there good reasons why it cannot called only
once at execution-time?

As long as *only* STABLE or IMMUTABLE functions are used in a query,
we can assume that settings like timezone won't change in the middle
of the execution of a function, thus STABLE function calls can be
collapsed -- right?

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation, work in progress

2010-11-24 Thread Peter Eisentraut
On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote:
 It seems you've falsified the header comment in
 pathkeys_useful_for_merging(), although I guess it's already false
 because it doesn't seem to have been updated for the NULLS ASC/DESC
 stuff, and the interior comment in right_merge_direction() also needs
 adjusting.  But this might be more than a documentation problem,
 because the choice of merge direction really *is* arbitrary in the
 case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's
 actually true for collation.  If collation affects the definition of
 equality then it certainly isn't true.

I did check that again and didn't arrive at the conclusion that the
comments would need updating either with respect to this patch or some
previous change.  Could you check again and possibly provide a
suggestion?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation, work in progress

2010-11-24 Thread Peter Eisentraut
On ons, 2010-09-22 at 19:44 +0900, Itagaki Takahiro wrote:
 * CREATE TABLE (LIKE table_with_collation) doesn't inherit collations.

This was fixed in the CF2010-11 patch.

 * psql \d needs a separator between collate and not null modifiers.

And this as well.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Full results, and call graph, attached.  The first obvious fact is
 that most of the memset overhead appears to be coming from
 InitCatCache.

 AFAICT that must be the palloc0 calls that are zeroing out (mostly)
 the hash bucket headers.  I don't see any real way to make that cheaper
 other than to cut the initial sizes of the hash tables (and add support
 for expanding them later, which is lacking in catcache ATM).  Not
 convinced that that creates any net savings --- it might just save
 some cycles at startup in exchange for more cycles later, in typical
 backend usage.

 Making those hashtables expansible wouldn't be a bad thing in itself,
 mind you.

The idea I had was to go the other way and say, hey, if these hash
tables can't be expanded anyway, let's put them on the BSS instead of
heap-allocating them.  Any new pages we request from the OS will be
zeroed anyway, but with palloc we then have to re-zero the allocated
block anyway because palloc can return a memory that's been used,
freed, and reused.  However, for anything that only needs to be
allocated once and never freed, and whose size can be known at compile
time, that's not an issue.

In fact, it wouldn't be that hard to relax the known at compile time
constraint either.  We could just declare:

char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED];

...and then peel off chunks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 21:47:32 Robert Haas wrote:
 On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  Full results, and call graph, attached.  The first obvious fact is
  that most of the memset overhead appears to be coming from
  InitCatCache.
  
  AFAICT that must be the palloc0 calls that are zeroing out (mostly)
  the hash bucket headers.  I don't see any real way to make that cheaper
  other than to cut the initial sizes of the hash tables (and add support
  for expanding them later, which is lacking in catcache ATM).  Not
  convinced that that creates any net savings --- it might just save
  some cycles at startup in exchange for more cycles later, in typical
  backend usage.
  
  Making those hashtables expansible wouldn't be a bad thing in itself,
  mind you.
 
 The idea I had was to go the other way and say, hey, if these hash
 tables can't be expanded anyway, let's put them on the BSS instead of
 heap-allocating them.  Any new pages we request from the OS will be
 zeroed anyway, but with palloc we then have to re-zero the allocated
 block anyway because palloc can return a memory that's been used,
 freed, and reused.  However, for anything that only needs to be
 allocated once and never freed, and whose size can be known at compile
 time, that's not an issue.
 
 In fact, it wouldn't be that hard to relax the known at compile time
 constraint either.  We could just declare:
 
 char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED];
 
 ...and then peel off chunks.
Won't this just cause loads of additional pagefaults after fork() when those 
pages are used the first time and then a second time when first written to (to 
copy it)?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote:
 On Wednesday 24 November 2010 21:47:32 Robert Haas wrote:
 On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  Full results, and call graph, attached.  The first obvious fact is
  that most of the memset overhead appears to be coming from
  InitCatCache.
 
  AFAICT that must be the palloc0 calls that are zeroing out (mostly)
  the hash bucket headers.  I don't see any real way to make that cheaper
  other than to cut the initial sizes of the hash tables (and add support
  for expanding them later, which is lacking in catcache ATM).  Not
  convinced that that creates any net savings --- it might just save
  some cycles at startup in exchange for more cycles later, in typical
  backend usage.
 
  Making those hashtables expansible wouldn't be a bad thing in itself,
  mind you.

 The idea I had was to go the other way and say, hey, if these hash
 tables can't be expanded anyway, let's put them on the BSS instead of
 heap-allocating them.  Any new pages we request from the OS will be
 zeroed anyway, but with palloc we then have to re-zero the allocated
 block anyway because palloc can return a memory that's been used,
 freed, and reused.  However, for anything that only needs to be
 allocated once and never freed, and whose size can be known at compile
 time, that's not an issue.

 In fact, it wouldn't be that hard to relax the known at compile time
 constraint either.  We could just declare:

 char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED];

 ...and then peel off chunks.
 Won't this just cause loads of additional pagefaults after fork() when those
 pages are used the first time and then a second time when first written to (to
 copy it)?

Aren't we incurring those page faults anyway, for whatever memory
palloc is handing out?  The heap is no different from bss; we just
move the pointer with sbrk().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Regex code versus Unicode chars beyond codepoint 255

2010-11-24 Thread Tom Lane
Bug #5766 points out that we're still not there yet in terms of having
sane behavior for locale-specific regex operations in Unicode encoding.
The reason it's not working is that regc_locale does this to expand
the set of characters that are considered to match [[:alnum:]] :

/*
 * Now compute the character class contents.
 *
 * For the moment, assume that only char codes  256 can be in these
 * classes.
 */
...
case CC_ALNUM:
cv = getcvec(v, UCHAR_MAX, 0);
if (cv)
{
for (i = 0; i = UCHAR_MAX; i++)
{
if (pg_wc_isalnum((chr) i))
addchr(cv, (chr) i);
}
}
break;

This is a leftover from when we weren't trying to behave sanely for
multibyte encodings.  Now that we are, it's clearly not good enough.
But iterating up to many thousands in this loop isn't too appetizing
from a performance standpoint.

I looked at the equivalent place in Tcl, and I see that what they're
currently doing is they have a hard-wired list of all the Unicode
code points that are classified as alnum, punct, etc.  We could
duplicate that (and use it only if encoding is UTF8), but it seems
kind of ugly, and it doesn't respect the idea that the locale setting
ought to control which characters are considered to be in each class.

Another possibility is to take those lists but apply iswpunct() and
friends to the values, including only code points that pass in the
finished set.  So what you get is the intersection of the Unicode list
and the locale behavior.

Some of the performance pressure could be taken off if we cached
the results instead of recomputing them every time a regex uses
the character classification; but I'm not sure how much that would
save.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote:
 The idea I had was to go the other way and say, hey, if these hash
 tables can't be expanded anyway, let's put them on the BSS instead of
 heap-allocating them.

 Won't this just cause loads of additional pagefaults after fork() when those
 pages are used the first time and then a second time when first written to 
 (to
 copy it)?

 Aren't we incurring those page faults anyway, for whatever memory
 palloc is handing out?  The heap is no different from bss; we just
 move the pointer with sbrk().

I think you're missing the real point, which that the cost you're
measuring here probably isn't so much memset() as faulting in large
chunks of address space.  Avoiding the explicit memset() likely will
save little in real runtime --- it'll just make sure the initial-touch
costs are more distributed and harder to measure.  But in any case I
think this idea is a nonstarter because it gets in the way of making
those hashtables expansible, which we *do* need to do eventually.

(You might be able to confirm or disprove this theory if you ask
oprofile to count memory access stalls instead of CPU clock cycles...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 21:54:53 Robert Haas wrote:
 On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote:
  On Wednesday 24 November 2010 21:47:32 Robert Haas wrote:
  On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   Robert Haas robertmh...@gmail.com writes:
   Full results, and call graph, attached.  The first obvious fact is
   that most of the memset overhead appears to be coming from
   InitCatCache.
   
   AFAICT that must be the palloc0 calls that are zeroing out (mostly)
   the hash bucket headers.  I don't see any real way to make that
   cheaper other than to cut the initial sizes of the hash tables (and
   add support for expanding them later, which is lacking in catcache
   ATM).  Not convinced that that creates any net savings --- it might
   just save some cycles at startup in exchange for more cycles later,
   in typical backend usage.
   
   Making those hashtables expansible wouldn't be a bad thing in itself,
   mind you.
  
  The idea I had was to go the other way and say, hey, if these hash
  tables can't be expanded anyway, let's put them on the BSS instead of
  heap-allocating them.  Any new pages we request from the OS will be
  zeroed anyway, but with palloc we then have to re-zero the allocated
  block anyway because palloc can return a memory that's been used,
  freed, and reused.  However, for anything that only needs to be
  allocated once and never freed, and whose size can be known at compile
  time, that's not an issue.
  
  In fact, it wouldn't be that hard to relax the known at compile time
  constraint either.  We could just declare:
  
  char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED];
  
  ...and then peel off chunks.
  
  Won't this just cause loads of additional pagefaults after fork() when
  those pages are used the first time and then a second time when first
  written to (to copy it)?
 
 Aren't we incurring those page faults anyway, for whatever memory
 palloc is handing out?  The heap is no different from bss; we just
 move the pointer with sbrk().
Yes, but only once. Also scrubbing a page is faster than copying it... (and 
there were patches floating around to do that in advance, not sure if they got 
integrated into mainline linux)

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I did some profiling of pgbench -j 36 -c 36 -T 500 banging on my
 two-core desktop box - with synchronous_commit turned off to keep the
 fsyncs from dominating the profile - and got these results:

 29634 4.7124  postgres base_yyparse

Seems like pgbench is a poster child for the value of prepared
statements.  Have you tried it with -M prepared?

I'd take this with a grain of salt as to whether it's representative of
real applications, of course.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:
 I'd like to get access to a box with (a lot) more cores, to see
 whether the lock stuff moves up in the profile.  A big chunk of that
 hash_search_with_hash_value overhead is coming from
 LockAcquireExtended.  The __strcmp_sse2 is almost entirely parsing
 overhead.  In general, I'm not sure there's much hope for reducing the
 parsing overhead, although ScanKeywordLookup() can certainly be done
 better.  XLogInsert() is spending a lot of time doing CRC's.
 LWLockAcquire() is dropping cycles in many different places.
I can get you profiles of machines with up two 24 real cores, unfortunately I 
can't give access away.

Regarding CRCs:
I spent some time optimizing these, as you might remember. The wall I hit 
optimizing it benefit-wise is that the single CRC calls (4 for a non-indexed 
single-row insert on a table with 1 column inside a transaction)  are just too 
damn small to get more efficient. Its causing pipeline stalls all over...
(21, 5, 1, 28 bytes).

I have a very preliminary patch calculating the CRC over the whole thing in 
one go if it can do so (no switch, no xl buffers wraparound), but its highly 
ugly as it needs to read from the xl insert buffers and then reinsert the crc 
at the correct position.
While it shows a noticable improvement, that doesn't seem to be a good way to 
go. It could be made to work properly though.

I played around with some ideas to do that more nicely, but none were 
gratifying.

Recarding LWLockAcquire costs: 
Yes, its pretty noticeable - on loads of different usages. On a bunch of 
production machines its the second (begind XLogInsert) on some the most 
expensive function. Most of the time 

All of those machines are Nehalems though, so the image may be a bit 
distorted.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote:
 
 Won't this just cause loads of additional pagefaults after fork() when
 those pages are used the first time and then a second time when first
 written to (to copy it)?
 
 Aren't we incurring those page faults anyway, for whatever memory
 palloc is handing out?  The heap is no different from bss; we just
 move the pointer with sbrk().
 Yes, but only once. Also scrubbing a page is faster than copying it... (and 
 there were patches floating around to do that in advance, not sure if they 
 got 
 integrated into mainline linux)

I'm not following - can you elaborate?

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 22:14:04 Andres Freund wrote:
 On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:
  I'd like to get access to a box with (a lot) more cores, to see
  whether the lock stuff moves up in the profile.  A big chunk of that
  hash_search_with_hash_value overhead is coming from
  LockAcquireExtended.  The __strcmp_sse2 is almost entirely parsing
  overhead.  In general, I'm not sure there's much hope for reducing the
  parsing overhead, although ScanKeywordLookup() can certainly be done
  better.  XLogInsert() is spending a lot of time doing CRC's.
  LWLockAcquire() is dropping cycles in many different places.
 
 I can get you profiles of machines with up two 24 real cores, unfortunately
 I can't give access away.
 
 Regarding CRCs:
 I spent some time optimizing these, as you might remember. The wall I hit
 optimizing it benefit-wise is that the single CRC calls (4 for a
 non-indexed single-row insert on a table with 1 column inside a
 transaction)  are just too damn small to get more efficient. Its causing
 pipeline stalls all over... (21, 5, 1, 28 bytes).
 
 I have a very preliminary patch calculating the CRC over the whole thing in
 one go if it can do so (no switch, no xl buffers wraparound), but its
 highly ugly as it needs to read from the xl insert buffers and then
 reinsert the crc at the correct position.
 While it shows a noticable improvement, that doesn't seem to be a good way
 to go. It could be made to work properly though.
 
 I played around with some ideas to do that more nicely, but none were
 gratifying.
 
 Recarding LWLockAcquire costs:
 Yes, its pretty noticeable - on loads of different usages. On a bunch of
 production machines its the second (begind XLogInsert) on some the most
 expensive function. Most of the time
AllocSetAlloc is the third, battling with hash_search_with_hash value. To 
complete that sentence...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote:
 Yes, but only once. Also scrubbing a page is faster than copying it... (and 
 there were patches floating around to do that in advance, not sure if they 
 got 
 integrated into mainline linux)

 I'm not following - can you elaborate?

I think Andres is saying that bss space isn't optimized during a fork
operation: it'll be propagated to the child as copy-on-write pages.
Dunno if that's true or not, but if it is, it'd be a good reason to
avoid the scheme you're suggesting.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 22:18:08 Robert Haas wrote:
 On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote:
  Won't this just cause loads of additional pagefaults after fork() when
  those pages are used the first time and then a second time when first
  written to (to copy it)?
  
  Aren't we incurring those page faults anyway, for whatever memory
  palloc is handing out?  The heap is no different from bss; we just
  move the pointer with sbrk().
  
  Yes, but only once. Also scrubbing a page is faster than copying it...
  (and there were patches floating around to do that in advance, not sure
  if they got integrated into mainline linux)
 I'm not following - can you elaborate?
When forking the memory mapping of the process is copied - the actual pages 
are not. When a page is first accessed the page fault handler will setup a 
mapping to the old page and mark it as shared. When now written to it will 
fault again and copy the page.

In contrast if you access a page the first time after an sbrk (or mmap, doesn't 
matter) a new page will get scrubbed and and a mapping will get setup.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 22:25:45 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote:
  Yes, but only once. Also scrubbing a page is faster than copying it...
  (and there were patches floating around to do that in advance, not sure
  if they got integrated into mainline linux)
  
  I'm not following - can you elaborate?
 
 I think Andres is saying that bss space isn't optimized during a fork
 operation: it'll be propagated to the child as copy-on-write pages.
 Dunno if that's true or not, but if it is, it'd be a good reason to
 avoid the scheme you're suggesting.
Afair nearly all pages are propagated with copy-on-write semantics.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (You might be able to confirm or disprove this theory if you ask
 oprofile to count memory access stalls instead of CPU clock cycles...)

I don't see an event for that.

# opcontrol --list-events | grep STALL
INSTRUCTION_FETCH_STALL: (counter: all)
DISPATCH_STALLS: (counter: all)
DISPATCH_STALL_FOR_BRANCH_ABORT: (counter: all)
DISPATCH_STALL_FOR_SERIALIZATION: (counter: all)
DISPATCH_STALL_FOR_SEGMENT_LOAD: (counter: all)
DISPATCH_STALL_FOR_REORDER_BUFFER_FULL: (counter: all)
DISPATCH_STALL_FOR_RESERVATION_STATION_FULL: (counter: all)
DISPATCH_STALL_FOR_FPU_FULL: (counter: all)
DISPATCH_STALL_FOR_LS_FULL: (counter: all)
DISPATCH_STALL_WAITING_FOR_ALL_QUIET: (counter: all)
DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC: (counter: all)

# opcontrol --list-events | grep MEMORY
MEMORY_REQUESTS: (counter: all)
MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS: (counter: all)
MEMORY_CONTROLLER_SLOT_MISSED: (counter: all)
MEMORY_CONTROLLER_TURNAROUNDS: (counter: all)
MEMORY_CONTROLLER_BYPASS_COUNTER_SATURATION: (counter: all)
CPU_IO_REQUESTS_TO_MEMORY_IO: (counter: all)
MEMORY_CONTROLLER_REQUESTS: (counter: all)

Ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (You might be able to confirm or disprove this theory if you ask
 oprofile to count memory access stalls instead of CPU clock cycles...)

 I don't see an event for that.

You probably want something involving cache misses.  The event names
vary depending on just which CPU you've got.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] duplicate connection failure messages

2010-11-24 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   I assume you are suggesting to use our inet_net_ntop() even if the
   system has inet_ntop().
  
  If you're going to have code to do the former, it doesn't seem to be
  worth the trouble to also have code that does the latter ...
 
 OK, we will not call inet_ntop() at all.  I moved the CIDR part of
 adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining
 net part to /port/inet_net_ntop.c.
 
 I then changed all uses of inet_ntoa to use inet_net_ntop().  While this
 churn would perhaps not be warranted just to allow for better error
 messages, I found pg_getaddrinfo_all() being called from
 libpq::connectDBStart(), which makes it not thread-safe.  I am not
 excited about backpatching it but it is a threading bug.
 
 The output is as expected:
 
   $ psql -h localhost test
   psql: could not connect to server: Connection refused
   Is the server running on host localhost (127.0.0.1) and 
 accepting
   TCP/IP connections on port 5432?
   $ psql -h 127.0.0.1 test
   psql: could not connect to server: Connection refused
   Is the server running on host 127.0.0.1 and accepting
   TCP/IP connections on port 5432?

Applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 23:03:48 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  (You might be able to confirm or disprove this theory if you ask
  oprofile to count memory access stalls instead of CPU clock cycles...)
  
  I don't see an event for that.
 
 You probably want something involving cache misses.  The event names
 vary depending on just which CPU you've got.
Or some BUS OUTSTANDING event.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 5:15 PM, Andres Freund and...@anarazel.de wrote:
 On Wednesday 24 November 2010 23:03:48 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  (You might be able to confirm or disprove this theory if you ask
  oprofile to count memory access stalls instead of CPU clock cycles...)
 
  I don't see an event for that.

 You probably want something involving cache misses.  The event names
 vary depending on just which CPU you've got.
 Or some BUS OUTSTANDING event.

I don't see anything for BUS OUTSTANDING.  For CACHE and MISS I have
some options:

# opcontrol --list-events | grep CACHE
DATA_CACHE_ACCESSES: (counter: all)
DATA_CACHE_MISSES: (counter: all)
DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE: (counter: all)
DATA_CACHE_REFILLS_FROM_NORTHBRIDGE: (counter: all)
DATA_CACHE_LINES_EVICTED: (counter: all)
LOCKED_INSTRUCTIONS_DCACHE_MISSES: (counter: all)
L2_CACHE_MISS: (counter: all)
L2_CACHE_FILL_WRITEBACK: (counter: all)
INSTRUCTION_CACHE_FETCHES: (counter: all)
INSTRUCTION_CACHE_MISSES: (counter: all)
INSTRUCTION_CACHE_REFILLS_FROM_L2: (counter: all)
INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM: (counter: all)
INSTRUCTION_CACHE_VICTIMS: (counter: all)
INSTRUCTION_CACHE_INVALIDATED: (counter: all)
CACHE_BLOCK_COMMANDS: (counter: all)
READ_REQUEST_L3_CACHE: (counter: all)
L3_CACHE_MISSES: (counter: all)
IBS_FETCH_ICACHE_MISSES: (ext: ibs_fetch)
IBS_FETCH_ICACHE_HITS: (ext: ibs_fetch)
IBS_OP_DATA_CACHE_MISS: (ext: ibs_op)
IBS_OP_NB_LOCAL_CACHE: (ext: ibs_op)
IBS_OP_NB_REMOTE_CACHE: (ext: ibs_op)
IBS_OP_NB_CACHE_MODIFIED: (ext: ibs_op)
IBS_OP_NB_CACHE_OWNED: (ext: ibs_op)
IBS_OP_NB_LOCAL_CACHE_LAT: (ext: ibs_op)
IBS_OP_NB_REMOTE_CACHE_LAT: (ext: ibs_op)

# opcontrol --list-events | grep MISS | grep -v CACHE
L1_DTLB_MISS_AND_L2_DTLB_HIT: (counter: all)
L1_DTLB_AND_L2_DTLB_MISS: (counter: all)
L1_ITLB_MISS_AND_L2_ITLB_HIT: (counter: all)
L1_ITLB_MISS_AND_L2_ITLB_MISS: (counter: all)
MEMORY_CONTROLLER_SLOT_MISSED: (counter: all)
IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS: (ext: ibs_fetch)
IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES: (ext: ibs_fetch)
IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT: (ext: ibs_op)
IBS_OP_L1_L2_DTLB_MISS: (ext: ibs_op)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Jeff Janes
On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
 On Wednesday 24 November 2010 22:14:04 Andres Freund wrote:
 On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:

 Recarding LWLockAcquire costs:
 Yes, its pretty noticeable - on loads of different usages. On a bunch of
 production machines its the second (begind XLogInsert) on some the most
 expensive function. Most of the time

 AllocSetAlloc is the third, battling with hash_search_with_hash value. To
 complete that sentence...

I've played a bit with hash_search_with_hash_value and found that most
of the time is spent on shared hash tables, not private ones.  And the
time attributed to it for the shared hash tables mostly seems to be
due to the time it takes to fight cache lines away from other CPUs.  I
suspect the same thing is true of LWLockAcquire.


Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
 On Wednesday 24 November 2010 22:14:04 Andres Freund wrote:
 On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:

 Recarding LWLockAcquire costs:
 Yes, its pretty noticeable - on loads of different usages. On a bunch of
 production machines its the second (begind XLogInsert) on some the most
 expensive function. Most of the time

 AllocSetAlloc is the third, battling with hash_search_with_hash value. To
 complete that sentence...

 I've played a bit with hash_search_with_hash_value and found that most
 of the time is spent on shared hash tables, not private ones.  And the
 time attributed to it for the shared hash tables mostly seems to be
 due to the time it takes to fight cache lines away from other CPUs.  I
 suspect the same thing is true of LWLockAcquire.

How do you get stats on that?

How big is a typical cache line on modern CPUs?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Andres Freund
On Wednesday 24 November 2010 23:34:46 Robert Haas wrote:
 On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:
  On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
  On Wednesday 24 November 2010 22:14:04 Andres Freund wrote:
  On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:
  
  Recarding LWLockAcquire costs:
  Yes, its pretty noticeable - on loads of different usages. On a bunch
  of production machines its the second (begind XLogInsert) on some the
  most expensive function. Most of the time
  
  AllocSetAlloc is the third, battling with hash_search_with_hash value.
  To complete that sentence...
  
  I've played a bit with hash_search_with_hash_value and found that most
  of the time is spent on shared hash tables, not private ones.  And the
  time attributed to it for the shared hash tables mostly seems to be
  due to the time it takes to fight cache lines away from other CPUs.  I
  suspect the same thing is true of LWLockAcquire.
 
 How do you get stats on that?
Btw, if you have some recent kernel I would try to use perf - I find the event 
mappings easier to understand there, but maybe thats just me jumping onto 
bandwagons.

 How big is a typical cache line on modern CPUs?
for the line size
cat /sys/devices/system/cpu/cpu0/cache/index*/coherency_line_size

for the overall size
cat /sys/devices/system/cpu/cpu0/cache/index*/size

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I don't see anything for BUS OUTSTANDING.  For CACHE and MISS I have
 some options:

 DATA_CACHE_MISSES: (counter: all)
 L3_CACHE_MISSES: (counter: all)

Those two look promising, though I can't claim to be an expert.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 5:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I don't see anything for BUS OUTSTANDING.  For CACHE and MISS I have
 some options:

 DATA_CACHE_MISSES: (counter: all)
 L3_CACHE_MISSES: (counter: all)

 Those two look promising, though I can't claim to be an expert.

OK.  Thanksgiving is about to interfere with my access to this
machine, but I'll pick this up next week.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines

2010-11-24 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010:
 Remove useless whitespace at end of lines

 This was stuck in the moderation queue because of message size limit (30
 kB).  Is it worth increasing that value?

Evidently we should.  pgindent and copyright-update commits are likely
to be at least this long.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines

2010-11-24 Thread Magnus Hagander
On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010:
 Remove useless whitespace at end of lines

 This was stuck in the moderation queue because of message size limit (30
 kB).  Is it worth increasing that value?

 Evidently we should.  pgindent and copyright-update commits are likely
 to be at least this long.

That's twice a year only - I don't see a big problem moderating those
when it happens...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines

2010-11-24 Thread Bruce Momjian
Magnus Hagander wrote:
 On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote:
  Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010:
  Remove useless whitespace at end of lines
 
  This was stuck in the moderation queue because of message size limit (30
  kB). ?Is it worth increasing that value?
 
  Evidently we should. ?pgindent and copyright-update commits are likely
  to be at least this long.
 
 That's twice a year only - I don't see a big problem moderating those
 when it happens...

I do love to see my big pgindent commits come through.  You can hear the
wind rustling though the code.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 I've played a bit with hash_search_with_hash_value and found that most
 of the time is spent on shared hash tables, not private ones.  And the
 time attributed to it for the shared hash tables mostly seems to be
 due to the time it takes to fight cache lines away from other CPUs.  I
 suspect the same thing is true of LWLockAcquire.

That squares with some behavior I've seen.  If you run opannotate
you often see ridiculously high time percentages attributed to extremely
trivial C statements.  The explanation seems to be that those places are
where chunks of memory are first touched, and have to be pulled into the
CPU's cache (and, if in shared memory, pulled away from some other CPU).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines

2010-11-24 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 This was stuck in the moderation queue because of message size limit (30
 kB).  Is it worth increasing that value?
 
 Evidently we should.  pgindent and copyright-update commits are likely
 to be at least this long.

 That's twice a year only - I don't see a big problem moderating those
 when it happens...

Its not so much the moderation load, as I don't like being blindsided by
commits that touch everything in sight.  Finding out only when you try
to do git push (as indeed happened to me just this afternoon because of
this patch) is annoying.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Jeff Janes
On Wed, Nov 24, 2010 at 2:34 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 I've played a bit with hash_search_with_hash_value and found that most
 of the time is spent on shared hash tables, not private ones.  And the
 time attributed to it for the shared hash tables mostly seems to be
 due to the time it takes to fight cache lines away from other CPUs.  I
 suspect the same thing is true of LWLockAcquire.

 How do you get stats on that?

I replicated hash_search_with_hash_value function definition many
times with different names, and changed different parts of the code to
invoke different function names.  (I don't trust profilers to
correctly distribute times over call graphs.  I think most of them
just assume all function calls take the same time, and don't
separately measure the time of calls coming from different parts of
the code.)

Then I took one of them that is heavily used and leaves the hash table
unchanged, and had it invoke the same call several times in a row.
The profiling confirmed that it was called 3 times more often, but the
time spent in it increased by far less than 3 times, I think the
increase in time in that function was only 10% or so (and in
non-profiling code, the total run time did not increase by a
noticeable amount).

The only way I could explain this, other than redundant calls being
optimized away (which the profiler call-counts disputes), is caching
effects.


--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -95,7 +95,21 @@ BufTableLookup(BufferTag *tagPtr, uint32 hashcode)
BufferLookupEnt *result;

result = (BufferLookupEnt *)
-   hash_search_with_hash_value(SharedBufHash,
+   hash_search_with_hash_value5(SharedBufHash,
+
 (void *) tagPtr,
+
 hashcode,
+
 HASH_FIND,
+   NULL);
+
+   result = (BufferLookupEnt *)
+   hash_search_with_hash_value5(SharedBufHash,
+
 (void *) tagPtr,
+
 hashcode,
+
 HASH_FIND,
+   NULL);
+
+   result = (BufferLookupEnt *)
+   hash_search_with_hash_value5(SharedBufHash,

 (void *) tagPtr,

 hashcode,

 HASH_FIND,


 How big is a typical cache line on modern CPUs?

That I don't know.  I'm more of an experimentalist.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] function(contants) evaluated for every row

2010-11-24 Thread Michael Glaesemann

On Nov 24, 2010, at 15:28 , Marti Raudsepp wrote:

 On Wed, Nov 24, 2010 at 21:52, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Notice the to_date()'s were not converted to constants in EXPLAIN so
 they are evaluated for every row.  to_date() is marked STABLE.
 
 No.  This is per expectation.  Only IMMUTABLE functions can be folded to
 constants in advance of the query.
 
 This is something that has bit me in the past.
 
 I realize that STABLE functions cannot be constant-folded at
 planning-time. But are there good reasons why it cannot called only
 once at execution-time?
 
 As long as *only* STABLE or IMMUTABLE functions are used in a query,
 we can assume that settings like timezone won't change in the middle
 of the execution of a function, thus STABLE function calls can be
 collapsed -- right?

I've seen this as well be a performance issue, in particular with partitioned 
tables. Out of habit I now write functions that always cache the value of the 
function in a variable and use the variable in the actual query to avoid this 
particular gotcha.

Michael Glaesemann
grzm seespotcode net




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Bruce Momjian
Robert Haas wrote:
 With respect to the syntax itself, I have mixed feelings.  On the one
 hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS
 precisely because I believe they handle many common cases that people
 want in real life without much hullabaloo.  But, there's clearly some
 limit to what can reasonably be done this way.  At some point, what
 you really want is some kind of meta-language where you can write
 things like:
 
 IF EXISTS TABLE t1 THEN
ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr;
 END IF;

FYI, I have felt this way for a while.  IF EXISTS seemed like something
that should never have been added as an inline SQL command option; it
just crept in, and kept growing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] duplicate connection failure messages

2010-11-24 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mié nov 24 19:04:30 -0300 2010:
 Bruce Momjian wrote:

  OK, we will not call inet_ntop() at all.  I moved the CIDR part of
  adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining
  net part to /port/inet_net_ntop.c.

 Applied.

This broke dugong in the ecpg tests.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Daniel Farina
On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 With respect to the syntax itself, I have mixed feelings.  On the one
 hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS
 precisely because I believe they handle many common cases that people
 want in real life without much hullabaloo.  But, there's clearly some
 limit to what can reasonably be done this way.  At some point, what
 you really want is some kind of meta-language where you can write
 things like:

 IF EXISTS TABLE t1 THEN
    ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr;
 END IF;

 FYI, I have felt this way for a while.  IF EXISTS seemed like something
 that should never have been added as an inline SQL command option; it
 just crept in, and kept growing.

Okay, that being the case: would it make sense to have pg_dump emit DO
blocks? I have a feeling this might draw fire, but I don't see any
reason why the mechanism would not work to more or less equivalent
effect. Certainly making dumps harder to use for those who insist on
disabling PL/PGSQL is probably a negative side effect, if one can
identify this hypothetical class of person.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Document that a CHECKPOINT before taking a file system snapshot

2010-11-24 Thread Cédric Villemain
2010/11/25 Bruce Momjian br...@momjian.us:
 Document that a CHECKPOINT before taking a file system snapshot can
 reduce recovery time.

I didn't follow that on -hackers, but :

* checkpoint take place in the pg_start_backup process (before it
releases the hand, so before you can start snapshoting)

* we used to issue a checkpoint *before* pg_start_backup to reduce
pg_start_backup duration  in case you have wal_archiving turn off and
full_page_write is off, it reduces a bit the IO contention. (or even
if not I hate seing this pg_start_backup taking seconds/minutes to
finish)

How the checkpoint before will reduce recovery time ?



 Branch
 --
 master

 Details
 ---
 http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=7276ab5888d85782d988fc297ad2e176c7ad1bca

 Modified Files
 --
 doc/src/sgml/backup.sgml |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)


 --
 Sent via pgsql-committers mailing list (pgsql-committ...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-committers




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling pgbench

2010-11-24 Thread Robert Haas
On Nov 24, 2010, at 5:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 I've played a bit with hash_search_with_hash_value and found that most
 of the time is spent on shared hash tables, not private ones.  And the
 time attributed to it for the shared hash tables mostly seems to be
 due to the time it takes to fight cache lines away from other CPUs.  I
 suspect the same thing is true of LWLockAcquire.
 
 That squares with some behavior I've seen.  If you run opannotate
 you often see ridiculously high time percentages attributed to extremely
 trivial C statements.  The explanation seems to be that those places are
 where chunks of memory are first touched, and have to be pulled into the
 CPU's cache (and, if in shared memory, pulled away from some other CPU).

Does it hurt that, for example, the BufMappingLocks are consecutive in memory?  
They appear to be among the more heavily contended locks even on my 2-core box.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] duplicate connection failure messages

2010-11-24 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of nov 24 19:04:30 -0300 2010:
  Bruce Momjian wrote:
 
   OK, we will not call inet_ntop() at all.  I moved the CIDR part of
   adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining
   net part to /port/inet_net_ntop.c.
 
  Applied.
 
 This broke dugong in the ecpg tests.

I stopped checking the build page after a few hours, but I see the
failure now.

I have reviewed the libpq Makefile and I believe I am now properly added
port/inet_net_ntop.c.  I improved the comments as well.

Patch attached and applied.  Thansk for the heads-up.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index b327ee5..74ae79a 100644
*** /tmp/mhssze_Makefile	Wed Nov 24 21:42:22 2010
--- src/interfaces/libpq/Makefile	Wed Nov 24 21:04:00 2010
*** endif
*** 27,39 
  # Need to recompile any libpgport object files because we need these
  # object files to use the same compile flags as libpq.  If we used
  # the object files from libpgport, this would not be true on all
! # platforms.
  LIBS := $(LIBS:-lpgport=)
  
  OBJS=	fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \
  	fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o \
  	libpq-events.o \
! 	md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o \
  	$(filter crypt.o getaddrinfo.o inet_aton.o open.o snprintf.o strerror.o strlcpy.o win32error.o, $(LIBOBJS))
  
  ifeq ($(PORTNAME), cygwin)
--- 27,40 
  # Need to recompile any libpgport object files because we need these
  # object files to use the same compile flags as libpq.  If we used
  # the object files from libpgport, this would not be true on all
! # platforms.  We filter some object files so we only use object
! # files configure says we need.
  LIBS := $(LIBS:-lpgport=)
  
  OBJS=	fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \
  	fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o \
  	libpq-events.o \
! 	md5.o ip.o wchar.o encnames.o inet_net_ntop.o noblock.o pgstrcasecmp.o thread.o \
  	$(filter crypt.o getaddrinfo.o inet_aton.o open.o snprintf.o strerror.o strlcpy.o win32error.o, $(LIBOBJS))
  
  ifeq ($(PORTNAME), cygwin)
*** backend_src = $(top_srcdir)/src/backend
*** 80,86 
  # For port modules, this only happens if configure decides the module
  # is needed (see filter hack in OBJS, above).
  
! crypt.c getaddrinfo.c inet_aton.c noblock.c open.c pgstrcasecmp.c snprintf.c strerror.c strlcpy.c thread.c win32error.c pgsleep.c: % : $(top_srcdir)/src/port/%
  	rm -f $@  $(LN_S) $ .
  
  md5.c ip.c: % : $(backend_src)/libpq/%
--- 81,87 
  # For port modules, this only happens if configure decides the module
  # is needed (see filter hack in OBJS, above).
  
! crypt.c getaddrinfo.c inet_aton.c inet_net_ntop.c noblock.c open.c pgstrcasecmp.c snprintf.c strerror.c strlcpy.c thread.c win32error.c pgsleep.c: % : $(top_srcdir)/src/port/%
  	rm -f $@  $(LN_S) $ .
  
  md5.c ip.c: % : $(backend_src)/libpq/%

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Document that a CHECKPOINT before taking a file system snapshot

2010-11-24 Thread Bruce Momjian
C?dric Villemain wrote:
 2010/11/25 Bruce Momjian br...@momjian.us:
  Document that a CHECKPOINT before taking a file system snapshot can
  reduce recovery time.
 
 I didn't follow that on -hackers, but :
 
 * checkpoint take place in the pg_start_backup process (before it
 releases the hand, so before you can start snapshoting)
 
 * we used to issue a checkpoint *before* pg_start_backup to reduce
 pg_start_backup duration  in case you have wal_archiving turn off and
 full_page_write is off, it reduces a bit the IO contention. (or even
 if not I hate seing this pg_start_backup taking seconds/minutes to
 finish)
 
 How the checkpoint before will reduce recovery time ?

This is for using file system snapshots, not PITR or continuous
archiving.  The checkpoint reduces how much WAL has to be replayed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Bruce Momjian
Daniel Farina wrote:
 On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote:
  Robert Haas wrote:
  With respect to the syntax itself, I have mixed feelings. ?On the one
  hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS
  precisely because I believe they handle many common cases that people
  want in real life without much hullabaloo. ?But, there's clearly some
  limit to what can reasonably be done this way. ?At some point, what
  you really want is some kind of meta-language where you can write
  things like:
 
  IF EXISTS TABLE t1 THEN
  ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr;
  END IF;
 
  FYI, I have felt this way for a while. ?IF EXISTS seemed like something
  that should never have been added as an inline SQL command option; it
  just crept in, and kept growing.
 
 Okay, that being the case: would it make sense to have pg_dump emit DO
 blocks? I have a feeling this might draw fire, but I don't see any
 reason why the mechanism would not work to more or less equivalent
 effect. Certainly making dumps harder to use for those who insist on
 disabling PL/PGSQL is probably a negative side effect, if one can
 identify this hypothetical class of person.

Not being able to recover a dump is serious problem for a user.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Daniel Farina
On Wed, Nov 24, 2010 at 7:03 PM, Bruce Momjian br...@momjian.us wrote:
 Daniel Farina wrote:
 On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote:
  Robert Haas wrote:
  With respect to the syntax itself, I have mixed feelings. ?On the one
  hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS
  precisely because I believe they handle many common cases that people
  want in real life without much hullabaloo. ?But, there's clearly some
  limit to what can reasonably be done this way. ?At some point, what
  you really want is some kind of meta-language where you can write
  things like:
 
  IF EXISTS TABLE t1 THEN
  ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr;
  END IF;
 
  FYI, I have felt this way for a while. ?IF EXISTS seemed like something
  that should never have been added as an inline SQL command option; it
  just crept in, and kept growing.

 Okay, that being the case: would it make sense to have pg_dump emit DO
 blocks? I have a feeling this might draw fire, but I don't see any
 reason why the mechanism would not work to more or less equivalent
 effect. Certainly making dumps harder to use for those who insist on
 disabling PL/PGSQL is probably a negative side effect, if one can
 identify this hypothetical class of person.

 Not being able to recover a dump is serious problem for a user.

Even if it only involves enabling PLPGSQL to do the restore? Also take
into consideration that plpgsql is enabled by default. A user would
have to change the template database (which, in general, can cause
restores to fail in at least a few other ways) or drop the procedural
language explicitly to make that mechanism not work with a fresh and
normal-looking createdb.

--
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Bruce Momjian
Daniel Farina wrote:
 On Wed, Nov 24, 2010 at 7:03 PM, Bruce Momjian br...@momjian.us wrote:
  Daniel Farina wrote:
  On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote:
   Robert Haas wrote:
   With respect to the syntax itself, I have mixed feelings. ?On the one
   hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS
   precisely because I believe they handle many common cases that people
   want in real life without much hullabaloo. ?But, there's clearly some
   limit to what can reasonably be done this way. ?At some point, what
   you really want is some kind of meta-language where you can write
   things like:
  
   IF EXISTS TABLE t1 THEN
   ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr;
   END IF;
  
   FYI, I have felt this way for a while. ?IF EXISTS seemed like something
   that should never have been added as an inline SQL command option; it
   just crept in, and kept growing.
 
  Okay, that being the case: would it make sense to have pg_dump emit DO
  blocks? I have a feeling this might draw fire, but I don't see any
  reason why the mechanism would not work to more or less equivalent
  effect. Certainly making dumps harder to use for those who insist on
  disabling PL/PGSQL is probably a negative side effect, if one can
  identify this hypothetical class of person.
 
  Not being able to recover a dump is serious problem for a user.
 
 Even if it only involves enabling PLPGSQL to do the restore? Also take
 into consideration that plpgsql is enabled by default. A user would
 have to change the template database (which, in general, can cause
 restores to fail in at least a few other ways) or drop the procedural
 language explicitly to make that mechanism not work with a fresh and
 normal-looking createdb.

What are we adding a pl/pgsql dependency for?  What is the benefit that
will warrant requiring people who disable plpgsql to enable it for
restores?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] duplicate connection failure messages

2010-11-24 Thread Bruce Momjian
bruce wrote:
 Alvaro Herrera wrote:
  Excerpts from Bruce Momjian's message of nov 24 19:04:30 -0300 2010:
   Bruce Momjian wrote:
  
OK, we will not call inet_ntop() at all.  I moved the CIDR part of
adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining
net part to /port/inet_net_ntop.c.
  
   Applied.
  
  This broke dugong in the ecpg tests.
 
 I stopped checking the build page after a few hours, but I see the
 failure now.
 
 I have reviewed the libpq Makefile and I believe I am now properly added
 port/inet_net_ntop.c.  I improved the comments as well.
 
 Patch attached and applied.  Thansk for the heads-up.

OK, dugong is green now, so I think I got it.  :-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?

2010-11-24 Thread Daniel Farina
On Wed, Nov 24, 2010 at 7:21 PM, Bruce Momjian br...@momjian.us wrote:
 What are we adding a pl/pgsql dependency for?  What is the benefit that
 will warrant requiring people who disable plpgsql to enable it for
 restores?

There are two use cases I want to cover:

1) It should be possible to restore a dump made with --clean on an
empty database without error, so it can be run in a transaction and
the error code can be usefully monitored.

2) It should be possible a database be dumped and restored by a
non-superuser, again, cleanly, as per 1.

It was easy enough to change all the DROP ... statements to DROP
... IF EXISTS, but the ALTER statements have no equivalent, and thus
the only way for a dump created with --clean to run without error is
to ensure that all table and domain constraints exist prior to
restore.

The obvious mechanisms that have come to mind in this thread are:

* An IF EXISTS variant of ALTER, at least for TABLE and DOMAIN
(although it may be strange to only support it on a couple of types)

* Use of anonymous-DO code blocks (the prototype uses this, and this
depends on plpgsql)

* Bizarre things I can imagine doing that involve creative queries
that, as a side effect, might drop objects that I have not mentioned
because I thought they were too gross to be given serious
consideration. But it might be plpgsql-less, which would be nice.

Note that in the case where one wants to dump/restore as a
non-superuser that one may not be in a position to conveniently do a
(DROP|CREATE) DATABASE statement to work around the problem.

--
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation, work in progress

2010-11-24 Thread Robert Haas
On Wed, Nov 24, 2010 at 3:37 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote:
 It seems you've falsified the header comment in
 pathkeys_useful_for_merging(), although I guess it's already false
 because it doesn't seem to have been updated for the NULLS ASC/DESC
 stuff, and the interior comment in right_merge_direction() also needs
 adjusting.  But this might be more than a documentation problem,
 because the choice of merge direction really *is* arbitrary in the
 case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's
 actually true for collation.  If collation affects the definition of
 equality then it certainly isn't true.

 I did check that again and didn't arrive at the conclusion that the
 comments would need updating either with respect to this patch or some
 previous change.  Could you check again and possibly provide a
 suggestion?

I think that you are right and that my previous comment was erroneous.
 Sorry for the noise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for view triggers in psql

2010-11-24 Thread Josh Kupershmidt
On Tue, Nov 23, 2010 at 10:21 PM, David Fetter da...@fetter.org wrote:
 Please find attached a patch changing both this and updateable to
 updatable, also per the very handy git grep I just learned about :)

I looked a little more at this patch today. I didn't find any serious
problems, though it would have helped me (and maybe other reviewers)
to have a comprehensive list of the SQL statements which the patch
implements autocompletion for.

Not sure how hard this would be to add in, but the following gets
autocompleted with an INSERT:
  CREATE TRIGGER mytrg BEFORE IN[TAB]
while the following doesn't find any autocompletions:
  CREATE TRIGGER mytrg BEFORE INSERT OR UP[TAB]

Also, this existed before the patch, but this SQL:
  CREATE TRIGGER mytrg INSTEAD OF INSERT ON [TAB]
autocompletes to:
  CREATE TRIGGER mytrg INSTEAD OF INSERT ON SET

not sure how that SET is getting in there -- some incorrect tab
completion match?

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security hooks on object creation

2010-11-24 Thread KaiGai Kohei
The attached patch is a revised patch.

- The utils/hooks.h was renamed to catalog/objectaccess.h
- Numeric in the tail of InvokeObjectAccessHook0() has gone.
- Fixed bug in ATExecAddColumn; it gave AttributeRelationId
  to the hook instead of RelationRelationId.

In addition, I found that we didn't put post-creation hook
on foreign data wrapper, foreign server and user mapping
exceptionally. So, I put this hook around their command
handler like any other object classes.

Thanks,

(2010/11/24 12:07), Robert Haas wrote:
 2010/11/23 KaiGai Koheikai...@ak.jp.nec.com:
 What
 I'm not quite sure about is where to put the definitions you've added
 to a new file utils/hooks.h; I don't feel that's a very appropriate
 location.  It's tempting to put them in utils/acl.h just because this
 is vaguely access-control related and that header is already included
 in most of the right places, but maybe that's too much of a stretch;
 or perhaps catalog/catalog.h, although that doesn't feel quite right
 either.  If we are going to add a new header file, I still don't like
 utils/hooks.h much - it's considerably more generic than can be
 justified by its contents.

 I don't think utils/acl.h is long-standing right place, because we
 intended not to restrict the purpose of this hooks to access controls
 as you mentioned.

 I think somewhere under the catalog/ directory is a good idea because
 it hooks events that user wants (eventually) to modify system catalogs.
 How about catalog/hooks.h, instead of utils/hooks.h?
 
 Well, if we're going to create a new header file for this, I think it
 should be called something like catalog/objectaccess.h, rather than
 just hooks.h.  But I'd rather reuse something that's already there,
 all things being equal.
 


-- 
KaiGai Kohei kai...@ak.jp.nec.com
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index dcc53e1..9a38207 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -40,6 +40,7 @@
 #include catalog/index.h
 #include catalog/indexing.h
 #include catalog/namespace.h
+#include catalog/objectaccess.h
 #include catalog/pg_attrdef.h
 #include catalog/pg_constraint.h
 #include catalog/pg_inherits.h
@@ -1188,6 +1189,10 @@ heap_create_with_catalog(const char *relname,
 		}
 	}
 
+	/* Post creation of new relation */
+	InvokeObjectAccessHook(OAT_POST_CREATE,
+		   RelationRelationId, relid, 0);
+
 	/*
 	 * Store any supplied constraints and defaults.
 	 *
diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index 8b4f8c6..1ed108f 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -18,6 +18,7 @@
 #include access/heapam.h
 #include catalog/dependency.h
 #include catalog/indexing.h
+#include catalog/objectaccess.h
 #include catalog/pg_constraint.h
 #include catalog/pg_operator.h
 #include catalog/pg_type.h
@@ -360,6 +361,10 @@ CreateConstraintEntry(const char *constraintName,
 		DEPENDENCY_NORMAL);
 	}
 
+	/* Post creation of a new constraint */
+	InvokeObjectAccessHook(OAT_POST_CREATE,
+		   ConstraintRelationId, conOid, 0);
+
 	return conOid;
 }
 
diff --git a/src/backend/catalog/pg_conversion.c b/src/backend/catalog/pg_conversion.c
index 9578184..853ed0e 100644
--- a/src/backend/catalog/pg_conversion.c
+++ b/src/backend/catalog/pg_conversion.c
@@ -18,6 +18,7 @@
 #include access/sysattr.h
 #include catalog/dependency.h
 #include catalog/indexing.h
+#include catalog/objectaccess.h
 #include catalog/pg_conversion.h
 #include catalog/pg_conversion_fn.h
 #include catalog/pg_namespace.h
@@ -131,6 +132,10 @@ ConversionCreate(const char *conname, Oid connamespace,
 	recordDependencyOnOwner(ConversionRelationId, HeapTupleGetOid(tup),
 			conowner);
 
+	/* Post creation of a new conversion */
+	InvokeObjectAccessHook(OAT_POST_CREATE,
+		   ConversionRelationId, HeapTupleGetOid(tup), 0);
+
 	heap_freetuple(tup);
 	heap_close(rel, RowExclusiveLock);
 
diff --git a/src/backend/catalog/pg_namespace.c b/src/backend/catalog/pg_namespace.c
index 71ebd7a..978be51 100644
--- a/src/backend/catalog/pg_namespace.c
+++ b/src/backend/catalog/pg_namespace.c
@@ -17,6 +17,7 @@
 #include access/heapam.h
 #include catalog/dependency.h
 #include catalog/indexing.h
+#include catalog/objectaccess.h
 #include catalog/pg_namespace.h
 #include utils/builtins.h
 #include utils/rel.h
@@ -75,5 +76,9 @@ NamespaceCreate(const char *nspName, Oid ownerId)
 	/* Record dependency on owner */
 	recordDependencyOnOwner(NamespaceRelationId, nspoid, ownerId);
 
+	/* Post creation of new schema */
+	InvokeObjectAccessHook(OAT_POST_CREATE,
+		   NamespaceRelationId, nspoid, 0);
+
 	return nspoid;
 }
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 73de672..cd169a6 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -22,6 +22,7 @@
 #include catalog/dependency.h
 #include catalog/indexing.h
 #include catalog/namespace.h

Re: [HACKERS] Label switcher function

2010-11-24 Thread KaiGai Kohei
The attached patch is a revised one.

It provides two hooks; the one informs core PG whether the supplied
function needs to be hooked, or not. the other is an actual hook on
prepare, start, end and abort of function invocations.

  typedef bool (*needs_function_call_type)(Oid fn_oid);

  typedef void (*function_call_type)(FunctionCallEventType event,
 FmgrInfo *flinfo, Datum *private);

The hook prototype was a bit modified since the suggestion from
Robert. Because FmgrInfo structure contain OID of the function,
it might be redundant to deliver OID of the function individually.

Rest of parts are revised according to the comment.

I also fixed up source code comments which might become incorrect.

Thanks,

(2010/11/18 11:30), Robert Haas wrote:
 2010/11/17 KaiGai Koheikai...@ak.jp.nec.com:
 I revised my patch as I attached.

 The hook function is modified and consolidated as follows:

   typedef enum FunctionCallEventType
   {
  FCET_BE_HOOKED,
  FCET_PREPARE,
  FCET_START,
  FCET_END,
  FCET_ABORT,
   } FunctionCallEventType;

   typedef Datum (*function_call_event_type)(Oid functionId,
 FunctionCallEventType event,
 Datum event_arg);
   extern PGDLLIMPORT function_call_event_type function_call_event_hook;

 Unlike the subject of this e-mail, now it does not focus on only switching
 security labels during execution of a certain functions.
 For example, we may use this hook to track certain functions for security
 auditing, performance tuning, and others.

 In the case of SE-PgSQL, it shall return BoolGetDatum(true), if the target
 function is configured as a trusted procedure, then, this invocation will
 be hooked by fmgr_security_definer. In the first call, it shall compute
 the security context to be assigned during execution on FCET_PREPARE event.
 Then, it switches to the computed label on the FCET_START event, and
 restore it on the FCET_END or ECET_ABORT event.
 
 This seems like it's a lot simpler than before, which is good.  It
 looks to me as though there should really be two separate hooks,
 though, one for what is now FCET_BE_HOOKED and one for everything
 else.  For FCET_BE_HOOKED, you want a function that takes an Oid and
 returns a bool.  For the other event types, the functionId and event
 arguments are OK, but I think you should forget about the save_datum
 stuff and just always pass fcache-flinfo andfcache-private.  The
 plugin can get the effect of save_datum by passing around whatever
 state it needs to hold on to using fcache-private.  So:
 
 bool (*needs_function_call_hook)(Oid fn_oid);
 void (*function_call_hook)(Oid fn_oid, FunctionCallEventType event,
 FmgrInfo flinfo, Datum *private);
 
 Another general comment is that you've not done a very complete job
 updating the comments; there are several of them in fmgr.c that are no
 longer accurate.  Also, please zap the unnecessary whitespace changes.
 
 Thanks,
 


-- 
KaiGai Kohei kai...@ak.jp.nec.com
 contrib/dummy_seclabel/dummy_seclabel.c   |  102 -
 src/backend/optimizer/util/clauses.c  |8 ++
 src/backend/utils/fmgr/fmgr.c |   44 ---
 src/include/fmgr.h|   55 +
 src/test/regress/input/security_label.source  |   28 +++
 src/test/regress/output/security_label.source |   43 ++-
 6 files changed, 267 insertions(+), 13 deletions(-)

diff --git a/contrib/dummy_seclabel/dummy_seclabel.c b/contrib/dummy_seclabel/dummy_seclabel.c
index 8bd50a3..7acb512 100644
--- a/contrib/dummy_seclabel/dummy_seclabel.c
+++ b/contrib/dummy_seclabel/dummy_seclabel.c
@@ -12,14 +12,105 @@
  */
 #include postgres.h
 
+#include catalog/pg_proc.h
 #include commands/seclabel.h
 #include miscadmin.h
 
 PG_MODULE_MAGIC;
 
+PG_FUNCTION_INFO_V1(dummy_client_label);
+
 /* Entrypoint of the module */
 void _PG_init(void);
 
+/* SQL functions */
+Datum dummay_client_label(PG_FUNCTION_ARGS);
+
+static const char *client_label = unclassified;
+
+typedef struct {
+	const char *old_label;
+	const char *new_label;
+	Datum		next_private;
+} dummy_stack;
+
+static needs_function_call_type	needs_function_call_next = NULL;
+static function_call_type		function_call_next = NULL;
+
+static bool
+dummy_needs_function_call(Oid fn_oid)
+{
+	char		   *label;
+	bool			result = false;
+	ObjectAddress	object = { .classId = ProcedureRelationId,
+			   .objectId = fn_oid,
+			   .objectSubId = 0 };
+
+	if (needs_function_call_next 
+		(*needs_function_call_next)(fn_oid))
+		return true;
+
+	label = GetSecurityLabel(object, dummy);
+	if (label  strcmp(label, trusted) == 0)
+		result = true;
+
+	if (label)
+		pfree(label);
+
+	return result;
+}
+
+static void
+dummy_function_call(FunctionCallEventType event,
+	FmgrInfo *flinfo, Datum *private)
+{
+	dummy_stack	   *stack;
+
+	switch (event)
+	{
+		case FCET_PREPARE:
+			

Re: [HACKERS] Extensions, this time with a patch

2010-11-24 Thread Itagaki Takahiro
On Thu, Nov 25, 2010 at 05:02, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Something like the attached, version 5 of the patch? I've been using the
 function name ParseConfigFp because the internal parameter was called fp
 in the previous function body. I suppose that could easily be changed at
 commit time if necessary.

Thanks. I'll move the patch to Ready for Committer.

Here is a list of items for committers, including only cosmetic changes.

- Comments in recovery.conf.sample needs to be adjusted.
# (The quotes around the value are NOT optional, but the = is.)
  It seems to be the only description about quotes are not omissible before.
- We might not need to check the result of ParseConfigFp() because
  it always raises FATAL on errors.
- We could remove the unused argument calling_file in ParseConfigFp().
- I feel  struct name_value_pair and ConfigNameValuePair a bit
  redundant names. I'd prefer something like ConfigVariable.
- fp might be a better name than FILE *fd. There are 4 usages in xlog.c.


 Extensions will need the support for custom_variable_classes as it is
 done now, and as you say, the recovery will just error out. You have to
 clean your recovery.conf file then try again (I just tried and confirm).

 I personally don't see any harm to have the features in all currently
 known uses-cases, and I don't see any point in walking an extra mile
 here to remove a feature in some cases.

Sure. We will leave them.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-24 Thread KaiGai Kohei

(2010/11/19 16:57), KaiGai Kohei wrote:

(2010/11/18 2:17), Robert Haas wrote:

On Wed, Nov 17, 2010 at 10:32 AM, Ross J. Reedstromreeds...@rice.edu wrote:

On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote:

On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Koheikai...@ak.jp.nec.com wrote:

If we don't need a PoC module for each new hooks, I'm not strongly
motivated to push it into contrib tree.
How about your opinion?


I'd say let it go, unless someone else feels strongly about it.


I would use this module (rate limit new connection attempts) as soon as
I could. Putting a cap on potential CPU usage on a production DB by either
a blackhat or mistake by a developer caused by a mistake in
configuration (leaving the port accessible) is definitely useful, even
in the face of max_connections. My production apps already have
their connections and seldom need new ones. They all use CPU though.


If KaiGai updates the code per previous discussion, would you be
willing to take a crack at adding documentation?

P.S. Your email client seems to be setting the Reply-To address to a
ridiculous value.


OK, I'll revise my patch according to the previous discussion.


The attached patch is revised version.

- Logging part within auth_delay was removed. This module now focuses on
  injection of a few seconds delay on authentication failed.
- Documentation parts were added like any other contrib modules.

Thanks,
--
KaiGai Kohei kai...@ak.jp.nec.com
 contrib/Makefile|1 +
 contrib/README  |5 +++
 contrib/auth_delay/Makefile |   14 +++
 contrib/auth_delay/auth_delay.c |   71 
 doc/src/sgml/auth-delay.sgml|   76 +++
 doc/src/sgml/contrib.sgml   |1 +
 doc/src/sgml/filelist.sgml  |1 +
 7 files changed, 169 insertions(+), 0 deletions(-)

diff --git a/contrib/Makefile b/contrib/Makefile
index e1f2a84..5747bcc 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -6,6 +6,7 @@ include $(top_builddir)/src/Makefile.global
 
 SUBDIRS = \
 		adminpack	\
+		auth_delay	\
 		auto_explain	\
 		btree_gin	\
 		btree_gist	\
diff --git a/contrib/README b/contrib/README
index 6d29cfe..a6abd94 100644
--- a/contrib/README
+++ b/contrib/README
@@ -28,6 +28,11 @@ adminpack -
 	File and log manipulation routines, used by pgAdmin
 	by Dave Page dp...@vale-housing.co.uk
 
+auth_delay
+	Add a few second's delay on authentication failed. It enables to make
+	difficult brute-force attacks on database passwords.
+	by KaiGai Kohei kai...@ak.jp.nec.com
+
 auto_explain -
 	Log EXPLAIN output for long-running queries
 	by Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp
diff --git a/contrib/auth_delay/Makefile b/contrib/auth_delay/Makefile
new file mode 100644
index 000..09d2d54
--- /dev/null
+++ b/contrib/auth_delay/Makefile
@@ -0,0 +1,14 @@
+# contrib/auth_delay/Makefile
+
+MODULES = auth_delay
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/auth_delay
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c
new file mode 100644
index 000..746ac4b
--- /dev/null
+++ b/contrib/auth_delay/auth_delay.c
@@ -0,0 +1,71 @@
+/* -
+ *
+ * auth_delay.c
+ *
+ * Copyright (C) 2010, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/auth_delay/auth_delay.c
+ *
+ * -
+ */
+#include postgres.h
+
+#include libpq/auth.h
+#include utils/guc.h
+#include utils/timestamp.h
+
+#include unistd.h
+
+PG_MODULE_MAGIC;
+
+void _PG_init(void);
+
+/* GUC Variables */
+static int	auth_delay_seconds;
+
+/* Original Hook */
+static ClientAuthentication_hook_type	original_client_auth_hook = NULL;
+
+/*
+ * Check authentication
+ */
+static void
+auth_delay_checks(Port *port, int status)
+{
+	/*
+	 * Any other plugins which use the ClientAuthentication_hook.
+	 */
+	if (original_client_auth_hook)
+		original_client_auth_hook(port, status);
+
+	/*
+	 * Inject a few seconds delay on authentication failed.
+	 */
+	if (status != STATUS_OK)
+	{
+		sleep(auth_delay_seconds);
+	}
+}
+
+/*
+ * Module Load Callback
+ */
+void
+_PG_init(void)
+{
+	/* Define custome GUC variables */
+	DefineCustomIntVariable(auth_delay.seconds,
+			Seconds to be delayed on authentication failed,
+			NULL,
+			auth_delay_seconds,
+			2,
+			0, INT_MAX,
+			PGC_POSTMASTER,
+			GUC_UNIT_S,
+			NULL,
+			NULL);
+	/* Install Hooks */
+	original_client_auth_hook = ClientAuthentication_hook;
+	ClientAuthentication_hook = auth_delay_checks;
+}
diff --git a/doc/src/sgml/auth-delay.sgml b/doc/src/sgml/auth-delay.sgml
new file mode 100644
index 000..372702d
--- 

[HACKERS] pg_execute_from_file review

2010-11-24 Thread Joshua Tolley
I've just looked at pg_execute_from_file[1]. The idea here is to execute all
the SQL commands in a given file. My comments:

* It applies well enough, and builds fine
* It seems to work, and I've not come up with a way to make it break
* It seems useful, and to follow the limited design discussion relevant to it
* It looks more or less like the rest of the code base, so far as I know

Various questions and/or nits:

* I'd like to see the docs slightly expanded, specifically to describe
  parameter replacement. I wondered for a while if I needed to set of
  parameters in any specific way, before reading the code and realizing they
  can be whatever I want.
* Does anyone think it needs representation in the test suite?
* Is it at all bad to include spi.h in genfile.c? IOW should this function
  live elsewhere? It seems reasonable to me to do it as written, but I thought
  I'd ask.
* In the snippet below, it seems best just to use palloc0():
query_string = (char *)palloc((fsize+1)*sizeof(char));
memset(query_string, 0, fsize+1);
* Shouldn't it include SPI_push() and SPI_pop()?

[1] http://archives.postgresql.org/message-id/m262wf6fnz@2ndquadrant.fr

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


  1   2   >