date:20101117

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

Robert,

On 11/15/2010 05:39 AM, Robert Haas wrote:
 I've spent a few hours pouring over the source code with
 coarse-toothed comb, trying to figure out just exactly what might
 break if we changed MyDatabaseId after backend startup time, or in
 other words, allowed a backend to unbind from the database to which it
 was originally bound and rebind to a new one.  This is related to the
 periodic conversations we've had about a built-in connection pooler,
 and/or maintaining a pool of worker threads that could be used to
 service parallel query, replication sets, etc.  What follows is not
 meant to be a concrete design proposal; it's basic research that may
 lead to a proposal at some time in the future.  Still, comments are
 welcome.

Thanks a lot for doing that, saved me a couple hours (presumably more
than it cost you :-)

 Thoughts?

The question obviously is whether or not this is faster than just
terminating one backend and starting a new one. Which basically costs an
additional termination and re-creation of a process (i.e. fork())
AFAICS. Or what other savings do you envision?

If that's it, it certainly seems like a huge amount of work for very
little benefit. Or does this feature enable something that's impossible
to do otherwise?

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Andres Freund

On Wednesday 17 November 2010 11:04:04 Markus Wanner wrote:
 Robert,
 
 On 11/15/2010 05:39 AM, Robert Haas wrote:
  I've spent a few hours pouring over the source code with
  coarse-toothed comb, trying to figure out just exactly what might
  break if we changed MyDatabaseId after backend startup time, or in
  other words, allowed a backend to unbind from the database to which it
  was originally bound and rebind to a new one.  This is related to the
  periodic conversations we've had about a built-in connection pooler,
  and/or maintaining a pool of worker threads that could be used to
  service parallel query, replication sets, etc.  What follows is not
  meant to be a concrete design proposal; it's basic research that may
  lead to a proposal at some time in the future.  Still, comments are
  welcome.
 
 Thanks a lot for doing that, saved me a couple hours (presumably more
 than it cost you :-)
 
  Thoughts?
 
 The question obviously is whether or not this is faster than just
 terminating one backend and starting a new one. Which basically costs an
 additional termination and re-creation of a process (i.e. fork())
 AFAICS. Or what other savings do you envision?
Well, one could optimize most of the resetting away if the the old 
MyDatabaseId and the new one are the same  - an optimization which is hardly 
possible with forking new backends.

Also I think it could lower the impact due locking the procarray an related 
areas.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multi-platform, multi-locale regression tests

2010-11-17 Thread Markus Wanner

Kevin,

On 11/13/2010 01:28 AM, Kevin Grittner wrote:
 Should anyone else run into this, it's controlled by this in the test
 scheduling definitions (the tdef values):
  
 'xfail': True
  
 There are other test flags you can override here, like 'skip' to skip
 a test.

Correct. Looks like dtester urgently needs documentation...

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

Andreas,

On 11/17/2010 11:38 AM, Andres Freund wrote:
 Well, one could optimize most of the resetting away if the the old 
 MyDatabaseId and the new one are the same  - an optimization which is hardly 
 possible with forking new backends.

Uh? Why not simply re-use the same backend, then? Or do you think of
re-connecting to the same database as a way to reset your connection?

 Also I think it could lower the impact due locking the procarray an related 
 areas.

That may be, yes. But as pointed out by Tom and Alvaro, you'd have to
adhere to a compatible sequence of changes to shared memory to avoid
race conditions. That possibly requires using a very similar locking
sequence as the combination of a detaching and a newly starting backend
would use.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Andres Freund

On Wednesday 17 November 2010 11:58:33 Markus Wanner wrote:
 Andreas,
 
 On 11/17/2010 11:38 AM, Andres Freund wrote:
  Well, one could optimize most of the resetting away if the the old
  MyDatabaseId and the new one are the same  - an optimization which is
  hardly possible with forking new backends.
 Uh? Why not simply re-use the same backend, then? Or do you think of
 re-connecting to the same database as a way to reset your connection?
I am thinking of a connection-pooler like setup. Quite often your main-load 
goes towards a single database - in that situation you don't have to reset the 
database id most of the time.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

On 11/17/2010 12:09 PM, Andres Freund wrote:
 I am thinking of a connection-pooler like setup. Quite often your main-load 
 goes towards a single database - in that situation you don't have to reset 
 the 
 database id most of the time.

Okay, so that's what I'd call a connection-reset or something. Or
probably a re-authentication feature which allows switching users (and
thus permissions on that same database). Could make sense for a
connection pooler, yes. Not having to flush caches in that case could
even turn out to be a good win. And I think it's a lot simpler to just
switch the user than to switch the database. Such a feature looks more
promising to me. (Do we have other connection related and unchangeable
state?)

As side notes: for the bgworkers in Postgres-R, I'm currently re-using
existing backends. As they only do change set application, the amount of
connection-reset required is minimal (well, there isn't a client
connection for these kinds of backends, in the first place, but that's
another story). Plus they are always acting as superuser, no
authentication or user switching required in that case.

For parallel querying as well as async transactions, it might make sense
to be able to switch users but remain connected to the same database.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Overlap operators for ARRAY and NULLs

2010-11-17 Thread Itagaki Takahiro

There might be inconsistency in overlap operators for array
types ( @, @, and  ) when the argument arrays contain NULLs.

- SELECT 2 = ANY (ARRAY[1, NULL])   = NULL
- SELECT ARRAY[2] @ ARRAY[1, NULL] = false

NULL means unknown in definition, so should it return NULL
rather than false?

I found the issue when I read spec of MULTISET. In the SQL
standard, the following expression returns NULL. I was thinking
to make SUBMULTISET OF to be an alias of @ operator, but
they seems to be incompatible.

- MULTISET[2] SUBMULTISET OF MULTISET[1, NULL] = NULL

Will we change the operator's behavior? It would be more
consistent, but incompatible with older server versions.
If impossible, I'll add separated submultiset_of() function
instead of @ operator for MULTISET supports.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Alvaro Herrera

Excerpts from Markus Wanner's message of mié nov 17 07:04:04 -0300 2010:

  Thoughts?
 
 The question obviously is whether or not this is faster than just
 terminating one backend and starting a new one. Which basically costs an
 additional termination and re-creation of a process (i.e. fork())
 AFAICS. Or what other savings do you envision?

I don't think it's a speed thing only.  It would be a great thing to
have in autovacuum, for example, where we have constant problem reports
because the system failed to fork a new backend.  If we could simply
reuse an already existing one, it would be a lot more robust.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

On 11/17/2010 01:27 PM, Alvaro Herrera wrote:
 I don't think it's a speed thing only.  It would be a great thing to
 have in autovacuum, for example, where we have constant problem reports
 because the system failed to fork a new backend.  If we could simply
 reuse an already existing one, it would be a lot more robust.

Hm, that's an interesting point.

To actually increase robustness, it would have to be a failure scenario
that (temporarily) prevents forking, but allows an existing backend to
continue to do work (i.e. the ability to allocate memory or open files
come to mind).

Any idea about what's usually causing these fork() failures? I'm asking
because I'm afraid that for example, in case of an out of memory
condition, we'd just hit an OOM error later on, without being able to
perform the VACUUM job, either.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Marc Cousin

Hi, Here is my review of 'rollback sequence reset for TRUNCATE ... RESTART 
IDENTITY' patch.

- Is the patch in context diff format?
It's in git diff format. I guess it's OK ?

- Does it apply cleanly to the current git master?
Yes

- Does it include reasonable tests, necessary doc patches, etc?
Doc: Yes, it removes the warning about TRUNCATE ... RESTART IDENTITY, which is 
the point of the patch
Tests: There is a new regression test added for restart identity. And 'make 
check' passes (tested on linux).

- Usability review (skills needed: test-fu, ability to find and read spec)

- Read what the patch is supposed to do, and consider:

  - Does the patch actually implement that?
Yes.

  - Do we want that?
I think so, it removes a trap limitation of truncate

  - Do we already have it?
No

  - Does it follow SQL spec, or the community-agreed behavior?
I think so

  - Does it include pg_dump support (if applicable)?
Not applicable

  - Are there dangers?
Not that I think of 


  

  - Have all the bases been covered?


  
I think so  


  



  



  
- Feature test (skills needed: patch, configure, make, pipe errors to log)  


  



  
- Apply the patch, compile it and test: 



  - Does the feature work as advertised?


  
Yes. It works consistently, isn't fooled by savepoints or multiple serials in 
a table, or concurrent transactions 



  - Are there corner cases the author has failed to consider?   


  
I don't think so


  

  - Are there any assertion failures or crashes?


  
No  


  



  



  
Performance review (skills

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Alvaro Herrera

Excerpts from Markus Wanner's message of mié nov 17 09:57:18 -0300 2010:
 On 11/17/2010 01:27 PM, Alvaro Herrera wrote:
  I don't think it's a speed thing only.  It would be a great thing to
  have in autovacuum, for example, where we have constant problem reports
  because the system failed to fork a new backend.  If we could simply
  reuse an already existing one, it would be a lot more robust.
 
 Hm, that's an interesting point.
 
 To actually increase robustness, it would have to be a failure scenario
 that (temporarily) prevents forking, but allows an existing backend to
 continue to do work (i.e. the ability to allocate memory or open files
 come to mind).

Well, the autovacuum mechanism involves a lot of back-and-forth between
launcher and postmaster, which includes some signals, a fork() and
backend initialization.  The failure possibilities are endless.

Fork failure communication is similarly brittle.

 Any idea about what's usually causing these fork() failures? I'm asking
 because I'm afraid that for example, in case of an out of memory
 condition, we'd just hit an OOM error later on, without being able to
 perform the VACUUM job, either.

To be honest I have no idea.  Sometimes the server is just too loaded.
Right now we have this delay, if the process is not up and running in
60 seconds then we have to assume that something happened, and we no
longer wait for it.  If we knew the process was already there, we could
leave it alone; we'd know it would get to its duty eventually.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] describe objects, as in pg_depend

2010-11-17 Thread Alvaro Herrera

Hi,

A customer of ours (Enova Financial) requested the ability to describe
objects in pg_depend.  The wiki contains a simplistic SQL snippet that
does the task, but only for some of the object types, and it's rather
ugly.  It struck me that we could fulfill this very easily by exposing
the getObjectDescription() function at the SQL level, as in the attached
module.

I propose we include this as a builtin function.

Opinions?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org
#include postgres.h

#include catalog/dependency.h
#include fmgr.h
#include utils/builtins.h

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Datum describe_object(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(describe_object);

Datum
describe_object(PG_FUNCTION_ARGS)
{
Oid classid = PG_GETARG_OID(0);
Oid objid = PG_GETARG_OID(1);
int32   subobjid = PG_GETARG_INT32(2);
ObjectAddress   address;
char   *description = NULL;

if (classid != InvalidOid)
{
address.classId = classid;
address.objectId = objid;
address.objectSubId = subobjid;

description = getObjectDescription(address);
}

if (!description || description[0] == '\0')
PG_RETURN_NULL();

PG_RETURN_TEXT_P(cstring_to_text(description));
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

On 11/17/2010 02:19 PM, Alvaro Herrera wrote:
 Well, the autovacuum mechanism involves a lot of back-and-forth between
 launcher and postmaster, which includes some signals, a fork() and
 backend initialization.  The failure possibilities are endless.
 
 Fork failure communication is similarly brittle.

I certainly agree to that. However, a re-connecting mechanism wouldn't
allow us to get rid of the existing avworker startup infrastructure
entirely.

And for increased robustness, we'd require a less brittle re-connecting
mechanism. Given Robert's list, that doesn't seem trivial, either. (But
still doable, yes).

 Right now we have this delay, if the process is not up and running in
 60 seconds then we have to assume that something happened, and we no
 longer wait for it.  If we knew the process was already there, we could
 leave it alone; we'd know it would get to its duty eventually.

You are assuming presence of pool here. Which is fine, it's just not
something that a re-connecting feature would solve per se. (Says he who
coded the bgworkers pool thingie).

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Jaime Casanova

On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin cousinm...@gmail.com wrote:

 - Does the feature work as advertised?

 Yes. It works consistently, isn't fooled by savepoints or multiple serials
 in a table, or concurrent transactions


i haven't tested this nor readed the patch but i wondering what
happens in the presence of a prepared transaction (2PC), did you try
with concurrent transactions with different serialization levels?


-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Marc Cousin

The Wednesday 17 November 2010 15:50:36, Jaime Casanova wrote :
 On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin cousinm...@gmail.com wrote:
  - Does the feature work as advertised?
  
  Yes. It works consistently, isn't fooled by savepoints or multiple
  serials in a table, or concurrent transactions
 
 i haven't tested this nor readed the patch but i wondering what
 happens in the presence of a prepared transaction (2PC), did you try
 with concurrent transactions with different serialization levels?
I haven't tested with 2PC.

I didn't check with different isolations levels either.

I just verified that locking was happening as it should : truncate is blocked 
by a transaction already locking the table with an AccessShareLock and vice-
versa.

And that Rollbacking and rollbacking to savepoint restores the sequence to the 
correct state : the sequence isn't restored to its value at the savepoint, but 
at its last value before the truncate.

I don't see a special test-case with different isolation levels or 2PC. What 
do you have in mind ?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 fsync()ing the file at shutdown doesn't seem too bad to me from 
 performance point of view, we tolerate that for all other tables. And 
 you can always truncate the table yourself before shutdown.

The objection to that was not about performance.  It was about how
to find out what needs to be fsync'd.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] describe objects, as in pg_depend

2010-11-17 Thread Tom Lane

Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 A customer of ours (Enova Financial) requested the ability to describe
 objects in pg_depend.  The wiki contains a simplistic SQL snippet that
 does the task, but only for some of the object types, and it's rather
 ugly.  It struck me that we could fulfill this very easily by exposing
 the getObjectDescription() function at the SQL level, as in the attached
 module.

What's the point of the InvalidOid check?  It seems like you're mostly
just introducing a corner case: sometimes, but not always, the function
will return NULL instead of failing for bad input.  I think it should
just fail always.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Isn't HANDLE 64 bits on Win64?

2010-11-17 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 Do you still have a reference to the page that said they will never be
 assigned that high?

http://msdn.microsoft.com/en-us/library/ms810720.aspx

which says

USER and GDI handles are sign extended 32b values

To facilitate the porting, a decision has been made that these system
handles should stay as 32b values, sign extended to 64b on the 64b
platform. That is, the individual handle types are still based on the
HANDLE type, which maps to void *, and so the size of the handle is the
size of the pointer, i.e. 4 bytes on 32b and 8 bytes on 64b. However,
the actual value of the handle on the 64b platform, (i.e. the meaningful
bits), fits within the lower 32b, while the upper bits just carry the
sign.

This should make it easy to port the majority of the application
code. Handling of the special values, like -1, should be fairly
transparent. It also should agree nicely with all the cases where the
handles had been remoted with the help of the IDL definitions from the
public file wtypes.idl. However, care needs to be taken when remoting
the handles was done via a DWORD, as the upper long should be properly
sign extended on the 64b side. The app should use HandleToLong() and
LongToHandle() macros (inline functions) to do the casting right.

What's not clear to me is whether the section title means that only
certain handles have this guarantee, and if so whether we have to worry
about running into ones that don't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Tom Lane

Markus Wanner mar...@bluegap.ch writes:
 On 11/17/2010 02:19 PM, Alvaro Herrera wrote:
 Well, the autovacuum mechanism involves a lot of back-and-forth between
 launcher and postmaster, which includes some signals, a fork() and
 backend initialization.  The failure possibilities are endless.
 
 Fork failure communication is similarly brittle.

 I certainly agree to that. However, a re-connecting mechanism wouldn't
 allow us to get rid of the existing avworker startup infrastructure
 entirely.

I'm afraid that any such change would trade a visible, safe failure
mechanism (no avworker) for invisible, impossible-to-debug data
corruption scenarios (due to failure to reset some bit of cached state).
It certainly won't give me any warm fuzzy feeling that I can trust
autovacuum.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Indent authentication overloading

2010-11-17 Thread Magnus Hagander

Currently, we overload indent meaning both unix socket
authentication and ident over tcp, depending on what type of
connection it is. This is quite unfortunate - one of them being one of
the most secure options we have, the other one being one of the most
*insecure* ones (really? ident over tcp? does *anybody* use that
intentionally today?)

Should we not consider naming those two different things?

If not now, then at least put it on the TODO of things to do the next
time we need to break backwards compatibility with the format of
pg_hba.conf? Though if we're going to break backwards compatibility
anywhere, pg_hba is probably one of the least bad places to do it...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 Currently, we overload indent meaning both unix socket
 authentication and ident over tcp, depending on what type of
 connection it is. This is quite unfortunate - one of them being one of
 the most secure options we have, the other one being one of the most
 *insecure* ones (really? ident over tcp? does *anybody* use that
 intentionally today?)

 Should we not consider naming those two different things?

Maybe, but it seems like the time to raise the objection was six or
eight years ago :-(.  Renaming now will do little except to introduce
even more confusion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] describe objects, as in pg_depend

2010-11-17 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mié nov 17 12:20:06 -0300 2010:
 Alvaro Herrera alvhe...@alvh.no-ip.org writes:
  A customer of ours (Enova Financial) requested the ability to describe
  objects in pg_depend.  The wiki contains a simplistic SQL snippet that
  does the task, but only for some of the object types, and it's rather
  ugly.  It struck me that we could fulfill this very easily by exposing
  the getObjectDescription() function at the SQL level, as in the attached
  module.
 
 What's the point of the InvalidOid check?  It seems like you're mostly
 just introducing a corner case: sometimes, but not always, the function
 will return NULL instead of failing for bad input.  I think it should
 just fail always.

If the check is not there, the calling query will have to prevent the
function from being called on rows having OID=0 in pg_depend.  (These
rows show up in the catalog for pinned objects).  The query becomes
either incomplete (because you don't report pinned objects) or awkward
(because you have to insert a CASE expression to avoid calling the
function in that case).

I don't think it's all that necessary anyway.  If the function goes in
without that check, it will still be a huge improvement over the statu
quo.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 16:39, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Currently, we overload indent meaning both unix socket
 authentication and ident over tcp, depending on what type of
 connection it is. This is quite unfortunate - one of them being one of
 the most secure options we have, the other one being one of the most
 *insecure* ones (really? ident over tcp? does *anybody* use that
 intentionally today?)

 Should we not consider naming those two different things?

 Maybe, but it seems like the time to raise the objection was six or
 eight years ago :-(.  Renaming now will do little except to introduce
 even more confusion.

For existing users, yes.
For new users, no.

I certainly get comments on it pretty much every time I do training
that includes explaining pg_hba options.

The question is if it's worth confusing our existing users a little,
at the advantage of not confusing new users. We could of course also
just drop ident-over-tcp completely, but there might be some poor guy
out there who actually *uses* it :-)

And I agree it would've been much better to do it years ago. That
doesn't mean we shouldn't at least *consider* doing it at some point.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] describe objects, as in pg_depend

2010-11-17 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of miÃ© nov 17 12:20:06 -0300 2010:
 What's the point of the InvalidOid check?

 If the check is not there, the calling query will have to prevent the
 function from being called on rows having OID=0 in pg_depend.  (These
 rows show up in the catalog for pinned objects).

Hmm.  It would be good to document that motivation somewhere.  Also,
for my own taste it would be better to do

/* for pinned items in pg_depend, return null */
if (!OidIsValid(catalogId))
PG_RETURN_NULL();

... straight line code here ...

rather than leave the reader wondering whether there are any other cases
where the function is intended to return null.

Oh, one other gripe: probably better to name it pg_describe_object.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner

On 11/17/2010 04:25 PM, Tom Lane wrote:
 I'm afraid that any such change would trade a visible, safe failure
 mechanism (no avworker) for invisible, impossible-to-debug data
 corruption scenarios (due to failure to reset some bit of cached state).
 It certainly won't give me any warm fuzzy feeling that I can trust
 autovacuum.

Well, Alvaro doesn't quite seem have a warm fuzzy feeling with the
status quo, either. And I can certainly understand his concerns.

But yes, the os-level process separation and cache state reset guarantee
that an exit() / fork() pair provides is hard to match up against in
user space.

So, Alvaro's argument for robustness only stands under the assumption
that we can achieve a perfect cache state reset mechanism. Now, how
feasible is that? Are there any kind of tools that could help us check?

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Greg Stark

On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 fsync()ing the file at shutdown doesn't seem too bad to me from
 performance point of view, we tolerate that for all other tables. And
 you can always truncate the table yourself before shutdown.

 The objection to that was not about performance.  It was about how
 to find out what needs to be fsync'd.


Just a crazy brainstorming thought, but

If this is a clean shutdown then all the non-unlogged tables have been
checkpointed so they should have no dirty pages in them anyways. So we
could just fsync everything.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib: auth_delay module

2010-11-17 Thread Ross J. Reedstrom

On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote:
 On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote:
  If we don't need a PoC module for each new hooks, I'm not strongly
  motivated to push it into contrib tree.
  How about your opinion?
 
 I'd say let it go, unless someone else feels strongly about it.

I would use this module (rate limit new connection attempts) as soon as
I could. Putting a cap on potential CPU usage on a production DB by either
a blackhat or mistake by a developer caused by a mistake in
configuration (leaving the port accessible) is definitely useful, even
in the face of max_connections. My production apps already have
their connections and seldom need new ones. They all use CPU though.

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] duplicate connection failure messages

2010-11-17 Thread Bruce Momjian

Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of s??b nov 13 22:36:31 -0300 2010:
 
  OK, I found out how to get the IP address with the attached patch.  The
  problem is that only pghost is set, never pghostaddr.  I am not even
  sure how that would get set for this code because my tests show it is
  not:
 
 This doesn't work for IPv6 addresses, though.
 
 pghostaddr is specified by the user on the command line as an
 optimization to avoid DNS lookups IIRC, which is why you don't see the
 code setting it.

OK, I doubt we want to add complexity to improve this, so I see our
options as:

o  ignore the problem
o  display IPv4/IPv6 labels
o  display only an IPv6 label
o  something else

Comments?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread David Fetter

On Wed, Nov 17, 2010 at 04:43:00PM +0100, Magnus Hagander wrote:
 On Wed, Nov 17, 2010 at 16:39, Tom Lane t...@sss.pgh.pa.us wrote:
  Magnus Hagander mag...@hagander.net writes:
  Currently, we overload indent meaning both unix socket
  authentication and ident over tcp, depending on what type of
  connection it is. This is quite unfortunate - one of them being
  one of the most secure options we have, the other one being one
  of the most *insecure* ones (really? ident over tcp? does
  *anybody* use that intentionally today?)
 
  Should we not consider naming those two different things?
 
  Maybe, but it seems like the time to raise the objection was six
  or eight years ago :-(.  Renaming now will do little except to
  introduce even more confusion.
 
 For existing users, yes.  For new users, no.

Yep.  If we're to be a successful project, the vast majority of our
users are future users, not current or past ones.

 I certainly get comments on it pretty much every time I do training
 that includes explaining pg_hba options.
 
 The question is if it's worth confusing our existing users a little,
 at the advantage of not confusing new users. We could of course also
 just drop ident-over-tcp completely, but there might be some poor
 guy out there who actually *uses* it :-)

+1 for dropping it completely.  We have dropped features--automatic
cast to TEXT, for example--that a good deal more of our user base
relied on, for reasons less compelling than this.

 And I agree it would've been much better to do it years ago. That
 doesn't mean we shouldn't at least *consider* doing it at some
 point.

The sooner, the better, IMHO.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Jeroen Vermeulen


On 2010-11-17 22:43, Magnus Hagander wrote:


at the advantage of not confusing new users. We could of course also
just drop ident-over-tcp completely, but there might be some poor guy
out there who actually *uses* it :-)


As far as I know, companies do use it in their internal networks where 
they do have a reasonable shot at full control over ident connections. 
I don't know how easy it would be for them to switch to other methods.



Jeroen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Peter Eisentraut

On ons, 2010-11-17 at 16:35 +0100, Magnus Hagander wrote:
 Currently, we overload indent meaning both unix socket
 authentication and ident over tcp, depending on what type of
 connection it is. This is quite unfortunate - one of them being one of
 the most secure options we have, the other one being one of the most
 *insecure* ones (really? ident over tcp? does *anybody* use that
 intentionally today?)
 
 Should we not consider naming those two different things?

The original patch called the Unix domain socket version peer (whereas
the name ident comes from the official name of the TCP/IP protocol
used).  You can look it up in the archives, but I believe the argument
for using the name ident for both was because ident was established
and the new feature would provide the same functionality.

That said, I completely agree with you.  Every time I look through a
pg_hba.conf I think, that's a terrible name, we should rename this.

We could perhaps introduce an alternative name and slowly deprecate the
original one.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Kevin Grittner

Greg Stark gsst...@mit.edu wrote:
 
 If this is a clean shutdown then all the non-unlogged tables have
 been checkpointed so they should have no dirty pages in them
 anyways. So we could just fsync everything.
 
Or just all the unlogged tables.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 5:04 AM, Markus Wanner mar...@bluegap.ch wrote:
 The question obviously is whether or not this is faster than just
 terminating one backend and starting a new one.

I agree.

 Which basically costs an
 additional termination and re-creation of a process (i.e. fork())
 AFAICS. Or what other savings do you envision?

I think the thing I'd like to do (or see somebody else do) is write a
test harness that connects to the database, runs a single SELECT *
FROM dual (one row, one column table), and then disconnects; and then
oprofile the daylights out of the backend processes.  In other words,
I'd like to measure as exactly as we can the overhead of each part of
the startup process.  I think that would give us a clearer picture of
where the overhead is, and then we could look more directly at which
line items might be avoidable for rebinding to a new database.

However, that test doesn't capture everything.  For example, imagine a
connection pooler sitting in front of PG.  Rebinding to a new database
means disconnecting a TCP connection and establishing a new one.
Switching databases might save some latency there even if we don't
actually save much in terms of CPU instructions.  Maybe that's not
important, though.  I don't know.  I don't want to let my theorizing
get too far ahead of the data.

It also occurs to me to wonder whether there's some way that we can
speed up backend startup, period.  One of the frequent complaints
levied against PostgreSQL is that our connections are too expensive.
AFAICT, this is partly because backend startup costs are high, and
partly because of internal contention, especially around
ProcArrayLock.  Squeezing on the startup overhead will be valuable no
matter what we decide to do about database switching.  A variant on
this theme would be to try to adjust the startup sequence in some way
so that less of it needs to be redone if we switch databases, which
might be possible even if a more general solution isn't.
Unfortunately, I'm not sure how feasible this is, but maybe there's a
way...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 17:10, Jeroen Vermeulen j...@xs4all.nl wrote:
 On 2010-11-17 22:43, Magnus Hagander wrote:

 at the advantage of not confusing new users. We could of course also
 just drop ident-over-tcp completely, but there might be some poor guy
 out there who actually *uses* it :-)

 As far as I know, companies do use it in their internal networks where they
 do have a reasonable shot at full control over ident connections. I don't
 know how easy it would be for them to switch to other methods.

Yea, I think deleting it is going a bit overboard.

If it was a matter of changing it for those who use ident over tcp, I
really wouldn't hesitate - they're few :-) But the problem is that
it's the ident-over-tcp that's correctly named, not the other one...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 11:00 AM, Greg Stark gsst...@mit.edu wrote:
 On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 fsync()ing the file at shutdown doesn't seem too bad to me from
 performance point of view, we tolerate that for all other tables. And
 you can always truncate the table yourself before shutdown.

 The objection to that was not about performance.  It was about how
 to find out what needs to be fsync'd.

 Just a crazy brainstorming thought, but

 If this is a clean shutdown then all the non-unlogged tables have been
 checkpointed so they should have no dirty pages in them anyways. So we
 could just fsync everything.

Hmm, that reminds me: checkpoints should really skip writing buffers
belonging to unlogged relations altogether; and any fsync against an
unlogged relation should be skipped.  I need to go take a look at
what's required to make that happen, either as part of this patch or
as a follow-on commit.

It might be interesting to have a kind of semi-unlogged table where we
write a special xlog record for the first access after each checkpoint
but otherwise don't xlog.  On redo, we truncate the tables mentioned,
but not any others, since they're presumably OK.  But that's not what
I'm trying to design here.  I'm trying optimize it for the case where
you DON'T care about durability and you just want it to be as fast as
possible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 17:31, Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2010-11-17 at 16:35 +0100, Magnus Hagander wrote:
 Currently, we overload indent meaning both unix socket
 authentication and ident over tcp, depending on what type of
 connection it is. This is quite unfortunate - one of them being one of
 the most secure options we have, the other one being one of the most
 *insecure* ones (really? ident over tcp? does *anybody* use that
 intentionally today?)

 Should we not consider naming those two different things?

 The original patch called the Unix domain socket version peer (whereas
 the name ident comes from the official name of the TCP/IP protocol
 used).  You can look it up in the archives, but I believe the argument
 for using the name ident for both was because ident was established
 and the new feature would provide the same functionality.

Yeah, I vaguely recall that discussion - too lazy to actually look it
up :-) I think the argument was definitely wrong, but it didn't seem
so at the time...


 That said, I completely agree with you.  Every time I look through a
 pg_hba.conf I think, that's a terrible name, we should rename this.

 We could perhaps introduce an alternative name and slowly deprecate the
 original one.

That seems reasonable. Maybe even have the server emit a warning when
it sees it (since we now read/parse pg_hba.conf on server start, it
would only show up once per server reload, not on every connect). Or
maybe just doc-deprecate in 9.1, warning in 9.2, drop in 9.3 or
something?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] contrib: auth_delay module

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 10:32 AM, Ross J. Reedstrom reeds...@rice.edu wrote:
 On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote:
 On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote:
  If we don't need a PoC module for each new hooks, I'm not strongly
  motivated to push it into contrib tree.
  How about your opinion?

 I'd say let it go, unless someone else feels strongly about it.

 I would use this module (rate limit new connection attempts) as soon as
 I could. Putting a cap on potential CPU usage on a production DB by either
 a blackhat or mistake by a developer caused by a mistake in
 configuration (leaving the port accessible) is definitely useful, even
 in the face of max_connections. My production apps already have
 their connections and seldom need new ones. They all use CPU though.

If KaiGai updates the code per previous discussion, would you be
willing to take a crack at adding documentation?

P.S. Your email client seems to be setting the Reply-To address to a
ridiculous value.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensible executor nodes for preparation of SQL/MED

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 2:13 AM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Wed, Nov 17, 2010 at 10:51, Itagaki Takahiro
 itagaki.takah...@gmail.com wrote:
 On Wed, Nov 17, 2010 at 03:36, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Nov 16, 2010 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I am of the opinion that a run-time-extensible set of plan node types
 is merest fantasy.  We will never have that, so putting in place 5%
 of the infrastructure for it is a waste of time and notational
 complexity.

 I think I agree; and moreover there's been no compelling argument made
 why we would need that for SQL/MED anyway.

 I see. I'll cut useless parts from my patch.

 I tested simplified version, but I cannot see measurable performance
 improvement at this time. So, I'll turn down the whole proposal
 to use function pointer calls. I'm sorry for all the fuss.

Wait a minute... I'm confused.  Didn't you have a measurable
performance improvement with an earlier version of this patch?  If
taking out the useless parts removed the performance benefit, maybe
they weren't useless?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HOT updates in index-less tables

2010-11-17 Thread Merlin Moncure

On Sun, Nov 14, 2010 at 1:12 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 14.11.2010 00:29, Robert Haas wrote:

 On Sat, Nov 13, 2010 at 12:13 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

 Hannu Krosingha...@2ndquadrant.com  writes:

 On Sat, 2010-11-13 at 10:51 -0500, Tom Lane wrote:

 If a table has no indexes, we will always decide that any same-page
 update operation is a HOT update, since obviously it isn't modifying
 any indexed columns.  But is there any benefit to doing so?

 If we do the in-page mini vacuum even without HOT, then there should
 be no benefit from index-less HOT updates.

 AFAICS we do: heap_update marks the page as prunable whether it's a HOT
 update or not.  The only difference between treating the update as HOT vs
 not-HOT is that if there was more than one HOT update, the intermediate
 tuples could be completely reclaimed by page pruning (ie, their line
 pointers go away too).  With not-HOT updates, the intermediate line
 pointers would have to remain in DEAD state until vacuum, since page
 pruning wouldn't know if there were index entries pointing at them.
 But that seems like a pretty tiny penalty.

 I'm not at all convinced that's a tiny penalty.

 Me neither. It's a tiny penalty when you consider one update, but if you
 repeatedly update the same tuple, you accumulate dead line pointers until
 the next real vacuum runs. With HOT updates, you reach a steady state where
 page pruning is all you need. Then again, if you're repeatedly updating a
 row in a table with no indexes, presumably it's a very small table or you
 would create an index on it. And frequently autovacuuming a small index is
 quite cheap too.

The case here is when you have say a control table that is managing a
gapless sequence, or a materialization table with a very small number
of records.  These type of tables get updated very frequently, perhaps
in every transaction.  People without detailed implementation
knowledge of postgresql might assume that leaving an index off the
table is faster in these situations.

The danger here is that if autovacuum is stalled for whatever reason,
you get exponentially bad behavior as the table gets stuffed with bad
records.  index-less hot was put in intentionally.  As autovacuum gets
smarter and smarter, the reasoning to do this get weaker.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Teodor Sigaev


Hmm, will have to do some benchmarking on that. I'm using the Consistent
function when walking down to check if the downlink needs to be updated,
and assumed that it would be insignificant compared to the cost of
calling Penalty on all the keys on the page.
Why consistent?! It's impossible - you don't know right strategy number, index 
with storage type/over type could do not accept the same type as query. Index 
over tsvector is an example.



There should be no difference in performance here AFAICS. The children
need to be updated a second time to clear the flag, but we don't release
the locks on them in the middle, and we're only talking about setting a
single flag, so it should make no difference.


Agree with that
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Greg Stark

On Wed, Nov 17, 2010 at 4:52 PM, Robert Haas robertmh...@gmail.com wrote:
 However, that test doesn't capture everything.  For example, imagine a
 connection pooler sitting in front of PG.  Rebinding to a new database
 means disconnecting a TCP connection and establishing a new one.
 Switching databases might save some latency there even if we don't
 actually save much in terms of CPU instructions.  Maybe that's not
 important, though.  I don't know.  I don't want to let my theorizing
 get too far ahead of the data.

Everything you said is true but there's more. A freshly created
backend needs to build relcache entries and for every relation in your
query. A reused connection eventually warms up the relcache and
syscaches and can plan new queries using them without doing any
syscalls. And of course if it's a query that's already been planned
might be able to reuse the entire plan structure without replanning
it.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Teodor Sigaev

Sorry, I missed beginning of discussion on GiST, so I read it on the web mail 
archive.


You wrote:
http://archives.postgresql.org/pgsql-hackers/2010-11/msg00939.php
[skip]
0. (the child page is locked)
1. The parent page is locked.
2. The child page is split. The original page becomes the left half, and new 
buffers are allocated for the right halves.
3. The downlink is inserted on the parent page (and the original downlink is 
updated to reflect only the keys that stayed on the left page). While keeping 
the child pages locked, the NSN field on the children are updated with the new 
LSN of the parent page.

...
The scan checks that by comparing the LSN it saw on the parent page with the NSN 
on the child page. If parent LSN  NSN, we saw the parent before the downlink 
was inserted.


Now, the problem with crash recovery is that the above algorithm depends on the 
split to keep the parent and child locked until the downlink is inserted in the 
parent. If you crash between steps 2 and 3, the locks are gone. If a later 
insert then updates the parent page, because of a split on some unrelated child 
page, that will bump the LSN of the parent above the NSN on the child. Scans 
will see that the parent LSN  child NSN, and will no longer follow the  rightlink.

[skip]


I disagree with that opinion: if we crash between 2 and 3 then why will somebody 
update parent before WAL replay? WAL replay process in this case should complete 
child split by inserting invalid pointer and tree become correct again, 
although it needs to repair invalid pointers. The same situation with b-tree: 
WAL replay repairs incomplete split before any other processing.


Or do I miss something important?



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Josh Berkus

Robert, All:

I hope you're following the thread on -general about this feature.
We're getting a lot of feedback.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 If it was a matter of changing it for those who use ident over tcp, I
 really wouldn't hesitate - they're few :-) But the problem is that
 it's the ident-over-tcp that's correctly named, not the other one...

Yeah, renaming the TCP version would be quite wrong.  If we're going to
do something about this, I agree with Peter's suggestion: add peer as
the preferred name for the Unix-socket method, and deprecate but don't
remove ident.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST insert algorithm rewrite

2010-11-17 Thread Heikki Linnakangas


On 17.11.2010 19:46, Teodor Sigaev wrote:

I disagree with that opinion: if we crash between 2 and 3 then why will
somebody update parent before WAL replay? WAL replay process in this
case should complete child split by inserting invalid pointer and tree
become correct again, although it needs to repair invalid pointers.
The same situation with b-tree: WAL replay repairs incomplete split
before any other processing.

Or do I miss something important?


Yeah, see the thread that started this:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php
http://archives.postgresql.org/message-id/12375.1289429...@sss.pgh.pa.us

The code currently relies on the end-of-recovery processing to finish 
the incomplete, but I'm trying to get rid of that end-of-recovery 
processing altogether.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Heikki Linnakangas


On 17.11.2010 17:11, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

fsync()ing the file at shutdown doesn't seem too bad to me from
performance point of view, we tolerate that for all other tables. And
you can always truncate the table yourself before shutdown.


The objection to that was not about performance.  It was about how
to find out what needs to be fsync'd.


I must be missing something: we handle that just fine with normal 
tables, why is it a problem for unlogged tables?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 12:42 PM, Greg Stark gsst...@mit.edu wrote:
 On Wed, Nov 17, 2010 at 4:52 PM, Robert Haas robertmh...@gmail.com wrote:
 However, that test doesn't capture everything.  For example, imagine a
 connection pooler sitting in front of PG.  Rebinding to a new database
 means disconnecting a TCP connection and establishing a new one.
 Switching databases might save some latency there even if we don't
 actually save much in terms of CPU instructions.  Maybe that's not
 important, though.  I don't know.  I don't want to let my theorizing
 get too far ahead of the data.

 Everything you said is true but there's more. A freshly created
 backend needs to build relcache entries and for every relation in your
 query. A reused connection eventually warms up the relcache and
 syscaches and can plan new queries using them without doing any
 syscalls. And of course if it's a query that's already been planned
 might be able to reuse the entire plan structure without replanning
 it.

I think you're missing the point.  If we switch databases, all cached
relations and plans have to be flushed anyway.  We're talking about
what might NOT need to be flushed on switching databases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Tom Lane

Marc Cousin cousinm...@gmail.com writes:
 - Does the feature work as advertised?
 
 Yes. It works consistently, isn't fooled by savepoints or multiple
 serials in a table, or concurrent transactions

I think there's a rather nasty problem here, which is what to do with
the cached nextval/currval state.  As submitted, the patch does the same
thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued
nextval values, but don't touch currval) at the time of resetting the
sequence.  That's fine, but what if the transaction later rolls back?
The cached state is untouched by rollback, so if the transaction had
done any nextval()s meanwhile, the cache will be out of step with the
rolled-back sequence contents.

We never had to worry about this before because sequence operations
didn't roll back, by definition.  If we're going to add a situation
where they do roll back, we need to consider the case.

I think we can arrange to clear cached unissued values on the next
attempt to nextval() the sequence, by dint of adding the relfilenode
to SeqTable entries and clearing cached state whenever we note that
it doesn't match the current relfilenode of the sequence.  However,
I'm unsure what ought to happen to currval.  It doesn't seem too
practical to try to roll it back to its pre-transaction value.
Should we leave it alone (ie, possibly reflecting a value that was
assigned inside the failed transaction)?  The other alternative would
be to clear it as though nextval had never been issued at all in the
session.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 17.11.2010 17:11, Tom Lane wrote:
 The objection to that was not about performance.  It was about how
 to find out what needs to be fsync'd.

 I must be missing something: we handle that just fine with normal 
 tables, why is it a problem for unlogged tables?

Hmm ... that's a good point.  If we simply treat unlogged tables the
same as regular for checkpointing purposes, don't we end up having
flushed them all correctly during a shutdown checkpoint?  I was thinking
that WAL-logging had some influence on that logic, but it doesn't.

Robert is probably going to object that he wanted to prevent any
fsyncing for unlogged tables, but the discussion over in pgsql-general
is crystal clear that people do NOT want to lose unlogged data over
a clean shutdown and restart.  If all it takes to do that is to refrain
from lobotomizing the checkpoint logic for unlogged tables, I say we
should refrain.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote:
 Robert, All:

 I hope you're following the thread on -general about this feature.
 We're getting a lot of feedback.

I haven't been; I'm not subscribed to general; it'd be useful to CC me
next time.

Reading through the thread in the archives, it seems like people are
mostly confused.  Some are confused about the current behavior of the
patch (no, it really does always truncate your tables, I swear);
others are confused about how WAL logging works (of course a backend
crash doesn't truncate an ordinary table - that's because it's WAL
LOGGED); and still others are maybe not exactly confused but hoping
that unlogged table = MyISAM (try not to corrupt your data, but don't
get too bent out of shape about the possibility that it may get
corrupted anyway).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Bruce Momjian

Magnus Hagander wrote:
  I basically report the connection error string if it starts with FATAL:.
 
  I originally tried to check for an ERRCODE_INVALID_PASSWORD error field
  (see // comments), but it seems there is no way to access this, i.e.
  PQgetResult(conn) on a connection failure is always NULL.
 
  Anyway, perhaps FATAL is a better test because it will report any major
  failure, not just a .pgpass one.
 
  Patch attached.
 
 Bad Bruce, using C++ comments like that :P And non-context diff ;)

That comment use was to highlight that those are not for commit, but
there if people want to test.

As far as the diff, it seems git-external-diff isn't portable to
non-Linux systems;  I will post a separate email on that.

 Does this actually solve the *problem*, though? The problem is not
 what is reported  on stdout/stderr, the problem is that the net result
 is that the server is reported as not started (by the service control
 manager) when it actually *is* started. In this case, stderr doesn't
 even go anywhere. What happens if you *don't* Ctrl-C it?

I was just going to post on that.  :-)  Right now, it prints the FATAL
and keeps printing 60 times, then says not running.  Should we just exit
on FATAL and output a special exit string, or say running?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote:
 Magnus Hagander wrote:
 Does this actually solve the *problem*, though? The problem is not
 what is reported  on stdout/stderr, the problem is that the net result
 is that the server is reported as not started (by the service control
 manager) when it actually *is* started. In this case, stderr doesn't
 even go anywhere. What happens if you *don't* Ctrl-C it?

 I was just going to post on that.  :-)  Right now, it prints the FATAL
 and keeps printing 60 times, then says not running.  Should we just exit
 on FATAL and output a special exit string, or say running?

From the perspective of the service control manager, it should say
running. That might break other scenarios though, but i'm not sure - I
think we can safely say the server is running when we try to log in
and get a password failure.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] git diff script is not portable

2010-11-17 Thread Bruce Momjian

Looking here:

http://wiki.postgresql.org/wiki/Working_with_Git

the script we recommend for doing context diffs for git,
git-external-diff, is not portable:

http://anarazel.de/pg/git-external-diff

It uses diff -L, which is not supported by FreeBSD, and I imagine many
other operating systems.

If we want people to use this to produce context diffs, we should
provide a portable script.  I can modify it to be portable, but it is
currently hosted on some other site.  How should I handle this?How
do I contact the author, or perhaps I should create a new on on
ftp.postgresql.org and link to that.

Comments?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 17.11.2010 17:11, Tom Lane wrote:
 The objection to that was not about performance.  It was about how
 to find out what needs to be fsync'd.

 I must be missing something: we handle that just fine with normal
 tables, why is it a problem for unlogged tables?

 Hmm ... that's a good point.  If we simply treat unlogged tables the
 same as regular for checkpointing purposes, don't we end up having
 flushed them all correctly during a shutdown checkpoint?  I was thinking
 that WAL-logging had some influence on that logic, but it doesn't.

 Robert is probably going to object that he wanted to prevent any
 fsyncing for unlogged tables, but the discussion over in pgsql-general
 is crystal clear that people do NOT want to lose unlogged data over
 a clean shutdown and restart.  If all it takes to do that is to refrain
 from lobotomizing the checkpoint logic for unlogged tables, I say we
 should refrain.

I think that's absolutely a bad idea.  I seriously do not want to have
a conversation with someone about why their unlogged tables are
exacerbating their checkpoint I/O spikes.  I'd be happy to have two
modes, though.  We should probably revisit the syntax, though.  One,
it seems that CREATE UNLOGGED TABLE is not as clear as I thought it
was.  Two, when (not if) we add more durability levels, we don't want
to create keywords for all of them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Bruce Momjian

Magnus Hagander wrote:
 On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote:
  Magnus Hagander wrote:
  Does this actually solve the *problem*, though? The problem is not
  what is reported ?on stdout/stderr, the problem is that the net result
  is that the server is reported as not started (by the service control
  manager) when it actually *is* started. In this case, stderr doesn't
  even go anywhere. What happens if you *don't* Ctrl-C it?
 
  I was just going to post on that. ?:-) ?Right now, it prints the FATAL
  and keeps printing 60 times, then says not running. ?Should we just exit
  on FATAL and output a special exit string, or say running?
 
 From the perspective of the service control manager, it should say
 running. That might break other scenarios though, but i'm not sure - I
 think we can safely say the server is running when we try to log in
 and get a password failure.

That was another part of the discussion.  Right now we report any FATAL,
so it might be a password problem, or something else, and it seems doing
all FATALs is the best idea because it will catch any other cases like
this.

Is FATAL, in general, enough to conclude the server is running?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git diff script is not portable

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 19:54, Bruce Momjian br...@momjian.us wrote:
 Looking here:

        http://wiki.postgresql.org/wiki/Working_with_Git

 the script we recommend for doing context diffs for git,
 git-external-diff, is not portable:

        http://anarazel.de/pg/git-external-diff

 It uses diff -L, which is not supported by FreeBSD, and I imagine many
 other operating systems.

 If we want people to use this to produce context diffs, we should
 provide a portable script.  I can modify it to be portable, but it is
 currently hosted on some other site.  How should I handle this?    How
 do I contact the author, or perhaps I should create a new on on
 ftp.postgresql.org and link to that.

Do it the git way - fork it and put it on your github page.

Or do it the pg way - fork it and put it in src/tools.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] MergeAppend costing

2010-11-17 Thread Robert Haas

See the attached test case.  With that setup, this uses MergeAppend:

explain select * from ma_parent order by id limit 10;

But this one does not:

explain select * from ma_parent order by name limit 10;

...which seems odd, because the index on ma_child1 and sorting the
other two ought to still be better than appending all three and
sorting the whole thing.  If you drop ma_child2, you get MergeAppend
again:

begin;
drop table ma_child2;
explain select * from ma_parent order by name limit 10;
rollback;

...which makes me wonder if our costing model is off?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


inh.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git diff script is not portable

2010-11-17 Thread Andres Freund

On Wednesday 17 November 2010 19:54:49 Bruce Momjian wrote:
 Looking here:
 
   http://wiki.postgresql.org/wiki/Working_with_Git
 
 the script we recommend for doing context diffs for git,
 git-external-diff, is not portable:
 
   http://anarazel.de/pg/git-external-diff
 
 It uses diff -L, which is not supported by FreeBSD, and I imagine many
 other operating systems.
 
 If we want people to use this to produce context diffs, we should
 provide a portable script.  I can modify it to be portable, but it is
 currently hosted on some other site.  How should I handle this?How
 do I contact the author, or perhaps I should create a new on on
 ftp.postgresql.org and link to that.
If you have changes I am happy to change the script - but hosting it a 
*.postgresql.org domain might be a good idea anyway.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Magnus Hagander

On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote:
 Magnus Hagander wrote:
 On Wed, Nov 17, 2010 at 19:50, Bruce Momjian br...@momjian.us wrote:
  Magnus Hagander wrote:
  Does this actually solve the *problem*, though? The problem is not
  what is reported ?on stdout/stderr, the problem is that the net result
  is that the server is reported as not started (by the service control
  manager) when it actually *is* started. In this case, stderr doesn't
  even go anywhere. What happens if you *don't* Ctrl-C it?
 
  I was just going to post on that. ?:-) ?Right now, it prints the FATAL
  and keeps printing 60 times, then says not running. ?Should we just exit
  on FATAL and output a special exit string, or say running?

 From the perspective of the service control manager, it should say
 running. That might break other scenarios though, but i'm not sure - I
 think we can safely say the server is running when we try to log in
 and get a password failure.

 That was another part of the discussion.  Right now we report any FATAL,
 so it might be a password problem, or something else, and it seems doing
 all FATALs is the best idea because it will catch any other cases like
 this.

 Is FATAL, in general, enough to conclude the server is running?

No - specifically, we will send FATAL when the database system is
starting up, which is exactly the one we want to *avoid*.

I think we should only exclude the password case. I guess we could
also do all fatal *except* list, but that seems more fragile.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Greg Stark

On Wed, Nov 17, 2010 at 6:33 PM, Robert Haas robertmh...@gmail.com wrote:
 I think you're missing the point.  If we switch databases, all cached
 relations and plans have to be flushed anyway.  We're talking about
 what might NOT need to be flushed on switching databases.

Oh sorry, yes, I missed that point.

I will mention that your point about TCP connection establishment
latency is real. TCP connection establishment adds several
milliseconds of latency (for low latency connections -- obviously it's
much much worse for long-haul connections) and then the congestion
control slow start adds more if there's a significant amount of data
to transfer.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert is probably going to object that he wanted to prevent any
 fsyncing for unlogged tables, but the discussion over in pgsql-general
 is crystal clear that people do NOT want to lose unlogged data over
 a clean shutdown and restart.  If all it takes to do that is to refrain
 from lobotomizing the checkpoint logic for unlogged tables, I say we
 should refrain.

 I think that's absolutely a bad idea.

The customer is always right, and I think we are hearing loud and clear
what the customers want.  Please let's not go out of our way to create
a feature that isn't what they want.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote:
 Is FATAL, in general, enough to conclude the server is running?

 No - specifically, we will send FATAL when the database system is
 starting up, which is exactly the one we want to *avoid*.

 I think we should only exclude the password case. I guess we could
 also do all fatal *except* list, but that seems more fragile.

I believe that the above argument is exactly backwards.  What we want
here is to check the result of postmaster.c's canAcceptConnections(),
and there are only a finite number of error codes that can result from
rejections there.  If we get past that, there are a large number of
possible failures, but all of them indicate that the postmaster is in
principle willing to accept connections.  Checking for password errors
only is utterly wrong: any other type of auth failure would be the same
for this purpose, as would no such database, no such user, too many
connections, etc etc etc.

What we actually want here, and don't have, is the fabled pg_ping
protocol...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Kenneth Marshall

On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert is probably going to object that he wanted to prevent any
  fsyncing for unlogged tables, but the discussion over in pgsql-general
  is crystal clear that people do NOT want to lose unlogged data over
  a clean shutdown and restart. ?If all it takes to do that is to refrain
  from lobotomizing the checkpoint logic for unlogged tables, I say we
  should refrain.
 
  I think that's absolutely a bad idea.
 
 The customer is always right, and I think we are hearing loud and clear
 what the customers want.  Please let's not go out of our way to create
 a feature that isn't what they want.
 
   regards, tom lane
 

I would be fine with only having a safe shutdown with unlogged tables
and skip the checkpoint I/O all other times.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Andrew Dunstan




On 11/17/2010 02:22 PM, Kenneth Marshall wrote:

On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Wed, Nov 17, 2010 at 1:46 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Robert is probably going to object that he wanted to prevent any
fsyncing for unlogged tables, but the discussion over in pgsql-general
is crystal clear that people do NOT want to lose unlogged data over
a clean shutdown and restart. �If all it takes to do that is to refrain
from lobotomizing the checkpoint logic for unlogged tables, I say we
should refrain.

I think that's absolutely a bad idea.

The customer is always right, and I think we are hearing loud and clear
what the customers want.  Please let's not go out of our way to create
a feature that isn't what they want.

I would be fine with only having a safe shutdown with unlogged tables
and skip the checkpoint I/O all other times.


Yeah, I was just thinking something like that would be good, and should 
overcome Robert's objection to the whole idea.


I also agree with Tom's sentiment above.

To answer another point I see Tom made on the -general list: while 
individual backends may crash from time to time, crashes of the whole 
Postgres server are very rare in my experience in production 
environments. It's really pretty robust, unless you're doing crazy 
stuff. So that makes it all the more important that we can restart a 
server cleanly (say, to change a config setting) without losing the 
unlogged tables. If we don't allow that we'll make a laughing stock of 
ourselves. Honestly.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié nov 17 15:48:56 -0300 2010:
 On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote:
  Robert, All:
 
  I hope you're following the thread on -general about this feature.
  We're getting a lot of feedback.
 
 I haven't been; I'm not subscribed to general; it'd be useful to CC me
 next time.

FWIW I've figured that being subscribed to the lists is good even if I
have my mail client configured to hide these emails by default.  It's a
lot easier for searching stuff that someone else references.

(I made the mistake of having it hide all pg-general email even though I
was CC'ed, though, which is the trivial way to implement this.  I don't
recommend repeating this mistake.)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 On 11/17/2010 02:22 PM, Kenneth Marshall wrote:
 I would be fine with only having a safe shutdown with unlogged tables
 and skip the checkpoint I/O all other times.

 Yeah, I was just thinking something like that would be good, and should 
 overcome Robert's objection to the whole idea.

I don't think you can fsync only in the shutdown checkpoint and assume
your data is safe, if you didn't fsync a write a few moments earlier.

Now, a few minutes ago Robert was muttering about supporting more than
one kind of degraded-reliability table.  I could see inventing
unlogged tables, which means exactly that (no xlog support, but we
still checkpoint/fsync as usual), and unsynced tables which
also/instead suppress fsync activity.  The former type could be assumed
to survive a clean shutdown/restart, while the latter wouldn't.  This
would let people pick their poison.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Josh Berkus


 Now, a few minutes ago Robert was muttering about supporting more than
 one kind of degraded-reliability table.  I could see inventing
 unlogged tables, which means exactly that (no xlog support, but we
 still checkpoint/fsync as usual), and unsynced tables which
 also/instead suppress fsync activity.  The former type could be assumed
 to survive a clean shutdown/restart, while the latter wouldn't.  This
 would let people pick their poison.

We're assuming here that the checkpoint activity for the unlogged table
causes significant load on a production system.  Maybe we should do some
testing before we try to make this overly complex?  I wouldn't be
surprised to find that on most filesystems the extra checkpointing of
the unlogged tables adds only small minority overhead.

Shouldn't be hard to build out pgbench into something which will test
this ... if only I had a suitable test machine available.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Tom Lane

[ forgot to comment on this part ]

Andrew Dunstan and...@dunslane.net writes:
 To answer another point I see Tom made on the -general list: while 
 individual backends may crash from time to time, crashes of the whole 
 Postgres server are very rare in my experience in production 
 environments.

Well, if you mean the postmaster darn near never goes down, that's true,
because we go out of our way to ensure it does as little as possible.
But that has got zip to do with this discussion, because a backend crash
has to be assumed to have corrupted unlogged tables.  There are some
folk over in -general who are wishfully thinking that only a postmaster
crash would lose their unlogged data, but that's simply wrong.  Backend
crashes *will* truncate those tables; there is no way around that.  The
comment I made was that my experience as to how often backends crash
might not square with production experience --- but you do have to draw
the distinction between a backend crash and a postmaster crash.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Bruce Momjian

Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  On Wed, Nov 17, 2010 at 19:57, Bruce Momjian br...@momjian.us wrote:
  Is FATAL, in general, enough to conclude the server is running?
 
  No - specifically, we will send FATAL when the database system is
  starting up, which is exactly the one we want to *avoid*.
 
  I think we should only exclude the password case. I guess we could
  also do all fatal *except* list, but that seems more fragile.
 
 I believe that the above argument is exactly backwards.  What we want
 here is to check the result of postmaster.c's canAcceptConnections(),
 and there are only a finite number of error codes that can result from
 rejections there.  If we get past that, there are a large number of
 possible failures, but all of them indicate that the postmaster is in
 principle willing to accept connections.  Checking for password errors
 only is utterly wrong: any other type of auth failure would be the same
 for this purpose, as would no such database, no such user, too many
 connections, etc etc etc.

Agreed.  So how do we pass that info to libpq without exceeding the
value of fixing this problem?  Should we parse pg_controldata output? 
pg_upgrade could use machine-readable output from that too.

 What we actually want here, and don't have, is the fabled pg_ping
 protocol...

Well, we are basically figuring how to implement that with this fix,
whether it is part of pg_ctl or a separate binary.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git diff script is not portable

2010-11-17 Thread Bruce Momjian

Magnus Hagander wrote:
 On Wed, Nov 17, 2010 at 19:54, Bruce Momjian br...@momjian.us wrote:
  Looking here:
 
  ? ? ? ?http://wiki.postgresql.org/wiki/Working_with_Git
 
  the script we recommend for doing context diffs for git,
  git-external-diff, is not portable:
 
  ? ? ? ?http://anarazel.de/pg/git-external-diff
 
  It uses diff -L, which is not supported by FreeBSD, and I imagine many
  other operating systems.
 
  If we want people to use this to produce context diffs, we should
  provide a portable script. ?I can modify it to be portable, but it is
  currently hosted on some other site. ?How should I handle this? ? ?How
  do I contact the author, or perhaps I should create a new on on
  ftp.postgresql.org and link to that.
 
 Do it the git way - fork it and put it on your github page.
 
 Or do it the pg way - fork it and put it in src/tools.

src/tools is a very good idea.  Objections?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Andrew Dunstan




On 11/17/2010 02:44 PM, Tom Lane wrote:

[ forgot to comment on this part ]

Andrew Dunstanand...@dunslane.net  writes:

To answer another point I see Tom made on the -general list: while
individual backends may crash from time to time, crashes of the whole
Postgres server are very rare in my experience in production
environments.

Well, if you mean the postmaster darn near never goes down, that's true,
because we go out of our way to ensure it does as little as possible.
But that has got zip to do with this discussion, because a backend crash
has to be assumed to have corrupted unlogged tables.  There are some
folk over in -general who are wishfully thinking that only a postmaster
crash would lose their unlogged data, but that's simply wrong.  Backend
crashes *will* truncate those tables; there is no way around that.  The
comment I made was that my experience as to how often backends crash
might not square with production experience --- but you do have to draw
the distinction between a backend crash and a postmaster crash.


OK. I'd missed that. Thanks for clarifying.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan and...@dunslane.net wrote:
 The customer is always right, and I think we are hearing loud and clear
 what the customers want.  Please let's not go out of our way to create
 a feature that isn't what they want.

 I would be fine with only having a safe shutdown with unlogged tables
 and skip the checkpoint I/O all other times.

 Yeah, I was just thinking something like that would be good, and should
 overcome Robert's objection to the whole idea.

Could we slow down here a bit and talk through the ideas here in a
logical fashion?

The customer is always right, but the informed customer makes better
decisions than the uninformed customer.  This idea, as proposed, does
not work.  If you only include dirty buffers at the final checkpoint
before shutting down, you have no guarantee that any buffers that you
either didn't write or didn't fsync previously are actually on disk.
Therefore, you have no guarantee that the table data is not corrupted.
 So you really have to decide between including the unlogged-table
buffers in EVERY checkpoint and not ever including them at all.  Which
one is right depends on your use case.

For example, consider the poster who said that, when this feature is
available, they plan to try ripping out their memcached instance and
replacing it with PostgreSQL running unlogged tables.  Suppose this
poster (or someone else in a similar situation) has a 64 GB and is
currently running a 60 GB memcached instance on it, which is not an
unrealistic scenario for memcached.  Suppose further that he dirties
25% of that data each hour.  memcached is currently doing no writes to
disk.  When he switches to PostgreSQL and sets checkpoints_segments to
a gazillion and checkpoint_timeout to the maximum, he's going to start
writing 15 GB of data to disk every hour - data which he clearly
doesn't care about losing, or preserving across restarts, because he's
currently storing it in memcached.  In fact, with memcached, he'll not
only lose data at shutdown - he'll lose data on a regular basis when
everything is running normally.  We can try to convince ourselves that
someone in this situation will not care about needing to get 15GB of
disposable data per hour from memory to disk in order to have a
feature that he doesn't need, but I think it's going to be pretty hard
to make that credible.

Now, second use case.  Consider someone who is currently running
PostgreSQL in a non-durable configuration, with fsync=off,
full_page_writes=off, and synchronous_commit=off.  This person - who
is based on someone I spoke with at PG West - is doing a large amount
of data processing using PostGIS.  Their typical workflow is to load a
bunch of data, run a simulation, and then throw away the entire
database.  They don't want to pay the cost of durability because if
they crash in mid-simulation they will simply rerun it.  Being fast is
more important.  Whether or not this person will be happy with the
proposed behavior is a bit harder to say.  If it kills performance,
they will definitely hate it.  But if the performance penalty is only
modest, they may enjoy the convenience of being able to shut down the
database and start it up again later without losing data.

Third use case.  Someone on pgsql-general mentioned that they want to
write logs to PG, and can abide losing them if a crash happens, but
not on a clean shutdown and restart.  This person clearly shuts down
their production database a lot more often than I do, but that is OK.
By explicit stipulation, they want the survive-a-clean-shutdown
behavior.  I have no problem supporting that use case, providing they
are willing to take the associated performance penalty at checkpoint
time, which we don't know because we haven't asked, but I'm fine with
assuming it's useful even though I probably wouldn't use it much
myself.

 I also agree with Tom's sentiment above.

 To answer another point I see Tom made on the -general list: while
 individual backends may crash from time to time, crashes of the whole
 Postgres server are very rare in my experience in production environments.
 It's really pretty robust, unless you're doing crazy stuff. So that makes it
 all the more important that we can restart a server cleanly (say, to change
 a config setting) without losing the unlogged tables. If we don't allow that
 we'll make a laughing stock of ourselves. Honestly.

Let's please not assume that there is only one reasonable option here,
or that I have not thought about some of these issues.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 Agreed.  So how do we pass that info to libpq without exceeding the
 value of fixing this problem?  Should we parse pg_controldata output? 
 pg_upgrade could use machine-readable output from that too.

pg_controldata seems 100% unrelated to this problem.  You cannot even
tell if the postmaster is alive just by inspecting pg_control.

 What we actually want here, and don't have, is the fabled pg_ping
 protocol...

 Well, we are basically figuring how to implement that with this fix,
 whether it is part of pg_ctl or a separate binary.

Possibly the cleanest fix is to implement pg_ping as a libpq function.
You do have to distinguish connection failures (ie connection refused)
from errors that came back from the postmaster, and the easiest place to
be doing that is inside libpq.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Marc Cousin

The Wednesday 17 November 2010 19:41:19, Tom Lane wrote :
 Marc Cousin cousinm...@gmail.com writes:
  - Does the feature work as advertised?
  
  Yes. It works consistently, isn't fooled by savepoints or multiple
  serials in a table, or concurrent transactions
 
 I think there's a rather nasty problem here, which is what to do with
 the cached nextval/currval state.  As submitted, the patch does the same
 thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued
 nextval values, but don't touch currval) at the time of resetting the
 sequence.  That's fine, but what if the transaction later rolls back?
 The cached state is untouched by rollback, so if the transaction had
 done any nextval()s meanwhile, the cache will be out of step with the
 rolled-back sequence contents.

Yes, I completely missed testing with non default cache value. And it fails, 
of course, some values are generated a second time twice after a rollback

 
 We never had to worry about this before because sequence operations
 didn't roll back, by definition.  If we're going to add a situation
 where they do roll back, we need to consider the case.
 
 I think we can arrange to clear cached unissued values on the next
 attempt to nextval() the sequence, by dint of adding the relfilenode
 to SeqTable entries and clearing cached state whenever we note that
 it doesn't match the current relfilenode of the sequence.  However,
 I'm unsure what ought to happen to currval.  It doesn't seem too
 practical to try to roll it back to its pre-transaction value.
 Should we leave it alone (ie, possibly reflecting a value that was
 assigned inside the failed transaction)?  The other alternative would
 be to clear it as though nextval had never been issued at all in the
 session.
 

Should currval really be used after a failed transaction ? Right now, we can 
have a value that has been generated inside a rollbacked transaction too. I'd 
vote for leave it alone.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Agreed.  So how do we pass that info to libpq without exceeding the
  value of fixing this problem?  Should we parse pg_controldata output? 
  pg_upgrade could use machine-readable output from that too.
 
 pg_controldata seems 100% unrelated to this problem.  You cannot even
 tell if the postmaster is alive just by inspecting pg_control.

I was thinking of this:

$ pg_controldata /u/pg/data
...
Database cluster state:   shut down

  What we actually want here, and don't have, is the fabled pg_ping
  protocol...
 
  Well, we are basically figuring how to implement that with this fix,
  whether it is part of pg_ctl or a separate binary.
 
 Possibly the cleanest fix is to implement pg_ping as a libpq function.
 You do have to distinguish connection failures (ie connection refused)
 from errors that came back from the postmaster, and the easiest place to
 be doing that is inside libpq.

OK, so a new libpq function --- got it.  Would we just pass the status
from the backend or can it be done without backend modifications?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TYPE recursion to typed tables

2010-11-17 Thread Dimitri Fontaine

Peter Eisentraut pete...@gmx.net writes:
 Here is the patch that adds [RESTRICT|CASCADE] to ALTER TYPE ...
 ADD/ALTER/DROP/RENAME ATTRIBUTE, so that recurses to typed tables.

And here's my commitfest review of it:

 - patch applies cleanly
 - adds regression tests
 - passes them
 - is useful and needed, and something we don't already have
 - don't generate warnings (or I missed them) :)

Code wise, though, I wonder about the name of the recursing parameter
of the renameatt_internal function is src/backend/commands/tablecmds.c,
which seems to only get used to detect erroneous attempt at renaming the
table column directly. Maybe it's only me not used enough to PostgreSQL
code yet, but here it distract the code reader. Having another parameter
called recurse is not helping, too, but I don't see this one needs to
be changed.

I'm not sure what a good name would be here, alter_type_cascade is an
example that comes to mind, on the verbose side.

As I think the issue is to be decided by a commiter, I will go and mark
this patch as ready for commiter!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 2:42 PM, Josh Berkus j...@agliodbs.com wrote:

 Now, a few minutes ago Robert was muttering about supporting more than
 one kind of degraded-reliability table.  I could see inventing
 unlogged tables, which means exactly that (no xlog support, but we
 still checkpoint/fsync as usual), and unsynced tables which
 also/instead suppress fsync activity.  The former type could be assumed
 to survive a clean shutdown/restart, while the latter wouldn't.  This
 would let people pick their poison.

 We're assuming here that the checkpoint activity for the unlogged table
 causes significant load on a production system.  Maybe we should do some
 testing before we try to make this overly complex?  I wouldn't be
 surprised to find that on most filesystems the extra checkpointing of
 the unlogged tables adds only small minority overhead.

 Shouldn't be hard to build out pgbench into something which will test
 this ... if only I had a suitable test machine available.

I guess the point I'd make here is that checkpoint I/O will be a
problem for unlogged tables in exactly the same situations in which it
is a problem for regular tables.  There is some amount of I/O that
your system can handle before the additional I/O caused by checkpoints
starts to become a problem.  If unlogged tables (or one particular
variant of unlogged tables) don't need to participate in checkpoints,
then you will be able to use unlogged tables, in situations where they
are appropriate to the workload, to control your I/O load and
hopefully keep it below the level where it causes a problem.  Of
course, there will also be workloads where your system has plenty of
spare capacity (in which case it won't matter) or where your system is
going to be overwhelmed anyway (in which case it doesn't really matter
either).  But if you are somewhere between those two extremes, this
has to matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Steve Singer


On 10-11-17 03:00 PM, Marc Cousin wrote:

The Wednesday 17 November 2010 19:41:19, Tom Lane wrote :

Marc Cousincousinm...@gmail.com  writes:

- Does the feature work as advertised?

Yes. It works consistently, isn't fooled by savepoints or multiple
serials in a table, or concurrent transactions


I think there's a rather nasty problem here, which is what to do with
the cached nextval/currval state.  As submitted, the patch does the same
thing as ALTER SEQUENCE RESTART (to wit, clear any cached unissued
nextval values, but don't touch currval) at the time of resetting the
sequence.  That's fine, but what if the transaction later rolls back?
The cached state is untouched by rollback, so if the transaction had
done any nextval()s meanwhile, the cache will be out of step with the
rolled-back sequence contents.


Yes, I completely missed testing with non default cache value. And it fails,
of course, some values are generated a second time twice after a rollback



I will look at addressing this in an updated patch.



We never had to worry about this before because sequence operations
didn't roll back, by definition.  If we're going to add a situation
where they do roll back, we need to consider the case.

I think we can arrange to clear cached unissued values on the next
attempt to nextval() the sequence, by dint of adding the relfilenode
to SeqTable entries and clearing cached state whenever we note that
it doesn't match the current relfilenode of the sequence.  However,
I'm unsure what ought to happen to currval.  It doesn't seem too
practical to try to roll it back to its pre-transaction value.
Should we leave it alone (ie, possibly reflecting a value that was
assigned inside the failed transaction)?  The other alternative would
be to clear it as though nextval had never been issued at all in the
session.



Should currval really be used after a failed transaction ? Right now, we can
have a value that has been generated inside a rollbacked transaction too. I'd
vote for leave it alone.



I agree  probably shouldn't be using curval after a failed transaction 
which is why having it return as if it hadn't been issued sounds like a 
more reasonable behaviour.  If an application tries a currval following 
the rollback then at least the application won't get a bogus value.  It 
is better to return an error than to let the application continuing on 
thinking it has a sequence value that won't be (or has not) been 
assigned to some other session.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Andres Freund

On Wednesday 17 November 2010 20:54:14 Robert Haas wrote:
 On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan and...@dunslane.net wrote:
  The customer is always right, and I think we are hearing loud and clear
  what the customers want.  Please let's not go out of our way to create
  a feature that isn't what they want.
  
  I would be fine with only having a safe shutdown with unlogged tables
  and skip the checkpoint I/O all other times.
  
  Yeah, I was just thinking something like that would be good, and should
  overcome Robert's objection to the whole idea.
 
 Could we slow down here a bit and talk through the ideas here in a
 logical fashion?
 
 The customer is always right, but the informed customer makes better
 decisions than the uninformed customer.  This idea, as proposed, does
 not work.  If you only include dirty buffers at the final checkpoint
 before shutting down, you have no guarantee that any buffers that you
 either didn't write or didn't fsync previously are actually on disk.
 Therefore, you have no guarantee that the table data is not corrupted.
  So you really have to decide between including the unlogged-table
 buffers in EVERY checkpoint and not ever including them at all.  Which
 one is right depends on your use case.
How can you get a buffer which was no written out *at all*? Do you want to 
force all such pages to stay in shared_buffers? That sounds quite a bit more 
complicated than what you proposed...

 For example, consider the poster who said that, when this feature is
 available, they plan to try ripping out their memcached instance and
 replacing it with PostgreSQL running unlogged tables.  Suppose this
 poster (or someone else in a similar situation) has a 64 GB and is
 currently running a 60 GB memcached instance on it, which is not an
 unrealistic scenario for memcached.  Suppose further that he dirties
 25% of that data each hour.  memcached is currently doing no writes to
 disk.  When he switches to PostgreSQL and sets checkpoints_segments to
 a gazillion and checkpoint_timeout to the maximum, he's going to start
 writing 15 GB of data to disk every hour - data which he clearly
 doesn't care about losing, or preserving across restarts, because he's
 currently storing it in memcached.  In fact, with memcached, he'll not
 only lose data at shutdown - he'll lose data on a regular basis when
 everything is running normally.  We can try to convince ourselves that
 someone in this situation will not care about needing to get 15GB of
 disposable data per hour from memory to disk in order to have a
 feature that he doesn't need, but I think it's going to be pretty hard
 to make that credible.
To really support that use case we would first need to make shared_buffers 
properly scale to 64GB - which unfortunatley, in my experience, is not yet the 
case.
Also, see the issues in the former paragraph - I have severe doubts you can 
support such a memcached scenario by pg. Either you spill to disk if your 
buffers overflow (fine with me) or you need to throw away data memcached alike. 
I 
doubt there is a sensible implementation in pg for the latter.

So you will have to write to disk at some point...

 Third use case.  Someone on pgsql-general mentioned that they want to
 write logs to PG, and can abide losing them if a crash happens, but
 not on a clean shutdown and restart.  This person clearly shuts down
 their production database a lot more often than I do, but that is OK.
 By explicit stipulation, they want the survive-a-clean-shutdown
 behavior.  I have no problem supporting that use case, providing they
 are willing to take the associated performance penalty at checkpoint
 time, which we don't know because we haven't asked, but I'm fine with
 assuming it's useful even though I probably wouldn't use it much
 myself.
Maybe I am missing something - but why does this imply we have to write data 
at checkpoints?
Just fsyncing every file belonging to an persistently-unlogged (or whatever 
sensible name anyone can come up) table is not prohibively expensive - in fact 
doing that on a local $PGDATA with approx 300GB and loads of tables doing so 
takes less than 15s on a system with hot inode/dentry cache and no dirty files.
(just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files 
beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files 
and then fsyncs every one).
The assumption of a hot inode cache is realistic I think.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 11/17/2010 02:22 PM, Kenneth Marshall wrote:
 I would be fine with only having a safe shutdown with unlogged tables
 and skip the checkpoint I/O all other times.

 Yeah, I was just thinking something like that would be good, and should
 overcome Robert's objection to the whole idea.

 I don't think you can fsync only in the shutdown checkpoint and assume
 your data is safe, if you didn't fsync a write a few moments earlier.

 Now, a few minutes ago Robert was muttering about supporting more than
 one kind of degraded-reliability table.  I could see inventing
 unlogged tables, which means exactly that (no xlog support, but we
 still checkpoint/fsync as usual), and unsynced tables which
 also/instead suppress fsync activity.  The former type could be assumed
 to survive a clean shutdown/restart, while the latter wouldn't.  This
 would let people pick their poison.

OK, so we're proposing a hierarchy like this.

1. PERMANENT (already exists).  Permanent tables are WAL-logged,
participate in checkpoints, and are fsync'd.  They survive crashes and
clean restarts, and are replicated.

2. UNLOGGED (what this patch currently implements).  Unlogged tables
are not WAL-logged, but they do participate in checkpoints and they
are fsync'd on request.  They survive clean restarts, but on a crash
they are truncated.  They are not replicated.

3. UNSYNCED (future work).  Unsynced tables are not WAL-logged, do not
participate in checkpoints, and are never fsync'd.  After any sort of
crash or shutdown, clean or otherwise, they are truncated.  They are
not replicated.

4. GLOBAL TEMPORARY (future work).  Global temporary tables are not
WAL-logged, do not participate in checkpoints, and are never fsync'd.
The contents of each global temporary table are private to that
session, so that they can use the local buffer manager rather than
shared buffers.  Multiple sessions can use a global temporary table at
the same time, and each sees separate contents.  At session exit, any
contents inserted by the owning backend are lost; since all sessions
exit on crash or shutdown, all contents are also lost at that time.

5. LOCAL TEMPORARY (our current temp tables).  Local temporary tables
are not WAL-logged, do not participate in checkpoints, and are never
fsync'd.  The table definition and all of its contents are private to
the session, so that they are dropped at session exit (or at
transaction end if ON COMMIT DROP is used).  Since all sessions exit
on crash or shutdown, all table definitions and all table contents are
lost at that time.

It's possible to imagine a few more stops on this hierarchy.  For
example, you could have an ASYNCHRONOUS table between (1) and (2) that
always acts as if synchronous_commit=off, but is otherwise replicated
and durable over crashes; or a MINIMALLY LOGGED table that is XLOG'd
as if wal_level=minimal even when the actual value of wal_level is
otherwise, and is therefore crash-safe but not replication-safe; or a
level that is similar to unlogged but we XLOG the first event that
dirties a page after each checkpoint, and therefore even on a crash we
need only remove the tables for which such an XLOG record has been
written.  All of those are a bit speculative perhaps but we could jam
them in there if there's demand, I suppose.

I don't particularly care for the name UNSYNCED, and I'm starting not
to like UNLOGGED much either, although at least that one is an actual
word.  PERMANENT and the flavors of TEMPORARY are a reasonably
comprehensible as a description of user-visible behavior, but UNLOGGED
and UNSYNCED sounds a lot like they're discussing internal details
that the user might not actually understand or care about.  I don't
have a better idea right off the top of my head, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Andrew Dunstan




On 11/17/2010 03:37 PM, Robert Haas wrote:

  I don't particularly care for the name UNSYNCED, and I'm starting not
to like UNLOGGED much either, although at least that one is an actual
word.  PERMANENT and the flavors of TEMPORARY are a reasonably
comprehensible as a description of user-visible behavior, but UNLOGGED
and UNSYNCED sounds a lot like they're discussing internal details
that the user might not actually understand or care about.  I don't
have a better idea right off the top of my head, though.


Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund and...@anarazel.de wrote:
 The customer is always right, but the informed customer makes better
 decisions than the uninformed customer.  This idea, as proposed, does
 not work.  If you only include dirty buffers at the final checkpoint
 before shutting down, you have no guarantee that any buffers that you
 either didn't write or didn't fsync previously are actually on disk.
 Therefore, you have no guarantee that the table data is not corrupted.
  So you really have to decide between including the unlogged-table
 buffers in EVERY checkpoint and not ever including them at all.  Which
 one is right depends on your use case.
 How can you get a buffer which was no written out *at all*? Do you want to
 force all such pages to stay in shared_buffers? That sounds quite a bit more
 complicated than what you proposed...

Oh, you're right.  We always have to write buffers before kicking them
out of shared_buffers, but if we don't fsync them we have no guarantee
they're actually on disk.

 For example, consider the poster who said that, when this feature is
 available, they plan to try ripping out their memcached instance and
 replacing it with PostgreSQL running unlogged tables.  Suppose this
 poster (or someone else in a similar situation) has a 64 GB and is
 currently running a 60 GB memcached instance on it, which is not an
 unrealistic scenario for memcached.  Suppose further that he dirties
 25% of that data each hour.  memcached is currently doing no writes to
 disk.  When he switches to PostgreSQL and sets checkpoints_segments to
 a gazillion and checkpoint_timeout to the maximum, he's going to start
 writing 15 GB of data to disk every hour - data which he clearly
 doesn't care about losing, or preserving across restarts, because he's
 currently storing it in memcached.  In fact, with memcached, he'll not
 only lose data at shutdown - he'll lose data on a regular basis when
 everything is running normally.  We can try to convince ourselves that
 someone in this situation will not care about needing to get 15GB of
 disposable data per hour from memory to disk in order to have a
 feature that he doesn't need, but I think it's going to be pretty hard
 to make that credible.
 To really support that use case we would first need to make shared_buffers
 properly scale to 64GB - which unfortunatley, in my experience, is not yet the
 case.

Well, that's something to aspire to.  :-)

 Also, see the issues in the former paragraph - I have severe doubts you can
 support such a memcached scenario by pg. Either you spill to disk if your
 buffers overflow (fine with me) or you need to throw away data memcached 
 alike. I
 doubt there is a sensible implementation in pg for the latter.

 So you will have to write to disk at some point...

I agree that there are difficulties, but again, doing checkpoint I/O
for data that the user was willing to throw away is going in the wrong
direction.

 Third use case.  Someone on pgsql-general mentioned that they want to
 write logs to PG, and can abide losing them if a crash happens, but
 not on a clean shutdown and restart.  This person clearly shuts down
 their production database a lot more often than I do, but that is OK.
 By explicit stipulation, they want the survive-a-clean-shutdown
 behavior.  I have no problem supporting that use case, providing they
 are willing to take the associated performance penalty at checkpoint
 time, which we don't know because we haven't asked, but I'm fine with
 assuming it's useful even though I probably wouldn't use it much
 myself.
 Maybe I am missing something - but why does this imply we have to write data
 at checkpoints?
 Just fsyncing every file belonging to an persistently-unlogged (or whatever
 sensible name anyone can come up) table is not prohibively expensive - in fact
 doing that on a local $PGDATA with approx 300GB and loads of tables doing so
 takes less than 15s on a system with hot inode/dentry cache and no dirty 
 files.
 (just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files
 beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files
 and then fsyncs every one).
 The assumption of a hot inode cache is realistic I think.

Hmm.  I don't really want to try to do it in this patch because it's
complicated enough already, but if people don't mind the shutdown
sequence potentially being slowed down a bit, that might allow us to
have the best of both worlds without needing to invent multiple
durability levels.  I was sort of assuming that people wouldn't want
to slow down the shutdown sequence to avoid losing data they've
already declared isn't that valuable, but evidently I underestimated
the demand for kinda-durable tables.  If the overhead of doing this
isn't too severe, it might be the way to go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] unlogged tables

2010-11-17 Thread David Fetter

On Wed, Nov 17, 2010 at 03:48:52PM -0500, Andrew Dunstan wrote:
 On 11/17/2010 03:37 PM, Robert Haas wrote:
 I don't particularly care for the name UNSYNCED, and I'm starting
 not to like UNLOGGED much either, although at least that one is an
 actual word.  PERMANENT and the flavors of TEMPORARY are a
 reasonably comprehensible as a description of user-visible
 behavior, but UNLOGGED and UNSYNCED sounds a lot like they're
 discussing internal details that the user might not actually
 understand or care about.  I don't have a better idea right off the
 top of my head, though.
 
 Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.

+1 for describing the end-user-visible behavior.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Steve Crawford


On 11/17/2010 12:48 PM, Andrew Dunstan wrote:


Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.


UNSAFE and EXTREMELY_UNSAFE?? :)

Cheers,
Steve


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 OK, so we're proposing a hierarchy like this.
 
 1. PERMANENT (already exists).
 
 2. UNLOGGED (what this patch currently implements).
 
 3. UNSYNCED (future work).
 
 4. GLOBAL TEMPORARY (future work).
 
 5. LOCAL TEMPORARY (our current temp tables).
 
All of the above would have real uses in our shop.
 
 It's possible to imagine a few more stops on this hierarchy.
 
Some of these might be slightly preferred over the above in certain
circumstances, but that's getting down to fine tuning.  I think the
five listed above are more important than the speculative ones
mentioned.
 
 I don't particularly care for the name UNSYNCED
 
EVANESCENT?
 
 I'm starting not to like UNLOGGED much either
 
EPHEMERAL?
 
Actually, the UNSYNCED and UNLOGGED seem fairly clear
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread A.M.


On Nov 17, 2010, at 4:00 PM, Kevin Grittner wrote:

 Robert Haas robertmh...@gmail.com wrote:
 
 OK, so we're proposing a hierarchy like this.
 
 1. PERMANENT (already exists).
 
 2. UNLOGGED (what this patch currently implements).
 
 3. UNSYNCED (future work).
 
 4. GLOBAL TEMPORARY (future work).
 
 5. LOCAL TEMPORARY (our current temp tables).
 
 All of the above would have real uses in our shop.
 
 It's possible to imagine a few more stops on this hierarchy.
 
 Some of these might be slightly preferred over the above in certain
 circumstances, but that's getting down to fine tuning.  I think the
 five listed above are more important than the speculative ones
 mentioned.
 
 I don't particularly care for the name UNSYNCED
 
 EVANESCENT?
 
 I'm starting not to like UNLOGGED much either
 
 EPHEMERAL?
 
 Actually, the UNSYNCED and UNLOGGED seem fairly clear

Unless one thinks that the types could be combined- perhaps a table declaration 
could use both UNLOGGED and UNSYNCED?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Robert Haas

On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 OK, so we're proposing a hierarchy like this.

 1. PERMANENT (already exists).

 2. UNLOGGED (what this patch currently implements).

 3. UNSYNCED (future work).

 4. GLOBAL TEMPORARY (future work).

 5. LOCAL TEMPORARY (our current temp tables).

 All of the above would have real uses in our shop.

 It's possible to imagine a few more stops on this hierarchy.

 Some of these might be slightly preferred over the above in certain
 circumstances, but that's getting down to fine tuning.  I think the
 five listed above are more important than the speculative ones
 mentioned.

 I don't particularly care for the name UNSYNCED

 EVANESCENT?

 I'm starting not to like UNLOGGED much either

 EPHEMERAL?

 Actually, the UNSYNCED and UNLOGGED seem fairly clear

I think Andrew's suggestion of VOLATILE is pretty good.  It's hard to
come up with multiple words that express gradations of we might
decide to chuck your data if things go South, though.  Then again if
we go with Andres's suggestion maybe we can get by with one level.

Or if we still end up with multiple levels, maybe it's best to use
VOLATILE for everything 1 and 4, and then have a subordinate clause
to specify gradations.

CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
THAT EXPLAIN THE DETAILS GO HERE;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Andrew Dunstan




On 11/17/2010 04:00 PM, Kevin Grittner wrote:

  Actually, the UNSYNCED and UNLOGGED seem fairly clear


I think Robert's right. These names won't convey much to someone not 
steeped in our technology.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié nov 17 17:51:37 -0300 2010:
 On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund and...@anarazel.de wrote:

  How can you get a buffer which was no written out *at all*? Do you want to
  force all such pages to stay in shared_buffers? That sounds quite a bit more
  complicated than what you proposed...
 
 Oh, you're right.  We always have to write buffers before kicking them
 out of shared_buffers, but if we don't fsync them we have no guarantee
 they're actually on disk.

You could just open all the segments and fsync them.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread David Fetter

On Wed, Nov 17, 2010 at 04:05:56PM -0500, Robert Haas wrote:
 On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
  Robert Haas robertmh...@gmail.com wrote:
 
  OK, so we're proposing a hierarchy like this.
 
  1. PERMANENT (already exists).
 
  2. UNLOGGED (what this patch currently implements).
 
  3. UNSYNCED (future work).
 
  4. GLOBAL TEMPORARY (future work).
 
  5. LOCAL TEMPORARY (our current temp tables).
 
  All of the above would have real uses in our shop.
 
  It's possible to imagine a few more stops on this hierarchy.
 
  Some of these might be slightly preferred over the above in certain
  circumstances, but that's getting down to fine tuning.  I think the
  five listed above are more important than the speculative ones
  mentioned.
 
  I don't particularly care for the name UNSYNCED
 
  EVANESCENT?
 
  I'm starting not to like UNLOGGED much either
 
  EPHEMERAL?
 
  Actually, the UNSYNCED and UNLOGGED seem fairly clear
 
 I think Andrew's suggestion of VOLATILE is pretty good.  It's hard to
 come up with multiple words that express gradations of we might
 decide to chuck your data if things go South, though.  Then again if
 we go with Andres's suggestion maybe we can get by with one level.
 
 Or if we still end up with multiple levels, maybe it's best to use
 VOLATILE for everything 1 and 4, and then have a subordinate clause
 to specify gradations.
 
 CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
 THAT EXPLAIN THE DETAILS GO HERE;

How about something like:

OPTIONS (SYNC=no, LOG=no, ... )

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié nov 17 18:05:56 -0300 2010:

 CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
 THAT EXPLAIN THE DETAILS GO HERE;

What about some reloptions?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Steve Crawford


On 11/17/2010 11:44 AM, Tom Lane wrote:
...because a backend crash has to be assumed to have corrupted 
unlogged tables...
   
So in a typical use-case, say storing session data on a web-site, one 
crashed backend could wreck sessions for some or all of the site? Is 
there a mechanism in the proposal that would allow a client to determine 
the state of a table (good, truncated, wrecked, etc.)?


Cheers,
Steve


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Tom Lane

Steve Singer ssin...@ca.afilias.info writes:
 I will look at addressing this in an updated patch.

I've nearly finished revising the patch already, don't worry about it.

 Should currval really be used after a failed transaction ? Right now, we can
 have a value that has been generated inside a rollbacked transaction too. I'd
 vote for leave it alone.

 I agree  probably shouldn't be using curval after a failed transaction 

Well, people can do that now, and it doesn't throw an error.  I'm
inclined to agree with Marc that just leaving it alone (ie, it returns
the last value produced, whether the transaction rolls back or not)
is the best thing.  There's inherently going to be some inconsistency
here, since there's no such thing as a transactional sequence change
otherwise.  I don't see the point of going way out of our way to move
the inconsistencies around.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

2010-11-17 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Possibly the cleanest fix is to implement pg_ping as a libpq function.
 You do have to distinguish connection failures (ie connection refused)
 from errors that came back from the postmaster, and the easiest place to
 be doing that is inside libpq.

 OK, so a new libpq function --- got it.  Would we just pass the status
 from the backend or can it be done without backend modifications?

It would definitely be better to do it without backend mods, so that
the functionality would work against back-branch postmasters.

To my mind, the entire purpose of such a function is to classify the
possible errors so that the caller doesn't have to.  So I wouldn't
consider that it ought to pass back the status from the backend.
I think what we basically want is a function that takes a conninfo
string (or one of the variants of that) and returns an enum defined
more or less like this:

* failed to connect to postmaster
* connected, but postmaster is not accepting sessions
* postmaster is up and accepting sessions

I'm not sure those are exactly the categories we want, but something
close to that.  In particular, I don't know if there's any value in
subdividing the not accepting sessions status --- pg_ctl doesn't
really care, but other use-cases might want to tell the difference
between the various canAcceptConnections failure states.

BTW, it is annoying that we can't definitively distinguish postmaster
is not running from a connectivity problem, but I can't see a way
around that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 I've been trying to diagnose in a production database why certain tables
 never get autovacuumed despite having a substantial % of updates.  The
 obvious reason is locks blocking autovacuum from vacuuming the table ...
 but the trick is we don't log such blocking behavior, at all.  This
 means that there is no possible way for a user to figure out *why* his
 tables aren't getting autovacuumed.

 And yes, this is a common problem.  A quick survey on IRC found 3 active
 users on channel (out of 20 or so) who'd encountered it.  The current
 case I'm looking at is a table with over 70% bloat which hasn't been
 autovacuumed since the database was upgraded a month ago.

 What I'd like to do is add some logging code to autovacuum.c so that if
 log_autovacuum is any value other than -1, failure to vacuum due to
 locks gets logged.   Does this make sense?

It's hard to tell, because you're just handwaving about what it is you
think isn't being logged; nor is it clear whether you have any evidence
that locks are the problem.  Offhand I'd think it at least as likely
that autovacuum thinks it doesn't need to do anything, perhaps because
of a statistics issue.  There *is* an elog(DEBUG3) in autovacuum.c
that reports whether autovac thinks a table needs vacuumed/analyzed ...
maybe that needs to be a tad more user-accessible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MergeAppend costing

2010-11-17 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 See the attached test case.  With that setup, this uses MergeAppend:
 explain select * from ma_parent order by id limit 10;

 But this one does not:

 explain select * from ma_parent order by name limit 10;

 ...which seems odd, because the index on ma_child1 and sorting the
 other two ought to still be better than appending all three and
 sorting the whole thing.

Not really; what you're not accounting for is that the top-level sort
is a lot cheaper than a full sort of the large child relation would be,
because it gets hacked to do a top-N sort instead of a full sort.

What this example suggests is that we should consider ways to pass
down the top-N-ness to sorts executed as part of a MergeAppend tree.
That seems a tad messy though, both in the executor and the planner.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Josh Berkus


 It's hard to tell, because you're just handwaving about what it is you
 think isn't being logged; nor is it clear whether you have any evidence
 that locks are the problem.  Offhand I'd think it at least as likely
 that autovacuum thinks it doesn't need to do anything, perhaps because
 of a statistics issue.  There *is* an elog(DEBUG3) in autovacuum.c
 that reports whether autovac thinks a table needs vacuumed/analyzed ...
 maybe that needs to be a tad more user-accessible.

Yeah, it would be really good to be able to log that without bumping the
log levels of the server in general to DEBUG3.  On a busy production
server, using any of the DEBUG levels is pretty much out of the question
... they can produce up to 1GB/minute in output.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 There *is* an elog(DEBUG3) in autovacuum.c
 that reports whether autovac thinks a table needs vacuumed/analyzed ...
 maybe that needs to be a tad more user-accessible.

 Yeah, it would be really good to be able to log that without bumping the
 log levels of the server in general to DEBUG3.  On a busy production
 server, using any of the DEBUG levels is pretty much out of the question
 ... they can produce up to 1GB/minute in output.

Well, the way to deal with that would be to add a GUC that enables
reporting of those messages at LOG level.  But it's a bit hard to argue
that we need such a thing without more evidence.  Maybe you could just
locally modify the DEBUG3 to LOG and see whether it teaches you
anything?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Itagaki Takahiro

On Thu, Nov 18, 2010 at 08:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, it would be really good to be able to log that without bumping the
 log levels of the server in general to DEBUG3.

 Well, the way to deal with that would be to add a GUC that enables
 reporting of those messages at LOG level.  But it's a bit hard to argue
 that we need such a thing without more evidence.  Maybe you could just
 locally modify the DEBUG3 to LOG and see whether it teaches you
 anything?

How about adding a special role for autovacuum, and running autovacuum
by the role instead of the database owner? If we have autovacuum role
for autovacuum processes, we could set log_min_messages to DEBUG3 for
only the role with per-user configuration.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane

Itagaki Takahiro itagaki.takah...@gmail.com writes:
 On Thu, Nov 18, 2010 at 08:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, the way to deal with that would be to add a GUC that enables
 reporting of those messages at LOG level. Â But it's a bit hard to argue
 that we need such a thing without more evidence. Â Maybe you could just
 locally modify the DEBUG3 to LOG and see whether it teaches you
 anything?

 How about adding a special role for autovacuum, and running autovacuum
 by the role instead of the database owner? If we have autovacuum role
 for autovacuum processes, we could set log_min_messages to DEBUG3 for
 only the role with per-user configuration.

That seems like a major kluge ... and anyway it doesn't fix the problem,
because DEBUG3 is still going to result in a lot of unwanted log output,
even if it's confined to autovacuum.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 111 matches

Mail list logo