Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Marc Cousin
The Wednesday 17 November 2010 15:50:36, Jaime Casanova wrote :
> On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin  wrote:
> > - Does the feature work as advertised?
> > 
> > Yes. It works consistently, isn't fooled by savepoints or multiple
> > serials in a table, or concurrent transactions
> 
> i haven't tested this nor readed the patch but i wondering what
> happens in the presence of a prepared transaction (2PC), did you try
> with concurrent transactions with different serialization levels?
I haven't tested with 2PC.

I didn't check with different isolations levels either.

I just verified that locking was happening as it should : truncate is blocked 
by a transaction already locking the table with an AccessShareLock and vice-
versa.

And that Rollbacking and rollbacking to savepoint restores the sequence to the 
correct state : the sequence isn't restored to its value at the savepoint, but 
at its last value before the truncate.

I don't see a special test-case with different isolation levels or 2PC. What 
do you have in mind ?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Jaime Casanova
On Wed, Nov 17, 2010 at 8:13 AM, Marc Cousin  wrote:
>
> - Does the feature work as advertised?
>
> Yes. It works consistently, isn't fooled by savepoints or multiple serials
> in a table, or concurrent transactions
>

i haven't tested this nor readed the patch but i wondering what
happens in the presence of a prepared transaction (2PC), did you try
with concurrent transactions with different serialization levels?


-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner
On 11/17/2010 02:19 PM, Alvaro Herrera wrote:
> Well, the autovacuum mechanism involves a lot of back-and-forth between
> launcher and postmaster, which includes some signals, a fork() and
> backend initialization.  The failure possibilities are endless.
> 
> Fork failure communication is similarly brittle.

I certainly agree to that. However, a re-connecting mechanism wouldn't
allow us to get rid of the existing avworker startup infrastructure
entirely.

And for increased robustness, we'd require a less brittle re-connecting
mechanism. Given Robert's list, that doesn't seem trivial, either. (But
still doable, yes).

> Right now we have this "delay", if the process is not up and running in
> 60 seconds then we have to assume that "something" happened, and we no
> longer wait for it.  If we knew the process was already there, we could
> leave it alone; we'd know it would get to its duty eventually.

You are assuming presence of pool here. Which is fine, it's just not
something that a re-connecting feature would solve per se. (Says he who
coded the bgworkers pool thingie).

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] describe objects, as in pg_depend

2010-11-17 Thread Alvaro Herrera
Hi,

A customer of ours (Enova Financial) requested the ability to describe
objects in pg_depend.  The wiki contains a simplistic SQL snippet that
does the task, but only for some of the object types, and it's rather
ugly.  It struck me that we could fulfill this very easily by exposing
the getObjectDescription() function at the SQL level, as in the attached
module.

I propose we include this as a builtin function.

Opinions?

-- 
Álvaro Herrera 
#include "postgres.h"

#include "catalog/dependency.h"
#include "fmgr.h"
#include "utils/builtins.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Datum describe_object(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(describe_object);

Datum
describe_object(PG_FUNCTION_ARGS)
{
Oid classid = PG_GETARG_OID(0);
Oid objid = PG_GETARG_OID(1);
int32   subobjid = PG_GETARG_INT32(2);
ObjectAddress   address;
char   *description = NULL;

if (classid != InvalidOid)
{
address.classId = classid;
address.objectId = objid;
address.objectSubId = subobjid;

description = getObjectDescription(&address);
}

if (!description || description[0] == '\0')
PG_RETURN_NULL();

PG_RETURN_TEXT_P(cstring_to_text(description));
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Alvaro Herrera
Excerpts from Markus Wanner's message of mié nov 17 09:57:18 -0300 2010:
> On 11/17/2010 01:27 PM, Alvaro Herrera wrote:
> > I don't think it's a speed thing only.  It would be a great thing to
> > have in autovacuum, for example, where we have constant problem reports
> > because the system failed to fork a new backend.  If we could simply
> > reuse an already existing one, it would be a lot more robust.
> 
> Hm, that's an interesting point.
> 
> To actually increase robustness, it would have to be a failure scenario
> that (temporarily) prevents forking, but allows an existing backend to
> continue to do work (i.e. the ability to allocate memory or open files
> come to mind).

Well, the autovacuum mechanism involves a lot of back-and-forth between
launcher and postmaster, which includes some signals, a fork() and
backend initialization.  The failure possibilities are endless.

Fork failure communication is similarly brittle.

> Any idea about what's usually causing these fork() failures? I'm asking
> because I'm afraid that for example, in case of an out of memory
> condition, we'd just hit an OOM error later on, without being able to
> perform the VACUUM job, either.

To be honest I have no idea.  Sometimes the server is just too loaded.
Right now we have this "delay", if the process is not up and running in
60 seconds then we have to assume that "something" happened, and we no
longer wait for it.  If we knew the process was already there, we could
leave it alone; we'd know it would get to its duty eventually.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: rollback sequence reset for TRUNCATE ... RESTART IDENTITY

2010-11-17 Thread Marc Cousin
Hi, Here is my review of 'rollback sequence reset for TRUNCATE ... RESTART 
IDENTITY' patch.

- Is the patch in context diff format?
It's in git diff format. I guess it's OK ?

- Does it apply cleanly to the current git master?
Yes

- Does it include reasonable tests, necessary doc patches, etc?
Doc: Yes, it removes the warning about TRUNCATE ... RESTART IDENTITY, which is 
the point of the patch
Tests: There is a new regression test added for restart identity. And 'make 
check' passes (tested on linux).

- Usability review (skills needed: test-fu, ability to find and read spec)

- Read what the patch is supposed to do, and consider:

  - Does the patch actually implement that?
Yes.

  - Do we want that?
I think so, it removes a trap limitation of truncate

  - Do we already have it?
No

  - Does it follow SQL spec, or the community-agreed behavior?
I think so

  - Does it include pg_dump support (if applicable)?
Not applicable

  - Are there dangers?
Not that I think of 


  

  - Have all the bases been covered?


  
I think so  


  



  



  
- Feature test (skills needed: patch, configure, make, pipe errors to log)  


  



  
- Apply the patch, compile it and test: 



  - Does the feature work as advertised?


  
Yes. It works consistently, isn't fooled by savepoints or multiple serials in 
a table, or concurrent transactions 



  - Are there corner cases the author has failed to consider?   


  
I don't think so


  

  - Are there any assertion failures or crashes?


  
No  


  



  



  
Performance review (skills n

Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner
On 11/17/2010 01:27 PM, Alvaro Herrera wrote:
> I don't think it's a speed thing only.  It would be a great thing to
> have in autovacuum, for example, where we have constant problem reports
> because the system failed to fork a new backend.  If we could simply
> reuse an already existing one, it would be a lot more robust.

Hm, that's an interesting point.

To actually increase robustness, it would have to be a failure scenario
that (temporarily) prevents forking, but allows an existing backend to
continue to do work (i.e. the ability to allocate memory or open files
come to mind).

Any idea about what's usually causing these fork() failures? I'm asking
because I'm afraid that for example, in case of an out of memory
condition, we'd just hit an OOM error later on, without being able to
perform the VACUUM job, either.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Alvaro Herrera
Excerpts from Markus Wanner's message of mié nov 17 07:04:04 -0300 2010:

> > Thoughts?
> 
> The question obviously is whether or not this is faster than just
> terminating one backend and starting a new one. Which basically costs an
> additional termination and re-creation of a process (i.e. fork())
> AFAICS. Or what other savings do you envision?

I don't think it's a speed thing only.  It would be a great thing to
have in autovacuum, for example, where we have constant problem reports
because the system failed to fork a new backend.  If we could simply
reuse an already existing one, it would be a lot more robust.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Overlap operators for ARRAY and NULLs

2010-11-17 Thread Itagaki Takahiro
There might be inconsistency in overlap operators for array
types ( <@, @>, and && ) when the argument arrays contain NULLs.

- SELECT 2 = ANY (ARRAY[1, NULL])   => NULL
- SELECT ARRAY[2] <@ ARRAY[1, NULL] => false

NULL means "unknown" in definition, so should it return NULL
rather than false?

I found the issue when I read spec of MULTISET. In the SQL
standard, the following expression returns NULL. I was thinking
to make SUBMULTISET OF to be an alias of <@ operator, but
they seems to be incompatible.

- MULTISET[2] SUBMULTISET OF MULTISET[1, NULL] => NULL

Will we change the operator's behavior? It would be more
consistent, but incompatible with older server versions.
If impossible, I'll add separated submultiset_of() function
instead of <@ operator for MULTISET supports.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner
On 11/17/2010 12:09 PM, Andres Freund wrote:
> I am thinking of a connection-pooler like setup. Quite often your main-load 
> goes towards a single database - in that situation you don't have to reset 
> the 
> database id most of the time.

Okay, so that's what I'd call a connection-reset or something. Or
probably a re-authentication feature which allows switching users (and
thus permissions on that same database). Could make sense for a
connection pooler, yes. Not having to flush caches in that case could
even turn out to be a good win. And I think it's a lot simpler to just
switch the user than to switch the database. Such a feature looks more
promising to me. (Do we have other connection related and unchangeable
state?)

As side notes: for the bgworkers in Postgres-R, I'm currently re-using
existing backends. As they only do change set application, the amount of
connection-reset required is minimal (well, there isn't a client
connection for these kinds of backends, in the first place, but that's
another story). Plus they are always acting as superuser, no
authentication or user switching required in that case.

For parallel querying as well as async transactions, it might make sense
to be able to switch users but remain connected to the same database.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Andres Freund
On Wednesday 17 November 2010 11:58:33 Markus Wanner wrote:
> Andreas,
> 
> On 11/17/2010 11:38 AM, Andres Freund wrote:
> > Well, one could optimize most of the resetting away if the the old
> > MyDatabaseId and the new one are the same  - an optimization which is
> > hardly possible with forking new backends.
> Uh? Why not simply re-use the same backend, then? Or do you think of
> re-connecting to the same database as a way to reset your connection?
I am thinking of a connection-pooler like setup. Quite often your main-load 
goes towards a single database - in that situation you don't have to reset the 
database id most of the time.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner
Andreas,

On 11/17/2010 11:38 AM, Andres Freund wrote:
> Well, one could optimize most of the resetting away if the the old 
> MyDatabaseId and the new one are the same  - an optimization which is hardly 
> possible with forking new backends.

Uh? Why not simply re-use the same backend, then? Or do you think of
re-connecting to the same database as a way to reset your connection?

> Also I think it could lower the impact due locking the procarray an related 
> areas.

That may be, yes. But as pointed out by Tom and Alvaro, you'd have to
adhere to a compatible sequence of changes to shared memory to avoid
race conditions. That possibly requires using a very similar locking
sequence as the combination of a detaching and a newly starting backend
would use.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-platform, multi-locale regression tests

2010-11-17 Thread Markus Wanner
Kevin,

On 11/13/2010 01:28 AM, Kevin Grittner wrote:
> Should anyone else run into this, it's controlled by this in the test
> scheduling definitions (the tdef values):
>  
> 'xfail': True
>  
> There are other test flags you can override here, like 'skip' to skip
> a test.

Correct. Looks like dtester urgently needs documentation...

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Andres Freund
On Wednesday 17 November 2010 11:04:04 Markus Wanner wrote:
> Robert,
> 
> On 11/15/2010 05:39 AM, Robert Haas wrote:
> > I've spent a few hours pouring over the source code with
> > coarse-toothed comb, trying to figure out just exactly what might
> > break if we changed MyDatabaseId after backend startup time, or in
> > other words, allowed a backend to unbind from the database to which it
> > was originally bound and rebind to a new one.  This is related to the
> > periodic conversations we've had about a built-in connection pooler,
> > and/or maintaining a pool of worker threads that could be used to
> > service parallel query, replication sets, etc.  What follows is not
> > meant to be a concrete design proposal; it's basic research that may
> > lead to a proposal at some time in the future.  Still, comments are
> > welcome.
> 
> Thanks a lot for doing that, saved me a couple hours (presumably more
> than it cost you :-)
> 
> > Thoughts?
> 
> The question obviously is whether or not this is faster than just
> terminating one backend and starting a new one. Which basically costs an
> additional termination and re-creation of a process (i.e. fork())
> AFAICS. Or what other savings do you envision?
Well, one could optimize most of the resetting away if the the old 
MyDatabaseId and the new one are the same  - an optimization which is hardly 
possible with forking new backends.

Also I think it could lower the impact due locking the procarray an related 
areas.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changing MyDatabaseId

2010-11-17 Thread Markus Wanner
Robert,

On 11/15/2010 05:39 AM, Robert Haas wrote:
> I've spent a few hours pouring over the source code with
> coarse-toothed comb, trying to figure out just exactly what might
> break if we changed MyDatabaseId after backend startup time, or in
> other words, allowed a backend to unbind from the database to which it
> was originally bound and rebind to a new one.  This is related to the
> periodic conversations we've had about a built-in connection pooler,
> and/or maintaining a pool of worker threads that could be used to
> service parallel query, replication sets, etc.  What follows is not
> meant to be a concrete design proposal; it's basic research that may
> lead to a proposal at some time in the future.  Still, comments are
> welcome.

Thanks a lot for doing that, saved me a couple hours (presumably more
than it cost you :-)

> Thoughts?

The question obviously is whether or not this is faster than just
terminating one backend and starting a new one. Which basically costs an
additional termination and re-creation of a process (i.e. fork())
AFAICS. Or what other savings do you envision?

If that's it, it certainly seems like a huge amount of work for very
little benefit. Or does this feature enable something that's impossible
to do otherwise?

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2