Re: [HACKERS] [PATCH] V3: Idle in transaction cancellation

2010-10-30 Thread Andres Freund
Hi,

On Tuesday 19 October 2010 16:18:29 Kevin Grittner wrote:
 Andres Freund and...@anarazel.de wrote:
  Here is a proposed patch which enables cancellation of $subject.
 
 Cool.  Some enhancements we'd like to do to Serializable Snapshot
 Isolation (SSI), should the base patch make it in, would require
 this capability.
 
  Currently it does *not* report any special error message to the
  client if it
  
  starts sending commands in an (unbekownst to it) failed
  transaction, but just the normal 25P02: current transaction is
  aborted... message.
  
  It shouldn't be hard to add that and I will propose a patch if
  people would like it (I personally am not very interested, but I
  can see people validly wanting it)
 
 For SSI purposes, it would be highly desirable to be able to set the
 SQLSTATE and message generated when the canceled transaction
 terminates.
Ok, I implemented that capability, but the patch feels somewhat wrong to me, 
so its a separate patch on top the others:

* it intermingles logic between elog.c and postgres.c far too much - requiring 
exporting variables which should not get exported. And it would still need 
more to allow some sensible Assert()s. I.e. Assert(DoingCommandRead) would 
need to be available in elog.c to avoid somebody using the re-throwing 
capability out of place

* You wont see an error if the next command after the IDLE in transaction is a 
COMMIT/ROLLBACK. I don’t see any sensible way around that.

* the copied edata lives in TopMemoryContext and gets only reset in the next 
reported error, after the re-raised error was reported.

That’s how it looks like to raise one such error right now:

error = ERROR
if (DoingCommandRead)
{
silent_error_while_idle = true;
error |= LOG_NO_CLIENT|LOG_RE_THROW_AFTER_SYNC;
}

...

ereport(error,
(errcode(ERRCODE_QUERY_CANCELED),
 errmsg(canceling statement due to user request)));



One could mingle together  LOG_NO_CLIENT|LOG_RE_THROW_AFTER_SYNC into a macro 
and set silent_error_while_idle in elog.c, but I don’t see many callsites 
coming up, so I think its better to be explicit.

Ill set this up for the next commitfest, I don't think I can do much more 
without further input.

Andres
From 06541b25fc11a8f17ec401de5a17eeae1bad57d1 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Fri, 21 May 2010 20:17:11 +0200
Subject: [PATCH 2/3] Implement cancellation of backends in IDLE IN TRANSACTION state.

Not having support for this was the reason for HS FATALing backends
which had a lock conflict for longer than
max_standby_(archive|standby)_delay as it couldnt cancel them without
them loosing sync with the frontend. As this is not the case anymore
fail the transaction silently. Possibly one day the protocol can
cope with this...
---
 src/backend/tcop/postgres.c |   80 +++
 1 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index cba90a9..505d136 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*** static bool RecoveryConflictPending = fa
*** 177,182 
--- 177,188 
  static bool RecoveryConflictRetryable = true;
  static ProcSignalReason RecoveryConflictReason;
  
+ /*
+  * Are we disallowed from sending a ready for query message right
+  * now because it would confuse the frontend?
+  */
+ static bool silent_error_while_idle = false;
+ 
  /* 
   *		decls for routines only used in this file
   * 
*** RecoveryConflictInterrupt(ProcSignalReas
*** 2877,2882 
--- 2883,2890 
  void
  ProcessInterrupts(void)
  {
+ 	int error = ERROR;
+ 
  	/* OK to accept interrupt now? */
  	if (InterruptHoldoffCount != 0 || CritSectionCount != 0)
  		return;
*** ProcessInterrupts(void)
*** 2949,2966 
  			RecoveryConflictPending = false;
  			DisableNotifyInterrupt();
  			DisableCatchupInterrupt();
! 			if (DoingCommandRead)
! ereport(FATAL,
! 		(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
! 		 errmsg(terminating connection due to conflict with recovery),
! 		 errdetail_recovery_conflict(),
!  errhint(In a moment you should be able to reconnect to the
! 		  database and repeat your command.)));
! 			else
! ereport(ERROR,
! 		(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
!  errmsg(canceling statement due to conflict with recovery),
! 		 errdetail_recovery_conflict()));
  		}
  
  		/*
--- 2957,2980 
  			RecoveryConflictPending = false;
  			DisableNotifyInterrupt();
  			DisableCatchupInterrupt();
! 
! 			if (DoingCommandRead){
! /*
!  * We cant issue a normal ERROR here because the
!  * client doesnt expect the server to send an error at
!  * that point.
!  * We also may not send a ready for query/Z message
!  * because 

Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-10-30 Thread Andres Freund
Hi,

This thread died after me not implementing a new version and some potential 
license problems.

I still think its worthwile (and I used it in production for some time) so I 
would like to implement a version fit for the next commitfest.

The code where I started out from is under the zlib license - which is to my 
knowledge compatible with PGs licence. Whats the position of HACKERS there? 
There already is some separately licenced code around and were already linking 
to zlib licenced code...

For simplicitly I asked Mark Adler (the original Copyright Owner) if he would 
be willing to relicence - he is not.

For anybody not hording all old mail like me here is a link to the archives 
about my old patch:

http://archives.postgresql.org/message-
id/201005202227.49990.and...@anarazel.de


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
I was reminded today that I'd promised to do $SUBJECT awhile back.
It's worth having so that hash joins and hash aggregation can work
on array values.  I believe this is a fairly straightforward
exercise:

1. Add a hash opclass that accepts ANYARRAY, similar to the existing
btree opclass for ANYARRAY.  It will work for any array type whose
element type has a hash opclass.

2. Coding is much like the array_cmp support code, including caching
the lookup of the element type's hash function.

3. To hash, apply the element type's hash function to each array
element.  Combine these values by rotating the accumulator left
one bit at each step and xor'ing in the next element's hash value.

Thoughts?  In particular, is anyone aware of a better method
for combining the element hash values?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread marcin mank
On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 3. To hash, apply the element type's hash function to each array
 element.  Combine these values by rotating the accumulator left
 one bit at each step and xor'ing in the next element's hash value.

 Thoughts?  In particular, is anyone aware of a better method
 for combining the element hash values?


This would make the hash the same for arrays with elements 32 apart swapped.

This is what boost does:
http://www.systomath.com/include/Boost-1_34/doc/html/boost/hash_combine.html

Greetings

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes:
 On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 3. To hash, apply the element type's hash function to each array
 element.  Combine these values by rotating the accumulator left
 one bit at each step and xor'ing in the next element's hash value.
 
 Thoughts?  In particular, is anyone aware of a better method
 for combining the element hash values?

 This would make the hash the same for arrays with elements 32 apart swapped.

Well, there are *always* going to be cases where you get the same hash
value for two different inputs; it's unavoidable given that you have to
combine N 32-bit hash values into only one 32-bit output.

 This is what boost does:
 http://www.systomath.com/include/Boost-1_34/doc/html/boost/hash_combine.html

Hmm.  I am reminded of Knuth's famous dictum: never generate random
numbers with a method chosen at random.  Is there any actual theory
behind that algorithm, and if so what is it?  The combination of
shifting with addition (not xor) seems more likely to lead to weird
cancellations than any improvement in the hash behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Thoughts?  In particular, is anyone aware of a better method
 for combining the element hash values?


The obvious thing to do seems like it would be to feed the individual
values back into the regular hash function.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Thoughts?  In particular, is anyone aware of a better method
 for combining the element hash values?

 The obvious thing to do seems like it would be to feed the individual
 values back into the regular hash function.

Hmm, you mean use hash_any on the sequence of hash values?  Interesting
idea; but hash_any doesn't look very amenable to being invoked
incrementally, and I don't think I want to construct a temp array with
enough space for the hashes of all the items in the input array ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.1alpha2 bundled -- please verify

2010-10-30 Thread Peter Eisentraut
Alpha2 has been bundled and is available at

http://developer.postgresql.org/~petere/alpha/

Please check that it is sane.  I have also put a draft announcement at
the same place.

If there are no concerns, I will move them to the FTP site tomorrow
(Sunday) and send out announcements on Monday.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-10-30 Thread Dimitri Fontaine
Hi,

In the road to the extension patch, we already found some parts that
have to be separated into their own patch. Here's another one. It
occurred to me while implementing the pg_extension_objects() SRF that if
we can list all objects that belong to an extension, certainly we also
are able to move them to another schema.

As soon as we have that ability, we are able to provide for relocatable
extensions with the following command:

  ALTER EXTENSION ext SET SCHEMA name;
  ALTER EXTENSION ext SET SCHEMA foo TO bar;

I think that would end the open debate about search_path vs extension,
because each user would be able to relocate his local extensions easily,
wherever the main script has installed them (often enough, public).

Please find attached a work-in-progress patch (it's missing
documentation) implementing support for setting a new schema to SQL
objects of types conversion, operator, operator class, operator family,
text search parser, dictionary, template and configuration.

If there's will to apply such a patch, I'll finish it by writing the
necessary documentation for the 8 new SQL commands.

Note: CreateCommandTag() already has support for tags for ALTER TEXT
  SEARCH OBJECT … SET SCHEMA …, but the implementation I've not
  found, in the grammar nor in tsearchcmds.c. It's in the patch.

As usual, you can also get to the development version by using git:
  
http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/set_schema

git --no-pager diff master..|diffstat
 backend/catalog/pg_namespace.c|   38 
 backend/commands/alter.c  |   32 
 backend/commands/conversioncmds.c |   84 ++
 backend/commands/opclasscmds.c|  215 +++
 backend/commands/operatorcmds.c   |   90 +++
 backend/commands/tsearchcmds.c|  295 ++
 backend/parser/gram.y |   67 
 backend/tcop/utility.c|   12 +
 include/catalog/pg_namespace.h|2 
 include/commands/conversioncmds.h |5 
 include/commands/defrem.h |   23 ++
 11 files changed, 863 insertions(+)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/catalog/pg_namespace.c
--- b/src/backend/catalog/pg_namespace.c
***
*** 17,24 
--- 17,26 
  #include access/heapam.h
  #include catalog/dependency.h
  #include catalog/indexing.h
+ #include catalog/namespace.h
  #include catalog/pg_namespace.h
  #include utils/builtins.h
+ #include utils/lsyscache.h
  #include utils/rel.h
  #include utils/syscache.h
  
***
*** 77,79  NamespaceCreate(const char *nspName, Oid ownerId)
--- 79,117 
  
  	return nspoid;
  }
+ 
+ /*
+  * Check new namespace validity in ALTER OBJECT ... SET SCHEMA ... and
+  * ereport(ERROR, ...) in case of any problem.
+  */
+ void
+ CheckSetNamespace(Oid oldNspOid, Oid nspOid,
+   const char *name, const char *objtype)
+ {
+ 	if (oldNspOid == nspOid)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+  errmsg(%s \%s\ already exists in schema \%s\,
+ 		objtype, name, get_namespace_name(nspOid;
+ 
+ 	/* disallow renaming into or out of temp schemas */
+ 	if (isAnyTempNamespace(nspOid) || isAnyTempNamespace(oldNspOid))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 			errmsg(cannot move objects into or out of temporary schemas)));
+ 
+ 	/* same for TOAST schema */
+ 	if (nspOid == PG_TOAST_NAMESPACE || oldNspOid == PG_TOAST_NAMESPACE)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg(cannot move objects into or out of TOAST schema)));
+ 
+ 	/* check for duplicate name (more friendly than unique-index failure) */
+ 	if (SearchSysCacheExists2(TYPENAMENSP,
+ 			  CStringGetDatum(name),
+ 			  ObjectIdGetDatum(nspOid)))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+  errmsg(%s \%s\ already exists in schema \%s\,
+ 		objtype, name, get_namespace_name(nspOid;
+ }
*** a/src/backend/commands/alter.c
--- b/src/backend/commands/alter.c
***
*** 182,192  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 182,208 
     stmt-newschema);
  			break;
  
+ 		case OBJECT_CONVERSION:
+ 			AlterConversionNamespace(stmt-object, stmt-newschema);
+ 			break;
+ 
  		case OBJECT_FUNCTION:
  			AlterFunctionNamespace(stmt-object, stmt-objarg, false,
     stmt-newschema);
  			break;
  
+ 		case OBJECT_OPERATOR:
+ 			AlterOperatorNamespace(stmt-object, stmt-objarg, stmt-newschema);
+ 			break;
+ 
+ 		case OBJECT_OPCLASS:
+ 			AlterOpClassNamespace(stmt-object, stmt-objarg, stmt-newschema);
+ 			break;
+ 
+ 		case OBJECT_OPFAMILY:
+ 			AlterOpFamilyNamespace(stmt-object, stmt-objarg, stmt-newschema);
+ 			break;
+ 
  		case OBJECT_SEQUENCE:
  		case OBJECT_TABLE:
  		case OBJECT_VIEW:
***
*** 195,200  

Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-10-30 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 As soon as we have that ability, we are able to provide for relocatable
 extensions with the following command:

   ALTER EXTENSION ext SET SCHEMA name;
   ALTER EXTENSION ext SET SCHEMA foo TO bar;

 I think that would end the open debate about search_path vs extension,
 because each user would be able to relocate his local extensions easily,
 wherever the main script has installed them (often enough, public).

I'm not sure whether that really fixes anything, or just provides people
with a larger-caliber foot-gun.  See for example recent complaints about
citext misbehaving if it's not in the public schema (or more generally,
any schema not in the search path).  I think we'd need to think a bit
harder about the behavior of objects that aren't in the search path
before creating a facility like this, since it seems to be tantamount
to promising that extensions won't break when pushed around to different
schemas.

I'm also a bit less than enthused about the implementation approach.
If we're going to have a policy that every object type must support
ALTER SET SCHEMA, I think it might be time to refactor, rather than
copying-and-pasting similar boilerplate code for every one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 1:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hmm, you mean use hash_any on the sequence of hash values?  Interesting
 idea; but hash_any doesn't look very amenable to being invoked
 incrementally, and I don't think I want to construct a temp array with
 enough space for the hashes of all the items in the input array ...

I had rather hoped without evidence that it would be easy enough to
use incrementally. Looking now I see your point; it doesn't lend
itself to that at all.

I suppose you could use crc where our interface does allow incremental use.

I'm not really familiar enough with the math to know whether CRC is
any better than your XOR mixing. I suspect they're pretty similar. I
suppose if you have arrays of various lengths containing repetitions
of a single value than the non-CRC would produce a hash of 0 for all
arrays with a length that's a multiple of 32. I'm not sure what CRC
would do.

Oh, one question that occurred to me you didn't mention including the
bounds of the array or the null bitmap in the hash function. I suppose
it's correct with or without them (or in the case of the null bitmap
another way to put it is the choice of the hash value to represent
null is arbitrary).


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 I suppose you could use crc where our interface does allow incremental use.

Hm, that's a thought.  I'm not sure though whether CRC really has the
behavior you want from a hash function, in particular that all the bits
are independent (a/k/a taking N low-order bits is a reasonable way to
cut the hash down to a suitable bucket index).  Anybody know?

 I'm not really familiar enough with the math to know whether CRC is
 any better than your XOR mixing. I suspect they're pretty similar. I
 suppose if you have arrays of various lengths containing repetitions
 of a single value than the non-CRC would produce a hash of 0 for all
 arrays with a length that's a multiple of 32. I'm not sure what CRC
 would do.

Some quick experimenting suggests that you get -1 from an array of 32 of
the same value, then 0 from 64 of the same, etc.  This doesn't bother me
particularly though.  There are always going to be some situations that
produce degenerate hash values.

 Oh, one question that occurred to me you didn't mention including the
 bounds of the array or the null bitmap in the hash function. I suppose
 it's correct with or without them (or in the case of the null bitmap
 another way to put it is the choice of the hash value to represent
 null is arbitrary).

Yeah.  I have it coded to treat a null element as if it had a hash value
of zero.  I don't see a great need to consider the array bounds per se.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Maximum function call nesting depth for regression tests

2010-10-30 Thread Tom Lane
A few days ago I added a regression test that involves a plpgsql
function calling a sql function, which recurses back to the plpgsql
function, etc, to a depth of 10 cycles (ie 10 plpgsql function calls
and 10 sql function calls).  There are three buildfarm members that
are failing with stack depth limit exceeded errors on this test.
What should we do about that?  Possibilities include:

1. Back off the recursion nesting depth of the test to whatever
it takes to get those buildfarm critters happy.

2. Lobby the buildfarm owners to increase their ulimit -s settings.

3. Chisel things enough to get the case to pass, eg by reducing the
no-doubt-generous value of STACK_DEPTH_SLOP.

I don't especially care for choice #1.  To me, one of the things that
the regression tests ought to flag is whether a machine is so limited
that reasonable coding might fail.  If you can't do twenty or so
levels of function call you've got a mighty limited machine.  For
comparison, the parallel regression tests will probably fail if you
can't support twenty concurrent sessions, and nobody's seriously
advocated cutting that.

One point worth noting is that the failing machines are running on
IA64 or PPC64, and some of them seem to be only failing in some
branches.  So maybe there is some platform-specific effect here
that could be fixed with a narrow hack.  I'm not too hopeful though.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers