date:20101208

Hey Florian,

Thank you very much!

2010/12/8 Florian Pflug f...@phlo.org

 On Dec8, 2010, at 11:35 , Dmitriy Igrishin wrote:
  Is it guaranteed that name of array types in pg_type system
  catalog will always be prefixed by underscore or this convention
  can be changed in future ?

 What's the advantage of letting your code depend on this?

 Within SQL, I suggest you write type[] to denote type's array type. In
 the catalog, each pg_type row contains a references the corresponding array
 type (by OID) in the field typarray.

 BTW, when querying pg_type, instead of adding another join to pg_type to
 get the array type's name, you can simply cast the typarray field to
 regtype. That way, should the array type happen to lie in a schema not in
 your search_path, the name will even be correctly schema-qualified. (In
 fact, it's not the cast which does the translation but rather the implicit
 conversion from regtype to cstring that happens when the result is
 transferred to the client. For further information, you might want to check
 out the documentation of the various reg* types provided by postgres).

 Hope that helps,
 Florian Pflug




-- 
// Dmitriy.

[HACKERS] Solving sudoku using SQL

2010-12-08 Thread Tatsuo Ishii

Hi,

I found an interesting article which explains how to solve sudoku by
using SQL(unfortunately written in Japanese):
http://codezine.jp/article/detail/1629?p=2

In the article example sudoku is:
http://static.shoeisha.jp/cz/static/images/article/1629/problem4.gif

To solve the example you create following table:

CREATE TABLE nums (n INT NOT NULL PRIMARY KEY);
INSERT INTO nums VALUES (1);
INSERT INTO nums VALUES (2);
INSERT INTO nums VALUES (3);
INSERT INTO nums VALUES (4);
INSERT INTO nums VALUES (5);
INSERT INTO nums VALUES (6);
INSERT INTO nums VALUES (7);
INSERT INTO nums VALUES (8);
INSERT INTO nums VALUES (9);

Then execute the huge SELECT:
http://codezine.jp/static/images/article/1629/html/sql.html

In the page first one takes infinite time by PostgreSQL 9.0.1.  Next
one can be executed very quickly because the join order is explicitely
specified by cross join syntax. This seems to be a limitation of
PostgreSQL optimizer and I would like it be removed. Comments?

BTW according to the article, Oracle execute the first SELECT(the one
PostgreSQL takes infinite time) in 10 seconds. It seems Oracle's
optimizer is sperior in this regard.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

David E. Wheeler da...@kineticode.com writes:
 What about unaccent? Or lo (1 domain, 2 functions)?

 Sure. Doesn't have to actually do anything.

Ok, so that's already in the patch :)

 That's called a shared catalog. I don't see any benefit of having to
 maintain that when we do already have a directory containing the files
 and the restriction that extensions names are the file names.

 Because then you don't need control files at all.

 Again, if you really want to have that, you have to first detail how and
 you fill in the shared catalog, and update it.

 `make install` should do it. From variables in the Makefile.

I see that you're not too much into packaging, but here, we don't ever
use `make install` on a production machine. This step happens on the
packaging server, then we install and QA the stuff, then the package
gets installed on the servers where we need it.

Also, I don't see how make install is going to know which cluster it
should talk to — it's quite easy and typicall to run this command on a
server where you have several major versions installed, and several
clusters per major version.

So, again, the data that you so want to remove from the control files I
have no idea at all where to put it.

 Possibly. I'm not going to do it this week; seems like there are some
 issues that still need shaking out in the implementation, to judge
 from the pg_execute_from_file review thread.

Yeah, dust ain't settled completely yet… working on that.

 Each would get a separate control file. The mapping of one version
 number to multiple extensions is potentially confusing.

Funny, each already get a separate control file now.

$ ls contrib/spi/*control.in
autoinc.control.in  auto_username.control.in  moddatetime.control.in
refint.control.in  timetravel.control.in

Then the idea behind the version number in the Makefile is that you
generally are maintaining it there and don't want to have to maintain it
in more than one place.

 Why is that? We currently manage multiple script files, test files,
 etc. in a single Makefile. Wildcard operators are very useful for this
 sort of thing.

Well, that was you saying just above that using the same EXTVERSION Make
variable for more than one control file is potentially confusing. What
about using all the other variables in the same way?

 Well, before that you had to explicitly write public in there, which IMO
 is so much worse. Then again, I now think that the right way to approach
 that is to remove this feature. The user would have a 2-steps operation
 instead, but that works right always.

 Yes, that would be preferable, but a one-step operation would of
 course be ideal.

Thinking about it, as proposed in the other thread, I now think that the
2-steps operation could be internal and not user exposed.

  ALTER EXTENSION name SET SCHEMA foo TO bar, baz TO quux;

 Perhaps. v2, eh? ;-P

Yes please :)

 Some do require shared_preload_libraries, no?

One of them only, pg_stat_statements.

 SET client_min_messages TO warning;
 SET log_min_messagesTO warning;

 Though I think I'd rather that the warning still went to the log.

(that's about hstore verbosity) ok will see about changing
client_min_messages around the CREATE OPERATOR =.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Robert Haas robertmh...@gmail.com writes:
 I think this so-called two-step approach is pretty ugly.

Well it does not need to be exposed to the user, thinking about it, as
proposed in the other thread. Other than that, you're argument here is
exactly the same as the ones saying that VACUUM or Hint Bints are
bad. It's just that if you want correctness, you don't have anything
better.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

Tom Lane t...@sss.pgh.pa.us writes:
 Er ... what good is that?  A non-relocatable extension doesn't *need*
 any such substitution, because it knows perfectly well what schema it's
 putting its stuff into.  Only the relocatable case has use for it.  So
 you might as well drop the substitution mechanism entirely.

Funnily enough, I see it the exact opposite way.

  relocatable is true

A relocatable extension installs all its object into the first
schema of the search_path. As an extension's script author, you have
to make sure you're not schema qualifying any object that you
create.

Note that contrib/ is working this way but forcing the search_path
first entry into being public.

  relocatable is false

An extension that needs to know where some of its objects are
installed is not relocatable. As the user won't be able to change
the schema where the extensions gets installed, he's given the
possibility to specify the schema at installation time. The
extension installation script is then required to use the
@extschema@ placeholer as the schema to work with. That will
typically mean the script's first line is:

  SET search_path TO @extschema@;

Then you can also CREATE FUNCTION … SET search_path TO @extschema@
for security reasons.

With that support in, the CREATE EXTENSION foo WITH SCHEMA bar could
simply run the ALTER EXTENSION foo SET SCHEMA bar internally, so that's
not a burden for the user. So that simple things are simple both for the
extension's author and the users, but complex things still possible and
supported here.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final(?) proposal for wal_sync_method changes

Magnus Hagander mag...@hagander.net writes:
 On Wed, Dec 8, 2010 at 02:07, Tom Lane t...@sss.pgh.pa.us wrote:
 [ win32.h says ]
 #define fsync(fd) _commit(fd)

 What this means is that switching to a simple preference order
 fdatasync, then fsync will result in choosing fsync on Windows (since
 it hasn't got fdatasync), meaning _commit, meaning Windows users see
 a behavioral change after all.

 _commit() != fsync()

Um, the macro quoted above makes them the same, no?  One of us
is confused.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

2010-12-08 Thread Andrew Dunstan




On 12/08/2010 05:35 AM, Dmitriy Igrishin wrote:

Hey hackers@,

Is it guaranteed that name of array types in pg_type system
catalog will always be prefixed by underscore or this convention
can be changed in future ?





It is not guaranteed today, let alone in the future, that the array type 
for x will be _x for any x.


Consider:

   andrew=# create type _foo as (x int); create type foo as (y
   text);select typname from pg_type where oid = (select typarray from
   pg_type where typname = 'foo');
   CREATE TYPE
   CREATE TYPE
 typname
   -
 ___foo
   (1 row)


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane t...@sss.pgh.pa.us wrote:
 
 I agree that letting it be changed back to read/write after that
 is surprising and unnecessary.  Perhaps locking down the setting
 at the time of first grabbing a snapshot would be appropriate. 
 IIRC that's how it works for transaction isolation level, and this
 seems like it ought to work the same.
 
Agreed.  I can create a patch today to implement this.  The thing
which jumps out first is that assign_transaction_read_only probably
needs to move to variable.c so that it can reference
FirstSnapshotSet as the transaction isolation code does.  The
alternative would be to include snapmgr.h in guc.c, which seems less
appealing.  Agreed?  Other ideas?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

Dmitriy Igrishin dmit...@gmail.com writes:
 Is it guaranteed that name of array types in pg_type system
 catalog will always be prefixed by underscore

No.  Read the code, or the documentation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

Hey Andrew,

Finally convinced. Thanks!

2010/12/8 Andrew Dunstan and...@dunslane.net



 On 12/08/2010 05:35 AM, Dmitriy Igrishin wrote:

 Hey hackers@,

 Is it guaranteed that name of array types in pg_type system
 catalog will always be prefixed by underscore or this convention
 can be changed in future ?




 It is not guaranteed today, let alone in the future, that the array type
 for x will be _x for any x.

 Consider:

   andrew=# create type _foo as (x int); create type foo as (y
   text);select typname from pg_type where oid = (select typarray from
   pg_type where typname = 'foo');
   CREATE TYPE
   CREATE TYPE
 typname
   -
 ___foo
   (1 row)


 cheers

 andrew




-- 
// Dmitriy.

Re: [HACKERS] unlogged tables

2010-12-08 Thread Marti Raudsepp

A very useful feature for unlogged tables would be the ability to
switch them back to normal tables -- this way you could do bulk
loading into an unlogged table and then turn it into a regular table
using just fsync(), bypassing all the WAL-logging overhead. It seems
this could even be implemented in pg_restore itself.

Which brings me to:

On Tue, Dec 7, 2010 at 20:44, Robert Haas robertmh...@gmail.com wrote:
 2. Any crash truncates the table, but a clean shutdown does not.

Seems that syncing on a clean shutdown could use the same
infrastructure as the above functionality.

Have you thought about switching unlogged tables back to logged? Are
there any significant obstacles?

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

Hey Tom,

Thanks you too.
I always read the documentation, but don't want (yes, don't want)
to read a lot of code to get the answer on simple question because
life is too short for it. I think that people should helps each other :-)

2010/12/8 Tom Lane t...@sss.pgh.pa.us

 Dmitriy Igrishin dmit...@gmail.com writes:
  Is it guaranteed that name of array types in pg_type system
  catalog will always be prefixed by underscore

 No.  Read the code, or the documentation.

regards, tom lane




-- 
// Dmitriy.

[HACKERS] plperlu problem with utf8

2010-12-08 Thread Oleg Bartunov


Hi there,

below is the problem, which I don't have when running in shell. The database is 
in UTF-8 encoding.


CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS $$
use strict;
use URI::Escape;
return uri_unescape($_[0]); 
$$ LANGUAGE plperlu;

CREATE FUNCTION
Time: 1.416 ms
select url_decode('comment%20passer%20le%20r%C3%A9veillon');
  url_decode 
--

 comment passer le rveillon
  ^
 non-printed

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-08 Thread Marti Raudsepp

On Tue, Dec 7, 2010 at 03:34, Tom Lane t...@sss.pgh.pa.us wrote:
 To my mind, O_DIRECT is not really the key issue here, it's whether to
 prefer O_DSYNC or fdatasync.

Since different platforms implement these primitives differently, and
it's not always clear from the header file definitions which options
are actually implemented, how about simply hard-coding a default value
for each platform?

1. This would be quite straightforward to code and document (a table
of platforms and their default wal_sync_method setting)

2. The best performing (or safest) method can be chosen on every
platform. From the above discussion it seems that Windows and OSX
should default to fdatasync_writethrough even if other methods are
available

3. It would pre-empt similar surprises if other platforms change their
header files, like what happened on Linux now.

Sounds like the simple and foolproof solution.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

2010-12-08 Thread Merlin Moncure

On Wed, Dec 8, 2010 at 8:57 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Hi,

 I found an interesting article which explains how to solve sudoku by
 using SQL(unfortunately written in Japanese):
 http://codezine.jp/article/detail/1629?p=2

 In the article example sudoku is:
 http://static.shoeisha.jp/cz/static/images/article/1629/problem4.gif

 To solve the example you create following table:

 CREATE TABLE nums (n INT NOT NULL PRIMARY KEY);
 INSERT INTO nums VALUES (1);
 INSERT INTO nums VALUES (2);
 INSERT INTO nums VALUES (3);
 INSERT INTO nums VALUES (4);
 INSERT INTO nums VALUES (5);
 INSERT INTO nums VALUES (6);
 INSERT INTO nums VALUES (7);
 INSERT INTO nums VALUES (8);
 INSERT INTO nums VALUES (9);

 Then execute the huge SELECT:
 http://codezine.jp/static/images/article/1629/html/sql.html

 In the page first one takes infinite time by PostgreSQL 9.0.1.  Next
 one can be executed very quickly because the join order is explicitely
 specified by cross join syntax. This seems to be a limitation of
 PostgreSQL optimizer and I would like it be removed. Comments?

 BTW according to the article, Oracle execute the first SELECT(the one
 PostgreSQL takes infinite time) in 10 seconds. It seems Oracle's
 optimizer is sperior in this regard.

benchmark what you've got against this (ported to postgres by marcin mank):
http://www.pastie.org/684163

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

Marti Raudsepp ma...@juffo.org writes:
 On Tue, Dec 7, 2010 at 03:34, Tom Lane t...@sss.pgh.pa.us wrote:
 To my mind, O_DIRECT is not really the key issue here, it's whether to
 prefer O_DSYNC or fdatasync.

 Since different platforms implement these primitives differently, and
 it's not always clear from the header file definitions which options
 are actually implemented, how about simply hard-coding a default value
 for each platform?

There's not a fixed finite list of platforms we support.  In general
we prefer to avoid designing things that way at all.  If we have to have
specific exceptions for specific platforms, we grin and bear it, but for
the most part behavioral differences ought to be driven by configure's
probes for platform features.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age


Heikki pointed out to me that the btree delete record processing does
not respect vacuum_defer_cleanup_age. It should.

Attached patch to implement that.

Looking to commit in next few hours barring objections/suggestions, to
both HEAD and 9_0_STABLE, in time for next minor release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 0822f5c..5b2d58a 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -683,6 +683,12 @@ btree_xlog_delete_get_latestRemovedXid(XLogRecord *record)
 	UnlockReleaseBuffer(ibuffer);
 
 	/*
+	 * Apply vacuum_defer_cleanup_age, if we have a valid xid.
+	 */
+	if (TransactionIdIsValid(latestRemovedXid))
+		TransactionIdRetreatMany(latestRemovedXid, vacuum_defer_cleanup_age);
+
+	/*
 	 * Note that if all heap tuples were LP_DEAD then we will be returning
 	 * InvalidTransactionId here. That can happen if we are re-replaying this
 	 * record type, though that will be before the consistency point and will
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 6e7a6db..c16a287 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1129,8 +1129,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
 	LWLockRelease(ProcArrayLock);
 
 	/*
-	 * Compute the cutoff XID, being careful not to generate a permanent
-	 * XID.
+	 * Compute the cutoff XID, being careful not to generate a reserved XID.
 	 *
 	 * vacuum_defer_cleanup_age provides some additional slop for the
 	 * benefit of hot standby queries on slave servers.  This is quick and
@@ -1140,9 +1139,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
 	 * wraparound --- so guc.c should limit it to no more than the
 	 * xidStopLimit threshold in varsup.c.
 	 */
-	result -= vacuum_defer_cleanup_age;
-	if (!TransactionIdIsNormal(result))
-		result = FirstNormalTransactionId;
+	TransactionIdRetreatMany(result, vacuum_defer_cleanup_age);
 
 	return result;
 }
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index a7ae752..2f7070e 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -58,6 +58,19 @@
 		(dest)--; \
 	} while ((dest)  FirstNormalTransactionId)
 
+#define TransactionIdRetreatMany(dest, many)	\
+{ \
+	if ((dest) = (many)) \
+	{ \
+		(dest) -= (many); \
+		while ((dest)  FirstNormalTransactionId) \
+		{ \
+			(dest)--; \
+		} \
+	} \
+	else \
+		(dest) = MaxTransactionId - (many) + (dest); \
+}
 
 /* --
  *		Object ID (OID) zero is InvalidOid.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

On Tue, 2010-12-07 at 13:17 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I'm also going to go through and change all instances of the word
  unlogged to volatile, per previous discussion.  If this seems like
  a bad idea to anyone, please object now rather than afterwards.
 
 Hm... I thought there had been discussion of a couple of different
 flavors of table volatility.  Is it really a good idea to commandeer
 the word volatile for this particular one?

Note that DB2 uses the table modifier VOLATILE to indicate a table that
has a widely fluctuating table size, for example a queue table. It's
used as a declarative optimizer hint. So the term has many possible
meanings.

Prefer UNLOGGED or similar descriptive term.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, 2010-12-06 at 23:26 -0300, Alvaro Herrera wrote:

 Why would multiple bgwriter processes worry you?

Because it complicates the tracking of files requiring fsync.

As Greg says, the last attempt to do that was a lot of code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperlu problem with utf8

2010-12-08 Thread Andrew Dunstan




On 12/08/2010 10:13 AM, Oleg Bartunov wrote:

Hi there,

below is the problem, which I don't have when running in shell. The 
database is in UTF-8 encoding.


CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS $$
use strict;
use URI::Escape;
return uri_unescape($_[0]); $$ LANGUAGE plperlu;
CREATE FUNCTION
Time: 1.416 ms
select url_decode('comment%20passer%20le%20r%C3%A9veillon');
  url_decode --
 comment passer le rveillon
  ^
 non-printed



I get: (platform is Fedora 13, git tip, perl 5.10.1):

   andrew=# CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS
   varchar  AS $$
   andrew$# use strict;
   andrew$# use URI::Escape;
   andrew$# return uri_unescape($_[0]); $$ LANGUAGE plperlu;
   CREATE FUNCTION
   andrew=# select url_decode('comment%20passer%20le%20r%C3%A9veillon');
  url_decode
   --
 comment passer le rÃ©veillon
   (1 row)

   andrew=#

which makes it look like we might have some double escaping going on 
here, but at least I don't get nothing :-)


Further experimentation shows even more weirdness. There's definitely 
something odd about the utf8 handling. Will dig further.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

Dmitriy Igrishin dmit...@gmail.com writes:
 I always read the documentation, but don't want (yes, don't want)
 to read a lot of code to get the answer on simple question because
 life is too short for it. I think that people should helps each other :-)

Fine, but that sort of question doesn't belong on pgsql-hackers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperlu problem with utf8

2010-12-08 Thread Oleg Bartunov


adding utf8::decode($_[0]) solves the problem:

knn=# CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS $$
use strict;
use URI::Escape;
utf8::decode($_[0]);
return uri_unescape($_[0]); 
$$ LANGUAGE plperlu;


Oleg
On Wed, 8 Dec 2010, Andrew Dunstan wrote:




On 12/08/2010 10:13 AM, Oleg Bartunov wrote:

Hi there,

below is the problem, which I don't have when running in shell. The 
database is in UTF-8 encoding.


CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS $$
use strict;
use URI::Escape;
return uri_unescape($_[0]); $$ LANGUAGE plperlu;
CREATE FUNCTION
Time: 1.416 ms
select url_decode('comment%20passer%20le%20r%C3%A9veillon');
  url_decode --
 comment passer le rveillon
  ^
 non-printed



I get: (platform is Fedora 13, git tip, perl 5.10.1):

  andrew=# CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS
  varchar  AS $$
  andrew$# use strict;
  andrew$# use URI::Escape;
  andrew$# return uri_unescape($_[0]); $$ LANGUAGE plperlu;
  CREATE FUNCTION
  andrew=# select url_decode('comment%20passer%20le%20r%C3%A9veillon');
 url_decode
  --
comment passer le r?©veillon
  (1 row)

  andrew=#

which makes it look like we might have some double escaping going on here, 
but at least I don't get nothing :-)


Further experimentation shows even more weirdness. There's definitely 
something odd about the utf8 handling. Will dig further.


cheers

andrew





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] random write in xlog?

On Tue, Dec 7, 2010 at 2:06 PM, flyusa2010 fly flyusa2...@gmail.com wrote:
 Hi, folks,
 I trace the running postgres, and I found that there are some randoms writes
 in xlog files!
 To my impression, log file writing is always sequential, how come random
 writes happen?
 Thanks in advance!

Just because it does an lseek doesn't mean it is random.

Anyway, the writes are logically sequentially, but not physically.  If
I remember correctly, it always writes out full blocks, even if the
last part of the block has not yet been filled with new data.  When
the remainder gets filled, it then writes out the full block again,
both the already written and the new part.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

I wrote: 
 Tom Lane t...@sss.pgh.pa.us wrote:
  
 I agree that letting it be changed back to read/write after that
 is surprising and unnecessary.  Perhaps locking down the setting
 at the time of first grabbing a snapshot would be appropriate. 
 IIRC that's how it works for transaction isolation level, and
 this seems like it ought to work the same.
  
 Agreed.  I can create a patch today to implement this.
 
Attached.
 
Accomplished more through mimicry (based on setting transaction
isolation level) than profound understanding of the code involved;
but it passes all regression tests on both `make check` and `make
installcheck-world`.  This includes a new regression test that an
attempt to change it after a query fails.  I've poked at it with
various ad hoc tests, and it is behaving as expected in those.
 
I wasn't too confident how to word the new failure messages.
 
-Kevin

*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 544,549  show_log_timezone(void)
--- 544,580 
  
  
  /*
+  * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+  *
+  * These should be transaction properties which can be set in exactly the
+  * same points in time that transaction isolation may be set.
+  */
+ bool
+ assign_transaction_read_only(bool value, bool doit, GucSource source)
+ {
+   if (FirstSnapshotSet)
+   {
+   ereport(GUC_complaint_elevel(source),
+   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(read-only property must be set before 
any query)));
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
+   if (source != PGC_S_OVERRIDE)
+   return false;
+   }
+   else if (IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(read-only propery may not be changed 
in a subtransaction)));
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
+   if (source != PGC_S_OVERRIDE)
+   return false;
+   }
+ 
+   return true;
+ }
+ 
+ /*
   * SET TRANSACTION ISOLATION LEVEL
   */
  
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 169,175  static bool assign_bonjour(bool newval, bool doit, GucSource 
source);
  static bool assign_ssl(bool newval, bool doit, GucSource source);
  static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
  static bool assign_log_stats(bool newval, bool doit, GucSource source);
- static bool assign_transaction_read_only(bool newval, bool doit, GucSource 
source);
  static const char *assign_canonical_path(const char *newval, bool doit, 
GucSource source);
  static const char *assign_timezone_abbreviations(const char *newval, bool 
doit, GucSource source);
  static const char *show_archive_command(void);
--- 169,174 
***
*** 7837,7870  assign_log_stats(bool newval, bool doit, GucSource source)
return true;
  }
  
- static bool
- assign_transaction_read_only(bool newval, bool doit, GucSource source)
- {
-   /* Can't go to r/w mode inside a r/o transaction */
-   if (newval == false  XactReadOnly  IsSubTransaction())
-   {
-   ereport(GUC_complaint_elevel(source),
-   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg(cannot set transaction read-write mode 
inside a read-only transaction)));
-   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
-   if (source != PGC_S_OVERRIDE)
-   return false;
-   }
- 
-   /* Can't go to r/w mode while recovery is still active */
-   if (newval == false  XactReadOnly  RecoveryInProgress())
-   {
-   ereport(GUC_complaint_elevel(source),
-   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(cannot set transaction read-write mode during 
recovery)));
-   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
-   if (source != PGC_S_OVERRIDE)
-   return false;
-   }
- 
-   return true;
- }
- 
  static const char *
  assign_canonical_path(const char *newval, bool doit, GucSource source)
  {
--- 7836,7841 
*** a/src/include/commands/variable.h
--- b/src/include/commands/variable.h
***
*** 21,26  extern const char *show_timezone(void);
--- 21,28 
  extern const char *assign_log_timezone(const char *value,
bool doit, GucSource source);
  extern const char *show_log_timezone(void);
+ extern bool assign_transaction_read_only(bool value,
+   bool doit, GucSource source);
  extern

Re: [HACKERS] unlogged tables

On Wed, Dec 8, 2010 at 9:52 AM, Marti Raudsepp ma...@juffo.org wrote:
 Have you thought about switching unlogged tables back to logged? Are
 there any significant obstacles?

I think it can be done, and I think it's useful, but I didn't want to
tackle it for version one, because it's not trivial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

On Wed, Dec 8, 2010 at 10:19 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-12-07 at 13:17 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I'm also going to go through and change all instances of the word
  unlogged to volatile, per previous discussion.  If this seems like
  a bad idea to anyone, please object now rather than afterwards.

 Hm... I thought there had been discussion of a couple of different
 flavors of table volatility.  Is it really a good idea to commandeer
 the word volatile for this particular one?

 Note that DB2 uses the table modifier VOLATILE to indicate a table that
 has a widely fluctuating table size, for example a queue table. It's
 used as a declarative optimizer hint. So the term has many possible
 meanings.

 Prefer UNLOGGED or similar descriptive term.

Hrm.  The previous consensus seemed to be in favor of trying to
describe the behavior (your contents might disappear) rather than the
implementation (we don't WAL-log those contents).  However, the fact
that DB2 uses that word to mean something entirely different is
certainly a bit awkward, so maybe we should reconsider.  Or maybe not.
 I'm not sure.  Anyone else want to weigh in here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

On Wed, Dec 8, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dmitriy Igrishin dmit...@gmail.com writes:
 I always read the documentation, but don't want (yes, don't want)
 to read a lot of code to get the answer on simple question because
 life is too short for it. I think that people should helps each other :-)

 Fine, but that sort of question doesn't belong on pgsql-hackers.

Right.  Perhaps it's useful to quote the description of the list[1]:

The PostgreSQL developers team lives here. Discussion of current
development issues, problems and bugs, and proposed new features. If
your question cannot be answered by people in the other lists, and it
is likely that only a developer will know the answer, you may re-post
your question in this list. You must try elsewhere first!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://archives.postgresql.org/pgsql-hackers/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

On Wed, Dec 8, 2010 at 4:19 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think this so-called two-step approach is pretty ugly.

 Well it does not need to be exposed to the user, thinking about it, as
 proposed in the other thread. Other than that, you're argument here is
 exactly the same as the ones saying that VACUUM or Hint Bints are
 bad. It's just that if you want correctness, you don't have anything
 better.

Exposing it to the user is what I think is ugly.

It's also worth noting that ALTER EXTENSION .. SET SCHEMA does NOT
guarantee a correct relocation, because someone might have done ALTER
FUNCTION .. SET search_path = @extschema@, and that's not going to get
properly fixed up.  I'm coming to the conclusion more and more that
ALTER EXTENSION .. SET SCHEMA just can't work reliably.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 
 Note that DB2 uses the table modifier VOLATILE to indicate a
 table that has a widely fluctuating table size, for example a
 queue table.
 
 the fact that DB2 uses that word to mean something entirely
 different is certainly a bit awkward
 
It would be especially awkward should someone port their DB2
database to PostgreSQL without noticing the semantic difference, and
then find their data missing.
 
 so maybe we should reconsider.
 
+1 for choosing terminology without known conflicts with other
significant products.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_type.typname of array types.

Okay, I understand you hint, Tom and Robert. Sorry to trouble.
I've ask here because I thought that exactly only developer
will know the answer on my question: is it guaranteed ... ?.

Many thanks to Florian and Andrew for detailed explanations,
advice and etc without pointing me to the sources. Respect! :-)

2010/12/8 Robert Haas robertmh...@gmail.com

 On Wed, Dec 8, 2010 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Dmitriy Igrishin dmit...@gmail.com writes:
  I always read the documentation, but don't want (yes, don't want)
  to read a lot of code to get the answer on simple question because
  life is too short for it. I think that people should helps each other
 :-)
 
  Fine, but that sort of question doesn't belong on pgsql-hackers.

 Right.  Perhaps it's useful to quote the description of the list[1]:

 The PostgreSQL developers team lives here. Discussion of current
 development issues, problems and bugs, and proposed new features. If
 your question cannot be answered by people in the other lists, and it
 is likely that only a developer will know the answer, you may re-post
 your question in this list. You must try elsewhere first!

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 [1] http://archives.postgresql.org/pgsql-hackers/




-- 
// Dmitriy.

Re: [HACKERS] Solving sudoku using SQL

On Wed, Dec 8, 2010 at 8:57 AM, Tatsuo Ishii is...@postgresql.org wrote:
 In the page first one takes infinite time by PostgreSQL 9.0.1.  Next
 one can be executed very quickly because the join order is explicitely
 specified by cross join syntax. This seems to be a limitation of
 PostgreSQL optimizer and I would like it be removed. Comments?

It's not easy to make the optimizer degrade gracefully when confronted
with a very large number of tables, but I agree it would be worthwhile
if we could figure out how to do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 Note that DB2 uses the table modifier VOLATILE to indicate a
 table that has a widely fluctuating table size, for example a
 queue table.
 
 the fact that DB2 uses that word to mean something entirely
 different is certainly a bit awkward
 
 It would be especially awkward should someone port their DB2
 database to PostgreSQL without noticing the semantic difference, and
 then find their data missing.

Not to mention that DB2 syntax tends to appear in the standard a few
years later.

 so maybe we should reconsider.
 
 +1 for choosing terminology without known conflicts with other
 significant products.

Yeah.  Given this info I'm strongly inclined to stick with UNLOGGED.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize PL/Perl function argument passing [PATCH]

2010-12-08 Thread Tim Bunce

On Tue, Dec 07, 2010 at 10:00:28AM -0500, Andrew Dunstan wrote:
 
 
 On 12/07/2010 09:24 AM, Tim Bunce wrote:
 Changes:
 
  Sets the local $_TD via C instead of passing an extra argument.
  So functions no longer start with our $_TD; local $_TD = shift;
 
  Pre-extend stack for trigger arguments for slight performance gain.
 
 Passes installcheck.
 
 Please add it to the January commitfest.

Done. https://commitfest.postgresql.org/action/patch_view?id=446

 Have you measured the speedup gained from this?

No. I doubt it's significant, but every little helps :)

 Do you have any more improvements in the pipeline?

I'd like to add $arrayref = decode_array_literal('{2,3}') and
maybe $hashref = decode_hstore_literal('x=1, y=2').
I don't know how much works would be involved in those though.

Another possible improvement would be rewriting encode_array_literal()
in C, so returning arrays would be much faster.

I'd also like to #define PERL_NO_GET_CONTEXT, which would give a useful
performance boost by avoiding all the many hidden calls to lookup
thread-local storage. (PERL_SET_CONTEXT() would go and instead the
'currrent interpreter' would be passed around as an extra argument.)
That's a fairly mechanical change but the diff may be quite large.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperlu problem with utf8

On Dec 8, 2010, at 8:13 AM, Oleg Bartunov wrote:

 adding utf8::decode($_[0]) solves the problem:
 
 knn=# CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS 
 $$
use strict;
use URI::Escape;
utf8::decode($_[0]);
return uri_unescape($_[0]); $$ LANGUAGE plperlu;

Hrm. Ideally all strings passed to PL/Perl functions would be decoded.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize PL/Perl function argument passing [PATCH]

On Dec 8, 2010, at 9:14 AM, Tim Bunce wrote:

 Do you have any more improvements in the pipeline?
 
 I'd like to add $arrayref = decode_array_literal('{2,3}') and
 maybe $hashref = decode_hstore_literal('x=1, y=2').
 I don't know how much works would be involved in those though.

Those would be handy, but for arrays, at least, I'd rather have a GUC to turn 
on so that arrays are passed to PL/perl functions as array references.

 Another possible improvement would be rewriting encode_array_literal()
 in C, so returning arrays would be much faster.

+1

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Robert Haas robertmh...@gmail.com writes:
 It's also worth noting that ALTER EXTENSION .. SET SCHEMA does NOT
 guarantee a correct relocation, because someone might have done ALTER
 FUNCTION .. SET search_path = @extschema@, and that's not going to get
 properly fixed up.  I'm coming to the conclusion more and more that
 ALTER EXTENSION .. SET SCHEMA just can't work reliably.

Dimitri's last reply to me
http://archives.postgresql.org/message-id/87r5ds1v4q@hi-media-techno.com
suggests that what he has in mind is to define a relocatable extension
as one that can be relocated ;-), ie it does not contain any such
gotchas.  Maybe this is too ugly in itself, or not useful enough to be
worth doing.  But it doesn't seem technically unworkable to me, so long
as relocatability is made an explicitly declared property of extensions.
It's certainly true that a large fraction of contrib modules should be
relocatable in that sense, because they just contain C functions that
aren't going to care.

Or are you complaining that somebody could break relocatability after
the fact by altering the contained objects?  Sure, but he could break
the extension in any number of other ways as well by making such
alterations.  The answer to that is privilege checks, and superusers
being presumed to know what they're doing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 8, 2010 at 8:57 AM, Tatsuo Ishii is...@postgresql.org wrote:
 In the page first one takes infinite time by PostgreSQL 9.0.1.  Next
 one can be executed very quickly because the join order is explicitely
 specified by cross join syntax. This seems to be a limitation of
 PostgreSQL optimizer and I would like it be removed. Comments?

 It's not easy to make the optimizer degrade gracefully when confronted
 with a very large number of tables, but I agree it would be worthwhile
 if we could figure out how to do it.

There is something funny going on there; it's not just that the planner
is slower with a large flat search space.  It is slower, but only maybe
5x or so.  What I'm seeing is that it actually finds a much worse plan
(very much larger estimated cost as well as actual runtime) when given
the flat problem.  That seems like a bug: a constrained search ought
never find a better solution than an unconstrained search.

It may be that the search heuristics in joinrels.c are failing.  If
so, it may be impractical to fix, ie making this better would slow down
more-typical planning problems enormously.  But it'd be nice to
understand exactly where it's going off the rails.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

I wrote:
 There is something funny going on there; it's not just that the planner
 is slower with a large flat search space.  It is slower, but only maybe
 5x or so.  What I'm seeing is that it actually finds a much worse plan
 (very much larger estimated cost as well as actual runtime) when given
 the flat problem.  That seems like a bug: a constrained search ought
 never find a better solution than an unconstrained search.

Oh, wait: the problem of course is that it's switching into GEQO mode
and hence *not* doing a complete search.  Doh.  If you turn GEQO off
then planning takes ~ forever with the flat version of the query.

We could fix that by forcibly breaking up the search problem in the
same fashion that join_collapse_limit does, but I'm sure we'd get
complaints about that approach.

The real fix in my mind is to replace GEQO search with something
smarter.  I wonder what happened to the SA patch that was reported
on at PGCon.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

2010-12-08 Thread Jan Urbański

On 08/12/10 18:45, Tom Lane wrote:
 The real fix in my mind is to replace GEQO search with something
 smarter.  I wonder what happened to the SA patch that was reported
 on at PGCon.

I got distracted with other things :( I'll try to plan the two queries
with SA and see what the results are. If they're good it'll certainly
raise my motivation on finishing up the module and proposing it.

Cheers,
Jan

PS: https://github.com/wulczer/saio, although it's still ugly as hell :(

J

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Attached.
 
 Accomplished more through mimicry (based on setting transaction
 isolation level) than profound understanding of the code involved;
 but it passes all regression tests on both `make check` and `make
 installcheck-world`.  This includes a new regression test that an
 attempt to change it after a query fails.  I've poked at it with
 various ad hoc tests, and it is behaving as expected in those.

Hmm.  This patch disallows the case of creating a read-only
subtransaction of a read-write parent.  That's a step backwards.
I'm not sure how we could enforce that the property not change
after the first query of a subxact, but maybe we don't care that much?
Do your optimizations pay attention to local read-only in a subxact?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

On Wed, Dec 8, 2010 at 12:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 It's also worth noting that ALTER EXTENSION .. SET SCHEMA does NOT
 guarantee a correct relocation, because someone might have done ALTER
 FUNCTION .. SET search_path = @extschema@, and that's not going to get
 properly fixed up.  I'm coming to the conclusion more and more that
 ALTER EXTENSION .. SET SCHEMA just can't work reliably.

 Dimitri's last reply to me
 http://archives.postgresql.org/message-id/87r5ds1v4q@hi-media-techno.com
 suggests that what he has in mind is to define a relocatable extension
 as one that can be relocated ;-), ie it does not contain any such
 gotchas.  Maybe this is too ugly in itself, or not useful enough to be
 worth doing.  But it doesn't seem technically unworkable to me, so long
 as relocatability is made an explicitly declared property of extensions.
 It's certainly true that a large fraction of contrib modules should be
 relocatable in that sense, because they just contain C functions that
 aren't going to care.

I don't find that a very satisfying solution, but I guess we could do
it that way.

 Or are you complaining that somebody could break relocatability after
 the fact by altering the contained objects?  Sure, but he could break
 the extension in any number of other ways as well by making such
 alterations.  The answer to that is privilege checks, and superusers
 being presumed to know what they're doing.

I wasn't complaining about this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-12-08 Thread Chris Browne

t...@sss.pgh.pa.us (Tom Lane) writes:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 Note that DB2 uses the table modifier VOLATILE to indicate a
 table that has a widely fluctuating table size, for example a
 queue table.
  
 the fact that DB2 uses that word to mean something entirely
 different is certainly a bit awkward
  
 It would be especially awkward should someone port their DB2
 database to PostgreSQL without noticing the semantic difference, and
 then find their data missing.

 Not to mention that DB2 syntax tends to appear in the standard a few
 years later.

And the term volatile has well-understood connotations that are
analagous to those in DB2 in the C language and various descendants.
http://en.wikipedia.org/wiki/Volatile_variable

I'm not sure UNLOGGED is perfect... If TEMPORARY weren't already
taken, it would be pretty good.

Other possibilities include TRANSIENT, EPHEMERAL, TRANSIENT, TENUOUS.

FLASH would be an amusing choice.  PostgreSQL 9.1, now with support for
FLASH!
-- 
output = (cbbrowne @ acm.org)
http://linuxdatabases.info/info/internet.html
I've told you for the fifty-thousandth time, stop exaggerating. 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Simon Riggs si...@2ndquadrant.com writes:
 Optimize commit_siblings in two ways to improve group commit.
 First, avoid scanning the whole ProcArray once we know there
 are at least commit_siblings active; second, skip the check
 altogether if commit_siblings = 0.

 Greg Smith

I wonder whether we shouldn't change commit_siblings' default value to
zero while we're at it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane   t...@sss.pgh.pa.us   wrote:
 Hmm.  This patch disallows the case of creating a read-only
 subtransaction of a read-write parent.  That's a step backwards.
 I'm not sure how we could enforce that the property not change
 after the first query of a subxact, but maybe we don't care that
 much?  Do your optimizations pay attention to local read-only in a
 subxact?
 
No, it's all about the top level transaction, as long as the
subtransaction doesn't do anything which violates the requirements
of the top level.  (That is, if the top level is not READ ONLY, I
can't do the optimizations, but it would cause no problem if a
subtransaction was READ ONLY -- it just wouldn't allow any special
optimizations.)
 
I noticed that the standard seems (if I'm reading it correctly) to
allow subtransactions to switch to more restrictive settings for
both transaction isolation and read only status than the enclosing
transaction, but not looser.  I don't think it makes sense in
PostgreSQL to say (for example) that the top level transaction is
READ COMMITTED but the subtransaction is SERIALIZABLE, but it might
make sense to say that the top level transaction is READ WRITE but
the subtransaction is READ ONLY.  And I see where I broke support
for that in the patch.
 
I can fix up the patch if to support it again if you like.  (I think
it's just a matter of replacing a few lines that I replaced in the
original patch.)  If you'd rather do it, I'll stay out of your way.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I noticed that the standard seems (if I'm reading it correctly) to
 allow subtransactions to switch to more restrictive settings for
 both transaction isolation and read only status than the enclosing
 transaction, but not looser.

Yeah.  My recollection is that we've discussed exactly this point with
respect to isolation level, and decided that we couldn't (or it wasn't
worthwhile to) allow serializable subxacts inside repeatable read.
I don't know whether your patch will change that tradeoff.  But I don't
think it's really been discussed much with respect to read-only, perhaps
because nobody's paid all that much attention to read-only at all.
In any case, the behavior you state seems obviously correct, so let's
see what we can do about getting closer to that.

My guess is that a reasonable fix is to remember the read-only setting
as of snapshot lockdown, and thereafter to allow changing from
read-write to read-only but not vice versa.  One thing to watch for
is allowing subxact exit to restore the previous read-write state.
(BTW it looks like assign_XactIsoLevel emits a rather useless debug
message in that case, so that code could stand some cleanup too.  Also,
that code is set so that it will throw an error even if you're assigning
the currently active setting, which maybe is overly strict?  Not sure.)

 I can fix up the patch if to support it again if you like.  (I think
 it's just a matter of replacing a few lines that I replaced in the
 original patch.)  If you'd rather do it, I'll stay out of your way.

Feel free to have a go at it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

On Wed, Dec 8, 2010 at 1:37 PM, Chris Browne cbbro...@acm.org wrote:
 t...@sss.pgh.pa.us (Tom Lane) writes:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 Note that DB2 uses the table modifier VOLATILE to indicate a
 table that has a widely fluctuating table size, for example a
 queue table.

 the fact that DB2 uses that word to mean something entirely
 different is certainly a bit awkward

 It would be especially awkward should someone port their DB2
 database to PostgreSQL without noticing the semantic difference, and
 then find their data missing.

 Not to mention that DB2 syntax tends to appear in the standard a few
 years later.

 And the term volatile has well-understood connotations that are
 analagous to those in DB2 in the C language and various descendants.
 http://en.wikipedia.org/wiki/Volatile_variable

 I'm not sure UNLOGGED is perfect... If TEMPORARY weren't already
 taken, it would be pretty good.

 Other possibilities include TRANSIENT, EPHEMERAL, TRANSIENT, TENUOUS.

 FLASH would be an amusing choice.  PostgreSQL 9.1, now with support for
 FLASH!

The value of VOLATILE, I felt, was that it's sort of like a volatile
variable in C: it might suddenly change under you.  I think that
TRANSIENT and EPHEMERAL and TENUOUS all imply that the table itself is
either temporary or, in the last case, not very dense, which isn't
really what we want to convey.  I did consider EPHEMERAL myself, but
the more I think about it, the more wrong it sounds.  Even the table's
contents are not really short-lived - they may easily last for months
or years.  You just shouldn't rely on it.  I cracked up this morning
imagining calling this CREATE UNRELIABLE TABLE, but I'm starting to
think UNLOGGED is as well as we're going to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

On Wed, Dec 8, 2010 at 1:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Optimize commit_siblings in two ways to improve group commit.
 First, avoid scanning the whole ProcArray once we know there
 are at least commit_siblings active; second, skip the check
 altogether if commit_siblings = 0.

 Greg Smith

 I wonder whether we shouldn't change commit_siblings' default value to
 zero while we're at it.

Not that I see anything to disagree with in this patch, but what
happened to posting patches in advance of committing them?  Or did I
just miss that part?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Robert Haas robertmh...@gmail.com writes:
 Not that I see anything to disagree with in this patch, but what
 happened to posting patches in advance of committing them?  Or did I
 just miss that part?

http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

Possibly it should have been posted to -hackers instead, but surely you
read -performance?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

On Wed, Dec 8, 2010 at 2:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Not that I see anything to disagree with in this patch, but what
 happened to posting patches in advance of committing them?  Or did I
 just miss that part?

 http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

 Possibly it should have been posted to -hackers instead, but surely you
 read -performance?

Oh, yeah I see it now.  I do read -performance, but with two orders of
magnitude more latency than -hackers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 One thing to watch for is allowing subxact exit to restore the
 previous read-write state.
 
OK.
 
 BTW it looks like assign_XactIsoLevel emits a rather useless debug
 message in that case, so that code could stand some cleanup too.
 
I'll take a look.
 
 Also, that code is set so that it will throw an error even if
 you're assigning the currently active setting, which maybe is
 overly strict?  Not sure.
 
The standard is tricky to read, but my reading of it is that only
LOCAL changes are allowed after the transaction is underway (which
I *think* effectively means a subtransaction), and those can't make
the setting less strict -- you're allowed to specify the same level
or more strict.  There would be no harm from the perspective of
anything I'm working on to allow an in-progress transaction to be
set to what it already has, but that seems to invite confusion and
error more than provide a helpful feature, as far as I can tell. 
I'm inclined not to allow it except at the start of a
subtransaction, but don't feel strongly about it.
 
 I can fix up the patch if to support it again if you like.
 
 Feel free to have a go at it.
 
Will do.  I notice that I also removed the check for
RecoveryInProgress(), which I will also restore.  And maybe a
regression test or two for the subtransaction usages
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Also, that code is set so that it will throw an error even if
 you're assigning the currently active setting, which maybe is
 overly strict?  Not sure.
 
 The standard is tricky to read, but my reading of it is that only
 LOCAL changes are allowed after the transaction is underway (which
 I *think* effectively means a subtransaction), and those can't make
 the setting less strict -- you're allowed to specify the same level
 or more strict.  There would be no harm from the perspective of
 anything I'm working on to allow an in-progress transaction to be
 set to what it already has, but that seems to invite confusion and
 error more than provide a helpful feature, as far as I can tell. 
 I'm inclined not to allow it except at the start of a
 subtransaction, but don't feel strongly about it.

Hmm ...

(1) If the standard says that you're allowed to apply a redundant setting,
I think we'd better accept that.

(2) I'm not thrilled about the idea of tracking the equivalent of
FirstSnapshotSet for each subtransaction, which I think would be
necessary infrastructure to error-check this as tightly as you seem to
have in mind.  I'd prefer to be a bit laxer in order to have less
overhead for what is in the end just nanny-ism.  So the implementation
I had in mind would allow SET TRANSACTION operations to occur later in a
subxact, as long as they were redundant and weren't actually trying to
change the active value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane t...@sss.pgh.pa.us wrote:
 
 If the standard says that you're allowed to apply a redundant
 setting, I think we'd better accept that.
 
OK
 
 So the implementation I had in mind would allow SET TRANSACTION
 operations to occur later in a subxact, as long as they were
 redundant and weren't actually trying to change the active value.
 
It's easy to see how I can allow changes in the subtransaction as
long as they don't specify READ WRITE when the top level is READ
ONLY, but it isn't obvious to me how to only allow it at the start
of the subtransaction.  I'm OK with taking the easy route on this
aspect of things, but if someone needs READ ONLY to stick for the
duration of a subtransaction, I'm not sure how to do that.  (And I'm
not sure you were actually suggesting that, either.)
 
To restate, since I'm not sure how clear that is, what I have at the
moment is:
 
(1) A top level transaction can only set READ ONLY or READ WRITE
until it has acquired its first snapshot.
 
(2) A subtransaction can set it at will, as many times as desired,
to match the top level or specify READ ONLY.
 
(3) During recovery the setting cannot be changed from READ ONLY to
READ WRITE.
 
I'm not clear whether #2 is OK or how to do it only at the start.
I haven't yet looked at the other issues you mentioned.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-08 Thread Florian Pflug

On Dec8, 2010, at 20:39 , Kevin Grittner wrote:
 The standard is tricky to read, but my reading of it is that only
 LOCAL changes are allowed after the transaction is underway (which
 I *think* effectively means a subtransaction), and those can't make
 the setting less strict -- you're allowed to specify the same level
 or more strict.  There would be no harm from the perspective of
 anything I'm working on to allow an in-progress transaction to be
 set to what it already has, but that seems to invite confusion and
 error more than provide a helpful feature, as far as I can tell. 
 I'm inclined not to allow it except at the start of a
 subtransaction, but don't feel strongly about it.

Hm, I think being able to assert that the isolation level really is 
SERIALIZABLE by simply doing SET TRANSACTION ISOLATION LEVEL SERIALIZABLE 
would be a great feature for SSI.

Say you've written a trigger which enforces some complex constraint, but is 
correct only for SERIALIZABLE transactions. By simply sticking a SET 
TRANSACTION ISOLATION LEVEL SERIALIZABLE at the top of the trigger you'd both 
document that fact it is correct only for SERIALIZABLE transactions *and* 
prevent corruption should the isolation level be something else due to a pilot 
error. Nice, simply and quite effective.

BTW, I hope to find some time this evening to review your more detailed 
proposal for serializable read only deferrable

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Robert Haas robertmh...@gmail.com writes:
 Exposing it to the user is what I think is ugly.

Ok, and the current idea fixes that! :)

 It's also worth noting that ALTER EXTENSION .. SET SCHEMA does NOT
 guarantee a correct relocation, because someone might have done ALTER
 FUNCTION .. SET search_path = @extschema@, and that's not going to get
 properly fixed up.  I'm coming to the conclusion more and more that
 ALTER EXTENSION .. SET SCHEMA just can't work reliably.

For starters, an extension's script that requires an @extschema@
property is to be marked non-relocatable, so the command here would just
error out. Then again, should the extension's author forgot to mark it
relocatable, the @extschema@ placeholder is not replaced in the current
proposal, so that means that the objects schema are all hard-coded in
the script.

Then, if the superuser thinks it's a good idea to break the extension
after it's been installed, I'm not sure we can do anything about it.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 So the implementation I had in mind would allow SET TRANSACTION
 operations to occur later in a subxact, as long as they were
 redundant and weren't actually trying to change the active value.
 
 It's easy to see how I can allow changes in the subtransaction as
 long as they don't specify READ WRITE when the top level is READ
 ONLY, but it isn't obvious to me how to only allow it at the start
 of the subtransaction.  I'm OK with taking the easy route on this
 aspect of things, but if someone needs READ ONLY to stick for the
 duration of a subtransaction, I'm not sure how to do that.  (And I'm
 not sure you were actually suggesting that, either.)

What I suggested was to not allow read-only - read-write state
transitions except (1) before first snapshot in the main xact
and (2) at subxact exit (the OVERRIDE case).  That seems to accomplish
the goal.  Now it will also allow dropping down to read-only
mid-subtransaction, but I don't think forbidding that case is worth
extra complexity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

2010-12-08 Thread Jan Urbański

On 08/12/10 19:02, Jan Urbański wrote:
 On 08/12/10 18:45, Tom Lane wrote:
 The real fix in my mind is to replace GEQO search with something
 smarter.  I wonder what happened to the SA patch that was reported
 on at PGCon.
 
 I got distracted with other things :( I'll try to plan the two queries
 with SA and see what the results are. If they're good it'll certainly
 raise my motivation on finishing up the module and proposing it.

I'm pleasantly surprised that the SA code as it stands today, setting
the equlibrium factor to 8 and temperature reduction factor to 0.4, the
query takes 1799.662 ms in total. With the default values it runs
forever, but I long discovered that defaults taken from the original
paper are not well suited for my PG implementation (I could plug my MSc
thesis here, but I'm way too shy for that). 8/0.4 are values where I got
better results than GEQO for Andres' monster-query.

Maybe it actually has some value after all... Let's see if I can
untangle myself from plpython in time to clean up that code before January.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Florian Pflug f...@phlo.org wrote:
 
 Hm, I think being able to assert that the isolation level really
 is SERIALIZABLE by simply doing SET TRANSACTION ISOLATION LEVEL
 SERIALIZABLE would be a great feature for SSI.
 
 Say you've written a trigger which enforces some complex
 constraint, but is correct only for SERIALIZABLE transactions. By
 simply sticking a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 at the top of the trigger you'd both document that fact it is
 correct only for SERIALIZABLE transactions *and* prevent
 corruption should the isolation level be something else due to 
 a pilot error. Nice, simply and quite effective.
 
It would be great to have a way within a trigger, or possibly other
functions, to assert that the transaction isolation level is
serializable.  What gives me pause here is that the standard allows
you to specify a more strict transaction isolation level within a
subtransaction without error, so this way of spelling the feature is
flirting with rather nonstandard behavior.
 
Is there maybe a better way to check this?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-12-08 Thread Kineticode Billing

On Dec 8, 2010, at 10:37 AM, Chris Browne wrote:

 Other possibilities include TRANSIENT, EPHEMERAL, TRANSIENT, TENUOUS.

EVANESCENT.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Tom Lane t...@sss.pgh.pa.us writes:
 Dimitri's last reply to me
 http://archives.postgresql.org/message-id/87r5ds1v4q@hi-media-techno.com
 suggests that what he has in mind is to define a relocatable extension
 as one that can be relocated ;-), ie it does not contain any such
 gotchas.  Maybe this is too ugly in itself, or not useful enough to be
 worth doing.  But it doesn't seem technically unworkable to me, so long
 as relocatability is made an explicitly declared property of extensions.

Well it does not seem to be complex to code. It's about having a new
property in the control file, relocatable, boolean. This property is
required and controls the behavior of the CREATE EXTENSION ... WITH
SCHEMA command. When true we use the ALTER EXTENSION SET SCHEMA code
path and when false, the placeholder replacement code path. The ALTER
command has already been developed so I need to merge it into the main
patch.

The ALTER EXTENSION SET SCHEMA command needs to be adapted so that it
checks that all the extension's objects are currently in the same schema
and error out if that's not the case.

I'm not going be able to deliver a new patch including that and the
other changes required by David Wheeler's review by tonight, but by
Friday's evening seems like a reasonable target.

 It's certainly true that a large fraction of contrib modules should be
 relocatable in that sense, because they just contain C functions that
 aren't going to care.

As they all currently are using the SET search_path TO public; trick, I
think they are all relocatable as is and all I need is to remove that
line (and add the property to the control file).

 Or are you complaining that somebody could break relocatability after
 the fact by altering the contained objects?  Sure, but he could break
 the extension in any number of other ways as well by making such
 alterations.  The answer to that is privilege checks, and superusers
 being presumed to know what they're doing.

+1

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 I'm pleasantly surprised that the SA code as it stands today, setting
 the equlibrium factor to 8 and temperature reduction factor to 0.4, the
 query takes 1799.662 ms in total.

Cool.

 With the default values it runs
 forever, but I long discovered that defaults taken from the original
 paper are not well suited for my PG implementation (I could plug my MSc
 thesis here, but I'm way too shy for that). 8/0.4 are values where I got
 better results than GEQO for Andres' monster-query.

Hmmm ... runs forever is a bit scary.  One of the few good things I
can say about GEQO is that it will terminate in a reasonable amount of
time for even quite large problems.  I would like to think that SA will
also have that property.  I thought that the annealing approach was sure
to terminate in a fixed number of steps?  Or did you mean that the
planner terminated, but produced a horrid plan?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

2010-12-08 Thread Kineticode Billing

On Dec 8, 2010, at 1:39 AM, Dimitri Fontaine wrote:

 David E. Wheeler da...@kineticode.com writes:
 What about unaccent? Or lo (1 domain, 2 functions)?
 
 Sure. Doesn't have to actually do anything.
 
 Ok, so that's already in the patch :)

No, it's not. There are no unit tests at all. You can call the contrib modules 
and their tests acceptance tests, but that's not the same thing.

 I see that you're not too much into packaging, but here, we don't ever
 use `make install` on a production machine. This step happens on the
 packaging server, then we install and QA the stuff, then the package
 gets installed on the servers where we need it.
 
 Also, I don't see how make install is going to know which cluster it
 should talk to — it's quite easy and typicall to run this command on a
 server where you have several major versions installed, and several
 clusters per major version.

Yeah, one installs extensions into a PostgreSQL instal, not a cluster. I get 
that.

 So, again, the data that you so want to remove from the control files I
 have no idea at all where to put it.

Okay, keep the installed control files. But don't make me distribute them 
unless absolutely necessary.

 Possibly. I'm not going to do it this week; seems like there are some
 issues that still need shaking out in the implementation, to judge
 from the pg_execute_from_file review thread.
 
 Yeah, dust ain't settled completely yet… working on that.

Right.

 Each would get a separate control file. The mapping of one version
 number to multiple extensions is potentially confusing.
 
 Funny, each already get a separate control file now.
 
 $ ls contrib/spi/*control.in
 autoinc.control.in  auto_username.control.in  moddatetime.control.in
 refint.control.in  timetravel.control.in
 
 Then the idea behind the version number in the Makefile is that you
 generally are maintaining it there and don't want to have to maintain it
 in more than one place.

Sure. But you're mandating one version even if you have multiple extensions. 
That's the potentially confusing part.

 Why is that? We currently manage multiple script files, test files,
 etc. in a single Makefile. Wildcard operators are very useful for this
 sort of thing.
 
 Well, that was you saying just above that using the same EXTVERSION Make
 variable for more than one control file is potentially confusing. What
 about using all the other variables in the same way?

What? I don't follow what you're saying.

 Yes, that would be preferable, but a one-step operation would of
 course be ideal.
 
 Thinking about it, as proposed in the other thread, I now think that the
 2-steps operation could be internal and not user exposed.

Maybe. I'm still not convinced that you need the replace() stuff at all, though 
I can see the utility of it.

 Some do require shared_preload_libraries, no?
 
 One of them only, pg_stat_statements.

In contrib. You seem to forget that there are a lot of third-party extensions 
out there already.

SET client_min_messages TO warning;
SET log_min_messagesTO warning;
 
 Though I think I'd rather that the warning still went to the log.
 
 (that's about hstore verbosity) ok will see about changing
 client_min_messages around the CREATE OPERATOR =.

I would much rather retain that warning -- everyone should know about it -- and 
somehow convince SPI to be much less verbose in reporting issues. It should 
specify where the error came from (which query) and what the error actually is.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Florian Pflug f...@phlo.org wrote:
 Say you've written a trigger which enforces some complex
 constraint, but is correct only for SERIALIZABLE transactions. By
 simply sticking a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 at the top of the trigger you'd both document that fact it is
 correct only for SERIALIZABLE transactions *and* prevent
 corruption should the isolation level be something else due to 
 a pilot error. Nice, simply and quite effective.
 
 It would be great to have a way within a trigger, or possibly other
 functions, to assert that the transaction isolation level is
 serializable.  What gives me pause here is that the standard allows
 you to specify a more strict transaction isolation level within a
 subtransaction without error, so this way of spelling the feature is
 flirting with rather nonstandard behavior.

Yes.  This is not the way to provide a feature like that.
 
 Is there maybe a better way to check this?

You can always read the current setting and throw an error if you
don't like it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

XLog vs SSD [Was: Re: [HACKERS] random write in xlog?]

2010-12-08 Thread James Cloos

 JJ == Jeff Janes jeff.ja...@gmail.com writes:

JJ Anyway, the writes are logically sequentially, but not physically.
JJ If I remember correctly, it always writes out full blocks, even if
JJ the last part of the block has not yet been filled with new data.
JJ When the remainder gets filled, it then writes out the full block
JJ again, both the already written and the new part.

What does that mean for use of a flash SSD for the xlog dir?

Does the block writing mesh up well with the usage pattern a flash
SSD needs to maximize lifespan?

I'd love a dram ssd for pg_xlog and the journals for the other
filesystems, but they cost too much.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

2010-12-08 Thread Kineticode Billing

On Dec 8, 2010, at 12:18 PM, Dimitri Fontaine wrote:

 It's certainly true that a large fraction of contrib modules should be
 relocatable in that sense, because they just contain C functions that
 aren't going to care.
 
 As they all currently are using the SET search_path TO public; trick, I
 think they are all relocatable as is and all I need is to remove that
 line (and add the property to the control file).

+1 This alone would be a huge improvement, solving my complaint about 
@extschema@ for the vast majority of cases. It's reasonable to make the author 
of a multi-schema extension or an extension that cannot be moved do more work.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

2010-12-08 Thread Jan Urbański

On 08/12/10 21:18, Tom Lane wrote:
 =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 I'm pleasantly surprised that the SA code as it stands today, setting
 the equlibrium factor to 8 and temperature reduction factor to 0.4, the
 query takes 1799.662 ms in total.
 
 Cool.
 
 With the default values it runs
 forever, but I long discovered that defaults taken from the original
 paper are not well suited for my PG implementation (I could plug my MSc
 thesis here, but I'm way too shy for that). 8/0.4 are values where I got
 better results than GEQO for Andres' monster-query.
 
 Hmmm ... runs forever is a bit scary.  One of the few good things I
 can say about GEQO is that it will terminate in a reasonable amount of
 time for even quite large problems.  I would like to think that SA will
 also have that property.  I thought that the annealing approach was sure
 to terminate in a fixed number of steps?  Or did you mean that the
 planner terminated, but produced a horrid plan?

It finishes after a bound number of steps, but with high values of
temperature reduction it takes a lot of time for the temperature to fall
low enough to consider the system frozen, so that number of steps is big.

With SA you start with a temperature that's linearily dependant on the
size of the query, and back off exponentially. Each step means work tha
also depends on the size of the query, so big queries can mean expensive
steps. With q=0.9 and initial temperature=very-big it takes too much
time to plan.

The good thing is that it's trivial to implement a hard cut-off value,
which will stop annealing after a fixed number of steps (regardless of
the current temperature) that would serve as a safety valve.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Solving sudoku using SQL

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 On 08/12/10 21:18, Tom Lane wrote:
 Hmmm ... runs forever is a bit scary.

 With SA you start with a temperature that's linearily dependant on the
 size of the query, and back off exponentially. Each step means work tha
 also depends on the size of the query, so big queries can mean expensive
 steps. With q=0.9 and initial temperature=very-big it takes too much
 time to plan.

 The good thing is that it's trivial to implement a hard cut-off value,
 which will stop annealing after a fixed number of steps (regardless of
 the current temperature) that would serve as a safety valve.

Well, let's wait and see whether experience says we need that.  A
hard-wired cutoff risks returning a pretty bad plan, and we have no
experience yet with how fail-soft SA is.

Something that might be more useful is an escape that quits as soon as
the best plan's estimated cost is less than something-or-other.  There's
no point in expending more planner time to improve the plan than you can
hope to recoup at execution.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Kineticode Billing da...@kineticode.com writes:
 No, it's not. There are no unit tests at all. You can call the contrib
 modules and their tests acceptance tests, but that's not the same
 thing.

Ok, I need some more guidance here. All contrib extension (there are 38
of them) are using the CREATE EXTENSION command and checking the result
with the pg_regress framework. What are we missing?

I can see about adding DROP EXTENSION for all the tests, but that's
about it.

 Okay, keep the installed control files. But don't make me distribute
 them unless absolutely necessary.

Yes you have to distribute them, that's necessary. Sorry about that.

 Then the idea behind the version number in the Makefile is that you
 generally are maintaining it there and don't want to have to maintain it
 in more than one place.

 Sure. But you're mandating one version even if you have multiple
 extensions. That's the potentially confusing part.

I see how confusing it is, because what you say ain't true. You can
always put different version numbers in the control file and even skip
the rule to produce the .control from the .control.in by providing the
.control directly. That's just a facility here.

 Why is that? We currently manage multiple script files, test files,
 etc. in a single Makefile. Wildcard operators are very useful for this
 sort of thing.
 
 Well, that was you saying just above that using the same EXTVERSION Make
 variable for more than one control file is potentially confusing. What
 about using all the other variables in the same way?

 What? I don't follow what you're saying.

You're complaining that a single EXTVERSION applied to more than one
extension's control file is confusing. What if we had EXTCOMMENT and
EXTRELOCATABLE in there too? What exactly are you expecting the Makefile
to look like?

 In contrib. You seem to forget that there are a lot of third-party
 extensions out there already.

True. That's still not the common case, and it's still covered the same
way as before, you need to restart to attach to shared memory.

 I would much rather retain that warning -- everyone should know about
 it -- and somehow convince SPI to be much less verbose in reporting
 issues. It should specify where the error came from (which query) and
 what the error actually is.

The problem is much more complex than that and could well kill the patch
if we insist on fixing it as part of the extension's work, v1. The
problem is exposing more internals of the SQL parser into SPI so that
you can send a bunch of queries in an explicit way.  Mind you, the
firsts version of the patch had something like that in there, but that
wouldn't have supported this use case. I've been told to simply use SPI
there.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

2010-12-08 Thread Greg Smith


Tom Lane wrote:

http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

Possibly it should have been posted to -hackers instead, but surely you
read -performance?
  


Trying to figure out what exactly commit_delay and commit_siblings did 
under the hood was actually the motivation behind my first foray into 
reading the PostgreSQL source code.  Ever since, I've been annoyed that 
the behavior didn't really help the way it's intended, but was not sure 
what would be better.  The additional input from Jignesh this week on 
the performance list suddenly made it crystal clear what would preserve 
the good behavior he had seen, even improving things for his case, while 
also helping the majority who won't benefit from the commit_delay 
behavior at all a little.  I immediately wrote the patch and breathed a 
sign of relief that it was finally going to get better.


I then posted the patch and added it to the January CF.  Unbeknownst to 
me until today, Simon had the same multi-year this itches and I can't 
make it stop feel toward these parameters, and that's how it jumped the 
standard process.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

On Dec 8, 2010, at 12:42 PM, Dimitri Fontaine wrote:

 Kineticode Billing da...@kineticode.com writes:
 No, it's not. There are no unit tests at all. You can call the contrib
 modules and their tests acceptance tests, but that's not the same
 thing.
 
 Ok, I need some more guidance here. All contrib extension (there are 38
 of them) are using the CREATE EXTENSION command and checking the result
 with the pg_regress framework. What are we missing?

unit tests. You add a bunch of functions. You need to test those functions.

 I can see about adding DROP EXTENSION for all the tests, but that's
 about it.

If you add that, you'll also need something to CREATE EXTENSION with, eh? And 
also, tests to make sure that WITH SCHEMA works properly (however that shakes 
out).

 Okay, keep the installed control files. But don't make me distribute
 them unless absolutely necessary.
 
 Yes you have to distribute them, that's necessary. Sorry about that.

I don't see why. Most of them are dead simple and could easily be Makefile 
variables.

 Sure. But you're mandating one version even if you have multiple
 extensions. That's the potentially confusing part.
 
 I see how confusing it is, because what you say ain't true. You can
 always put different version numbers in the control file and even skip
 the rule to produce the .control from the .control.in by providing the
 .control directly. That's just a facility here.

I see, okay.

 What? I don't follow what you're saying.
 
 You're complaining that a single EXTVERSION applied to more than one
 extension's control file is confusing. What if we had EXTCOMMENT and
 EXTRELOCATABLE in there too? What exactly are you expecting the Makefile
 to look like?

Mostly these will all have only one setting. In more complex cases perhaps one 
*would* be required to distribute a control file.

 In contrib. You seem to forget that there are a lot of third-party
 extensions out there already.
 
 True. That's still not the common case, and it's still covered the same
 way as before, you need to restart to attach to shared memory.

Okay.

 I would much rather retain that warning -- everyone should know about
 it -- and somehow convince SPI to be much less verbose in reporting
 issues. It should specify where the error came from (which query) and
 what the error actually is.
 
 The problem is much more complex than that and could well kill the patch
 if we insist on fixing it as part of the extension's work, v1. The
 problem is exposing more internals of the SQL parser into SPI so that
 you can send a bunch of queries in an explicit way.  Mind you, the
 firsts version of the patch had something like that in there, but that
 wouldn't have supported this use case. I've been told to simply use SPI
 there.

I agree that SPI should be fixed in a different project/patch. Go with what 
you've got, it will just highlight the problem with SPI more.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plperlu problem with utf8

2010-12-08 Thread Oleg Bartunov


On Wed, 8 Dec 2010, David E. Wheeler wrote:


On Dec 8, 2010, at 8:13 AM, Oleg Bartunov wrote:


adding utf8::decode($_[0]) solves the problem:

knn=# CREATE OR REPLACE FUNCTION url_decode(Vkw varchar) RETURNS varchar  AS $$
   use strict;
   use URI::Escape;
   utf8::decode($_[0]);
   return uri_unescape($_[0]); $$ LANGUAGE plperlu;


Hrm. Ideally all strings passed to PL/Perl functions would be decoded.


yes, this is what I expected.


Best,

David





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final(?) proposal for wal_sync_method changes

Given my concerns around exactly what is going on in the Windows code,
I'm now afraid to mess with an all-platforms change to fdatasync as the
preferred default; if we do that it should probably just be in HEAD not
the back branches.  So I've come around to the idea that Marti's
proposal of a PLATFORM_DEFAULT_SYNC_METHOD symbol is the best way.
(One reason for adopting that rather than some other way is that it
seems quite likely we'll end up needing it for Windows.)

I haven't touched the documentation yet, but attached is a proposed
code patch against HEAD.  This forces the default to fdatasync on Linux,
and makes some cosmetic cleanups around the HAVE_FSYNC_WRITETHROUGH_ONLY
confusion.

regards, tom lane

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fd5ec78..4f7dc39 100644
*** a/src/backend/storage/file/fd.c
--- b/src/backend/storage/file/fd.c
*** static bool looks_like_temp_rel_name(con
*** 260,271 
  int
  pg_fsync(int fd)
  {
! #ifndef HAVE_FSYNC_WRITETHROUGH_ONLY
! 	if (sync_method != SYNC_METHOD_FSYNC_WRITETHROUGH)
! 		return pg_fsync_no_writethrough(fd);
  	else
  #endif
! 		return pg_fsync_writethrough(fd);
  }
  
  
--- 260,272 
  int
  pg_fsync(int fd)
  {
! 	/* #if is to skip the sync_method test if there's no need for it */
! #if defined(HAVE_FSYNC_WRITETHROUGH)  !defined(FSYNC_WRITETHROUGH_IS_FSYNC)
! 	if (sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH)
! 		return pg_fsync_writethrough(fd);
  	else
  #endif
! 		return pg_fsync_no_writethrough(fd);
  }
  
  
diff --git a/src/include/access/xlogdefs.h b/src/include/access/xlogdefs.h
index 18b214e..072096d 100644
*** a/src/include/access/xlogdefs.h
--- b/src/include/access/xlogdefs.h
*** typedef uint32 TimeLineID;
*** 123,134 
  #endif
  #endif
  
! #if defined(OPEN_DATASYNC_FLAG)
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_OPEN_DSYNC
  #elif defined(HAVE_FDATASYNC)
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FDATASYNC
- #elif defined(HAVE_FSYNC_WRITETHROUGH_ONLY)
- #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FSYNC_WRITETHROUGH
  #else
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FSYNC
  #endif
--- 123,134 
  #endif
  #endif
  
! #if defined(PLATFORM_DEFAULT_SYNC_METHOD)
! #define DEFAULT_SYNC_METHOD		PLATFORM_DEFAULT_SYNC_METHOD
! #elif defined(OPEN_DATASYNC_FLAG)
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_OPEN_DSYNC
  #elif defined(HAVE_FDATASYNC)
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FDATASYNC
  #else
  #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FSYNC
  #endif
diff --git a/src/include/port/linux.h b/src/include/port/linux.h
index b9498b2..bcaa42d 100644
*** a/src/include/port/linux.h
--- b/src/include/port/linux.h
***
*** 12,14 
--- 12,22 
   * to have a kernel version test here.
   */
  #define HAVE_LINUX_EIDRM_BUG
+ 
+ /*
+  * Set the default wal_sync_method to fdatasync.  With recent Linux versions,
+  * xlogdefs.h's normal rules will prefer open_datasync, which (a) doesn't
+  * perform better and (b) causes outright failures on ext4 data=journal
+  * filesystems, because those don't support O_DIRECT.
+  */
+ #define PLATFORM_DEFAULT_SYNC_METHOD	SYNC_METHOD_FDATASYNC
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index 3417ab5..9c2ae4d 100644
*** a/src/include/port/win32.h
--- b/src/include/port/win32.h
***
*** 34,47 
  /* Must be here to avoid conflicting with prototype in windows.h */
  #define mkdir(a,b)	mkdir(a)
  
- #define HAVE_FSYNC_WRITETHROUGH
- #define HAVE_FSYNC_WRITETHROUGH_ONLY
  #define ftruncate(a,b)	chsize(a,b)
  /*
!  *	Even though we don't support 'fsync' as a wal_sync_method,
!  *	we do fsync() a few other places where _commit() is just fine.
   */
! #define fsync(fd) _commit(fd)
  
  #define USES_WINSOCK
  
--- 34,51 
  /* Must be here to avoid conflicting with prototype in windows.h */
  #define mkdir(a,b)	mkdir(a)
  
  #define ftruncate(a,b)	chsize(a,b)
+ 
+ /* Windows doesn't have fsync() as such, use _commit() */
+ #define fsync(fd) _commit(fd)
+ 
  /*
!  * For historical reasons, we allow setting wal_sync_method to
!  * fsync_writethrough on Windows, even though it's really identical to fsync
!  * (both code paths wind up at _commit()).
   */
! #define HAVE_FSYNC_WRITETHROUGH
! #define FSYNC_WRITETHROUGH_IS_FSYNC
  
  #define USES_WINSOCK
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wCTE behaviour

2010-12-08 Thread David Fetter

On Wed, Dec 08, 2010 at 01:23:59PM +0200, Marko Tiikkaja wrote:
 On 2010-12-08 10:19 AM +0200, David Fetter wrote:
 On Sun, Dec 05, 2010 at 01:33:39PM -0500, Greg Smith wrote:
 So this patch was marked Ready for Committer, but a) no committer
 has picked it up yet and b) Marko has made changes here that nobody
 else has tested out yet that I've seen on the last.  Accordingly,
 that classification may have been optimistic.  It seems to me that
 another testing run-through from someone like David might be
 appropriate to build some confidence this latest patch should be a
 commit candidate.  If there is a committer intending to work on this
 as-is, they haven't identified themselves.
 
 I've tested this one and not managed to break it.  One thing it could
 use is support for EXPLAIN ANALYZE.
 
 What's wrong with EXPLAIN ANALYZE?  Here's what I see:

Oops!

I am terribly sorry.  It was an earlier patch I didn't manage to
break.  I've tried all the same things on this one, and no breakage so
far.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python improvements

2010-12-08 Thread Peter Eisentraut

On tis, 2010-12-07 at 23:56 +0100, Jan Urbański wrote:
 Peter suggested having a mail/patch per feature and the way I intend
 to do that is instead of having a dozen branches, have one and after
 I'm done rebase it interactively to produce incremental patches that
 apply to master, each one implementing one feature.

Fair enough if you want to do it that way, but I'd encourage you to just
send in any self-contained features/changes that you have finished.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane t...@sss.pgh.pa.us wrote:
 
 What I suggested was to not allow read-only - read-write state
 transitions except (1) before first snapshot in the main xact
 and (2) at subxact exit (the OVERRIDE case).  That seems to
 accomplish the goal.  Now it will also allow dropping down to
 read-only mid-subtransaction, but I don't think forbidding that
 case is worth extra complexity.
 
Attached is version 2.  I think it does everything we discussed,
except that I'm not sure whether I addressed the assign_XactIsoLevel
issue you mentioned, since you mentioned a debug message and I only
see things that look like errors to me.  If I did miss something,
I'll be happy to take another look if you can point me to the right
place.
 
-Kevin
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 544,551  show_log_timezone(void)
--- 544,595 
  
  
  /*
+  * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+  *
+  * These should be transaction properties which can be set in exactly the
+  * same points in time that transaction isolation may be set.
+  */
+ bool
+ assign_transaction_read_only(bool newval, bool doit, GucSource source)
+ {
+   /* Can't go to r/w mode inside a r/o transaction */
+   if (newval == false  XactReadOnly  IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction read-write mode 
inside a read-only transaction)));
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
+   if (source != PGC_S_OVERRIDE)
+   return false;
+   }
+   /* Top level transaction can't change this after first snapshot. */
+   else if (FirstSnapshotSet  !IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(read-only property must be set before 
any query)));
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
+   if (source != PGC_S_OVERRIDE)
+   return false;
+   }
+   /* Can't go to r/w mode while recovery is still active */
+   else if (newval == false  XactReadOnly  RecoveryInProgress())
+   {
+   ereport(GUC_complaint_elevel(source),
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction read-write mode 
during recovery)));
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
+   if (source != PGC_S_OVERRIDE)
+   return false;
+   }
+ 
+   return true;
+ }
+ 
+ /*
   * SET TRANSACTION ISOLATION LEVEL
   */
+ extern char *XactIsoLevel_string; /* in guc.c */
  
  const char *
  assign_XactIsoLevel(const char *value, bool doit, GucSource source)
***
*** 559,565  assign_XactIsoLevel(const char *value, bool doit, GucSource 
source)
if (source != PGC_S_OVERRIDE)
return NULL;
}
!   else if (IsSubTransaction())
{
ereport(GUC_complaint_elevel(source),
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
--- 603,610 
if (source != PGC_S_OVERRIDE)
return NULL;
}
!   else if (IsSubTransaction()
! strcmp(value, XactIsoLevel_string) != 0)
{
ereport(GUC_complaint_elevel(source),
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 169,175  static bool assign_bonjour(bool newval, bool doit, GucSource 
source);
  static bool assign_ssl(bool newval, bool doit, GucSource source);
  static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
  static bool assign_log_stats(bool newval, bool doit, GucSource source);
- static bool assign_transaction_read_only(bool newval, bool doit, GucSource 
source);
  static const char *assign_canonical_path(const char *newval, bool doit, 
GucSource source);
  static const char *assign_timezone_abbreviations(const char *newval, bool 
doit, GucSource source);
  static const char *show_archive_command(void);
--- 169,174 
***
*** 426,432  static int server_version_num;
  static char *timezone_string;
  static char *log_timezone_string;
  static char *timezone_abbreviations_string;
- static char *XactIsoLevel_string;
  static char *custom_variable_classes;
  static intmax_function_args;
  static intmax_index_keys;
--- 425,430 
***
*** 441,446  static int effective_io_concurrency;
--- 439,445 
  /* should be

Re: [HACKERS] Review: Extensions Patch

David E. Wheeler da...@kineticode.com writes:
 I don't see why. Most of them are dead simple and could easily be
 Makefile variables.

And how does the information flows from the Makefile to the production
server, already?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

On Dec 8, 2010, at 1:53 PM, Dimitri Fontaine wrote:

 I don't see why. Most of them are dead simple and could easily be
 Makefile variables.
 
 And how does the information flows from the Makefile to the production
 server, already?

`make` generates the file if it doesn't already exist.

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Greg Smith g...@2ndquadrant.com writes:
 I then posted the patch and added it to the January CF.  Unbeknownst to 
 me until today, Simon had the same multi-year this itches and I can't 
 make it stop feel toward these parameters, and that's how it jumped the 
 standard process.

I think pretty much everybody who's looked at that code had the same
feeling.  If Simon hadn't taken it, I might have.

Jignesh's explanation of what the actual usefulness of the code is
finally made sense to me: the sleep calls effectively synchronize
multiple nearby commits to happen at the next scheduler clock tick,
and then whichever one grabs the WALWriteLock first does the work.
If you've got a high enough commit volume that this is likely to
be a win, then it's unclear that taking ProcArrayLock (even shared)
to check for guys who might commit shortly is a net win.  Moreover,
it's likely that that heuristic will exclude the last-to-arrive
process who otherwise could have participated in a group flush.

I'm not entirely convinced that zero commit_siblings is a better
default than small positive values, but it's certainly plausible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

David E. Wheeler da...@kineticode.com writes:
 And how does the information flows from the Makefile to the production
 server, already?

 `make` generates the file if it doesn't already exist.

Again, will retry when possible, but it has been a time sink once already.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 except that I'm not sure whether I addressed the assign_XactIsoLevel
 issue you mentioned, since you mentioned a debug message and I only
 see things that look like errors to me.  If I did miss something,
 I'll be happy to take another look if you can point me to the right
 place.

GUC_complaint_elevel() can return DEBUGn, and in fact will do so in
the PGC_S_OVERRIDE case.  I'm thinking that the code ought to look
more like

/* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
if (source != PGC_S_OVERRIDE)
{
check and report all the complaint-worthy cases;
}

The original coding was probably sane for initial development, but
now it just results in useless debug-log traffic for predictable
perfectly valid cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

2010-12-08 Thread Greg Smith


Tom Lane wrote:

I'm not entirely convinced that zero commit_siblings is a better
default than small positive values, but it's certainly plausible.
  


Not being allowed to set it to zero was certainly a limitation worth 
abolishing though; that has been the case before now, for those who 
didn't see the thread on the performance list.  I think that on the sort 
of high throughput system likely to benefit from this behavior, whether 
commit_siblings is zero or five doesn't matter very much--those people 
should cross the siblings threshold very quickly regardless.  The main 
arguments in favor of making the default lower aren't as exciting now 
that it jumps out of the loop early once finding the requisite number.


I like keeping the default at 5 though.  It keeps the person who 
experiments with increasing commit_delay from suffering when there are 
in reality not a lot of active connections.  There are essentially two 
foot-guns you have to aim before you run into the worst case here, which 
is making every single commit wait for the delay when there's really 
only one active process committing.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

Dimitri Fontaine dimi...@2ndquadrant.fr writes:
Well it does not seem to be complex to code. It's about having a new
property in the control file, relocatable, boolean. This property is
required and controls the behavior of the CREATE EXTENSION ... WITH
SCHEMA command. When true we use the ALTER EXTENSION SET SCHEMA code
path and when false, the placeholder replacement code path. The ALTER
command has already been developed so I need to merge it into the main
patch.

Ok I've done that on the git branch, for people interested into having a
look or playing with it before the week-end, when I think I'll post the
new patch revision. Well I've left alone the behavior change at CREATE
EXTENSION time, and also, well, the necessary documentation.

http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=summary

http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=c31d8a7728706d539f50d764fe8f45db92869664

The ALTER EXTENSION SET SCHEMA command needs to be adapted so that it
checks that all the extension's objects are currently in the same schema
and error out if that's not the case.

Done in the commit above. WIP of course, but just so that commit fest
manager notice things are moving here.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Tom Lane t...@sss.pgh.pa.us wrote:
 
 GUC_complaint_elevel() can return DEBUGn, and in fact will do so in
 the PGC_S_OVERRIDE case.  I'm thinking that the code ought to look
 more like
 
   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort
*/
   if (source != PGC_S_OVERRIDE)
   {
   check and report all the complaint-worthy cases;
   }
 
Done.  Version 3 attached.  I think I've covered all bases now.
 
-Kevin

*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 544,572  show_log_timezone(void)
  
  
  /*
   * SET TRANSACTION ISOLATION LEVEL
   */
  
  const char *
  assign_XactIsoLevel(const char *value, bool doit, GucSource source)
  {
!   if (FirstSnapshotSet)
{
!   ereport(GUC_complaint_elevel(source),
!   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg(SET TRANSACTION ISOLATION LEVEL must 
be called before any query)));
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
!   if (source != PGC_S_OVERRIDE)
return NULL;
!   }
!   else if (IsSubTransaction())
!   {
!   ereport(GUC_complaint_elevel(source),
!   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg(SET TRANSACTION ISOLATION LEVEL must 
not be called in a subtransaction)));
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
!   if (source != PGC_S_OVERRIDE)
return NULL;
}
  
if (strcmp(value, serializable) == 0)
--- 544,615 
  
  
  /*
+  * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+  *
+  * These should be transaction properties which can be set in exactly the
+  * same points in time that transaction isolation may be set.
+  */
+ bool
+ assign_transaction_read_only(bool newval, bool doit, GucSource source)
+ {
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
+   if (source != PGC_S_OVERRIDE)
+   {
+   /* Can't go to r/w mode inside a r/o transaction */
+   if (newval == false  XactReadOnly  IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction 
read-write mode inside a read-only transaction)));
+   return false;
+   }
+   /* Top level transaction can't change this after first 
snapshot. */
+   else if (FirstSnapshotSet  !IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(read-only property must be set 
before any query)));
+   return false;
+   }
+   /* Can't go to r/w mode while recovery is still active */
+   else if (newval == false  XactReadOnly  
RecoveryInProgress())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(cannot set transaction 
read-write mode during recovery)));
+   return false;
+   }
+   }
+ 
+   return true;
+ }
+ 
+ /*
   * SET TRANSACTION ISOLATION LEVEL
   */
+ extern char *XactIsoLevel_string; /* in guc.c */
  
  const char *
  assign_XactIsoLevel(const char *value, bool doit, GucSource source)
  {
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
!   if (source != PGC_S_OVERRIDE)
{
!   if (FirstSnapshotSet)
!   {
!   ereport(GUC_complaint_elevel(source),
!   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg(SET TRANSACTION ISOLATION 
LEVEL must be called before any query)));
return NULL;
!   }
!   else if (IsSubTransaction()
! strcmp(value, XactIsoLevel_string) != 0)
!   {
!   ereport(GUC_complaint_elevel(source),
!   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg(SET TRANSACTION ISOLATION 
LEVEL must not be called in a subtransaction)));
return NULL;
+   }
}
  
if (strcmp(value, serializable) == 0)
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 169,175  static bool assign_bonjour(bool newval, bool

Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

2010-12-08 Thread Heikki Linnakangas


On 08.12.2010 16:00, Simon Riggs wrote:

Heikki pointed out to me that the btree delete record processing does
not respect vacuum_defer_cleanup_age. It should.

Attached patch to implement that.

Looking to commit in next few hours barring objections/suggestions, to
both HEAD and 9_0_STABLE, in time for next minor release.


Please note that it was Noah Misch that raised this a while ago:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01919.php

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

2010-12-08 Thread Heikki Linnakangas


On 09.12.2010 00:10, Heikki Linnakangas wrote:

On 08.12.2010 16:00, Simon Riggs wrote:

Heikki pointed out to me that the btree delete record processing does
not respect vacuum_defer_cleanup_age. It should.

Attached patch to implement that.

Looking to commit in next few hours barring objections/suggestions, to
both HEAD and 9_0_STABLE, in time for next minor release.


Please note that it was Noah Misch that raised this a while ago:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01919.php


On closer look, that's not actually the same issue, sorry for the noise..

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Done.  Version 3 attached.
 
My final tweaks didn't make it into that diff.  Attached is 3a as a
patch over the version 3 patch.
 
-Kevin

*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 564,570  assign_transaction_read_only(bool newval, bool doit, GucSource 
source)
return false;
}
/* Top level transaction can't change this after first 
snapshot. */
!   else if (FirstSnapshotSet  !IsSubTransaction())
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
--- 564,570 
return false;
}
/* Top level transaction can't change this after first 
snapshot. */
!   if (FirstSnapshotSet  !IsSubTransaction())
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
***
*** 572,578  assign_transaction_read_only(bool newval, bool doit, GucSource 
source)
return false;
}
/* Can't go to r/w mode while recovery is still active */
!   else if (newval == false  XactReadOnly  
RecoveryInProgress())
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
--- 572,578 
return false;
}
/* Can't go to r/w mode while recovery is still active */
!   if (newval == false  XactReadOnly  RecoveryInProgress())
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
***
*** 602,609  assign_XactIsoLevel(const char *value, bool doit, GucSource 
source)
 errmsg(SET TRANSACTION ISOLATION 
LEVEL must be called before any query)));
return NULL;
}
!   else if (IsSubTransaction()
! strcmp(value, XactIsoLevel_string) != 0)
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
--- 602,609 
 errmsg(SET TRANSACTION ISOLATION 
LEVEL must be called before any query)));
return NULL;
}
!   /* We ignore a subtransaction setting it to the existing value. 
*/
!   if (IsSubTransaction()  strcmp(value, XactIsoLevel_string) != 
0)
{
ereport(GUC_complaint_elevel(source),

(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

2010-12-08 Thread Heikki Linnakangas


On 08.12.2010 16:00, Simon Riggs wrote:


Heikki pointed out to me that the btree delete record processing does
not respect vacuum_defer_cleanup_age. It should.

Attached patch to implement that.


This doesn't look right to me. btree_xlog_delete_get_latestRemovedXid() 
function calculates the latest XID present on the tuples that we're 
removing b-tree pointers for. btree_xlog_delete_get_latestRemovedXid() 
is used during recovery. vacuum_defer_cleanup_age should take effect in 
the master, not during recovery.


With the patch, btree_xlog_delete_get_latestRemovedXid() returns a value 
that's much smaller than it should. That's just wrong, it means that 
recovery in the standby will incorrectly think that all the removed 
tuples are old and not visible to any running read-only queries anymore, 
and will go ahead and remove the index tuples for them.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: XLog vs SSD [Was: Re: [HACKERS] random write in xlog?]

On Wed, Dec 8, 2010 at 12:15 PM, James Cloos cl...@jhcloos.com wrote:
 JJ == Jeff Janes jeff.ja...@gmail.com writes:

 JJ Anyway, the writes are logically sequentially, but not physically.
 JJ If I remember correctly, it always writes out full blocks, even if
 JJ the last part of the block has not yet been filled with new data.
 JJ When the remainder gets filled, it then writes out the full block
 JJ again, both the already written and the new part.

 What does that mean for use of a flash SSD for the xlog dir?

 Does the block writing mesh up well with the usage pattern a flash
 SSD needs to maximize lifespan?

I think that SSD have a block size below which writing only part of
the block has the same effect as writing the whole thing.  And those
block sizes are larger than 8K.
So PG always writing 8K at a time is unlikely to make a difference
than if it wrote a smaller amount.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Upcoming back-branch update releases

We've recently fixed the Linux O_DIRECT mess, as well as several nasty
crash and potential-data-loss issues.  The core committee has agreed
that it would be a good idea to get these fixes into the field before
people disappear for the holiday season.  Since time for that grows
short, we will follow the somewhat unusual plan of wrapping tarballs
this Monday, 12/13, for public announcement on Thursday 12/16.

As a reminder, this set of releases will include the final release in
the 8.1.x series, since 8.1 is now past its announced EOL date of
November 2010.  Community maintenance of 8.1 will stop after this
release.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extensions Patch

On Dec 8, 2010, at 2:07 PM, Dimitri Fontaine wrote:

 David E. Wheeler da...@kineticode.com writes:
 And how does the information flows from the Makefile to the production
 server, already?
 
 `make` generates the file if it doesn't already exist.
 
 Again, will retry when possible, but it has been a time sink once already.

Fair enough.

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] BufFreelistLock

I think that the BufFreelistLock can be a contention bottleneck on a
system with a lot of CPUs that do a lot of shared-buffer allocations
which can fulfilled by the OS buffer cache.  That is, read-mostly
queries where the working data set fits in RAM, but not in
shared_buffers.  (You can always increase shared_buffers, but that
leads to other problems, and who wants to spend their time
micromanaging the size of shared_buffers as work loads slowly change?)

I can't prove it is a contention bottleneck without first solving the
putative problem and timing the difference, but it is the dominant
blocking lock showing up under LWLOCK_STATS for one benchmark I've
done using 8 CPUs.

So I had two questions.

1) Would it be useful for BufFreelistLock be partitioned, like
BufMappingLock, or via some kind of clever virtual partitioning that
could get the same benefit via another means?  I don't know if both
the linked list and the clock sweep would have to be partitioned, or
if some other arrangement could be made

2) Could BufFreelistLock simply go away, by reducing it from a lwlock
to a spinlock?  Or at least in most common paths?

For doing away with it, I think that any manipulation of the freelist
is short enough (just a few instructions) that it could be done under
a spinlock.  If you somehow obtained a pinned or usage_count buffer,
you would have to retake the spinlock to look at the new head of the
chain, but the comments StrategyGetBuffer suggest that that should be
rare or impossible.

For the clock sweep algorithm, I think you could access
nextVictimBuffer without any type of locking.  If a non-atomic
increment causes an occasional buffer to be skipped or examined twice,
that doesn't seem like a correctness problem.  When nextVictimBuffer
gets reset to zero and completePasses gets incremented, that would
probably need to be protected to prevent a double-increment of
completePasses from throwing off the background writer's usage
estimations.  But again, a spinlock should be enough for that.  And it
shouldn't occur all that often.

If potentially inaccurate non-atomic increments of numBufferAllocs are
a problem, it could be incremented under the same spinlock used to
protect the test firstFreeBuffer0 to determine if the freelist is
empty.

Doing away with the lock without some form of partitioning might just
move the contention to the BufHdr spinlocks.  But if most of the
processes entering the code at about the same time perceive each
others increments to nextVictimBuffer, they would all start out offset
from each other and shouldn't collide too badly.

Does any of this sound like it might be fruitful to look into?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BufFreelistLock

Jeff Janes jeff.ja...@gmail.com writes:
 I think that the BufFreelistLock can be a contention bottleneck on a
 system with a lot of CPUs that do a lot of shared-buffer allocations
 which can fulfilled by the OS buffer cache.

Really?  buffer/README says

  The buffer
  management policy is designed so that BufFreelistLock need not be taken
  except in paths that will require I/O, and thus will be slow anyway.

It's hard to see how it's going to be much of a problem if you're going
to be doing kernel calls as well.  Is the test case you're looking at
really representative of any common situation?

 1) Would it be useful for BufFreelistLock be partitioned, like
 BufMappingLock, or via some kind of clever virtual partitioning that
 could get the same benefit via another means?

Maybe, but you could easily end up with a net loss if the partitioning
makes buffer allocation significantly stupider (ie, higher probability
of picking a less-than-optimal buffer to recycle).

 For the clock sweep algorithm, I think you could access
 nextVictimBuffer without any type of locking.

This is wrong, mainly because you wouldn't have any security against two
processes decrementing the usage count of the same buffer because they'd
fetched the same value of nextVictimBuffer.  That would probably happen
often enough to severely compromise the accuracy of the usage counts and
thus the accuracy of the LRU eviction behavior.  See above.

It might be worth looking into actual partitioning, so that more than
one processor can usefully be working on the usage count management.
But simply dropping the locking primitives isn't going to lead to
anything except severe screw-ups.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PERFORM] Slow BLOBs restoring

Vlad Arkhipov arhi...@dc.baikal.ru writes:
 08.12.2010 22:46, Tom Lane writes:
 Are you by any chance restoring from an 8.3 or older pg_dump file made
 on Windows?  If so, it's a known issue.

 No, I tried Linux only.

OK, then it's not the missing-data-offsets issue.

 I think you can reproduce it. First I created a database full of many 
 BLOBs on Postres 8.4.5. Then I created a dump:

Oh, you should have said how many was many.  I had tried with several
thousand large blobs yesterday and didn't see any problem.  However,
with several hundred thousand small blobs, indeed it gets pretty slow
as soon as you use -j.

oprofile shows all the time is going into reduce_dependencies during the
first loop in restore_toc_entries_parallel (ie, before we've actually
started doing anything in parallel).  The reason is that for each blob,
we're iterating through all of the several hundred thousand TOC entries,
uselessly looking for anything that depends on the blob.  And to add
insult to injury, because the blobs are all marked as SECTION_PRE_DATA,
we don't get to parallelize at all.  I think we won't get to parallelize
the blob data restoration either, since all the blob data is hidden in a
single TOC entry :-(

So the short answer is don't bother to use -j in a mostly-blobs restore,
becausw it isn't going to help you in 9.0.

One fairly simple, if ugly, thing we could do about this is skip calling
reduce_dependencies during the first loop if the TOC object is a blob;
effectively assuming that nothing could depend on a blob.  But that does
nothing about the point that we're failing to parallelize blob
restoration.  Right offhand it seems hard to do much about that without
some changes to the archive representation of blobs.  Some things that
might be worth looking at for 9.1:

* Add a flag to TOC objects saying this object has no dependencies,
to provide a generalized and principled way to skip the
reduce_dependencies loop.  This is only a good idea if pg_dump knows
that or can cheaply determine it at dump time, but I think it can.

* Mark BLOB TOC entries as SECTION_DATA, or somehow otherwise make them
parallelizable.  Also break the BLOBS data item apart into an item per
BLOB, so that that part's parallelizable.  Maybe we should combine the
metadata and data for each blob into one TOC item --- if we don't, it
seems like we need a dependency, which will put us back behind the
eight-ball.  I think the reason it's like this is we didn't originally
have a separate TOC item per blob; but now that we added that to support
per-blob ACL data, the monolithic BLOBS item seems pretty pointless.
(Another thing that would have to be looked at here is the dependency
between a BLOB and any BLOB COMMENT for it.)

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BufFreelistLock

On Wed, Dec 8, 2010 at 8:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 I think that the BufFreelistLock can be a contention bottleneck on a
 system with a lot of CPUs that do a lot of shared-buffer allocations
 which can fulfilled by the OS buffer cache.

 Really?  buffer/README says

  The buffer
  management policy is designed so that BufFreelistLock need not be taken
  except in paths that will require I/O, and thus will be slow anyway.

True, but very large memory means they often don't require true disk I/O anyway.

 It's hard to see how it's going to be much of a problem if you're going
 to be doing kernel calls as well.

Are kernels calls really all that slow?  I thought they had been
greatly optimized on recent hardware and kernels.
I'm not sure how to create a test case to distinguish that.

 Is the test case you're looking at
 really representative of any common situation?

That's always the question.  I took the pick a random number and use
it to look up a pgbench_accounts by primary key logic from pgbench
-S,
and but it into a stored procedure where it loops 10,000 times, to
remove the overhead of ping-ponging messages back and forth for every
query.
(But doing so also removes the overhead of taking AccessShareLock for
every select, so those two changes are entangled.)

This type of workload could be representative of a nested loop join.

I started looking into it because someone
(http://archives.postgresql.org/pgsql-performance/2010-11/msg00350.php)
thought that that pgbench -S might more or less match their real world
work load.  But by the time I moved most of selecting into a stored
procedure, maybe it no longer does (it's not even clear if they were
using prepared statements).  But separating things into their
component potential bottlenecks, which do you tackle first?  The more
fundamental.  The easiest to analyze.  The one that can't be gotten
around by fine-tuning.  The more interesting :).



 1) Would it be useful for BufFreelistLock be partitioned, like
 BufMappingLock, or via some kind of clever virtual partitioning that
 could get the same benefit via another means?

 Maybe, but you could easily end up with a net loss if the partitioning
 makes buffer allocation significantly stupider (ie, higher probability
 of picking a less-than-optimal buffer to recycle).

 For the clock sweep algorithm, I think you could access
 nextVictimBuffer without any type of locking.

 This is wrong, mainly because you wouldn't have any security against two
 processes decrementing the usage count of the same buffer because they'd
 fetched the same value of nextVictimBuffer.  That would probably happen
 often enough to severely compromise the accuracy of the usage counts and
 thus the accuracy of the LRU eviction behavior.  See above.

Ah, I hadn't considered that.


Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

On Thu, 2010-12-09 at 00:39 +0100, Heikki Linnakangas wrote:

 vacuum_defer_cleanup_age should take effect in 
 the master, not during recovery. 

Hmmm, more to the point, it does take effect on the master and so there
is no need for this at all. What were you thinking? What was I? Doh.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby btree delete records and vacuum_defer_cleanup_age

On Thu, 2010-12-09 at 00:16 +0100, Heikki Linnakangas wrote:
 On 09.12.2010 00:10, Heikki Linnakangas wrote:
  On 08.12.2010 16:00, Simon Riggs wrote:
  Heikki pointed out to me that the btree delete record processing does
  not respect vacuum_defer_cleanup_age. It should.
 
  Attached patch to implement that.
 
  Looking to commit in next few hours barring objections/suggestions, to
  both HEAD and 9_0_STABLE, in time for next minor release.
 
  Please note that it was Noah Misch that raised this a while ago:
 
  http://archives.postgresql.org/pgsql-hackers/2010-11/msg01919.php
 
 On closer look, that's not actually the same issue, sorry for the noise..

Heikki, this one *is* important. Will fix. Thanks for the analysis Noah.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wCTE behaviour

2010-12-08 Thread David Fetter

On Sun, Dec 05, 2010 at 01:33:39PM -0500, Greg Smith wrote:
 Marko Tiikkaja wrote:
 This is almost exactly the patch from 2010-02 without
 CommandCounterIncrement()s.  It's still a bit rough around the
 edges and needs some more comments, but I'm posting it here
 anyway.
 
 This patch passes all regression tests, but feel free to try to
 break it, there are probably ways to do that.  This one also has
 the always run DMLs to completion, and exactly once behaviour.
 
 So this patch was marked Ready for Committer, but a) no committer
 has picked it up yet and b) Marko has made changes here that nobody
 else has tested out yet that I've seen on the last.  Accordingly,
 that classification may have been optimistic.  It seems to me that
 another testing run-through from someone like David might be
 appropriate to build some confidence this latest patch should be a
 commit candidate.  If there is a committer intending to work on this
 as-is, they haven't identified themselves.

I've tested this one and not managed to break it.  One thing it could
use is support for EXPLAIN ANALYZE.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Feature request - CREATE TYPE ... WITH OID = oid_number.

Hey Merlin,

Do you mean that due to binary transfer it is possible
to determine the type of data transfered to the backend
and therefore there is no need to pass OIDs ?

2010/12/7 Merlin Moncure mmonc...@gmail.com

 On Tue, Dec 7, 2010 at 10:45 AM, Andrew Dunstan and...@dunslane.net
 wrote:
  On 12/07/2010 10:02 AM, Merlin Moncure wrote:
  On Tue, Dec 7, 2010 at 9:10 AM, Dmitriy Igrishindmit...@gmail.com
   wrote:
 
  Hey hackers@,
 
  libpq execution function works with OIDs. In some cases it is
  highly recommended specify OIDs of parameters according to
  libpq documentation.
  While developing a database application with libpq and if
  application works with custom data types IMO reasonable to
  provide developer extended type creation syntax, e.g.
CREATE TYPE my_type ... WITH OID = 12345;
  Yes, it is possible to make dump of the database with oids,
  but if developer prefer to hard code OIDs in the application it
  would be more convenient for him to use syntax above.
  Btw, there is already Oid lo_import_with_oid function in large
  objects libpq's API which are very useful.
 
  It's possible to do this in 9.0 in a slightly indirect way.   See the
  contrib/pg_upgrade folder.  In particular, grep for
  set_next_pg_type_oid.
 
 
  This doesn't strike me as very good advice. Those things are not exposed
  generally for good reason. The right way to do this surely is to have the
  app look up and cache the OIDs it needs rather than hardcode the values
 in
  the application.

 Note he didn't provide reasons why he is asking for this power.  Your
 assertion is a coded variant of don't use the binary protocol which
 I happen to think is not very good advice IF you know what you're
 doing.  We plan on using this feature to support binary transfer of
 data between databases through the variadic dblink library we maintain
 that uses binary format  (but pre 9.0 it reverts to text in many
 cases).  This can be 2x or more faster than stock dblink in real world
 cases.

 merlin

 (your advice is generally correct however) :-)




-- 
// Dmitriy.

Re: [HACKERS] To Signal The postmaster

2010-12-08 Thread Fujii Masao

On Wed, Dec 8, 2010 at 4:59 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 For 9.1, we should think of a better way to do this, perhaps using SIGUSR1
 to wake up. Maybe we won't even need the trigger file anymore.

If we use SIGUSR1, the mechanism to allow the users to specify the event type
seems to be required. For example, we should make the SIGUSR1 handler
check not only the shmem (i.e., PMSignalStat) but also the file?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_type.typname of array types.