Re: [HACKERS] Segfault while using an array domain

2015-11-29 Thread Tom Lane
Emre Hasegeli  writes:
> [ SQL function in a domain constraint doesn't work ]

Hm, looks like I broke this in 8abb3cda0.  Should have learned by now
that long-lived caching of ExprState trees is dangerous.  The proximate
cause of the problem is that execQual.c is executing an expression state
tree that's held by the typcache, but it is using an ecxt_per_query_memory
context that's only of query lifespan.  We end up with pointers into that
context from the typcache's state tree, which of course soon become
dangling pointers.

It's possible that we could temporarily change ecxt_per_query_memory
during ExecEvalCoerceToDomain to point to the context holding the state
trees, but that sounds pretty risky; at the very least there's a risk of
meant-to-be-query-lifespan allocations turning into session-lifespan
memory leakage.

Probably the best idea is to give up on caching the ExprState trees for
domain constraints this way.  We can still cache the Expr trees and
thereby avoid pg_constraint catalog reads, but we'll have to pay an
ExecInitExpr pass per query.

At some point I'd really like to find a way to keep ExprState trees
longer; this is a problem for plpgsql performance too.  But it's too
late in the 9.5 cycle to tackle that problem.

Or we could revert 8abb3cda0 altogether for the time being, but I hate to
do that because it was a correctness improvement not just a performance
tweak.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Making the C collation less inclined to abort abbreviation

2015-11-29 Thread Peter Geoghegan
The C collation is treated exactly the same as other collations when
considering whether the generation of abbreviated keys for text should
continue. This doesn't make much sense. With text, the big cost that
we are concerned about going to waste should abbreviated keys not
capture sufficient entropy is the cost of n strxfrm() calls. However,
the C collation doesn't use strxfrm() -- it uses memcmp(), which is
far cheaper.

With other types, like numeric and now UUID, the cost of generating an
abbreviated key is significantly lower than text when using collations
other than the C collation. Their cost models reflect this, and abort
abbreviation far less aggressively than text's, even though the
trade-off is very similar when text uses the C collation.

Attached patch fixes this inconsistency by making it significantly
less likely that abbreviation will be aborted when the C collation is
in use. The behavior with other collations is unchanged. This should
be backpatched to 9.5 as a bugfix, IMV.

-- 
Peter Geoghegan
From c87da5330c636e939983b1ba8eaee581b4c953dd Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sun, 29 Nov 2015 12:51:36 -0800
Subject: [PATCH] Abort C collation text abbreviation less frequently

Discriminate against the C collation by creating a much lower bar for
the amount of entropy that abbreviated keys must capture.  This is
consistent with existing cases that have cheaper conversion processes,
like UUID.

Backpatch to 9.5, where abbreviated keys for text were added.
---
 src/backend/utils/adt/varlena.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a89f586..0bcdd96 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1869,7 +1869,7 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 		 */
 		if (abbreviate)
 		{
-			tss->prop_card = 0.20;
+			tss->prop_card = collate_c ? 0.01 : 0.20;
 			initHyperLogLog(>abbr_card, 10);
 			initHyperLogLog(>full_card, 10);
 			ssup->abbrev_full_comparator = ssup->comparator;
@@ -2261,7 +2261,11 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	 * cardinality against the overall size of the set in order to more
 	 * accurately model costs.  Assume that an abbreviated comparison, and an
 	 * abbreviated comparison with a cheap memcmp()-based authoritative
-	 * resolution are equivalent.
+	 * resolution are equivalent.  (With the C collation, authoritative
+	 * cardinality is used in the same way, even though the cost of an
+	 * authoritative tie-breaker is no cheaper when values are equal.  The
+	 * theory is that the early appearance of low entropy abbreviated keys
+	 * predicts the same prefix for all or most values.)
 	 */
 	if (abbrev_distinct > key_distinct * tss->prop_card)
 	{
-- 
1.9.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra


On 11/29/2015 03:33 PM, Tomas Vondra wrote:

Hi,

On 11/29/2015 02:38 PM, Craig Ringer wrote:


I've had a few tries at implementing a qemu-based crashtester where it
hard kills the qemu instance at a random point then starts it back up.


I've tried to reproduce the issue by killing a qemu VM, and so far I've
been unsuccessful. On bare HW it was easily reproducible (I'd hit the
issue 9 out of 10 attempts), so either I'm doing something wrong or qemu
somehow interacts with the I/O.


Update: I've managed to reproduce the issue in the qemu setup - I think 
it needs slightly different timing due to the VM being slightly slower. 
I also tweaked vm.dirty_bytes and vm.dirty_background_bytes to values 
used on the bare hardware (I suspect it widens the window).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Make timestamptz_out less slow.

2015-11-29 Thread David Rowley
On 29 November 2015 at 14:00, Peter Geoghegan  wrote:

> On Sun, Nov 22, 2015 at 2:20 AM, David Rowley
>  wrote:
> > Just to confirm, you mean this comment?
> >
> > int tm_year; /* relative to 1900 */
> >
> > Please let me know if you disagree, but I'm not sure it's the business of
> > this patch to fix that. If it's wrong now, then it was wrong before my
> > patch, so it should be a separate patch which fixes it.
> >
> > At this stage I don't quite know what the fix should be, weather it's
> doing
> > tm->tm_year -= 1900; in timestamp2tm() after the j2date() call, or if
> it's
> > just removing the misleading comment.
> >
> > I also don't quite understand why we bother having it relative to 1900
> and
> > not just base it on 0.
>
> That's fair. I defer to the judgement of the committer here.
>
> > Is there anything else you see that's pending before it can be marked as
> > ready for committer?
>
> Can't think of any reason not to. It's been marked "ready for committer".
>
>
Many thanks for reviewing this Peter.

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Andrew Dunstan



On 11/29/2015 04:28 PM, Noah Misch wrote:

+BEGIN
+{
+   $windows_os = $Config{osname} eq 'MSWin32' || $Config{osname} eq 'msys';
+
+   # Determine output directories, and create them.  The base path is the
+   # TESTDIR environment variable, which is normally set by the invoking
+   # Makefile.
+   $tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
+   $log_path = "$tmp_check/log";
+
+   mkdir $tmp_check;
+   mkdir $log_path;
Never mutate the filesystem in a BEGIN block, because "perl -c" runs BEGIN
blocks.  (Likewise for the BEGIN block this patch adds to TestLib.)



Yeah, those two lines might belong in an INIT block. "perldoc perlmod" 
for details.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Noah Misch
On Fri, Nov 27, 2015 at 07:53:10PM -0300, Alvaro Herrera wrote:
> Michael Paquier wrote:
> > The result of a couple of hours of hacking is attached:
> > - 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
> > also found that it is quite advantageous to move some of the routines that
> > are synonyms of system() and the stuff used for logging into another
> > low-level library that PostgresNode depends on, that I called TestBase in
> > this patch.

> Here's another version of this.  I changed the packages a bit more.  For
> starters, I moved the routines around a bit; some of your choices seemed
> more about keeping stuff where it was originally rather than moving it
> to where it made sense.  These are the routines in each module:
> 
> TestBase:  system_or_bail system_log run_log slurp_dir slurp_file
> append_to_file
> 
> TestLib:get_new_node teardown_node psql poll_query_until command_ok
> command_fails command_exit_is program_help_ok program_version_ok
> program_options_handling_ok command_like issues_sql_like

The proposed code is short on guidance about when to put a function in TestLib
versus TestBase.  TestLib has no header comment.  The TestBase header comment
would permit, for example, command_ok() in that module.  I would try instead
keeping TestLib as the base module and moving into PostgresNode the functions
that deal with PostgreSQL clusters (get_new_node teardown_node psql
poll_query_until issues_sql_like).

> I tried to get rid of teardown_node by having a DESTROY method for
> PostgresNode; that method would call "pg_ctl stop -m immediate".  That
> would have been much cleaner.  However, when a test fails this doesn't
> work sanely because the END block for File::Temp runs earlier than that
> DESTROY block, which means the datadir is already gone by the time
> pg_ctl stop runs, so the node stop doesn't work at all.  (Perhaps we
> could fix this by noting postmaster's PID at start time, and then
> sending a signal directly instead of relying on pg_ctl).

You could disable File::Temp cleanup and handle cleanup yourself at the
desired time.  (I haven't reviewed whether the goal of removing teardown_node
is otherwise good.)

> +my $node = get_new_node();
> +# Initialize node without replication settings
> +$node->initNode(0);
> +$node->startNode();
> +my $pgdata = $node->getDataDir();
> +
> +$ENV{PGPORT} = $node->getPort();

Starting a value retrieval method name with "get" is not Perlish.  The TAP
suites currently follow "man perlstyle" in using underscored_lower_case method
names.  No PostgreSQL Perl code uses lowerFirstCamelCase, though some uses
CamelCase.  The word "Node" is redundant.  Use this style:

  $node->init(0);
  $node->start;
  my $pgdata = $node->data_dir;
  $ENV{PGPORT} = $node->port;

As a matter of opinion, I recommend giving "init" key/value arguments instead
of the single Boolean argument.  The method could easily need more options in
the future, and this makes the call site self-documenting:

  $node->init(hba_permit_replication => 0);

> - 'pg_controldata with nonexistent directory fails');
> +   'pg_controldata with nonexistent directory fails');

perltidy will undo this whitespace-only change.

> --- a/src/bin/pg_rewind/t/001_basic.pl
> +++ b/src/bin/pg_rewind/t/001_basic.pl
> @@ -1,9 +1,11 @@
> +# Basic pg_rewind test.
> +
>  use strict;
>  use warnings;
> -use TestLib;
> -use Test::More tests => 8;
>  
>  use RewindTest;
> +use TestLib;
> +use Test::More tests => 8;

Revert all changes to this file.  Audit the rest of the patch for whitespace
change unrelated to the subject.

> - 'fails with nonexistent table');
> +   'fails with nonexistent table');

> -'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER test1 
> USING test1x';
> + 'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER 
> test1 USING test1x';

perltidy will undo these whitespace-only changes.

> +# cluster -a is not compatible with -d, hence enforce environment variables

s/cluster -a/clusterdb -a/

> -issues_sql_like(
> +$ENV{PGPORT} = $node->getPort();
> +
> +issues_sql_like($node,

perltidy will move $node to its own line.

> -command_fails([ 'createuser', 'user1' ], 'fails if role already exists');
> +command_fails([ 'createuser', 'user1' ],
> +   'fails if role already exists');

perltidy will undo this whitespace-only change.

> @@ -0,0 +1,252 @@
> +# PostgresNode, simple node representation for regression tests.
> +#
> +# Regression tests should use this basic class infrastructure to define nodes
> +# that need used in the test modules/scripts.
> +package PostgresNode;

Consider just saying, "Class representing a data directory and postmaster."

> + my $self   = {
> + _port => undef,
> + _host => undef,
> + _basedir  => undef,
> + _applname => undef,
> + _logfile  => undef };
> +
> + # Set 

[HACKERS] Segfault while using an array domain

2015-11-29 Thread Emre Hasegeli
I was getting segfaults while working on the current master for a while.
This is the shortest way I could found to reproduce the problem:

create or replace function is_distinct_from(anyelement, anyelement)
returns boolean language sql
as 'select $1 is distinct from $2';

create operator !== (
procedure = is_distinct_from,
leftarg = anyelement,
rightarg = anyelement
);

create domain my_list int[] check (null !== all (value));

create table my_table (my_column my_list);

insert into my_table values ('{1}');
insert into my_table values ('{1}');

Here is the backtrace:

> * thread #1: tid = 0x108710, 0x0001040ebf82 
> postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 18 at mcxt.c:205, 
> queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
> (code=EXC_I386_GPFLT)
>   * frame #0: 0x0001040ebf82 
> postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 18 at mcxt.c:205
> frame #1: 0x000103ea60ac postgres`fmgr_sql(fcinfo=0x7fa5e50150a8) 
> + 252 at functions.c:1047
> frame #2: 0x000103e9f6f5 
> postgres`ExecEvalScalarArrayOp(sstate=0x7fa5e5015038, 
> econtext=, isNull="", isDone=) + 885 at 
> execQual.c:2660
> frame #3: 0x000103ea1bb4 
> postgres`ExecEvalCoerceToDomain(cstate=, 
> econtext=0x7fa5e6065110, isNull="", isDone=) + 180 at 
> execQual.c:4009
> frame #4: 0x000103ea208a postgres`ExecProject + 39 at execQual.c:5345
> frame #5: 0x000103ea2063 postgres`ExecProject(projInfo=, 
> isDone=0x7fff5bef58bc) + 387 at execQual.c:5560
> frame #6: 0x000103eb96a3 postgres`ExecResult(node=0x7fa5e6064ff8) 
> + 179 at nodeResult.c:155
> frame #7: 0x000103e9b57c 
> postgres`ExecProcNode(node=0x7fa5e6064ff8) + 92 at execProcnode.c:392
> frame #8: 0x000103eb5f12 
> postgres`ExecModifyTable(node=0x7fa5e6064ea0) + 434 at 
> nodeModifyTable.c:1331
> frame #9: 0x000103e9b5bb 
> postgres`ExecProcNode(node=0x7fa5e6064ea0) + 155 at execProcnode.c:396
> frame #10: 0x000103e97a90 postgres`standard_ExecutorRun [inlined] 
> ExecutePlan(estate=, planstate=0x7fa5e6064ea0, 
> use_parallel_mode='\0', operation=, numberTuples=0, 
> direction=, dest=) + 87 at execMain.c:1566
> frame #11: 0x000103e97a39 
> postgres`standard_ExecutorRun(queryDesc=0x7fa5e6061038, 
> direction=, count=0) + 201 at execMain.c:338
> frame #12: 0x000103fc18da 
> postgres`ProcessQuery(plan=0x7fa5e604fbd8, sourceText="insert into 
> my_table values ('{1}');", params=0x, 
> dest=0x7fa5e604fcd0, completionTag="") + 218 at pquery.c:185
> frame #13: 0x000103fc0ddb 
> postgres`PortalRunMulti(portal=0x7fa5e480a238, isTopLevel='\x01', 
> dest=0x7fa5e604fcd0, altdest=0x7fa5e604fcd0, completionTag="") + 331 
> at pquery.c:1283
> frame #14: 0x000103fc06f8 
> postgres`PortalRun(portal=0x7fa5e480a238, count=9223372036854775807, 
> isTopLevel='\x01', dest=0x7fa5e604fcd0, altdest=0x7fa5e604fcd0, 
> completionTag="") + 552 at pquery.c:812
> frame #15: 0x000103fbe8d6 postgres`PostgresMain + 48 at 
> postgres.c:1105
> frame #16: 0x000103fbe8a6 postgres`PostgresMain(argc=, 
> argv=, dbname=, username=) + 9414 at 
> postgres.c:4032
> frame #17: 0x000103f503c8 postgres`PostmasterMain [inlined] 
> BackendRun + 8328 at postmaster.c:4237
> frame #18: 0x000103f503a2 postgres`PostmasterMain [inlined] 
> BackendStartup at postmaster.c:3913
> frame #19: 0x000103f503a2 postgres`PostmasterMain at postmaster.c:1684
> frame #20: 0x000103f503a2 postgres`PostmasterMain(argc=, 
> argv=) + 8290 at postmaster.c:1292
> frame #21: 0x000103ed759f postgres`main(argc=, 
> argv=) + 1567 at main.c:223
> frame #22: 0x7fff8f1245c9 libdyld.dylib`start + 1

I can reproduce it on 9.5 branch too, but not on 9.4 branch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra

Hi,

On 11/29/2015 02:38 PM, Craig Ringer wrote:

On 27 November 2015 at 21:28, Greg Stark > wrote:

On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra
>
wrote:
> I plan to do more power failure testing soon, with more complex test
> scenarios. I suspect there might be other similar issues (e.g. when we
> rename a file before a checkpoint and don't fsync the directory - then the
> rename won't be replayed and will be lost).

I'm curious how you're doing this testing. The easiest way I can think
of would be to run a database on an LVM volume and take a large number
of LVM snapshots very rapidly and then see if the database can start
up from each snapshot. Bonus points for keeping track of the committed
transactions before each snaphsot and ensuring they're still there I
guess.


I've had a few tries at implementing a qemu-based crashtester where it
hard kills the qemu instance at a random point then starts it back up.


I've tried to reproduce the issue by killing a qemu VM, and so far I've 
been unsuccessful. On bare HW it was easily reproducible (I'd hit the 
issue 9 out of 10 attempts), so either I'm doing something wrong or qemu 
somehow interacts with the I/O.



I always got stuck on the validation part - actually ensuring that the
DB state is how we expect. I think I could probably get that right now,
it's been a while.


Weel, I guess we can't really check all the details, but I guess the 
checksums make checking the general consistency somewhat simpler. And 
then you have to design the workload in a way that makes the check 
easier - for example remembering the committed values etc.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra



On 11/29/2015 02:41 PM, Craig Ringer wrote:

On 27 November 2015 at 19:17, Tomas Vondra > wrote:

It's also possible to mitigate this by setting wal_sync_method=fsync


Are you sure?

https://lwn.net/Articles/322823/ tends to suggest that fsync() on the
file is insufficient to ensure rename() is persistent, though it's
somewhat old.


Good point. I don't know, and I'm not any smarter after reading the LWN 
article. What I meant by "mitigate" is that I've been unable to 
reproduce the issue after setting wal_sync_method=fsync, so my 
conclusion is that it either fixes the issue or at least significantly 
reduces the probability of hitting it.


It's pretty clear that the right fix is the additional fsync on pg_xlog.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-29 Thread Peter Geoghegan
On Sat, Nov 28, 2015 at 4:05 PM, Peter Geoghegan  wrote:
> So there was 27.76 seconds spent copying tuples into local memory
> ahead of the quicksort, 2 minutes 56.68 seconds spent actually
> quicksorting, and a trifling 10.32 seconds actually writing the run! I
> bet that the quicksort really didn't use up too much memory bandwidth
> on the system as a whole, since abbreviated keys are used with a cache
> oblivious internal sorting algorithm.

Uh, actually, that isn't so:

LOG:  begin index sort: unique = f, workMem = 1048576, randomAccess = f
LOG:  bttext_abbrev: abbrev_distinct after 160: 1.000489
(key_distinct: 40.802210, norm_abbrev_card: 0.006253, prop_card:
0.20)
LOG:  bttext_abbrev: aborted abbreviation at 160 (abbrev_distinct:
1.000489, key_distinct: 40.802210, prop_card: 0.20)

Abbreviation is aborted in all cases that you tested. Arguably this
should happen significantly less frequently with the "C" locale,
possibly almost never, but it makes this case less than representative
of most people's workloads. I think that at least the first several
hundred leading attribute tuples are duplicates.

BTW, roughly what does this CREATE INDEX look like? Is it a composite
index, for example?

It would also be nice to see pg_stats entries for each column being
indexed. Data distributions are certainly of interest here.

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-29 Thread Pavel Stehule
Hi

2015-11-16 23:40 GMT+01:00 Peter Eisentraut :

> On 11/15/15 11:29 PM, Pavel Stehule wrote:
> >
> >
> > 2015-11-16 5:20 GMT+01:00 Peter Eisentraut  > >:
> >
> > I don't think it's right to reuse SPIError for this.  SPIError is
> > clearly meant to signal an error in the SPI calls.  Of course, we
> can't
> > stop users from raising whatever exception they want, but if we're
> going
> > to advertise that users can raise exceptions, then we should create
> > separate exception classes.
> >
> > I suppose the proper way to set this up would be to create a base
> class
> > like plpy.Error and derive SPIError from that.
> >
> >
> > Do you have some ideas about the name of this class?
>
> I think plpy.Error is fine.
>
>
here is updated patch - work with 2.x Python.

I have 3.x Python broken on my fedora, so I should not do tests on 3.x.

Regards

Pavel
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
new file mode 100644
index 015bbad..51bc48e
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 1205,1210 
--- 1205,1228 
  approximately the same functionality
 

+ 
+   
+Raising Errors
+ 
+
+ A plpy.SPIError can be raised from PL/Python, the constructor accepts
+ keyword parameters:
+ plpy.SPIError([ message [, detail [, hint [, sqlstate  [, schema  [, table  [, column  [, datatype  [, constraint ]).
+
+
+ An example of raising custom exception could be written as:
+ 
+ DO $$
+   raise plpy.SPIError('custom message', hint = 'It is test only');
+ $$ LANGUAGE plpythonu;
+ 
+
+   
   
  
   
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
new file mode 100644
index 1f52af7..cb792eb
*** a/src/pl/plpython/expected/plpython_error.out
--- b/src/pl/plpython/expected/plpython_error.out
*** EXCEPTION WHEN SQLSTATE 'SILLY' THEN
*** 422,424 
--- 422,486 
  	-- NOOP
  END
  $$ LANGUAGE plpgsql;
+ /* the possibility to set fields of custom exception
+  */
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.')
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: This is message text.
+ DETAIL:  This is detail text
+ HINT:  This is hint text.
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 4, in 
+ hint = 'This is hint text.')
+ PL/Python anonymous code block
+ \set VERBOSITY verbose
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.',
+ sqlstate = 'SILLY',
+ schema = 'any info about schema',
+ table = 'any info about table',
+ column = 'any info about column',
+ datatype = 'any info about datatype',
+ constraint = 'any info about constraint')
+ $$ LANGUAGE plpythonu;
+ ERROR:  SILLY: plpy.Error: This is message text.
+ DETAIL:  This is detail text
+ HINT:  This is hint text.
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 10, in 
+ constraint = 'any info about constraint')
+ PL/Python anonymous code block
+ SCHEMA NAME:  any info about schema
+ TABLE NAME:  any info about table
+ COLUMN NAME:  any info about column
+ DATATYPE NAME:  any info about datatype
+ CONSTRAINT NAME:  any info about constraint
+ LOCATION:  PLy_elog, plpy_elog.c:122
+ \set VERBOSITY default
+ DO $$
+ raise plpy.Error(detail = 'This is detail text')
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: 
+ DETAIL:  This is detail text
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 2, in 
+ raise plpy.Error(detail = 'This is detail text')
+ PL/Python anonymous code block
+ DO $$
+ raise plpy.Error();
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: 
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 2, in 
+ raise plpy.Error();
+ PL/Python anonymous code block
+ DO $$
+ raise plpy.Error(sqlstate = 'wrong sql state');
+ $$ LANGUAGE plpythonu;
+ ERROR:  could not create Error object (invalid SQLSTATE code)
+ CONTEXT:  PL/Python anonymous code block
diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out
new file mode 100644
index 5323906..19b1a9b
*** a/src/pl/plpython/expected/plpython_error_0.out
--- b/src/pl/plpython/expected/plpython_error_0.out
*** EXCEPTION WHEN SQLSTATE 'SILLY' THEN
*** 422,424 
--- 422,486 
  	-- NOOP
  END
  $$ LANGUAGE plpgsql;
+ /* the possibility to set fields of custom exception
+  */
+ DO $$
+ raise plpy.SPIError('This is message text.',
+ detail = 'This is detail 

Re: [HACKERS] Using quicksort for every external sort run

2015-11-29 Thread Peter Geoghegan
On Sun, Nov 29, 2015 at 2:01 AM, Peter Geoghegan  wrote:
> I think that at least the first several
> hundred leading attribute tuples are duplicates.

I mean duplicate abbreviated keys. There are 40 distinct keys overall
in the first 160 tuples, which is why abbreviation is aborted -- this
can be seen from the trace_sort output, of course.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-29 Thread Michael Paquier
On Sun, Nov 29, 2015 at 4:00 PM, Pavel Stehule 
wrote:

> 2015-11-28 13:11 GMT+01:00 Michael Paquier :
>
>> On Sat, Nov 28, 2015 at 1:57 AM, Pavel Stehule 
>> wrote:
>> >
>> >
>> > 2015-11-27 17:54 GMT+01:00 Teodor Sigaev :
>> >>
>> >> Is this patch in 'Waiting on Author' state actually?
>>
>> I am marking it as returned with feedback for this CF then.
>>
>
> please, can revert it?
>

I have moved it to the next CF instead as you are still working on it. It
is going to be time to finish the current CF soon.
-- 
Michael


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Michael Paquier
On Sat, Nov 28, 2015 at 3:01 AM, Tomas Vondra
 wrote:
>
>
> On 11/27/2015 02:18 PM, Michael Paquier wrote:
>>
>> On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra
>>  wrote:
>>>
>>> So, what's going on? The problem is that while the rename() is atomic,
>>> it's
>>> not guaranteed to be durable without an explicit fsync on the parent
>>> directory. And by default we only do fdatasync on the recycled segments,
>>> which may not force fsync on the directory (and ext4 does not do that,
>>> apparently).
>>
>>
>> Yeah, that seems to be the way the POSIX spec clears things.
>> "If _POSIX_SYNCHRONIZED_IO is defined, the fsync() function shall
>> force all currently queued I/O operations associated with the file
>> indicated by file descriptor fildes to the synchronized I/O completion
>> state. All I/O operations shall be completed as defined for
>> synchronized I/O file integrity completion."
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html
>> If I understand that right, it is guaranteed that the rename() will be
>> atomic, meaning that there will be only one file even if there is a
>> crash, but that we need to fsync() the parent directory as mentioned.
>>
>>> FWIW this has nothing to do with storage reliability - you may have good
>>> drives, RAID controller with BBU, reliable SSDs or whatever, and you're
>>> still not safe. This issue is at the filesystem level, not storage.
>>
>>
>> The POSIX spec authorizes this behavior, so the FS is not to blame,
>> clearly. At least that's what I get from it.
>
>
> The spec seems a bit vague to me (but maybe it's not, I'm not a POSIX
> expert),

As I am understanding it, FS implementations are free to decide to
make the rename persist on disk or not.

> but we should be prepared for the less favorable interpretation I
> think.

Yep. I agree. And in case my previous words were not clear, that's the
same line of thought here, we had better cover our backs and study
carefully each code path that could be impacted.

>>> Attached is a proposed fix for this (xlog-fsync.patch), and I'm pretty
>>> sure
>>> this needs to be backpatched to all backbranches. I've also attached a
>>> patch
>>> that adds pg_current_xlog_flush_location() function, which proved to be
>>> quite useful when debugging this issue.
>>
>>
>> Agreed. We should be sure as well that the calls to fsync_fname get
>> issued in a critical section with START/END_CRIT_SECTION(). It does
>> not seem to be the case with your patch.
>
>
> Don't know. I've based that on code from replication/logical/ which does
> fsync_fname() on all the interesting places, without the critical section.

For slot information in slot.c, there will be a PANIC when fsyncing
pg_replslot at some points. It does not seem that weird to do the same
for example after renaming the backup label file..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-29 Thread Pavel Stehule
2015-11-29 13:42 GMT+01:00 Michael Paquier :

>
>
> On Sun, Nov 29, 2015 at 4:00 PM, Pavel Stehule 
> wrote:
>
>> 2015-11-28 13:11 GMT+01:00 Michael Paquier :
>>
>>> On Sat, Nov 28, 2015 at 1:57 AM, Pavel Stehule 
>>> wrote:
>>> >
>>> >
>>> > 2015-11-27 17:54 GMT+01:00 Teodor Sigaev :
>>> >>
>>> >> Is this patch in 'Waiting on Author' state actually?
>>>
>>> I am marking it as returned with feedback for this CF then.
>>>
>>
>> please, can revert it?
>>
>
> I have moved it to the next CF instead as you are still working on it. It
> is going to be time to finish the current CF soon.
>

ook

Regards

Pavel


> --
> Michael
>


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer
On 27 November 2015 at 21:28, Greg Stark  wrote:

> On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra
>  wrote:
> > I plan to do more power failure testing soon, with more complex test
> > scenarios. I suspect there might be other similar issues (e.g. when we
> > rename a file before a checkpoint and don't fsync the directory - then
> the
> > rename won't be replayed and will be lost).
>
> I'm curious how you're doing this testing. The easiest way I can think
> of would be to run a database on an LVM volume and take a large number
> of LVM snapshots very rapidly and then see if the database can start
> up from each snapshot. Bonus points for keeping track of the committed
> transactions before each snaphsot and ensuring they're still there I
> guess.
>

I've had a few tries at implementing a qemu-based crashtester where it hard
kills the qemu instance at a random point then starts it back up.

I always got stuck on the validation part - actually ensuring that the DB
state is how we expect. I think I could probably get that right now, it's
been a while.

The VM can be started back up and killed again over and over quite quickly.

It's not as good as physical plug-pull, but it's a lot more practical.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer
On 27 November 2015 at 19:17, Tomas Vondra 
wrote:


> It's also possible to mitigate this by setting wal_sync_method=fsync


Are you sure?

https://lwn.net/Articles/322823/ tends to suggest that fsync() on the file
is insufficient to ensure rename() is persistent, though it's somewhat old.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] How to add and use a static library within Postgres backend

2015-11-29 Thread Craig Ringer
On 27 November 2015 at 10:35, XiaoChuan Yu  wrote:

> I'm doing some experimenting with the storage engine and I'd like to use a
> C++ static library (that has C headers).  I only need this to build on
> Ubuntu 64-bit for now.
> How do I make postgres build with this library?
>

Build it separately. Link to it like any other library. Take a look at
configure.in for how libraries are discovered and the Makefile.am files for
how the header path and linker flags are defined.

What changes do I need to make to the build system files?
>

As above. Look at how PostgreSQL uses existing libraries. If the library
presents a pure C interface it should be just like everything else.


> I assume headers go in src/include but where do I put the library.a file?
>

Don't copy anything into PostgreSQL's source tree. Add new linker path and
include path entries so the build system knows how to find the files where
you originally compiled them.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2015-11-29 Thread Amit Kapila
On Sun, Nov 29, 2015 at 1:47 AM, Jeff Janes  wrote:
>
> On Thu, Nov 26, 2015 at 11:32 PM, Amit Kapila 
wrote:
> > On Tue, Nov 17, 2015 at 6:30 PM, Simon Riggs 
wrote:
> >>
> >> On 17 November 2015 at 11:48, Amit Kapila 
wrote:
> >>>
> >>>
> >>> I think in that case what we can do is if the total number of
> >>> sub transactions is lesser than equal to 64 (we can find that by
> >>> overflowed flag in PGXact) , then apply this optimisation, else use
> >>> the existing flow to update the transaction status.  I think for that
we
> >>> don't even need to reserve any additional memory. Does that sound
> >>> sensible to you?
> >>
> >>
> >> I understand you to mean that the leader should look backwards through
the
> >> queue collecting xids while !(PGXACT->overflowed)
> >>
> >> No additional shmem is required
> >>
> >
> > Okay, as discussed I have handled the case of sub-transactions without
> > additional shmem in the attached patch.  Apart from that, I have tried
> > to apply this optimization for Prepared transactions as well, but as
> > the dummy proc used for such transactions doesn't have semaphore like
> > backend proc's, so it is not possible to use such a proc in group status
> > updation as each group member needs to wait on semaphore.  It is not tad
> > difficult to add the support for that case if we are okay with creating
> > additional
> > semaphore for each such dummy proc which I was not sure, so I have left
> > it for now.
>
> Is this proposal instead of, or in addition to, the original thread
> topic of increasing clog buffers to 64?
>

This is in addition to increasing the clog buffers to 64, but with this
patch
(Group Clog updation), the effect of increasing the clog buffers will be
lesser.




With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Michael Paquier
On Sun, Nov 29, 2015 at 12:13 AM, Alvaro Herrera wrote:
> Here's your recovery test patch rebased, for your (and others'!)
> perusal.  It passes for me.  (Test 003 is unchanged.)

Are you planning to push that as well? It does not have much coverage
but I guess that's quite good for a first shot, and that can serve as
example for future tests.

Still, the first patch adds enough infrastructure to allow any other
module to have more complex regression test scenarios, the first two
targets coming immediately to my mind being the quorum syncrep patch
and pg_rewind and its timeline switch manipulation. So that's more
than welcome!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue on C function that reads int2[] (using "int2vector")

2015-11-29 Thread Tom Lane
Rodrigo Hjort  writes:
> I created a custom C function with this signature:

> CREATE FUNCTION calculate_hash(numbers int2[])
> RETURNS int8
> AS 'MODULE_PATHNAME', 'pg_calculate_hash'
> LANGUAGE C
> IMMUTABLE STRICT;

> And here is the function source code (inspired in codes I found in
> src/backend/utils/adt/int.c):

> PG_FUNCTION_INFO_V1(pg_calculate_hash);
> Datum
> pg_calculate_hash(PG_FUNCTION_ARGS)
> {
>   int2vector *int2Array = (int2vector *) PG_GETARG_POINTER(0);

Nope.  int2vector is not the same as int2[].  It might occasionally seem
to work, but in general it's not the same type.  And this particular
coding won't work at all on on-disk int2[] data, because it doesn't
account for toasting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-29 Thread David Fetter
On Sat, Nov 28, 2015 at 02:04:16PM -0800, Jeff Janes wrote:
> On Wed, Nov 18, 2015 at 3:29 PM, Peter Geoghegan  wrote:
> > On Wed, Nov 18, 2015 at 10:31 AM, Jeff Janes  wrote:
> >
> >> I agree we don't want to optimize for low memory, but I don't think we
> >> should throw it under the bus, either.  Right now we are effectively
> >> saying the CPU-cache problems with the heap start exceeding the larger
> >> run size benefits at 64kb (the smallest allowed setting for work_mem).
> >> While any number we pick is going to be a guess that won't apply to
> >> all hardware, surely we can come up with a guess better than 64kb.
> >> Like, 8 MB, say.  If available memory for the sort is 8MB or smaller
> >> and the predicted size anticipates a multipass merge, then we can use
> >> the heap method rather than the quicksort method.  Would a rule like
> >> that complicate things much?
> >
> > I'm already using replacement selection for the first run when it is
> > predicted by my new ad-hoc cost model that we can get away with a
> > "quicksort with spillover", avoiding almost all I/O. We only
> > incrementally spill as many tuples as needed right now, but it would
> > be pretty easy to not quicksort the remaining tuples, but continue to
> > incrementally spill everything. So no, it wouldn't be too hard to hang
> > on to the old behavior sometimes, if it looked worthwhile.
> >
> > In principle, I have no problem with doing that. Through testing, I
> > cannot see any actual upside, though. Perhaps I just missed something.
> > Even 8MB is enough to avoid the multipass merge in the event of a
> > surprisingly high volume of data (my work laptop is elsewhere, so I
> > don't have my notes on this in front of me, but I figured out the
> > crossover point for a couple of cases).
> 
> For me very large sorts (100,000,000 ints) with work_mem below 4MB do
> better with unpatched than with your patch series, by about 5%.  Not a
> big deal, but also if it is easy to keep the old behavior then I think
> we should.  Yes, it is dumb to do large sorts with work_mem below 4MB,
> but if you have canned apps which do a mixture of workloads it is not
> so easy to micromanage their work_mem.  Especially as there are no
> easy tools that let me as the DBA say "if you connect from this IP
> address, you get this work_mem".

That's certainly doable with pgbouncer, for example.  What would you
have in mind for the more general capability?  It seems to me that
bloating up pg_hba.conf would be undesirable, but maybe I'm picturing
this as bigger than it actually needs to be.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-29 Thread Peter Geoghegan
On Sun, Nov 29, 2015 at 8:52 PM, David Rowley
 wrote:
> You're right, gcc did not include the prefetch instructions.
> I've tested again on the same machine but with clang 3.7 instead of gcc
> 4.8.3

Thanks for going to the trouble of investigating this.

These results are mediocre -- in general, it seems like the prefetch
instructions are almost as likely to hurt as to help. Naturally, I
don't want to go down the road of considering every microarchitecture,
and that seems to be what it would take to get this to work well, if
that's possible at all.

I'm currently running some benchmarks on my external sorting patch on
the POWER7 machine that Robert Haas and a few other people have been
using for some time now [1]. So far, the benchmarks look very good
across a variety of scales.

I'll run a round of tests without the prefetching enabled (which the
patch series makes further use of -- they're also used when writing
tuples out). If there is no significant impact, I'll completely
abandon this patch, and we can move on.

[1] http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Erroneous cost estimation for nested loop join

2015-11-29 Thread KAWAMICHI Ryoji


Robert Haas  wrote:
> 
> - If we're sequential scanning a small table, let's say less than 1/4
> of shared_buffers, which is the point where synchronized scans kick
> in, then assume the data is coming from shared_buffers.
> - If we're scanning a medium-sized table, let's say less than
> effective_cache_size, then assume the data is coming from the OS
> cache.  Maybe this is the same cost as the previous case, or maybe
> it's slightly more.
> - Otherwise, assume that the first effective_cache_size pages are
> coming from cache and the rest has to be read from disk.  This is
> perhaps unrealistic, but we don't want the cost curve to be
> discontinuous.

I think this improvement is so reasonable, and I expect it will be merged 
into current optimizer code.


> A problem with this sort of thing, of course, is that it's really hard
> to test a proposed change broadly enough to be certain how it will
> play out in the real world.

That’s the problem we’re really interested in and trying to tackle.

For example, with extensive experiments, I’m really sure my modification of 
cost model is effective for our environment, but I can’t see if it is also 
efficient or unfortunately harmful in general environments.

And I think that, in postgres community, there must be (maybe buried) 
knowledge on how to judge the effectiveness of cost model modifications 
because someone should have considered something like that at each commit.
I’m interested in it, and hopefully would like to contribute to finding 
a better way to improve the optimizer through cost model refinement.

Thanks.
Ryoji.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Michael Paquier
On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:
>
> On Fri, Nov 27, 2015 at 07:53:10PM -0300, Alvaro Herrera wrote:
> > Michael Paquier wrote:
> > > The result of a couple of hours of hacking is attached:
> > > - 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
> > > also found that it is quite advantageous to move some of the routines that
> > > are synonyms of system() and the stuff used for logging into another
> > > low-level library that PostgresNode depends on, that I called TestBase in
> > > this patch.
>
> > Here's another version of this.  I changed the packages a bit more.  For
> > starters, I moved the routines around a bit; some of your choices seemed
> > more about keeping stuff where it was originally rather than moving it
> > to where it made sense.  These are the routines in each module:
> >
> > TestBase:  system_or_bail system_log run_log slurp_dir slurp_file
> > append_to_file
> >
> > TestLib:get_new_node teardown_node psql poll_query_until command_ok
> > command_fails command_exit_is program_help_ok program_version_ok
> > program_options_handling_ok command_like issues_sql_like
>
> The proposed code is short on guidance about when to put a function in TestLib
> versus TestBase.  TestLib has no header comment.  The TestBase header comment
> would permit, for example, command_ok() in that module.  I would try instead
> keeping TestLib as the base module and moving into PostgresNode the functions
> that deal with PostgreSQL clusters (get_new_node teardown_node psql
> poll_query_until issues_sql_like).

PostgresNode is wanted to be a base representation of how of node is,
not of how to operate on it. The ways to perform the tests, which
works on a node, is wanted as a higher-level operation.

Logging and base configuration of a test set is a lower level of
operations than PostgresNode, because cluster nodes need actually to
perform system calls, some of those system calls like run_log allowing
to log in the centralized log file. I have tried to make the headers
of those modules more verbose, please see attached.

>
> > +my $node = get_new_node();
> > +# Initialize node without replication settings
> > +$node->initNode(0);
> > +$node->startNode();
> > +my $pgdata = $node->getDataDir();
> > +
> > +$ENV{PGPORT} = $node->getPort();
>
> Starting a value retrieval method name with "get" is not Perlish.  The TAP
> suites currently follow "man perlstyle" in using underscored_lower_case method
> names.  No PostgreSQL Perl code uses lowerFirstCamelCase, though some uses
> CamelCase.  The word "Node" is redundant.  Use this style:
>
>   $node->init(0);
>   $node->start;
>   my $pgdata = $node->data_dir;
>   $ENV{PGPORT} = $node->port;

I have switched the style this way.

> As a matter of opinion, I recommend giving "init" key/value arguments instead
> of the single Boolean argument.  The method could easily need more options in
> the future, and this makes the call site self-documenting:
>
>   $node->init(hba_permit_replication => 0);

Done.

>
> > - 'pg_controldata with nonexistent directory fails');
> > +   'pg_controldata with nonexistent directory fails');
>
> perltidy will undo this whitespace-only change.

Cleaned up.

>
> > --- a/src/bin/pg_rewind/t/001_basic.pl
> > +++ b/src/bin/pg_rewind/t/001_basic.pl
> > @@ -1,9 +1,11 @@
> > +# Basic pg_rewind test.
> > +
> >  use strict;
> >  use warnings;
> > -use TestLib;
> > -use Test::More tests => 8;
> >
> >  use RewindTest;
> > +use TestLib;
> > +use Test::More tests => 8;
>
> Revert all changes to this file.  Audit the rest of the patch for whitespace
> change unrelated to the subject.

Done.

>
>
> > - 'fails with nonexistent table');
> > +   'fails with nonexistent table');
>
> > -'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER 
> > test1 USING test1x';
> > + 'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); 
> > CLUSTER test1 USING test1x';
>
> perltidy will undo these whitespace-only changes.

Cleaned up.

>
> > +# cluster -a is not compatible with -d, hence enforce environment variables
>
> s/cluster -a/clusterdb -a/

Fixed.

>
> > -command_fails([ 'createuser', 'user1' ], 'fails if role already exists');
> > +command_fails([ 'createuser', 'user1' ],
> > +   'fails if role already exists');
>
> perltidy will undo this whitespace-only change.
>
> > @@ -0,0 +1,252 @@
> > +# PostgresNode, simple node representation for regression tests.
> > +#
> > +# Regression tests should use this basic class infrastructure to define 
> > nodes
> > +# that need used in the test modules/scripts.
> > +package PostgresNode;
>
> Consider just saying, "Class representing a data directory and postmaster."

OK, I have changed this description:
+# PostgresNode, class representing a data directory and postmaster.
+#
+# This contains a basic set of routines able to work on a PostgreSQL node,
+# allowing to start, 

[HACKERS] Issue on C function that reads int2[] (using "int2vector")

2015-11-29 Thread Rodrigo Hjort
Hello PG Hackers,


I created a custom C function with this signature:

CREATE FUNCTION calculate_hash(numbers int2[])
RETURNS int8
AS 'MODULE_PATHNAME', 'pg_calculate_hash'
LANGUAGE C
IMMUTABLE STRICT;


And here is the function source code (inspired in codes I found in
src/backend/utils/adt/int.c):

PG_FUNCTION_INFO_V1(pg_calculate_hash);
Datum
pg_calculate_hash(PG_FUNCTION_ARGS)
{
  int2vector *int2Array = (int2vector *) PG_GETARG_POINTER(0);
  const int qtd = int2Array->dim1;

  elog(DEBUG1, "pg_calculate_hash(qtd=%d)", qtd);

  elog(DEBUG2, "  [ndim=%d, dataoffset=%d, elemtype=%d, dim1=%d,
lbound1=%d]",
int2Array->ndim, int2Array->dataoffset, int2Array->elemtype,
int2Array->dim1, int2Array->lbound1);

  [...]
}


In order to test it against a table structure, I executed these
instructions on psql:

db=# create table ss (s int2[]);
CREATE TABLE

db=# \d+ ss
Table "public.ss"
 Column |Type| Modifiers | Storage  | Stats target | Description
++---+--+--+-
 s  | smallint[] |   | extended |  |
Has OIDs: no

db=# insert into ss values ('[0:5]={58,17,15,36,59,54}');
INSERT 0 1

db=# select * from ss;
 s
---
 [0:5]={58,17,15,36,59,54}
(1 row)


Then, whenever calling the function passing the int2[] column directly,
strange values are read into the "int2vector" object:

db=# set client_min_messages to debug2;
SET

db=# select s, calculate_hash(s) from ss;
DEBUG:  pg_calculate_hash(qtd=0)
DEBUG:[ndim=0, dataoffset=5376, elemtype=1536, dim1=0,
lbound1=285227520]
 s | calculate_hash
---+---
 [0:5]={58,17,15,36,59,54} | 0
(1 row)


On the other hand, when I double-cast the int2[] column value, it works as
expected (reading the proper "int2vector" structure):

db=# select s, calculate_hash(s::varchar::int2[]) from ss;
DEBUG:  pg_calculate_hash(qtd=6)
DEBUG:[ndim=1, dataoffset=0, elemtype=21, dim1=6, lbound1=0]
 s |   calculate_hash
---+
 [0:5]={58,17,15,36,59,54} | 441352797842128896
(1 row)


Please, what is wrong with that function code?

Thanks in advance.

The whole project is on GitHub:
https://github.com/hjort/mega-sena/tree/master/src


Best regards,

-- 
Rodrigo Hjort
www.hjort.co