date:20141211

Re: [HACKERS] WRITE_UINT_FIELD used where WRITE_OID_FIELD likely intended

2014-12-11 Thread Heikki Linnakangas


On 12/11/2014 05:57 AM, Michael Paquier wrote:

On Thu, Dec 11, 2014 at 7:44 AM, Mark Dilger m...@port25.com wrote:


At line 1787 of outfuncs.c, the line:

 WRITE_UINT_FIELD(reltablespace)

should probably say

 WRITE_OID_FIELD(reltablespace)

since that variable is of type Oid, not uint32.
Granted, these two macros are interchangeable,
but they won't be if we ever move to 64-bit Oids.


For correctness you are right. Looks like you spent quite some time looking
at that..


Fixed, thanks.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Casting issues with domains

2014-12-11 Thread Thomas Reiss

Le 11/12/2014 00:46, Tom Lane a écrit :
 Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 As far as that goes, I think the OP was unhappy about the performance
 of the information_schema views, which in our implementation do exactly
 that so that the exposed types of the view columns conform to the SQL
 standard, even though the underlying catalogs use PG-centric types.

 I don't believe that that's the only reason why the performance of the
 information_schema views tends to be sucky, but it's certainly a reason.
 
 Is that schema too edge case to justify some functional indexes
 on the cast values on the underlying catalogs? (I'm inclined to
 think so, but it seemed like a question worth putting out
 there)
 
 We don't support functional indexes on system catalogs, so whether it'd
 be justified is sorta moot.  On the whole though I'm inclined to agree
 that the information_schema views aren't used enough to justify adding
 overhead to system-catalog updates, even if the pieces for that all
 existed.
 
 Or, since these particular domains are known, is there any sane way
 to special-case these to allow the underlying types to work?
 
 I don't particularly care for a kluge solution here.
 
 I notice that recent versions of the SQL spec contain the notion of a
 distinct type, which is a user-defined type that is representationally
 identical to some base type but has its own name, and comes equipped with
 assignment-grade casts to and from the base type (which in PG terms would
 be binary-compatible casts, though the spec doesn't require that).
 It seems like this might be intended to be the sort of zero cost type
 alias we were talking about, except that the SQL committee seems to have
 got it wrong by not specifying the cast-to-base-type as being implicit.
 Which ISTM you would want so that operators/functions on the base type
 would apply automatically to the distinct type.  But perhaps we could
 extend the spec with some option to CREATE TYPE to allow the cast to come
 out that way.
 
 Or in short, maybe we should try to replace the domains used in the
 current information_schema with distinct types.

That's interesting and could easily solve the problem.

To give some context, for some reason, Drupal queries the
information_schema views before displaying some pages.
As our customer has many tables (approx 6 tables, organised à la
Oracle with one schema per database user). Thus, the seq scan against
pg_class takes ~50ms and the very same one without the cast takes less
than 1ms.

There is an example of query used :
SELECT column_name, data_type, column_default
  FROM information_schema.columns
 WHERE table_schema = 'one_schema'
   AND table_name = 'one_table'
   AND ( data_type = 'bytea'
 OR ( numeric_precision IS NOT NULL
   AND column_default::text LIKE '%nextval%' )
 );

Regards



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Peter Geoghegan

On Wed, Dec 10, 2014 at 10:18 PM, Josh Berkus j...@agliodbs.com wrote:

 So far, I haven't seen any features for 9.5 which would delay a more
 timely release the way we did for 9.4.  Anybody know of a bombshell
 someone's going to drop on us for CF5?


I had wondered about that myself. What about jsquery? Is that something
that is planned for submission some time in the current cycle?

FWIW, I strongly doubt that I'll find time to work on anything like that
for 9.5.

-- 
Peter Geoghegan

Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby

2014-12-11 Thread Heikki Linnakangas


On 12/11/2014 05:45 AM, Andres Freund wrote:

A customer recently reported getting backup_label contains data
inconsistent with control file after taking a basebackup from a standby
and starting it with a typo in primary_conninfo.

When starting postgres from a basebackup StartupXLOG() has the follow
code to deal with backup labels:
 if (haveBackupLabel)
 {
 ControlFile-backupStartPoint = checkPoint.redo;
 ControlFile-backupEndRequired = backupEndRequired;

 if (backupFromStandby)
 {
 if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY)
 ereport(FATAL,
 (errmsg(backup_label contains data inconsistent with 
control file),
  errhint(This means that the backup is corrupted and 
you will 
have to use another backup for recovery.)));
 ControlFile-backupEndPoint = ControlFile-minRecoveryPoint;
 }
 }

while I'm not enthusiastic about the error message, that bit of code
looks sane at first glance. We certainly expect the control file to
indicate we're in recovery. Since we're unlinking the backup label
shortly afterwards we'd normally not expect to hit that case after a
shutdown in recovery.


Check.


The problem is that after reading the backup label we also have to read
the corresponding checkpoing from pg_xlog. If primary_conninfo and/or
restore_command are misconfigured and can't restore files that can only
be fixed by shutting down the cluster and fixing up recovery.conf -
which sets DB_SHUTDOWNED_IN_RECOVERY in the control file.


No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in 
CreateRestartPoint(). If you shut down the server before it has even 
read the initial checkpoint record, it will not attempt to create a 
restartpoint nor update the control file.



The easiest solution seems to be to simply also allow that as a state in
the above check. It might be nicer to not allow a ShutdownXLOG to modify
the control file et al at that stage, but I think that'd end up being
more invasive.

A short search shows that that also looks like a credible explanation
for #12128...


Yeah. I was not able to reproduce this, but I'm clearly missing 
something, since both you and Sergey have seen this happening. Can you 
write a script to reproduce?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.

2014-12-11 Thread Kyotaro HORIGUCHI

Hello, we found (and maybe fixed) two wrong behavior related to
inheritance and FOR UPDATE. This report is about one of them.

This behavior could be observed on master and back to 8.4 but 8.4
behaves a bit more funny. I haven't checked on 8.3.

This issue is that some query returns duplicate rows after FOR
UPDATE was blocked, in other words, after getting
HeapTupleUpdated in ExecLockRows.

- Reproducing the symptom

This behavior is easy to reproduce as following.

A=# DROP TABLE IF EXISTS p CASCADE;
A=# CREATE TABLE p (a int, b int, c int);
A=# CREATE TABLE c1 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# CREATE TABLE c2 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# CREATE TABLE c3 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# INSERT INTO c1 (SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a);
A=# INSERT INTO c2 (SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a);
A=# INSERT INTO c3 (SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a);
A=# ANALYZE;
A=# BEGIN;
A=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE;
A=# UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0;
B=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE;
-- the above is blocked by session A until it commits.
A=# COMMIT;

Where A and B are individual sessions.

You will get the follwing result on sesison B just after the
COMMIT on session A.

|  tableoid | ctid  | a | b | c 
| --+---+---+---+---
* 33834 | (0,1) | 0 | 0 | 0
| 33834 | (0,4) | 0 | 1 | 0
| 33838 | (0,1) | 1 | 0 | 0
* 33834 | (0,1) | 0 | 0 | 0
| 33842 | (0,1) | 2 | 0 | 0
| 33842 | (0,4) | 2 | 1 | 0
| (6 rows)

The lines prefixed with '*' appear twice in the result.

The plan for the query is as following,

 LockRows
  -  Result
   -  Append
-  Seq Scan on p
 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0))
-  Seq Scan on c1
 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0))
-  Seq Scan on c2
 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0))
-  Seq Scan on c3
 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0))



- Analysys and solution

This seems to be caused in ExecScanFetch under EvalPlanQualNext,
when es_epqTuleSet[scanrelid - 1] is false (p and c1 in the above
example).  In that case, execution confluents onto non-EPQ
route. As the result, the accessMtd (= SeqScan, IndexScan will
behaves as the same) retunes null slot at the first iteration on
p. Then the second iteration on c1, it reuturns the c1's first
tuple (33834:(0,1)) and EvalPlanQualNext is satisfied with the
wrong tuple.

In the EPQ block in ExecScanFetch, it seems should return NULL if
the relation does not has test tuple. The attached patch does so
on the current master and the problem has disappeared.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 79c201539abdc59400af8339795c6c63c2980278 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Thu, 11 Dec 2014 09:06:59 +0900
Subject: [PATCH] Fix duplicate tuples for inheritance and FOR UPDATE

---
 src/backend/executor/execScan.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 1319519..d5c491c 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -74,6 +74,11 @@ ExecScanFetch(ScanState *node,
 
 			return slot;
 		}
+		else
+		{
+			/* This relation has no tuple to recheck.*/
+			return NULL;
+		}
 	}
 
 	/*
-- 
2.1.0.GIT


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-11 Thread Kyotaro HORIGUCHI

Hello, this is about the second issue.

SELECT FROM inheritance parent WHERE cond FOR UPDATE may
return results which does not match the cond. The following
steps will reproduce the problematic behavior (A and B are
individual sessions) on master and back to 9.0 but 8.4 gives
correct result. I haven't checked on 8.3.

- Reproducing the symptom

A=# SET enable_seqscan TO false;
A=# SET enable_bitmapscan TO false;
A=# DROP TABLE IF EXISTS p CASCADE;
A=# CREATE TABLE p (id text, a text, b text, c text);
A=# CREATE INDEX p_i1 ON p (a, b, c) WHERE b IN ('0', '1') AND c = '0';
A=# CREATE TABLE c1 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# CREATE TABLE c2 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# CREATE TABLE c3 (LIKE p INCLUDING INDEXES) INHERITS (p);
A=# INSERT INTO c1 (SELECT  1 + a, 0, a % 4, 0 FROM generate_series(0, 7) a);
A=# INSERT INTO c2 (SELECT 11 + a, 1, a % 4, 0 FROM generate_series(0, 7) a);
A=# INSERT INTO c3 (SELECT 21 + a, 2, a % 4, 0 FROM generate_series(0, 7) a);
A=# ANALYZE;
A=# BEGIN;
A=# CREATE TEMP TABLE tt1 AS
A=# SELECT id FROM p WHERE b IN ('0', '1') AND c = '0' ORDER BY id LIMIT 1 FOR 
UPDATE;
A=# UPDATE p SET b = -1 WHERE id IN (SELECT id FROM tt1) RETURNING id;
A=# DROP TABLE tt1;
A=# SET enable_seqscan TO false;
A=# SET enable_bitmapscan TO false;
B=# SELECT tableoid, ctid, * FROM p WHERE b IN ('0', '1') AND c = '0' ORDER BY 
id LIMIT 1 FOR UPDATE;
A=# COMMIT;

On session B.

|  tableoid | ctid  | id | a | b  | c 
| --+---++---++---
| 34316 | (0,9) | 1  | 0 | -1 | 0


b = -1 apparently contradicts the WHERE clause.

The plan for the query is as following. The part b IN ('0',
'1') in the WHERE clause is omitted even though required by EPQ
recheck.

 Limit
  -  LockRows
   -  Sort
Sort Key: p.id
-  Result
 -  Append
  -  Index Scan using p_i1 on p
   Index Cond: (c = '0'::text)
  -  Index Scan using c1_a_b_c_idx on c1
   Index Cond: (c = '0'::text)
  -  Index Scan using c2_a_b_c_idx on c2
   Index Cond: (c = '0'::text)
  -  Index Scan using c3_a_b_c_idx on c3
   Index Cond: (c = '0'::text)


- Analysys and solution

This is caused by that IndexRecheck examines the test tuple with
a qual c = '0' without b IN ('0', '1'). The part has been
removed in create_indexscan_plan. It decieds whether to remove a
qual or not using get_parse_rowmark(root-parse(-rowMarks)) and
predicate_implied_by(). But the former always says no (NULL) for
child relations even if the parent has rowMarks.

On the other hand, rowmarks on children is already distributed at
the time by expand_inherited_rtentry() into root-rowMarks.

So I replaced the get_parse_rowmark() with get_plan_rowmark() as
the attached patch and the problem disappeared.


By the way, get_plan_rowmark() has the comment like this,

 * This probably ought to be elsewhere, but there's no very good place

I haven't moved it anywhere but createplan.c might be the good plance.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From a0e26063bd412ce06df7dc252b8881e51af10f35 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Thu, 11 Dec 2014 18:20:37 +0900
Subject: [PATCH] Fix bogus tuples for inhertance and FOR UPDATE

---
 src/backend/optimizer/plan/createplan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bf8dbe0..30df3ce 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -29,6 +29,7 @@
 #include optimizer/clauses.h
 #include optimizer/cost.h
 #include optimizer/paths.h
+#include optimizer/prep.h
 #include optimizer/placeholder.h
 #include optimizer/plancat.h
 #include optimizer/planmain.h
@@ -1231,7 +1232,7 @@ create_indexscan_plan(PlannerInfo *root,
 			if (best_path-indexinfo-indpred)
 			{
 if (baserelid != root-parse-resultRelation 
-	get_parse_rowmark(root-parse, baserelid) == NULL)
+	get_plan_rowmark(root-rowMarks, baserelid) == NULL)
 	if (predicate_implied_by(clausel,
 			 best_path-indexinfo-indpred))
 		continue;		/* implied by index predicate */
-- 
2.1.0.GIT


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

2014-12-11 Thread Etsuro Fujita


(2014/12/11 14:54), Ashutosh Bapat wrote:

I marked this as ready for committer.


Many thanks!

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Memory-bounded HashAgg

2014-12-11 Thread Jeff Davis

On Sun, 2014-08-10 at 14:26 -0700, Jeff Davis wrote:
 This patch is requires the Memory Accounting patch, or something similar
 to track memory usage.
 
 The attached patch enables hashagg to spill to disk, which means that
 hashagg will contain itself to work_mem even if the planner makes a
 bad misestimate of the cardinality.

New patch attached. All open items are complete, though the patch may
have a few rough edges.

Summary of changes:

 * rebased on top of latest memory accounting patch
http://www.postgresql.org/message-id/1417497257.5584.5.camel@jeff-desktop
 * added a flag to hash_create to prevent it from creating an extra
level of memory context
   - without this, the memory accounting would have a measurable impact
on performance
 * cost model for the disk usage
 * intelligently choose the number of partitions for each pass of the
data
 * explain support
 * in build_hash_table(), be more intelligent about the value of
nbuckets to pass to BuildTupleHashTable()
   - BuildTupleHashTable tries to choose a value to keep the table in
work_mem, but it isn't very accurate.
 * some very rudimentary testing (sanity checks, really) shows good
results

Summary of previous discussion (my summary; I may have missed some
points):

Tom Lane requested that the patch also handle the case where transition
values grow (e.g. array_agg) beyond work_mem. I feel this patch provides
a lot of benefit as it is, and trying to handle that case would be a lot
more work (we need a way to write the transition values out to disk at a
minimum, and perhaps combine them with other transition values). I also
don't think my patch would interfere with a fix there in the future.

Tomas Vondra suggested an alternative design that more closely resembles
HashJoin: instead of filling up the hash table and then spilling any new
groups, the idea would be to split the current data into two partitions,
keep one in the hash table, and spill the other (see
ExecHashIncreaseNumBatches()). This has the advantage that it's very
fast to identify whether the tuple is part of the in-memory batch or
not; and we can avoid even looking in the memory hashtable if not.

The batch-splitting approach has a major downside, however: you are
likely to evict a skew value from the in-memory batch, which will result
in all subsequent tuples with that skew value going to disk. My approach
never evicts from the in-memory table until we actually finalize the
groups, so the skew values are likely to be completely processed in the
first pass.

So, the attached patch implements my original approach, which I still
feel is the best solution.

Regards,
Jeff Davis

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 3017,3022  include_dir 'conf.d'
--- 3017,3037 
/listitem
   /varlistentry
  
+  varlistentry id=guc-enable-hashagg-disk xreflabel=enable_hashagg_disk
+   termvarnameenable_hashagg_disk/varname (typeboolean/type)
+   indexterm
+primaryvarnameenable_hashagg_disk/ configuration parameter/primary
+   /indexterm
+   /term
+   listitem
+para
+ Enables or disables the query planner's use of hashed aggregation plan
+ types when the planner expects the hash table size to exceed
+ varnamework_mem/varname. The default is literalon/.
+/para
+   /listitem
+  /varlistentry
+ 
   varlistentry id=guc-enable-hashjoin xreflabel=enable_hashjoin
termvarnameenable_hashjoin/varname (typeboolean/type)
indexterm
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 86,91  static void show_sort_group_keys(PlanState *planstate, const char *qlabel,
--- 86,92 
  	 List *ancestors, ExplainState *es);
  static void show_sort_info(SortState *sortstate, ExplainState *es);
  static void show_hash_info(HashState *hashstate, ExplainState *es);
+ static void show_hashagg_info(AggState *hashstate, ExplainState *es);
  static void show_tidbitmap_info(BitmapHeapScanState *planstate,
  	ExplainState *es);
  static void show_instrumentation_count(const char *qlabel, int which,
***
*** 1423,1428  ExplainNode(PlanState *planstate, List *ancestors,
--- 1424,1430 
  		case T_Agg:
  			show_agg_keys((AggState *) planstate, ancestors, es);
  			show_upper_qual(plan-qual, Filter, planstate, ancestors, es);
+ 			show_hashagg_info((AggState *) planstate, es);
  			if (plan-qual)
  show_instrumentation_count(Rows Removed by Filter, 1,
  		   planstate, es);
***
*** 1913,1918  show_sort_info(SortState *sortstate, ExplainState *es)
--- 1915,1956 
  }
  
  /*
+  * Show information on hash aggregate buckets and batches
+  */
+ static void
+ show_hashagg_info(AggState *aggstate, ExplainState *es)
+ {
+ 	Agg *agg = (Agg *)aggstate-ss.ps.plan;
+ 
+ 	Assert(IsA(aggstate, AggState));
+ 
+ 	if (agg-aggstrategy != AGG_HASHED)
+ 		return;
+ 
+ 	if

Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions

2014-12-11 Thread Heikki Linnakangas

I'm marking this as Rejected in the commitfest. It's quite clear that 
this isn't going to fly in its current form.


For the COPY FROM use case, I'd suggest just doing COPY FROM STDIN. Yes, 
it's slower, but not much. And you probably could optimize it further - 
there's some gratuitous memcpy()ing happening from buffer to buffer in 
that codepath.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby

2014-12-11 Thread Andres Freund

On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:
On 12/11/2014 05:45 AM, Andres Freund wrote:
 A customer recently reported getting backup_label contains data
 inconsistent with control file after taking a basebackup from a
standby
 and starting it with a typo in primary_conninfo.

 When starting postgres from a basebackup StartupXLOG() has the follow
 code to deal with backup labels:
  if (haveBackupLabel)
  {
  ControlFile-backupStartPoint = checkPoint.redo;
  ControlFile-backupEndRequired = backupEndRequired;

  if (backupFromStandby)
  {
  if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY)
  ereport(FATAL,
  (errmsg(backup_label contains data
inconsistent with control file),
   errhint(This means that the backup is
corrupted and you will 
 have to use another backup for
recovery.)));
  ControlFile-backupEndPoint =
ControlFile-minRecoveryPoint;
  }
  }

 while I'm not enthusiastic about the error message, that bit of code
 looks sane at first glance. We certainly expect the control file to
 indicate we're in recovery. Since we're unlinking the backup label
 shortly afterwards we'd normally not expect to hit that case after a
 shutdown in recovery.

Check.

 The problem is that after reading the backup label we also have to
read
 the corresponding checkpoing from pg_xlog. If primary_conninfo and/or
 restore_command are misconfigured and can't restore files that can
only
 be fixed by shutting down the cluster and fixing up recovery.conf -
 which sets DB_SHUTDOWNED_IN_RECOVERY in the control file.

No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in 
CreateRestartPoint(). If you shut down the server before it has even 
read the initial checkpoint record, it will not attempt to create a 
restartpoint nor update the control file.

Yes, it does. There's a shortcut that just sets the state in the control file 
and then exits.

 The easiest solution seems to be to simply also allow that as a state
in
 the above check. It might be nicer to not allow a ShutdownXLOG to
modify
 the control file et al at that stage, but I think that'd end up being
 more invasive.

 A short search shows that that also looks like a credible explanation
 for #12128...

Yeah. I was not able to reproduce this, but I'm clearly missing 
something, since both you and Sergey have seen this happening. Can you 
write a script to reproduce?

Not right now, I only have my mobile... Its quite easy though. Create a 
pg-basebackup from a standby. Create a recovery.conf with a broken primary 
conninfo. Start. Shutdown. Fix conninfo. Start.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Refactoring code for sync node detection

2014-12-11 Thread Heikki Linnakangas


On 11/18/2014 11:23 PM, Michael Paquier wrote:

On Tue, Nov 18, 2014 at 6:33 PM, Simon Riggs si...@2ndquadrant.com wrote:


Can we just wait on this patch until we have the whole feature?


Well, this may take some time to even define, and even if goals are clearly
defined this may take even more time to have a prototype to discuss about.



We quite often break larger patches down into smaller ones, but if
they arrive in lots of small pieces its more difficult to understand
and correct issues that are happening on the larger scale. Churning
the same piece of code multiple times is rather mind numbing.


Hm. I think that we are losing ourselves in this thread. The primary goal
is to remove a code duplication between syncrep.c and walsender,c that
exists since 9.1. Would it be possible to focus only on that for now?
That's still the topic of this CF.


Some comments on this patch:

* I don't like filling the priorities-array in 
SyncRepGetSynchronousStandby(). It might be convenient for the one 
caller that needs it, but it seems pretty ad hoc.


In fact, I don't really see the point of having the array at all. You 
could just print the priority in the main loop in 
pg_stat_get_wal_senders(). Yeah, nominally you need to hold SyncRepLock 
while reading the priority, but it seems over-zealous to be so strict 
about that in pg_stat_wal_senders(), since it's just an informational 
view, and priority changes so rarely that it's highly unlikely to hit a 
race condition there. Also note that when you change the list of 
synchronous standbys in the config file, and SIGHUP, the walsender 
processes will update their priorities in random order. So the idea that 
you get a consistent snapshot of the priorities is a bit bogus anyway. 
Also note that between filling the priorities array and the main loop in 
pg_stat_get_wal_senders(), a new WAL sender might connect and set a 
slot's pid. With the current coding, you'll print an uninitialized value 
from the array if that happens. So the only thing that holding the 
SyncRepLock really protects from is a torn read of the value, if 
reading an int is not atomic. We could use the spinlock to protect from 
that if we care.


* Would be nicer to return a pointer, than index into the wal senders 
array. That's what the caller really wants.


I propose the attached (I admit I haven't tested it).

- Heikki

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index aa54bfb..da89a3b 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -5,7 +5,7 @@
  * Synchronous replication is new as of PostgreSQL 9.1.
  *
  * If requested, transaction commits wait until their commit LSN is
- * acknowledged by the sync standby.
+ * acknowledged by the synchronous standby.
  *
  * This module contains the code for waiting and release of backends.
  * All code in this module executes on the primary. The core streaming
@@ -358,6 +358,54 @@ SyncRepInitConfig(void)
 }
 
 /*
+ * Find the WAL sender servicing the synchronous standby with the lowest
+ * priority value, or NULL if no synchronous standby is connected. If there
+ * are multiple nodes with the same lowest priority value, the first node
+ * found is selected. The caller must hold SyncRepLock.
+ */
+WalSnd *
+SyncRepGetSynchronousStandby(void)
+{
+	WalSnd	   *result = NULL;
+	int			result_priority = 0;
+	int			i;
+
+	/* Scan WAL senders and find synchronous node if any */
+	for (i = 0; i  max_wal_senders; i++)
+	{
+		/* Use volatile pointer to prevent code rearrangement */
+		volatile WalSnd *walsnd = WalSndCtl-walsnds[i];
+		int			this_priority;
+
+		/* Must be active */
+		if (walsnd-pid == 0)
+			continue;
+
+		/* Must be streaming */
+		if (walsnd-state != WALSNDSTATE_STREAMING)
+			continue;
+
+		/* Must be synchronous */
+		this_priority = walsnd-sync_standby_priority;
+		if (this_priority == 0)
+			continue;
+
+		/* Must have a lower priority value than any previous ones */
+		if (result != NULL  result_priority = this_priority)
+			continue;
+
+		/* Must have a valid flush position */
+		if (XLogRecPtrIsInvalid(walsnd-flush))
+			continue;
+
+		result = (WalSnd *) walsnd;
+		result_priority = this_priority;
+	}
+
+	return result;
+}
+
+/*
  * Update the LSNs on each queue based upon our latest state. This
  * implements a simple policy of first-valid-standby-releases-waiter.
  *
@@ -368,11 +416,9 @@ void
 SyncRepReleaseWaiters(void)
 {
 	volatile WalSndCtlData *walsndctl = WalSndCtl;
-	volatile WalSnd *syncWalSnd = NULL;
+	WalSnd	   *syncWalSnd;
 	int			numwrite = 0;
 	int			numflush = 0;
-	int			priority = 0;
-	int			i;
 
 	/*
 	 * If this WALSender is serving a standby that is not on the list of
@@ -388,32 +434,13 @@ SyncRepReleaseWaiters(void)
 	/*
 	 * We're a potential sync standby. Release waiters if we are the highest
 	 * priority standby. If there are multiple standbys with same priorities
-	 * then we use the first mentioned standby. If you change this,

Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby

2014-12-11 Thread Marco Nenciarini

Il 11/12/14 12:38, Andres Freund ha scritto:
 On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:
 On 12/11/2014 05:45 AM, Andres Freund wrote:

 Yeah. I was not able to reproduce this, but I'm clearly missing 
 something, since both you and Sergey have seen this happening. Can you 
 write a script to reproduce?
 
 Not right now, I only have my mobile... Its quite easy though. Create a 
 pg-basebackup from a standby. Create a recovery.conf with a broken primary 
 conninfo. Start. Shutdown. Fix conninfo. Start.
 

Just tested it. There steps are not sufficient to reproduce the issue on
a test installation. I suppose because, on small test datadir, the
checkpoint location and the redo location on the pg_control are the same
present in the backup_label.

To trigger this bug you need to have at least a restartpoint happened on
standby between the start and the end of the backup.

you could simulate it issuing a checkpoint on master, a checkpoint on
standby (to force a restartpoint), then copying the pg_control from the
standby.

This way I've been able to reproduce it.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-11 Thread Peter Eisentraut

On 11/5/14 9:50 AM, Ali Akbar wrote:
 I noticed somewhat big performance regression if the xml is large (i use
 PRODML Product Volume sample from energistics.org http://energistics.org):
 * Without patch (tested 3 times):
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 84,012 ms
 Time: 85,683 ms
 Time: 88,766 ms

 * With latest v6 patch (notice the correct result with namespace
 definition):
 
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 108,912 ms
 Time: 108,267 ms
 Time: 114,848 ms
 
 
 It's 23% performance regression.
 
 * Just curious, i'm also testing v5 patch performance (notice the
 namespace in the result):
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 92,552 ms
 Time: 97,440 ms
 Time: 99,309 ms
 
 The regression is only 13%. I know the xmlCopyNode() version (v6 patch)
 is much more cleaner than v5patch, should we consider the performance
 benefit?

I ran a test using postgres-US.fo built in the PostgreSQL source tree,
which is 38 MB, and ran

select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
'http://www.w3.org/1999/XSL/Format']])) from data;

(Table contains one row only.)

The timings were basically indistinguishable between the three code
versions.

I'll try to reproduce your test.  How big is your file?  Do you have a
link to the actual file?  Could you share your load script?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Bruce Momjian

On Wed, Dec 10, 2014 at 11:00:21PM -0800, Josh Berkus wrote:
 On 12/10/2014 10:35 PM, Bruce Momjian wrote:
  I think we are reaching the limits on how much we can do with our
  existing commitfest structure, and it might be time to consider changes.
  While the commitfest process hasn't changed much and was very successful
  in the first few years, a few things have changed externally:
  
  1  more new developers involved in contributing small patches
  2  more full-time developers creating big patches
  3  increased time demands on experienced Postgres developers
 
 I will add:
 
 4. commitfest managers have burned out and refuse to do it again

Agreed.  The fun, if it was ever there, has left the commitfest
process.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 12:00 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Yeah either this way or what Josh has suggested upthread, the main
 point was that if at all we want to support multi-column list partitioning
 then we need to have slightly different syntax, however I feel that we
 can leave multi-column list partitioning for first version.

Yeah, possibly.

I think we could stand to have a lot more discussion about the syntax
here.  So far the idea seems to be to copy what Oracle has, but it's
not clear if we're going to have exactly what Oracle has or something
subtly different.  I personally don't find the Oracle syntax very
PostgreSQL-ish.  Stuff like VALUES LESS THAN 500 doesn't sit
especially well with me - less than according to which opclass?  Are
we going to insist that partitioning must use the default btree
opclass so that we can use that syntax?  That seems kind of lame.

There are lots of interesting things we could do here, e.g.:

CREATE TABLE parent_name PARTITION ON (column [ USING opclass ] [, ... ]);
CREATE TABLE child_name PARTITION OF parent_name
   FOR { (value, ...) [ TO (value, ...) ] } [, ...];

So instead of making a hard distinction between range and list
partitioning, you can say:

CREATE TABLE child_name PARTITION OF parent_name FOR (3), (5), (7);
CREATE TABLE child2_name PARTITION OF parent_name FOR (8) TO (12);
CREATE TABLE child2_name PARTITION OF parent_name FOR (20) TO (30),
(120) TO (130);

Now that might be a crappy idea for various reasons, but the point is
there are a lot of details to be hammered out with the syntax, and
there are several ways we can go wrong.  If we choose an
overly-limiting syntax, we're needlessly restricting what can be done.
If we choose an overly-permissive syntax, we'll restrict the
optimization opportunities.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSSAPI, SSPI - include_realm default

2014-12-11 Thread Robert Haas

On Wed, Dec 10, 2014 at 4:53 AM, Bruce Momjian br...@momjian.us wrote:
 On Tue, Dec  9, 2014 at 05:40:35PM -0500, Stephen Frost wrote:
  I thought the idea was to backpatch documentation saying it's a good idea
  to change this value to x because of y. Not actually referring to the
  upcoming change directly. And I still think that part is a good idea, as it
  helps people avoid potential security pitfalls.

 I agree with this but I don't really see why we wouldn't say hey, this
 is going to change in 9.5.  Peter's argument sounds like he'd rather we
 not make any changes to the existing documentation, and I don't agree
 with that, and if we're making changes then, imv, we might as well
 comment that the default is changed in 9.5.

 I agree with Peter --- it is unwise to reference a future released
 feature in a backbranch doc patch.  Updating the backbranch docs to add
 a recommendation is fine.

I am strongly in agreement with that principle as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL information view

2014-12-11 Thread Alex Shulgin


Magnus Hagander mag...@hagander.net writes:

 You should add it to the next CF for proper tracking, there are already many
 patches in the queue waiting for reviews :)

 Absolutely - I just wanted those that were already involved in the
 thread to get a chance to look at it early :) didn't want to submit it
 until it was complete.

 Which it is now - attached is a new version that includes docs.

Here's my review of the patch:

* Applies to the current HEAD, no failed hunks.
* Compiles and works as advertised.
* Docs included.


The following catches my eye:

psql (9.5devel)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 
256, compression: off)
Type help for help.

postgres=# select * from pg_stat_ssl;
 pid  | ssl | bits | compression | version |   cipher| 
clientdn 
--+-+--+-+-+-+--
 1343 | t   |  256 | f   | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 
(1 row)

I think the order of details in the psql prompt makes more sense,
because it puts more important details first.

I suggest we reorder the view columns, while also renaming 'version' to
'protocol' (especially since we have another patch in the works that
adds a GUC named 'ssl_protocols'):

  pid, ssl, protocol, cipher, bits, compression, clientdn.


Next, this one:

+ be_tls_get_cipher(Port *port, char *ptr, size_t len)
+ {
+   if (port-ssl)
+   strlcpy(ptr, SSL_get_cipher(port-ssl), NAMEDATALEN);

should be this:

+   strlcpy(ptr, SSL_get_cipher(port-ssl), len);

The same goes for this one:

+ be_tls_get_peerdn_name(Port *port, char *ptr, size_t len)
+ {
+   if (port-peer)
+   strlcpy(ptr, 
X509_NAME_to_cstring(X509_get_subject_name(port-peer)), NAMEDATALEN);

The NAMEDATALEN constant is passed in the calling code in
pgstat_bestart().


Other than that, looks good to me.

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 12:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Quite.  So, here's a new thread.

 MHO is that, although 9.4 has slipped more than any of us would like,
 9.5 development launched right on time in August.  So I don't see a
 good reason to postpone 9.5 release just because 9.4 has slipped.
 I think we should stick to the schedule agreed to in Ottawa last spring.

 Comments?

I'm fine with that, but in the spirit of playing the devil's advocate:

1. At the development meeting, Simon argued for the 5CF schedule for
this release, with CF5 not starting until February, as a way of making
sure that there was time after the release of 9.4 to get feedback from
actual users in time to do something about it for 9.5.  If anything,
we're going to end up being significantly worse off in that regard
than we would have been, because we're releasing in late December
instead of early September; an extra month of time to get patches in
does not make up for a release that was delayed nearly three months.

2. It's not clear that we're going to have a particularly-impressive
list of major features for 9.5.  So far we've got RLS and BRIN. I
expect that GROUPING SETS is far enough along that it should be
possible to get it in before development ends, and there are a few
performance patches pending (Andres's lwlock scalability patches,
Rahila's work on compressing full-page writes) that I think will
probably make the grade.  But after that it seems to me that it gets
pretty thin on the ground.  Are we going to bill commit timestamp
tracking - with replication node ID tracking as the real goal, despite
the name - as a major feature, or DDL deparsing if that goes in, as
major features?  As useful as they may be for BDR, they don't strike
me as things we can publicize as major features independent of BDR.
And it's getting awfully late for any other major work that people are
thinking of to start showing up.

Now, against all that, if we don't get back on our usual release
schedule then (a) it will look like we're losing momentum, which I'm
actually afraid may be true rather than merely a perception, and (b)
people whose stuff did get in will have to wait longer to see it
released.  So, I'm not sure waiting is any better.  But there are
certainly some things not to like about where we are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus j...@agliodbs.com wrote:
 While there were technical
 issues, 9.4 dragged a considerable amount because most people were
 ignoring it in favor of 9.5 development.

I think 9.4 dragged almost entirely because of one issue: the
compressibility of JSONB.  And it became pretty clear early on in that
discussion that it was not going to be resolved until Tom signed off
on some proposed fix.  Which I think points to the issue Bruce
mentioned about how busy many of our key contributors are.  I could
have easily spent four times as much time doing reviewing and
committing over the last few months, but I'm not willing to work an 80
or 100 hour week to make that happen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus j...@agliodbs.com wrote:
 While there were technical
 issues, 9.4 dragged a considerable amount because most people were
 ignoring it in favor of 9.5 development.

 I think 9.4 dragged almost entirely because of one issue: the
 compressibility of JSONB.

Meh.  While we certainly weren't very speedy about resolving that,
I don't think that issue deserves all or even most of the blame.
I agree with Josh: the problem really was that people were not focusing
on getting 9.4 tested and releasable.  One way in which that lack of
focus manifested was not having any urgency about resolving JSONB ...
but it struck me as a systemic problem and not that specific issue's
fault.

I'd finger two underlying issues here:

1. As Bruce points out in a nearby thread, we've been in commitfest mode
more or less continuously since August.  That inherently sucks away
developer mindshare from testing already-committed stuff.

2. The amount of pre-release testing we get from people outside the
hard-core development crowd seems to be continuing to decrease.
We were fortunate that somebody found the JSONB issue before it was
too late to do anything about it.  Personally, I'm very worried that
there are other such bugs in 9.4.  But I've given up hoping that any
more testing will happen until we put out something that calls itself
9.4.0, which is why I voted to release in the core discussion about it.

I don't know what to do about either of those things, but I predict
future release cycles will be just as bad unless we can fix them.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 01:26:38PM +0530, Rahila Syed wrote:
 I am sorry but I can't understand the above results due to wrapping.
 Are you saying compression was twice as slow?
 
 CPU usage at user level (in seconds)  for compression set 'on' is 562 secs
 while that for compression  set 'off' is 354 secs. As per the readings, it
 takes little less than double CPU time to compress.
 However , the total time  taken to run 25 transactions for each of the
 scenario is as follows,
 
 compression = 'on'  : 1838 secs
 = 'off' : 1701 secs
 
 
 Different is around 140 secs.

OK, so the compression took 2x the cpu and was 8% slower.  The only
benefit is WAL files are 35% smaller?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] double vacuum in initdb

2014-12-11 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:

 I think we could go to
 PG_CMD_PUTS(ANALYZE;\nVACUUM FULL FREEZE;\n);

 without any degradation of the intended results.

 Another idea would be to drop the FULL part and make this

 PG_CMD_PUTS(ANALYZE;\nVACUUM FREEZE;\n);

We want to finish with VACUUM FREEZE without the FULL, unless we
don't care about missing visibility maps and free space maps.

[ initdb and start the cluster ]

server started
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls1

kgrittn@Kevin-Desktop:~/pg/master$ psql -c vacuum freeze; postgres
VACUUM
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls2

kgrittn@Kevin-Desktop:~/pg/master$ psql -c vacuum full freeze; postgres
VACUUM
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls3
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls1 | wc -l
116
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls2 | wc -l
119
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls3 | wc -l
80
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls1 | wc -l
116
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls2 | wc -l
117
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls3 | wc -l
77

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 10:37:32AM -0500, Robert Haas wrote:
 2. It's not clear that we're going to have a particularly-impressive
 list of major features for 9.5.  So far we've got RLS and BRIN. I
 expect that GROUPING SETS is far enough along that it should be
 possible to get it in before development ends, and there are a few
 performance patches pending (Andres's lwlock scalability patches,
 Rahila's work on compressing full-page writes) that I think will
 probably make the grade.  But after that it seems to me that it gets
 pretty thin on the ground.  Are we going to bill commit timestamp
 tracking - with replication node ID tracking as the real goal, despite
 the name - as a major feature, or DDL deparsing if that goes in, as
 major features?  As useful as they may be for BDR, they don't strike
 me as things we can publicize as major features independent of BDR.
 And it's getting awfully late for any other major work that people are
 thinking of to start showing up.

How bad is the 9.5 feature list going to be compared to the 9.4 one that
had JSONB, but also a lot of infrastructure additions.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Peter Eisentraut

On 12/11/14 1:35 AM, Bruce Momjian wrote:
 While the commitfest process hasn't changed much and was very successful
 in the first few years, a few things have changed externally:
 
   1  more new developers involved in contributing small patches
   2  more full-time developers creating big patches
   3  increased time demands on experienced Postgres developers

The number of patches registered in the commit fests hasn't actually
changed over the years.  It has always fluctuated between 50 and 100,
depending on the point of the release cycle.  So I don't think (1) is
necessarily the problem.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: multivariate statistics / proof of concept

2014-12-11 Thread Heikki Linnakangas


On 10/13/2014 01:00 AM, Tomas Vondra wrote:

Hi,

attached is a WIP patch implementing multivariate statistics.


Great! Really glad to see you working on this.


+* FIXME This sample sizing is mostly OK when computing stats for
+*   individual columns, but when computing multi-variate stats
+*   for multivariate stats (histograms, mcv, ...) it's rather
+*   insufficient. For small number of dimensions it works, but
+*   for complex stats it'd be nice use sample proportional to
+*   the table (say, 0.5% - 1%) instead of a fixed size.


I don't think a fraction of the table is appropriate. As long as the 
sample is random, the accuracy of a sample doesn't depend much on the 
size of the population. For example, if you sample 1,000 rows from a 
table with 100,000 rows, or 1000 rows from a table with 100,000,000 
rows, the accuracy is pretty much the same. That doesn't change when you 
go from a single variable to multiple variables.


You do need a bigger sample with multiple variables, however. My gut 
feeling is that if you sample N rows for a single variable, with two 
variables you need to sample N^2 rows to get the same accuracy. But it's 
not proportional to the table size. (I have no proof for that, but I'm 
sure there is literature on this.)



+ * Multivariate histograms
+ *
+ * Histograms are a collection of buckets, represented by n-dimensional
+ * rectangles. Each rectangle is delimited by an array of lower and
+ * upper boundaries, so that for for the i-th attribute
+ *
+ * min[i] = value[i] = max[i]
+ *
+ * Each bucket tracks frequency (fraction of tuples it contains),
+ * information about the inequalities, number of distinct values in
+ * each dimension (which is used when building the histogram) etc.
+ *
+ * The boundaries may be either inclusive or exclusive, or the whole
+ * dimension may be NULL.
+ *
+ * The buckets may overlap (assuming the build algorithm keeps the
+ * frequencies additive) or may not cover the whole space (i.e. allow
+ * gaps). This entirely depends on the algorithm used to build the
+ * histogram.


That sounds pretty exotic. These buckets are quite different from the 
single-dimension buckets we currently have.


The paper you reference in partition_bucket() function, M. 
Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating 
Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 
1988: 28-36, actually doesn't mention overlapping buckets at all. I 
haven't read the code in detail, but if it implements the algorithm from 
that paper, there will be no overlap.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 On Thu, Dec 11, 2014 at 10:37:32AM -0500, Robert Haas wrote:
 2. It's not clear that we're going to have a particularly-impressive
 list of major features for 9.5.

 How bad is the 9.5 feature list going to be compared to the 9.4 one that
 had JSONB, but also a lot of infrastructure additions.

Well, whatever the list ends up being, let's wait until we have some
more features isn't a tenable scheduling policy.  We learned years ago
the folly of delaying a release until not-quite-ready feature X was ready.
Are we going to delay 9.5 until not-even-proposed-yet features are ready?

More abstractly, there's a lot of value in having a predictable release
schedule.  That's going to mean that some release cycles are thin on
user-visible features, even if just as much work went into them.  It's
the nature of the game.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think 9.4 dragged almost entirely because of one issue: the
 compressibility of JSONB.

 Meh.  While we certainly weren't very speedy about resolving that,
 I don't think that issue deserves all or even most of the blame.
 I agree with Josh: the problem really was that people were not focusing
 on getting 9.4 tested and releasable.  One way in which that lack of
 focus manifested was not having any urgency about resolving JSONB ...
 but it struck me as a systemic problem and not that specific issue's
 fault.

 I'd finger two underlying issues here:

 1. As Bruce points out in a nearby thread, we've been in commitfest mode
 more or less continuously since August.  That inherently sucks away
 developer mindshare from testing already-committed stuff.

The problem is that, on the one hand, we have a number of serious
problems with things that got committed and turned out to have
problems - the multixact stuff, and JSONB, in particular - and on the
other hand, we are lacking in adequate committer bandwidth to properly
handle all of the new patches that come in.  We can fix the first
problem by tightening up on the requirements for committing things,
but that exacerbates the second problem.  Or we can fix the second
problem by loosening up on the requirements for commit, but that
exacerbates the first problem.  Promoting more or fewer committers is
really the same trade-off: if you're very careful about who you
promote, you'll get better people but not as many of them, so less
will get done but with fewer mistakes; if you're more generous in
handing out commit bits, you reduce the bottleneck to stuff getting
done but, inevitably, you'll be trusting people in whom you have at
least slightly less confidence.  There's an inherent tension between
quality and rate of progress that we can't get rid of, and the fact
that some of our best people are busier than ever with things other
than PostgreSQL hacking is not helping - not only because less actual
review/commit happens, but because newcomers to the community don't
have as much contact with the more senior people who could help mentor
them if they only had the time.

 2. The amount of pre-release testing we get from people outside the
 hard-core development crowd seems to be continuing to decrease.
 We were fortunate that somebody found the JSONB issue before it was
 too late to do anything about it.  Personally, I'm very worried that
 there are other such bugs in 9.4.  But I've given up hoping that any
 more testing will happen until we put out something that calls itself
 9.4.0, which is why I voted to release in the core discussion about it.

 I don't know what to do about either of those things, but I predict
 future release cycles will be just as bad unless we can fix them.

I agree.  We have had a few people, Jeff Janes perhaps foremost among
them, who have done a lot of really useful testing, but overall it
does feel pretty thin.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] double vacuum in initdb

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 11:44 AM, Kevin Grittner kgri...@ymail.com wrote:
 We want to finish with VACUUM FREEZE without the FULL, unless we
 don't care about missing visibility maps and free space maps.

Oh, good point.  I had forgotten about that issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
 While the commitfest process hasn't changed much and was very successful
 in the first few years, a few things have changed externally:
 
 1  more new developers involved in contributing small patches
 2  more full-time developers creating big patches
 3  increased time demands on experienced Postgres developers

 The number of patches registered in the commit fests hasn't actually
 changed over the years.  It has always fluctuated between 50 and 100,
 depending on the point of the release cycle.  So I don't think (1) is
 necessarily the problem.

Interesting point, but of course not all patches are equal; perhaps
the problem is that we're seeing fewer of (1) and more of (2) in the
commitfest queue.  Is there any easy way to quantify the size/complexity
of the patches in past fests?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread David G Johnston

Tom Lane-2 wrote
 Robert Haas lt;

 robertmhaas@

 gt; writes:
 On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt;

 josh@

 gt; wrote:
 While there were technical
 issues, 9.4 dragged a considerable amount because most people were
 ignoring it in favor of 9.5 development.
 
 I think 9.4 dragged almost entirely because of one issue: the
 compressibility of JSONB.
 
 2. The amount of pre-release testing we get from people outside the
 hard-core development crowd seems to be continuing to decrease.
 We were fortunate that somebody found the JSONB issue before it was
 too late to do anything about it.  Personally, I'm very worried that
 there are other such bugs in 9.4.  But I've given up hoping that any
 more testing will happen until we put out something that calls itself
 9.4.0, which is why I voted to release in the core discussion about it.

The compressibility properties of a new type seem like something that should
be mandated before it is committed - it shouldn't require good fortune that



--
View this message in context: 
http://postgresql.nabble.com/logical-column-ordering-tp5829775p5830115.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread David G Johnston

David G Johnston wrote
 
 Tom Lane-2 wrote
 Robert Haas lt;

 robertmhaas@

 gt; writes:
 On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt;

 josh@

 gt; wrote:
 While there were technical
 issues, 9.4 dragged a considerable amount because most people were
 ignoring it in favor of 9.5 development.
 
 I think 9.4 dragged almost entirely because of one issue: the
 compressibility of JSONB.
 
 2. The amount of pre-release testing we get from people outside the
 hard-core development crowd seems to be continuing to decrease.
 We were fortunate that somebody found the JSONB issue before it was
 too late to do anything about it.  Personally, I'm very worried that
 there are other such bugs in 9.4.  But I've given up hoping that any
 more testing will happen until we put out something that calls itself
 9.4.0, which is why I voted to release in the core discussion about it.
 The compressibility properties of a new type seem like something that
 should be mandated before it is committed - it shouldn't require good
 fortune that

... someone thought to test it out.  Is there anywhere to effectively add
that to maximize the chance someone remembers for next time?

3. Effort and brain power was also diverted to fixing 9.3 this time around.

I don't have any answers but if a release is thin on features but thick on
review and testing I wouldn't complain.  That testing, especially applied to
existing releases, would have considerable benefit to those still using
older supported releases instead of only benefitting those who can upgrade
to the latest and greatest.  An existing user on an older release is just,
if not more, important than the potential user who is looking for new
features before deciding to migrate or begin using PostgreSQL.

David J.




--
View this message in context: 
http://postgresql.nabble.com/logical-column-ordering-tp5829775p5830116.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Heikki Linnakangas


On 12/11/2014 06:59 PM, Robert Haas wrote:

On Thu, Dec 11, 2014 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:

I think 9.4 dragged almost entirely because of one issue: the
compressibility of JSONB.


Meh.  While we certainly weren't very speedy about resolving that,
I don't think that issue deserves all or even most of the blame.
I agree with Josh: the problem really was that people were not focusing
on getting 9.4 tested and releasable.  One way in which that lack of
focus manifested was not having any urgency about resolving JSONB ...
but it struck me as a systemic problem and not that specific issue's
fault.

I'd finger two underlying issues here:

1. As Bruce points out in a nearby thread, we've been in commitfest mode
more or less continuously since August.  That inherently sucks away
developer mindshare from testing already-committed stuff.


The problem is that, on the one hand, we have a number of serious
problems with things that got committed and turned out to have
problems - the multixact stuff, and JSONB, in particular - and on the
other hand, we are lacking in adequate committer bandwidth to properly
handle all of the new patches that come in.  We can fix the first
problem by tightening up on the requirements for committing things,
but that exacerbates the second problem.  Or we can fix the second
problem by loosening up on the requirements for commit, but that
exacerbates the first problem.


There is a third option: reject more patches, more quickly, with less 
discussion. IOW, handle new patches less properly.


The commitfest is good at making sure that no patch completely falls off 
the radar. That's also a problem. Before the commitfest process, many 
patches were not actively rejected, but they just fell to the side if no 
committer was interested enough in it to pick it up, review, and commit.


There are a lot of patches in the October commitfest that I personally 
don't care about, and if I was a dictator I would just drop as not 
worth the trouble to review. Often a patch just doesn't feel quite 
right, or would require some work to clean up, but you can't immediately 
point to anything outright wrong with it. It takes some effort to review 
such a patch enough to give feedback on it, if you want more meaningful 
feedback than This smells bad.


I imagine that it's the same for everyone else. Many of the patches that 
sit in the commitfest for weeks are patches that no-one really cares 
much about. I'm not sure what to do about that. It would be harsh to 
reject a patch just because no-one's gotten around to reviewing it, and 
if we start doing that, it's not clear what the point of a commitfest is 
any more.


Perhaps we should change the process so that it is the patch author's 
responsibility to find a reviewer, and a committer, for the patch. If 
you can't cajole anyone to review your patch, it's a sign that no-one 
cares about it, and the patch is rejected by default. Or put a quick 
+1/-1 voting option to each patch in the commitfest, to get a quick 
gauge of how the committers feel about it.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

2014-12-11 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 On Sat, Dec 6, 2014 at 10:08 PM, Tomas Vondra t...@fuzzy.cz wrote:
select a.i, b.i from a join b on (a.i = b.i);

 I think the concern is that the inner side might be something more
 elaborate than a plain table scan, like an aggregate or join.  I might
 be all wet, but my impression is that you can make rescanning
 arbitrarily expensive if you work at it.

I'm not sure I'm following.  Let's use a function to select from b:

create or replace function fb()
  returns setof b
  language plpgsql
  rows 1
as $$
begin
  return query select i from b;
end;
$$;

explain (analyze, buffers, verbose)
  select a.i, b.i from a join fb() b on (a.i = b.i);

I used the low row estimate to cause the planner to put this on the inner side.

16 batches
Execution time: 1638.582 ms

Now let's make it slow.

create or replace function fb()
  returns setof b
  language plpgsql
  rows 1
as $$
begin
  perform pg_sleep(2.0);
  return query select i from b;
end;
$$;
explain (analyze, buffers, verbose)
  select a.i, b.i from a join fb() b on (a.i = b.i);

16 batches
Execution time: 3633.859 ms

Under what conditions do you see the inner side get loaded into the 
hash table multiple times?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Mark Dilger

I found a few places in the code where a variable of
type Oid is printed using %d rather than %u and
changed them in the attached patch.

In src/backend/replication/logical/reorderbuffer.c,
circa line 2494, chunk_seq is of type Oid, but
ent-last_chunk_seq is of type int32, leading me
to question if perhaps the use of %d for chunk_seq
is correct, but the use of Oid for the type of chunk_seq
is in error.  If neither is in error, perhaps someone
can provide a short code comment explaining the
logic of the signed/unsigned discrepancy.

Thanks,

Mark Dilger
diff --git a/src/backend/access/heap/tuptoaster.c 
b/src/backend/access/heap/tuptoaster.c
index ce44bbd..d230387 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -2155,7 +2155,7 @@ toast_open_indexes(Relation toastrel,
 * wrong if there is nothing.
 */
if (!found)
-   elog(ERROR, no valid index found for toast relation with Oid 
%d,
+   elog(ERROR, no valid index found for toast relation with Oid 
%u,
 RelationGetRelid(toastrel));
 
return res;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c7f41a5..c25ac31 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4540,7 +4540,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
DBWriteRequest *newreq;
PgStat_StatDBEntry *dbentry;
 
-   elog(DEBUG2, received inquiry for %d, msg-databaseid);
+   elog(DEBUG2, received inquiry for %u, msg-databaseid);
 
/*
 * Find the last write request for this DB.  If it's older than the
@@ -4598,7 +4598,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
writetime = 
pstrdup(timestamptz_to_str(dbentry-stats_timestamp));
mytime = pstrdup(timestamptz_to_str(cur_ts));
elog(LOG,
-   stats_timestamp %s is later than collector's time %s 
for db %d,
+   stats_timestamp %s is later than collector's time %s 
for db %u,
 writetime, mytime, dbentry-databaseid);
pfree(writetime);
pfree(mytime);
diff --git a/src/backend/replication/logical/reorderbuffer.c 
b/src/backend/replication/logical/reorderbuffer.c
index 6e75398..41a4896 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, 
ReorderBufferTXN *txn,
dlist_init(ent-chunks);
 
if (chunk_seq != 0)
-   elog(ERROR, got sequence entry %d for toast chunk %u 
instead of seq 0,
+   elog(ERROR, got sequence entry %u for toast chunk %u 
instead of seq 0,
 chunk_seq, chunk_id);
}
else if (found  chunk_seq != ent-last_chunk_seq + 1)
-   elog(ERROR, got sequence entry %d for toast chunk %u instead 
of seq %d,
+   elog(ERROR, got sequence entry %u for toast chunk %u instead 
of seq %d,
 chunk_seq, chunk_id, ent-last_chunk_seq + 1);
 
chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc, isnull));
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 0a9ac02..a59508f 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -821,7 +821,7 @@ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
 pg_class.relname 
FROM pg_class 
  JOIN pg_namespace on 
pg_namespace.oid = relnamespace 
-  WHERE pg_class.oid = %d, 
te-catalogId.oid);
+  WHERE pg_class.oid = %u, 
te-catalogId.oid);
 
res = PQexec(AH-connection, query-data);
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Alvaro Herrera

Mark Dilger wrote:
 I found a few places in the code where a variable of
 type Oid is printed using %d rather than %u and
 changed them in the attached patch.
 
 In src/backend/replication/logical/reorderbuffer.c,
 circa line 2494, chunk_seq is of type Oid, but
 ent-last_chunk_seq is of type int32, leading me
 to question if perhaps the use of %d for chunk_seq
 is correct, but the use of Oid for the type of chunk_seq
 is in error.  If neither is in error, perhaps someone
 can provide a short code comment explaining the
 logic of the signed/unsigned discrepancy.

tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk
is wrong in declaring it Oid, and so this part of your patch is bogus.
The others seem correct.

 diff --git a/src/backend/replication/logical/reorderbuffer.c 
 b/src/backend/replication/logical/reorderbuffer.c
 index 6e75398..41a4896 100644
 --- a/src/backend/replication/logical/reorderbuffer.c
 +++ b/src/backend/replication/logical/reorderbuffer.c
 @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, 
 ReorderBufferTXN *txn,
   dlist_init(ent-chunks);
  
   if (chunk_seq != 0)
 - elog(ERROR, got sequence entry %d for toast chunk %u 
 instead of seq 0,
 + elog(ERROR, got sequence entry %u for toast chunk %u 
 instead of seq 0,
chunk_seq, chunk_id);
   }
   else if (found  chunk_seq != ent-last_chunk_seq + 1)
 - elog(ERROR, got sequence entry %d for toast chunk %u instead 
 of seq %d,
 + elog(ERROR, got sequence entry %u for toast chunk %u instead 
 of seq %d,
chunk_seq, chunk_id, ent-last_chunk_seq + 1);
  
   chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc, isnull));

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Peter Geoghegan

On Thu, Dec 11, 2014 at 7:37 AM, Robert Haas robertmh...@gmail.com wrote:
 2. It's not clear that we're going to have a particularly-impressive
 list of major features for 9.5.  So far we've got RLS and BRIN. I
 expect that GROUPING SETS is far enough along that it should be
 possible to get it in before development ends, and there are a few
 performance patches pending (Andres's lwlock scalability patches,
 Rahila's work on compressing full-page writes) that I think will
 probably make the grade.  But after that it seems to me that it gets
 pretty thin on the ground.

I'm slightly surprised that you didn't mention abbreviated keys in
that list of performance features. You're reviewing that patch; how do
you feel about it now?

 Are we going to bill commit timestamp
 tracking - with replication node ID tracking as the real goal, despite
 the name - as a major feature, or DDL deparsing if that goes in, as
 major features?  As useful as they may be for BDR, they don't strike
 me as things we can publicize as major features independent of BDR.
 And it's getting awfully late for any other major work that people are
 thinking of to start showing up.

Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August -
when development launched. It still doesn't have a reviewer, and it
isn't actually in evidence that someone else has so much as downloaded
and applied the patch (I'm sure someone has done that much, but the
fact is that all the feedback that I've received this year concerns
the semantics/syntax, which you can form an opinion on by just looking
at the extensive documentation and other supplementary material I've
written). It's consistently one of the most requested features, and
yet major aspects of the design, that permeate through every major
subsystem go unremarked on for months now. This feature is
*definitely* major feature list material, since people have been
loudly requesting it for over a decade, and yet no one mentions it in
this thread (only Bruce mentioned it in the other thread about the
effectiveness of the CF process). It's definitely in the top 2 or 3
most requested features, alongside much harder problems like parallel
query and comprehensive partitioning support.

If there is a lesson here for people that are working on major
features, or me personally, I don't know what it is -- if anyone else
knows, please tell me. I've bent over backwards to make the patch as
accessible as possible, and as easy to review as possible. I also
think its potential to destabilize the system (as major features go)
is only about average. What am I doing wrong here?

There is an enormous amount of supplementary documentation associated
with the patch:

https://wiki.postgresql.org/wiki/UPSERT
https://wiki.postgresql.org/wiki/Value_locking

Both of these pages are far larger than the Wiki page for RLS, for
example. The UPSERT wiki page is kept right up to date.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Jeff Janes

On Thu, Dec 11, 2014 at 8:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:


 2. The amount of pre-release testing we get from people outside the
 hard-core development crowd seems to be continuing to decrease.
 We were fortunate that somebody found the JSONB issue before it was
 too late to do anything about it.  Personally, I'm very worried that
 there are other such bugs in 9.4.  But I've given up hoping that any
 more testing will happen until we put out something that calls itself
 9.4.0, which is why I voted to release in the core discussion about it.



We are not particularly inviting of feedback for whatever testing has been
done.

The definitive guide seems to be
https://wiki.postgresql.org/wiki/HowToBetaTest, and says:

You can report tests by email. You can subscribe to any PostgreSQL mailing
list from the subscription form http://www.postgresql.org/community/lists/
.

   - pgsql-bugs: this is the preferred mailing list if you think you have
   found a bug in the beta. You can also use the Bug Reporting Form
   http://www.postgresql.org/support/submitbug/.
   - pgsql-hackers: bugs, questions, and successful test reports are
   welcome here if you are already subscribed to pgsql-hackers. Note that
   pgsql-hackers is a high-traffic mailing list with a lot of development
   discussion.


=

So if you find a bug, you can report it on the bug reporting form (which
doesn't have a drop-down entry for 9.4RC1).

If you have positive results rather than negative ones (or even complaints
that are not actually bugs), you can subscribe to a mailing list which
generates a lot of traffic which is probably over your head and not
interesting to you.

Does the core team keep a mental list of items they want to see tested by
the public, and they will spend their own time testing those things
themselves if they don't hear back on some positive tests for them?

If we find reports of public testing that yields good results (or at least
no bugs) to be useful, we should be more clear on how to go about doing
it.  But are positive reports useful?  If I report a bug, I can write down
the steps to reproduce it, and then follow my own instructions to make sure
it does actually reproduce it.  If I find no bugs, it is just I did a
bunch of random stuff and nothing bad happened, that I noticed.

Chees,

Jeff

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Mark Dilger

Thank you Alvaro,

The attached patch changes the type of chunk_seq to int32,
rather than changing the %d formatting.  The other changes
are the same as in the previous patch.

Mark Dilger



On Thu, Dec 11, 2014 at 9:39 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Mark Dilger wrote:
  I found a few places in the code where a variable of
  type Oid is printed using %d rather than %u and
  changed them in the attached patch.
 
  In src/backend/replication/logical/reorderbuffer.c,
  circa line 2494, chunk_seq is of type Oid, but
  ent-last_chunk_seq is of type int32, leading me
  to question if perhaps the use of %d for chunk_seq
  is correct, but the use of Oid for the type of chunk_seq
  is in error.  If neither is in error, perhaps someone
  can provide a short code comment explaining the
  logic of the signed/unsigned discrepancy.

 tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk
 is wrong in declaring it Oid, and so this part of your patch is bogus.
 The others seem correct.

  diff --git a/src/backend/replication/logical/reorderbuffer.c
 b/src/backend/replication/logical/reorderbuffer.c
  index 6e75398..41a4896 100644
  --- a/src/backend/replication/logical/reorderbuffer.c
  +++ b/src/backend/replication/logical/reorderbuffer.c
  @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb,
 ReorderBufferTXN *txn,
dlist_init(ent-chunks);
 
if (chunk_seq != 0)
  - elog(ERROR, got sequence entry %d for toast chunk
 %u instead of seq 0,
  + elog(ERROR, got sequence entry %u for toast chunk
 %u instead of seq 0,
 chunk_seq, chunk_id);
}
else if (found  chunk_seq != ent-last_chunk_seq + 1)
  - elog(ERROR, got sequence entry %d for toast chunk %u
 instead of seq %d,
  + elog(ERROR, got sequence entry %u for toast chunk %u
 instead of seq %d,
 chunk_seq, chunk_id, ent-last_chunk_seq + 1);
 
chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc,
 isnull));

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services

diff --git a/src/backend/access/heap/tuptoaster.c 
b/src/backend/access/heap/tuptoaster.c
index ce44bbd..d230387 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -2155,7 +2155,7 @@ toast_open_indexes(Relation toastrel,
 * wrong if there is nothing.
 */
if (!found)
-   elog(ERROR, no valid index found for toast relation with Oid 
%d,
+   elog(ERROR, no valid index found for toast relation with Oid 
%u,
 RelationGetRelid(toastrel));
 
return res;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c7f41a5..c25ac31 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4540,7 +4540,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
DBWriteRequest *newreq;
PgStat_StatDBEntry *dbentry;
 
-   elog(DEBUG2, received inquiry for %d, msg-databaseid);
+   elog(DEBUG2, received inquiry for %u, msg-databaseid);
 
/*
 * Find the last write request for this DB.  If it's older than the
@@ -4598,7 +4598,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
writetime = 
pstrdup(timestamptz_to_str(dbentry-stats_timestamp));
mytime = pstrdup(timestamptz_to_str(cur_ts));
elog(LOG,
-   stats_timestamp %s is later than collector's time %s 
for db %d,
+   stats_timestamp %s is later than collector's time %s 
for db %u,
 writetime, mytime, dbentry-databaseid);
pfree(writetime);
pfree(mytime);
diff --git a/src/backend/replication/logical/reorderbuffer.c 
b/src/backend/replication/logical/reorderbuffer.c
index 6e75398..cd132c1 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2458,7 +2458,7 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, 
ReorderBufferTXN *txn,
Pointer chunk;
TupleDesc   desc = RelationGetDescr(relation);
Oid chunk_id;
-   Oid chunk_seq;
+   int32   chunk_seq;
 
if (txn-toast_hash == NULL)
ReorderBufferToastInitHash(rb, txn);
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 0a9ac02..a59508f 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -821,7 +821,7 @@ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
 pg_class.relname 
FROM pg_class

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Josh Berkus

On 12/11/2014 09:02 AM, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
 While the commitfest process hasn't changed much and was very successful
 in the first few years, a few things have changed externally:

 1  more new developers involved in contributing small patches
 2  more full-time developers creating big patches
 3  increased time demands on experienced Postgres developers
 
 The number of patches registered in the commit fests hasn't actually
 changed over the years.  It has always fluctuated between 50 and 100,
 depending on the point of the release cycle.  So I don't think (1) is
 necessarily the problem.

I don't think that's accurate.  The number of patches per CF *has* edged
upwards by 10-30% per CF over the years:

http://www.databasesoup.com/2013/08/94-commitfest-1-wrap-up.html

(I haven't done the rest of 9.4 yet or 9.5)

No, it's not a jump up by 2X, but it is an upwards trend.  And I think
that Tom has it right that the additional patches we're seeing are
additional large, complex patches.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 1:06 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Dec 11, 2014 at 7:37 AM, Robert Haas robertmh...@gmail.com wrote:
 2. It's not clear that we're going to have a particularly-impressive
 list of major features for 9.5.  So far we've got RLS and BRIN. I
 expect that GROUPING SETS is far enough along that it should be
 possible to get it in before development ends, and there are a few
 performance patches pending (Andres's lwlock scalability patches,
 Rahila's work on compressing full-page writes) that I think will
 probably make the grade.  But after that it seems to me that it gets
 pretty thin on the ground.

 I'm slightly surprised that you didn't mention abbreviated keys in
 that list of performance features. You're reviewing that patch; how do
 you feel about it now?

I'm not sure it's where I think it needs to be yet, but yeah, I think
that will get in.  I thought of it after I hit send.

 Are we going to bill commit timestamp
 tracking - with replication node ID tracking as the real goal, despite
 the name - as a major feature, or DDL deparsing if that goes in, as
 major features?  As useful as they may be for BDR, they don't strike
 me as things we can publicize as major features independent of BDR.
 And it's getting awfully late for any other major work that people are
 thinking of to start showing up.

 Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August -
 when development launched. It still doesn't have a reviewer, and it
 isn't actually in evidence that someone else has so much as downloaded
 and applied the patch (I'm sure someone has done that much, but the
 fact is that all the feedback that I've received this year concerns
 the semantics/syntax, which you can form an opinion on by just looking
 at the extensive documentation and other supplementary material I've
 written). It's consistently one of the most requested features, and
 yet major aspects of the design, that permeate through every major
 subsystem go unremarked on for months now. This feature is
 *definitely* major feature list material, since people have been
 loudly requesting it for over a decade, and yet no one mentions it in
 this thread (only Bruce mentioned it in the other thread about the
 effectiveness of the CF process). It's definitely in the top 2 or 3
 most requested features, alongside much harder problems like parallel
 query and comprehensive partitioning support.

I'm not sure whether that patch is going to go anywhere.  There has
been so much arguing about different aspects of the design that felt
rather unproductive that I think most of the people who would be
qualified to commit it have grown a bit gun-shy.  That would be a good
problem to get fixed, but I don't have a proposal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Josh Berkus

On 12/11/2014 09:22 AM, Heikki Linnakangas wrote:

 I imagine that it's the same for everyone else. Many of the patches that
 sit in the commitfest for weeks are patches that no-one really cares
 much about. I'm not sure what to do about that. It would be harsh to
 reject a patch just because no-one's gotten around to reviewing it, and
 if we start doing that, it's not clear what the point of a commitfest is
 any more.

So the nobody cares argument is manifestly based on false assumptions.
 Are you contending that nobody cares about UPSERT or GROUPING SETS?  In
my experience, the patches that sit for weeks on the CF fall into 4 groups:

1. Large complicated patches that only a handful of committers are even
capable of reviewing.

2. Obscure patches which require specialized knowledge our outside input
to review.

3. Inconsequential patches where the submitter doesn't work the social
process.

4. Patches with contentious spec or syntax.

5. Patches which everyone wants but have persistent hard-to-resolve issues.

Of these only (3) would fit into nobody cares, and that's a pretty
small group.

There's also a chicken-and-egg problem there.  Say that we started not
reviewing DW features because nobody cares.  Then the people who
contributed those features don't go on to become major contributors,
which means they won't review further DW patches.  Which means that
we've just closed off an entire use-case for PostgreSQL.  I'd think that
PostGIS would have taught us the value of the nobody cares fallacy.

Also, if we go back on the promise that every patch gets a review,
then we're definitely headed towards no more new contributors.  As Noah
said at one developer meeting (to paraphrase), one of the few things
which keeps contributors persisting through our baroque,
poorly-documented, excessively political contribution process is the
promise that they'll get a fair shake for their invested time.  If we
drop that promise, we'll solve our workflow problem by cutting off the
flow of new patches entirely.

 Perhaps we should change the process so that it is the patch author's
 responsibility to find a reviewer, and a committer, for the patch. If
 you can't cajole anyone to review your patch, it's a sign that no-one
 cares about it, and the patch is rejected by default. Or put a quick
 +1/-1 voting option to each patch in the commitfest, to get a quick
 gauge of how the committers feel about it.

Again, that process would favor existing contributors and other folks
who know how to work the Postgres community.  It would be effectively
the same as hanging up a sign which says no new contributors wanted.
It would also be dramatically increasing the amount of politics around
submitted patches, which take up far more time than the technical work.

Overall, we're experiencing this issue because of a few predictable reasons:

1. a gradual increase in the number of submitted patches, especially
large patches

2. Tom Lane cutting back on the number of other people's patches he
reviews and revises.

3. Other major committers getting busier with their day jobs.

4. Failure to recruit, mentor and promote new committers at a rate
proportional to the number of new contributors or the size of our community.

5. Failure to adopt or develop automated systems to remove some of the
grunt work from patch submission and review.

6. Failure to adhere to any uniform standards of patch handing for the
Commitfests.

(2) out of these is actually the biggest thing we're seeing right now, I
think.   Tom was historically a one-man-patch-fixing machine, at one
time responsible for 70% of the patches committed to PostgreSQL.  This
was never a good thing for the community, even if it was a good thing
for the code base, becuase it discouraged others from stepping into a
senior committer role.  Now Tom has cut back, and others have to take up
the slack.  In the long run this will be good for our development
community; in the short run it's kind of painful.

I will also point out that some of the existing senior committers, who
are the heaviest burdened under the existing system, have also been the
most resistant to any change in the system.  You (Heikki) yourself
expressed a strong opposition to any attempt to recruit more reviewers.
So, given that you are inside the heart of the problem, do you have a
solution other than your proposal above that we simply stop accepting
new contributors?  Or is that your solution?  It would work, for some
definition of work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Josh Berkus

On 12/11/2014 08:59 AM, Tom Lane wrote:
 More abstractly, there's a lot of value in having a predictable release
 schedule.  That's going to mean that some release cycles are thin on
 user-visible features, even if just as much work went into them.  It's
 the nature of the game.

+ 1,000,000 from me.  ;-)

Frankly, BRIN, UPSERT and a couple other things are plenty for a
Postgres release.  Other SQL databases would be thrilled to have that
much ... can you name 3 major advances in the last MySQL release?

And given that I've seen nothing about jquery/VODKA since pgCon, I'm
expecting them for 9.6/whatever, not 9.5.  There's a whole longish
syntax discussion we haven't even started yet, let alone actual
technical review.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 1:26 PM, Josh Berkus j...@agliodbs.com wrote:
 No, it's not a jump up by 2X, but it is an upwards trend.  And I think
 that Tom has it right that the additional patches we're seeing are
 additional large, complex patches.

I feel like there's increasingly not that much easy stuff left to do.
Many of the problems we haven't solved yet are unsolved because they
are really hard, or because not that important, or both.  For example,
if you look at the current CommitFest, there are things like:

Add ssl_protocols configuration option
alter user/role CURRENT_USER
Fix local_preload_libraries to work as expected.
Refactoring code for synchronous node detection
Refactor of functions related to quoting from builtins.h to utils/quote.h

Those things are all, obviously, important to somebody, or there
wouldn't be patches for them in need of review.  But in the broader
scheme of things, they are very minor issues.  And then you have
things like:

deparse DDL in event triggers
Compression of Full Page Writes
WIP: dynahash replacement for buffer table
Foreign table inheritance
Grouping Sets
INSERT ... ON CONFLICT {UPDATE | IGNORE}

Pretty much everything on that list is something we've been wrestling
with as a project for at least half a decade.  I remember people
complaining about DDL triggers when I first got involved in the
PostgreSQL community, and the initial patch for foreign tables had
support for inheritance and constraints which I ripped out before
committing as too half-baked.  Pavel had a GROUPING SETS patch by
2009.  Brainstorming solutions to the full_page_writes problem has
been a perennial PGCon after-hours activity for as long as I've been
going.  So it's natural that, to the extent these patches are making
progress, they're doing so slowly.  It's hard stuff to get right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 12:29 PM, Kevin Grittner kgri...@ymail.com wrote:
 Robert Haas robertmh...@gmail.com wrote:
 On Sat, Dec 6, 2014 at 10:08 PM, Tomas Vondra t...@fuzzy.cz wrote:
select a.i, b.i from a join b on (a.i = b.i);

 I think the concern is that the inner side might be something more
 elaborate than a plain table scan, like an aggregate or join.  I might
 be all wet, but my impression is that you can make rescanning
 arbitrarily expensive if you work at it.

 I'm not sure I'm following.  Let's use a function to select from b:

 create or replace function fb()
   returns setof b
   language plpgsql
   rows 1
 as $$
 begin
   return query select i from b;
 end;
 $$;

 explain (analyze, buffers, verbose)
   select a.i, b.i from a join fb() b on (a.i = b.i);

 I used the low row estimate to cause the planner to put this on the inner 
 side.

 16 batches
 Execution time: 1638.582 ms

 Now let's make it slow.

 create or replace function fb()
   returns setof b
   language plpgsql
   rows 1
 as $$
 begin
   perform pg_sleep(2.0);
   return query select i from b;
 end;
 $$;
 explain (analyze, buffers, verbose)
   select a.i, b.i from a join fb() b on (a.i = b.i);

 16 batches
 Execution time: 3633.859 ms

 Under what conditions do you see the inner side get loaded into the
 hash table multiple times?

Huh, interesting.  I guess I was thinking that the inner side got
rescanned for each new batch, but I guess that's not what happens.

Maybe there's no real problem here, and we just win.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 11:49:43AM -0500, Peter Eisentraut wrote:
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
  While the commitfest process hasn't changed much and was very successful
  in the first few years, a few things have changed externally:
  
  1  more new developers involved in contributing small patches
  2  more full-time developers creating big patches
  3  increased time demands on experienced Postgres developers
 
 The number of patches registered in the commit fests hasn't actually
 changed over the years.  It has always fluctuated between 50 and 100,
 depending on the point of the release cycle.  So I don't think (1) is
 necessarily the problem.

Yes, I was hoping someone would produce some numbers --- thanks.  I
think the big question is whether the level of complexity of the patches
has increased, or whether it is just the amount of experienced developer
time (3) that has decreased, or something else.  Or maybe things have
not materially changed at all over the years.

I am following this thought process:

1.  Do we have a problem?
2.  What is the cause of the problem?
3.  How do we fix the problem?

I think to get a useful outcome we have to process things in that order.
So, is #1 true, and if so, what is the answer to #2?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-11 Thread Tom Lane

Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom I've not spent any real effort looking at gsp2.patch yet, but it
  Tom seems even worse off comment-wise: if there's any explanation in
  Tom there at all of what a chained aggregate is, I didn't find it.

 (Maybe stacked would have been a better term.)

 What that code does is produce plans that look like this:

   GroupAggregate
   - Sort
  - ChainAggregate
 - Sort
- ChainAggregate

 in much the same way that WindowAgg nodes are generated.

That seems pretty messy, especially given your further comments that these
plan nodes are interconnected and know about each other (though you failed
to say exactly how).  The claimed analogy to WindowAgg therefore seems
bogus since stacked WindowAggs are independent, AFAIR anyway.  I'm also
wondering about performance: doesn't this imply more rows passing through
some of the plan steps than really necessary?

Also, how would this extend to preferring hashed aggregation in some of
the grouping steps?

ISTM that maybe what we should do is take a cue from the SQL spec, which
defines these things in terms of UNION ALL of plain-GROUP-BY operations
reading from a common CTE.  Abstractly, that is, we'd have

Append
  - GroupAggregate
- Sort
  - source data
  - GroupAggregate
- Sort
  - source data
  - GroupAggregate
- Sort
  - source data
  ...

(or some of the arms could be HashAgg without a sort).  Then the question
is what exactly the aggregates are reading from.  We could do worse than
make it a straight CTE, I suppose.

  Tom I'd also counsel you to find some other way to do it than
  Tom putting bool chain_head fields in Aggref nodes;

 There are no chain_head fields in Aggref nodes.

Oh, I mistook struct Agg for struct Aggref.  (That's another pretty
poorly chosen struct name, though I suppose it's far too late to change
that choice.)  Still, interconnecting plan nodes that aren't adjacent in
the plan tree doesn't sound like a great idea to me.

  Tom I took a quick look at gsp-u.patch.  It seems like that approach
  Tom should work, with of course the caveat that using CUBE/ROLLUP as
  Tom function names in a GROUP BY list would be problematic.  I'm not
  Tom convinced by the commentary in ruleutils.c suggesting that extra
  Tom parentheses would help disambiguate: aren't extra parentheses
  Tom still going to contain grouping specs according to the standard?

 The extra parens do actually disambiguate because CUBE(x) and
 (CUBE(x)) are not equivalent anywhere; while CUBE(x) can appear inside
 GROUPING SETS (...), it cannot appear inside a (...) list nested inside
 a GROUPING SETS list (or anywhere else).

Maybe, but this seems very fragile and non-future-proof.  I think
double-quoting or schema-qualifying such function names would be safer
when you think about the use-case of dumping views that may get loaded
into future Postgres versions.

 The question that needs deciding here is less whether the approach
 _could_ work but whether we _want_ it. The objection has been made
 that we are in effect introducing a new category of unreserved almost
 everywhere keyword, which I think has a point;

True, but I think that ship has already sailed.  We already have similar
behavior for PARTITION, RANGE, and ROWS (see the opt_existing_window_name
production), and I think PRECEDING, FOLLOWING, and UNBOUNDED are
effectively reserved-in-certain-very-specific-contexts as well.  And there
are similar behaviors in plpgsql's parser.

 on the other hand,
 reserving CUBE is a seriously painful prospect.

Precisely.  I think renaming or getting rid of contrib/cube would have
to be something done in a staged fashion over multiple release cycles.
Waiting several years to get GROUPING SETS doesn't seem appealing at all
compared to this alternative.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Heikki Linnakangas


On 12/11/2014 08:51 PM, Josh Berkus wrote:

On 12/11/2014 09:22 AM, Heikki Linnakangas wrote:


I imagine that it's the same for everyone else. Many of the patches that
sit in the commitfest for weeks are patches that no-one really cares
much about. I'm not sure what to do about that. It would be harsh to
reject a patch just because no-one's gotten around to reviewing it, and
if we start doing that, it's not clear what the point of a commitfest is
any more.


So the nobody cares argument is manifestly based on false assumptions.
  Are you contending that nobody cares about UPSERT or GROUPING SETS?


No. I was thinking of patches like Add IF NOT EXISTS to CREATE TABLE AS 
and CREATE MATERIALIZED VIEW, event triggers: more DROP info, Point 
to polygon distance operator and pgcrypto: support PGP signatures. 
And nobody was an exaggeration; clearly *someone* cares about those 
things, or they wouldn't have written a patch in the first place. But 
none of the committers care enough to pick them up. Even in the case 
that someone reviews them, it's often not because the reviewer is 
interested in the patch, it's just to help out with the process.


(Apologies to the authors of those patches; they were just the first few 
that caught my eye)



There's also a chicken-and-egg problem there.  Say that we started not
reviewing DW features because nobody cares.  Then the people who
contributed those features don't go on to become major contributors,
which means they won't review further DW patches.  Which means that
we've just closed off an entire use-case for PostgreSQL.  I'd think that
PostGIS would have taught us the value of the nobody cares fallacy.

Also, if we go back on the promise that every patch gets a review,
then we're definitely headed towards no more new contributors.  As Noah
said at one developer meeting (to paraphrase), one of the few things
which keeps contributors persisting through our baroque,
poorly-documented, excessively political contribution process is the
promise that they'll get a fair shake for their invested time.  If we
drop that promise, we'll solve our workflow problem by cutting off the
flow of new patches entirely.


Yeah, there is that.


Perhaps we should change the process so that it is the patch author's
responsibility to find a reviewer, and a committer, for the patch. If
you can't cajole anyone to review your patch, it's a sign that no-one
cares about it, and the patch is rejected by default. Or put a quick
+1/-1 voting option to each patch in the commitfest, to get a quick
gauge of how the committers feel about it.


Again, that process would favor existing contributors and other folks
who know how to work the Postgres community.  It would be effectively
the same as hanging up a sign which says no new contributors wanted.
It would also be dramatically increasing the amount of politics around
submitted patches, which take up far more time than the technical work.


I was thinking that by getting candid feedback that none of the existing 
contributors are interested to look at your patch, the author could 
revise the patch to garner more interest, or perhaps promise to review 
someone else's patch in return. Right now, the patches just linger, and 
the author doesn't know why, what's going to happen or what to do about it.



I will also point out that some of the existing senior committers, who
are the heaviest burdened under the existing system, have also been the
most resistant to any change in the system.  You (Heikki) yourself
expressed a strong opposition to any attempt to recruit more reviewers.


Huh, I did? Can you elaborate?


So, given that you are inside the heart of the problem, do you have a
solution other than your proposal above that we simply stop accepting
new contributors?  Or is that your solution?  It would work, for some
definition of work.


I don't have any good solutions, I'm afraid. It might help to ping the 
reviewers who have signed up more often, to make the reviews to happen 
more quickly. I did some nudge people in the August commitfest, but I 
felt that it didn't actually achieve much. The most efficient way to 
move the commitfest forward was to just review more patches myself.


That's one thought. Robert said the same thing about when he was the 
commitfest manager; he just reviewed most the patches himself in the 
end. And you mentioned that Tom used to review 70% of all incoming 
patches. How about we make that official? It's the commitfest manager's 
duty to review all the patches. He can recruit others if he can, but at 
the end of the day he's expected to take a look at every patch, and 
commit or return with some feedback.


The problem with that is that we'll have a hard time to find volunteers 
for that. But we only need to find one sucker for each commitfest. I can 
volunteer to do that once a year; if the other active committers do the 
same, we're covered.


- Heikki



--
Sent via pgsql-hackers mailing list

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

2014-12-11 Thread Tomas Vondra

On 11.12.2014 20:00, Robert Haas wrote:
 On Thu, Dec 11, 2014 at 12:29 PM, Kevin Grittner kgri...@ymail.com wrote:

 Under what conditions do you see the inner side get loaded into the
 hash table multiple times?
 
 Huh, interesting.  I guess I was thinking that the inner side got
 rescanned for each new batch, but I guess that's not what happens.

No, it's not rescanned. It's scanned only once (for the batch #0), and
tuples belonging to the other batches are stored in files. If the number
of batches needs to be increased (e.g. because of incorrect estimate of
the inner table), the tuples are moved later.

 
 Maybe there's no real problem here, and we just win.

I'm a bit confused by this discussion, because the inner relation has
nothing to do with this patch. It gets scanned exactly once, no matter
what the load factor is. If a batching is necessary, only the already
files (without reexecuting the inner part) are read. However in that
case this patch makes no difference, because it explicitly reverts to
load factor = NTUP_PER_BUCKET (which is 1).

The only point of this patch was to prevent batching because of the
outer table. Usually, the outer table is much larger than the inner one
(e.g. in a star schema, outer = fact table, inner = dimension). Batching
the outer table means you have to write = 50% into a temporary file.

The idea was that if we could increase the load a bit (e.g. using 2
tuples per bucket instead of 1), we will still use a single batch in
some cases (when we miss the work_mem threshold by just a bit). The
lookups will be slower, but we'll save the I/O.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Alvaro Herrera

Heikki Linnakangas wrote:

 That's one thought. Robert said the same thing about when he was the
 commitfest manager; he just reviewed most the patches himself in the end.
 And you mentioned that Tom used to review 70% of all incoming patches. How
 about we make that official? It's the commitfest manager's duty to review
 all the patches. He can recruit others if he can, but at the end of the day
 he's expected to take a look at every patch, and commit or return with some
 feedback.
 
 The problem with that is that we'll have a hard time to find volunteers for
 that. But we only need to find one sucker for each commitfest. I can
 volunteer to do that once a year; if the other active committers do the
 same, we're covered.

Hm, I was commitfest manager once, too, and this exact same thing
happened.  Maybe we need to make that official, and give the sucker some
perk.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Peter Geoghegan

On Thu, Dec 11, 2014 at 11:52 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 The problem with that is that we'll have a hard time to find volunteers for
 that. But we only need to find one sucker for each commitfest. I can
 volunteer to do that once a year; if the other active committers do the
 same, we're covered.

 Hm, I was commitfest manager once, too, and this exact same thing
 happened.  Maybe we need to make that official, and give the sucker some
 perk.

I should do it at least once, if only because I haven't experienced it.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Peter Geoghegan

On Thu, Dec 11, 2014 at 10:43 AM, Robert Haas robertmh...@gmail.com wrote:
 Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August -
 when development launched. It still doesn't have a reviewer, and it
 isn't actually in evidence that someone else has so much as downloaded
 and applied the patch

 I'm not sure whether that patch is going to go anywhere.  There has
 been so much arguing about different aspects of the design that felt
 rather unproductive that I think most of the people who would be
 qualified to commit it have grown a bit gun-shy.  That would be a good
 problem to get fixed, but I don't have a proposal.

Really?

I have acted comprehensively on 100% of feedback to date on the
semantics/syntax of the ON CONFLICT UPDATE patch. The record reflects
that. I don't believe that the problem is that people are gun shy. We
haven't even had a real disagreement since last year, and last year's
discussion of value locking was genuinely very useful (I hate to say
it, but the foreign key locking patch might have considered the
possibility of unprincipled deadlocks more closely, as we saw
recently).

Lots of other systems have a comparable feature. Most recently, VoltDB
added a custom UPSERT, even though they don't have half the SQL
features we do. It's a complicated feature, but it's not a
particularly complicated feature as big, impactful features go (like
LATERAL, or logical decoding, or the foreign key locking stuff). It's
entirely possible to get the feature in in the next few months, if
someone will work with me on it.

Even Heikki, who worked on this with me far more than everyone else,
found the value locking page a useful summary. He didn't even know
that there was a third design advocated by Simon and Andres until he
saw it! The discussion was difficult, but had a useful outcome,
because accounting for unprincipled deadlocks is a major source of
complexity. I have seen a lot of benefit from comprehensively
documenting stuff in one place (the wiki pages), but that has tapered
off.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: multivariate statistics / proof of concept

2014-12-11 Thread Tomas Vondra

On 11.12.2014 17:53, Heikki Linnakangas wrote:
 On 10/13/2014 01:00 AM, Tomas Vondra wrote:
 Hi,

 attached is a WIP patch implementing multivariate statistics.
 
 Great! Really glad to see you working on this.
 
 + * FIXME This sample sizing is mostly OK when computing stats for
 + *   individual columns, but when computing multi-variate stats
 + *   for multivariate stats (histograms, mcv, ...) it's rather
 + *   insufficient. For small number of dimensions it works, but
 + *   for complex stats it'd be nice use sample proportional to
 + *   the table (say, 0.5% - 1%) instead of a fixed size.
 
 I don't think a fraction of the table is appropriate. As long as the 
 sample is random, the accuracy of a sample doesn't depend much on
 the size of the population. For example, if you sample 1,000 rows
 from a table with 100,000 rows, or 1000 rows from a table with
 100,000,000 rows, the accuracy is pretty much the same. That doesn't
 change when you go from a single variable to multiple variables.

I might be wrong, but I doubt that. First, I read a number of papers
while working on this patch, and all of them used samples proportional
to the data set. That's an indirect evidence, though.

 You do need a bigger sample with multiple variables, however. My gut 
 feeling is that if you sample N rows for a single variable, with two 
 variables you need to sample N^2 rows to get the same accuracy. But
 it's not proportional to the table size. (I have no proof for that,
 but I'm sure there is literature on this.)

Maybe. I think it's somehow related to the number of buckets (which
somehow determines the precision of the histogram). If you want 1000
buckets, the number of rows scanned needs to be e.g. 10x that. With
multi-variate histograms, we may shoot for more buckets (say, 100 in
each dimension).

 
 + * Multivariate histograms
 + *
 + * Histograms are a collection of buckets, represented by n-dimensional
 + * rectangles. Each rectangle is delimited by an array of lower and
 + * upper boundaries, so that for for the i-th attribute
 + *
 + * min[i] = value[i] = max[i]
 + *
 + * Each bucket tracks frequency (fraction of tuples it contains),
 + * information about the inequalities, number of distinct values in
 + * each dimension (which is used when building the histogram) etc.
 + *
 + * The boundaries may be either inclusive or exclusive, or the whole
 + * dimension may be NULL.
 + *
 + * The buckets may overlap (assuming the build algorithm keeps the
 + * frequencies additive) or may not cover the whole space (i.e. allow
 + * gaps). This entirely depends on the algorithm used to build the
 + * histogram.
 
 That sounds pretty exotic. These buckets are quite different from
 the single-dimension buckets we currently have.
 
 The paper you reference in partition_bucket() function, M. 
 Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating 
 Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 
 1988: 28-36, actually doesn't mention overlapping buckets at all. I 
 haven't read the code in detail, but if it implements the algorithm
 from that paper, there will be no overlap.

The algorithm implemented in partition_bucket() is very simple and
naive, and it mostly resembles the algorithm described in the paper. I'm
sure there are differences, it's not a 1:1 implementation, but you're
right it produces non-overlapping buckets.

The point is that I envision more complex algorithms or different
histogram types, and some of them may produce overlapping buckets. Maybe
that's premature comment, and it will turn out it's not really necessary.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Alvaro Herrera

Peter Geoghegan wrote:
 On Thu, Dec 11, 2014 at 11:52 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  The problem with that is that we'll have a hard time to find volunteers for
  that. But we only need to find one sucker for each commitfest. I can
  volunteer to do that once a year; if the other active committers do the
  same, we're covered.
 
  Hm, I was commitfest manager once, too, and this exact same thing
  happened.  Maybe we need to make that official, and give the sucker some
  perk.
 
 I should do it at least once, if only because I haven't experienced it.

We could make an slogan out of it: you've never experienced the
PostgreSQL development process if you haven't been a CFM at least once
in your life.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Tom Lane

Mark Dilger hornschnor...@gmail.com writes:
 The attached patch changes the type of chunk_seq to int32,
 rather than changing the %d formatting.  The other changes
 are the same as in the previous patch.

BTW, how are you finding these?  If it's some automated tool, what?
(If you're finding them by hand, I'm in awe of your scan rate.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Memory-bounded HashAgg

2014-12-11 Thread Tomas Vondra

On 11.12.2014 11:46, Jeff Davis wrote:

 New patch attached. All open items are complete, though the patch may
 have a few rough edges.
 
 Summary of changes:
 
  * rebased on top of latest memory accounting patch
 http://www.postgresql.org/message-id/1417497257.5584.5.camel@jeff-desktop
  * added a flag to hash_create to prevent it from creating an extra
 level of memory context
- without this, the memory accounting would have a measurable impact
 on performance
  * cost model for the disk usage
  * intelligently choose the number of partitions for each pass of the
 data
  * explain support
  * in build_hash_table(), be more intelligent about the value of
 nbuckets to pass to BuildTupleHashTable()
- BuildTupleHashTable tries to choose a value to keep the table in
 work_mem, but it isn't very accurate.
  * some very rudimentary testing (sanity checks, really) shows good
 results

I plan to look into this over the holidays, hopefully.

 Summary of previous discussion (my summary; I may have missed some
 points):
 
 Tom Lane requested that the patch also handle the case where transition
 values grow (e.g. array_agg) beyond work_mem. I feel this patch provides
 a lot of benefit as it is, and trying to handle that case would be a lot
 more work (we need a way to write the transition values out to disk at a
 minimum, and perhaps combine them with other transition values). I also
 don't think my patch would interfere with a fix there in the future.
 
 Tomas Vondra suggested an alternative design that more closely resembles
 HashJoin: instead of filling up the hash table and then spilling any new
 groups, the idea would be to split the current data into two partitions,
 keep one in the hash table, and spill the other (see
 ExecHashIncreaseNumBatches()). This has the advantage that it's very
 fast to identify whether the tuple is part of the in-memory batch or
 not; and we can avoid even looking in the memory hashtable if not.
 
 The batch-splitting approach has a major downside, however: you are
 likely to evict a skew value from the in-memory batch, which will result
 in all subsequent tuples with that skew value going to disk. My approach
 never evicts from the in-memory table until we actually finalize the
 groups, so the skew values are likely to be completely processed in the
 first pass.

I don't think that's the main issue - there are probably ways to work
around that (e.g. by keeping a skew hash table for those frequent
values, similarly to what hash join does).

The main problem IMHO is that it requires writing the transition values
to disk, which we don't know in many cases (esp. in the interesting
ones, where the transtion values grow).

 So, the attached patch implements my original approach, which I still
 feel is the best solution.

I think this is a reasonable approach - it's true it does no handle the
case with growing aggregate state (e.g. array_agg), so it really fixes
just the case when we underestimate the number of groups.

But I believe we need this approach anyway, becauce we'll never know how
to write all the various transition values (e.g. think of custom
aggregates), and this is an improvement.

We can build on this and add the more elaborate hashjoin-like approach
in the future.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Peter Eisentraut

On 12/11/14 1:56 PM, Robert Haas wrote:
 I feel like there's increasingly not that much easy stuff left to do.
 Many of the problems we haven't solved yet are unsolved because they
 are really hard, or because not that important, or both.

You know, I was telling people the very same thing in 2004.

Yet here we are.

Maybe we just need to relax and accept that we're awesome.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 11:59:58AM -0500, Robert Haas wrote:
 The problem is that, on the one hand, we have a number of serious
 problems with things that got committed and turned out to have
 problems - the multixact stuff, and JSONB, in particular - and on the
 other hand, we are lacking in adequate committer bandwidth to properly
 handle all of the new patches that come in.  We can fix the first
 problem by tightening up on the requirements for committing things,
 but that exacerbates the second problem.  Or we can fix the second
 problem by loosening up on the requirements for commit, but that
 exacerbates the first problem.  Promoting more or fewer committers is
 really the same trade-off: if you're very careful about who you
 promote, you'll get better people but not as many of them, so less
 will get done but with fewer mistakes; if you're more generous in
 handing out commit bits, you reduce the bottleneck to stuff getting
 done but, inevitably, you'll be trusting people in whom you have at
 least slightly less confidence.  There's an inherent tension between
 quality and rate of progress that we can't get rid of, and the fact
 that some of our best people are busier than ever with things other
 than PostgreSQL hacking is not helping - not only because less actual
 review/commit happens, but because newcomers to the community don't
 have as much contact with the more senior people who could help mentor
 them if they only had the time.

Great outline of the tradeoffs involved in  being more aggressive about
committing. I do think the multixact and JSONB problems have spooked
some of us to be slower/more careful about committing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Alvaro Herrera

Tom Lane wrote:
 Mark Dilger hornschnor...@gmail.com writes:
  The attached patch changes the type of chunk_seq to int32,
  rather than changing the %d formatting.  The other changes
  are the same as in the previous patch.
 
 BTW, how are you finding these?  If it's some automated tool, what?
 (If you're finding them by hand, I'm in awe of your scan rate.)

BTW if it's an automated tool, it might also improve by examining the
DatumGetFoo() macros.  In the reorderbuffer.c code just fixed you can
see that chunk_seq is obtained by DatumGetInt32(), which was wrong to
assign to a value of type Oid.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Peter Eisentraut

On 12/11/14 1:35 AM, Bruce Momjian wrote:
 I have heard repeated concerns about the commitfest process in the past
 few months.  The fact we have been in a continual commitfest since
 August also is concerning.

I realized the other day, I'm embracing the idea of a continual commitfest.

I'm still working on patches from the last commitfest.  Why not?  They
didn't expire.  Sure, it would have been nicer to get them done sooner,
but what are you gonna do?  The fact that Nov 15  now  Dec 15 isn't
going to change the fact that I have a few hours to spare right now and
the patches are still relevant.

As far as I'm concerned, we might as well just have one commitfest per
major release.  Call it a patch list.  Make the list sortable by created
date and last-updated date, and let the system police itself.  At least
that's honest.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Jeff Janes

On Thu, Dec 11, 2014 at 11:40 AM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 On 12/11/2014 08:51 PM, Josh Berkus wrote:

 On 12/11/2014 09:22 AM, Heikki Linnakangas wrote:


  Perhaps we should change the process so that it is the patch author's
 responsibility to find a reviewer, and a committer, for the patch. If
 you can't cajole anyone to review your patch, it's a sign that no-one
 cares about it, and the patch is rejected by default. Or put a quick
 +1/-1 voting option to each patch in the commitfest, to get a quick
 gauge of how the committers feel about it.


 Again, that process would favor existing contributors and other folks
 who know how to work the Postgres community.  It would be effectively
 the same as hanging up a sign which says no new contributors wanted.
 It would also be dramatically increasing the amount of politics around
 submitted patches, which take up far more time than the technical work.


 I was thinking that by getting candid feedback that none of the existing
 contributors are interested to look at your patch, the author could revise
 the patch to garner more interest, or perhaps promise to review someone
 else's patch in return. Right now, the patches just linger, and the author
 doesn't know why, what's going to happen or what to do about it.


I agree.  Having your patch disappear into the void is not friendly at
all.  But I don't think a commentless -1 is the answer, either.  That
might one of the few things worse than silence.  Even if the comment is
just This seems awfully complicated for a minimally useful feature or
This will probably have unintended side effects in the executor that I
don't have time to trace down, can someone make an argument for
correctness, or even this format is unreadable, I won't review this until
it is fixed then at least the author (and perhaps more important, junior
contributors who are not the author) will know either what argument to make
to lobby for it, what avenue to take when reviewing it, or whether to just
forget it and work on something more important.


Cheers,

Jeff

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 3:47 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 I agree.  Having your patch disappear into the void is not friendly at all.
 But I don't think a commentless -1 is the answer, either.  That might one
 of the few things worse than silence.  Even if the comment is just This
 seems awfully complicated for a minimally useful feature or This will
 probably have unintended side effects in the executor that I don't have time
 to trace down, can someone make an argument for correctness, or even this
 format is unreadable, I won't review this until it is fixed then at least
 the author (and perhaps more important, junior contributors who are not the
 author) will know either what argument to make to lobby for it, what avenue
 to take when reviewing it, or whether to just forget it and work on
 something more important.

Agreed!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote:
 Tom Lane-2 wrote
  Robert Haas lt;
 
  robertmhaas@
 
  gt; writes:
  On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt;
 
  josh@
 
  gt; wrote:
  While there were technical
  issues, 9.4 dragged a considerable amount because most people were
  ignoring it in favor of 9.5 development.
  
  I think 9.4 dragged almost entirely because of one issue: the
  compressibility of JSONB.
  
  2. The amount of pre-release testing we get from people outside the
  hard-core development crowd seems to be continuing to decrease.
  We were fortunate that somebody found the JSONB issue before it was
  too late to do anything about it.  Personally, I'm very worried that
  there are other such bugs in 9.4.  But I've given up hoping that any
  more testing will happen until we put out something that calls itself
  9.4.0, which is why I voted to release in the core discussion about it.
 
 The compressibility properties of a new type seem like something that should
 be mandated before it is committed - it shouldn't require good fortune that

Odd are the next problem will have nothing to do with compressibility 
--- we can't assume old failure will repeat themselves.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote:
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
  I have heard repeated concerns about the commitfest process in the past
  few months.  The fact we have been in a continual commitfest since
  August also is concerning.
 
 I realized the other day, I'm embracing the idea of a continual commitfest.
 
 I'm still working on patches from the last commitfest.  Why not?  They
 didn't expire.  Sure, it would have been nicer to get them done sooner,
 but what are you gonna do?  The fact that Nov 15  now  Dec 15 isn't
 going to change the fact that I have a few hours to spare right now and
 the patches are still relevant.
 
 As far as I'm concerned, we might as well just have one commitfest per
 major release.  Call it a patch list.  Make the list sortable by created
 date and last-updated date, and let the system police itself.  At least
 that's honest.

Wow, that's radical, and interesting.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/11/2014 01:07 PM, Bruce Momjian wrote:
 On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote:
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
 I have heard repeated concerns about the commitfest process in
 the past few months.  The fact we have been in a continual
 commitfest since August also is concerning.
 
 I realized the other day, I'm embracing the idea of a continual
 commitfest.
 
 I'm still working on patches from the last commitfest.  Why not?
 They didn't expire.  Sure, it would have been nicer to get them
 done sooner, but what are you gonna do?  The fact that Nov 15 
 now  Dec 15 isn't going to change the fact that I have a few
 hours to spare right now and the patches are still relevant.
 
 As far as I'm concerned, we might as well just have one
 commitfest per major release.  Call it a patch list.  Make the
 list sortable by created date and last-updated date, and let the
 system police itself.  At least that's honest.
 
 Wow, that's radical, and interesting.

Actually, to me it sounds a lot like what we did 10 years ago before
the commitfests except with a way to track the patches (other than the
mail archives) and more people participating in patch reviews.

Joe


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQIcBAEBAgAGBQJUigi9AAoJEDfy90M199hlgTEP/3wZov8w32JN8Olp/KV8OCsC
ZwTluKRxgfAMf3bZmimxrnD5SYvgj6RhGwBfPvnNXPub+ODfCqTkMJ9dx41Xv/H9
wm0sFctfZTJb3TLbdZS4slg32LMbpCXIwL4QmNAXmkbKqqTVkTx+G2RvjtdkIpSQ
ZApg4S7LvDazACZjTSl+bUNfzvu9Ygl6Nu4otgXaMbM1ntfKtgF3irpf0+K8Wwi9
7zQ1eU+bSnH9+/kb416WufLutIP182OalbB3BI1KbLURTL98FElUmPYD7xV39OwZ
tE6bFNwUDYdrZCR6B+vkbf/z+9xa8C2vqU9d85MyKNjV1sB1MXX5O+yMRT7eQd8q
JOX1IhKOMMS79c1LqB2vTQlv8lU6VD6gvxFGO4X2OD5cGaLyl4L90hSYmZDordQe
uhwtCo0xoBFExaNb1NF9bzH8u0bJJpxDYZJPPhMNcWMY7EZ3+McXU7MxinDwLh1D
wNynzrZs0Oq8jyn0XwGSF54urkC5clmpEwnhlzeiu2oHosgKLOOw688r38O77MOL
EtNDcVSq2x4B6ocCjGEN7Xcbjm8kbiuQoQJBXUm2C/lvqbiNyzYJ19oH01tHcVTC
oMv3MrhlJ4p83t40bZGqoDUpA5/hme255OgqRdDDH7WFGQXwA6CQEsROFshcIcyR
f7Zx9iww4SGACOj+j0OS
=hSWH
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra t...@fuzzy.cz wrote:
 No, it's not rescanned. It's scanned only once (for the batch #0), and
 tuples belonging to the other batches are stored in files. If the number
 of batches needs to be increased (e.g. because of incorrect estimate of
 the inner table), the tuples are moved later.

Yeah, I think I sort of knew that, but I got confused.  Thanks for clarifying.

 The idea was that if we could increase the load a bit (e.g. using 2
 tuples per bucket instead of 1), we will still use a single batch in
 some cases (when we miss the work_mem threshold by just a bit). The
 lookups will be slower, but we'll save the I/O.

Yeah.  That seems like a valid theory, but your test results so far
seem to indicate that it's not working out like that - which I find
quite surprising, but, I mean, it is what it is, right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Mark Dilger

On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Mark Dilger hornschnor...@gmail.com writes:
  The attached patch changes the type of chunk_seq to int32,
  rather than changing the %d formatting.  The other changes
  are the same as in the previous patch.

 BTW, how are you finding these?  If it's some automated tool, what?
 (If you're finding them by hand, I'm in awe of your scan rate.)

 regards, tom lane


I've been advised to hoodwink you and say that I have found them all
by hand.  Actually, I am changing the typedef in src/include/c.h and then
recompiling, and watching for compiler warnings telling me that I am
passing a uint64 to a %d.  Of course, I get lots of warnings about
passing a uint64 to a %u, but I can filter through those easily enough.

Also, the macros in outfuncs.c tend to complain in a similar way, and
I can check that output, too.

mark

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread David Johnston

On Thu, Dec 11, 2014 at 2:05 PM, Bruce Momjian br...@momjian.us wrote:

 On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote:
  Tom Lane-2 wrote
   Robert Haas lt;
 
   robertmhaas@
 
   gt; writes:
   On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt;
 
   josh@
 
   gt; wrote:
   While there were technical
   issues, 9.4 dragged a considerable amount because most people were
   ignoring it in favor of 9.5 development.
  
   I think 9.4 dragged almost entirely because of one issue: the
   compressibility of JSONB.
  
   2. The amount of pre-release testing we get from people outside the
   hard-core development crowd seems to be continuing to decrease.
   We were fortunate that somebody found the JSONB issue before it was
   too late to do anything about it.  Personally, I'm very worried that
   there are other such bugs in 9.4.  But I've given up hoping that any
   more testing will happen until we put out something that calls itself
   9.4.0, which is why I voted to release in the core discussion about it.
 
  The compressibility properties of a new type seem like something that
 should
  be mandated before it is committed - it shouldn't require good fortune
 that

 Odd are the next problem will have nothing to do with compressibility
 --- we can't assume old failure will repeat themselves.




tl;dr: assign two people, a manager/curator and a lead reviewer.  Give
the curator better tools and the responsibility to engage the community.
If the primary reviewer cannot review a patch in the current commitfest it
can be marked awaiting reviewer and moved to the next CF for evaluation
by the next pair of individuals.  At minimum, though, if it does get moved
the manager AND reviewer need to comment on why it was not handled during
their CF.  Also, formalize documentation targeting developers and reviewers
just like the documentation for users has been formalized and committed to
the repository.  That knowledge and history is probably more important that
source code commit log and definitely should be more accessible to
newcomers.


While true a checklist of things to look for and evaluate when adding a
new type to the system still has value.  How new types interact, if at all,
with TOAST seems like something that warrants explicit attention before
commit, and there are probably others (e.g., OIDs, input/output function
volatility and configuration, etc...).  Maybe this exists somewhere but it
you are considering improvements to the commitfest application having an
top-of-page  always-visible checklist that can bootstrapped based upon the
patch/feature type and modified for specific nuances for the item in
question seems like it would be valuable.

If this was in place before 9.3 then the whatever category the multi-xact
patches fall into would want to have their template modified to incorporate
the various research and testing - along with links to the discussions -
the resulted from the various bug reports that were submitted.  It could
even be structured to both be an interactive checklist as well as acting as
curated documentation for developers and reviewers.  The wiki has some of
this but if the goal is to encourage people to learn how to contribute to
PostgreSQL it should receive a similar level of attention and quality
control that our documentation for people wanting to learn how to use
PostgreSQL receive.  But that is above and beyond the simple goal of having
meaningful checklists attached to each of the major commit-fest items whose
TODO items can be commented upon and serve as a reference for how close to
commit a feature may be.  Comments can be as simple as a place for people
to upload a psql script and say this is what I did and everything seemed
to work/fail in the way I expected - on this platform.

Curation is hard though so I get why easier - just provide links to the
mailing list mainly - actions are what is currently being used.  Instead of
the CF manager being a reviewer (which is a valid approach) having them be
a curator and providing better tools geared toward that role (both to do
the work and to share the results) seem like a better ideal.  Having that
role a CF reviewer should maybe also be assigned.  The two people -
reviewer and manager - would then be responsible for ensuring that reviews
are happening and that the communication to and recruitment from the
community is being handled - respectively.

Just some off-the-cuff thoughts...

David J.

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-11 Thread Robert Haas

On Thu, Dec 11, 2014 at 1:11 AM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan p...@heroku.com wrote:
 Attached revision, v1.6, slightly tweaks the ordering of per-statement
 trigger execution.

 Right now, there is no way for a before row insert/update trigger to
 determine whether it was called as part of an INSERT ... ON CONFLICT
 UPDATE or not. It's also not possible for a DO INSTEAD trigger on a
 view (a before row insert trigger) to determine that it was called
 specifically due to an INSERT...IGNORE (which I think ought to imply
 that any corresponding, redirected insertion into a table should
 also use IGNOREthat's at least going to be something that a
 certain number of apps will need to be made robust against).

 The question is: Do we want to expose this distinction to triggers?
 The natural way to do so would probably be to add TG_SPECULATIVE
 special variable to plpgsql (and equivalent variables in other PLs).
 This text variable would be either UPSERT or IGNORE; it would be
 NULL when it was not applicable (e.g. with traditional INSERTs).

 How do people feel about this? Is it important to include this in our
 initial cut of the feature? I thought that I'd avoid that kind of
 thing until prompted to address it by others, since it probably won't
 end up being a common concern, but I'd like to hear a few opinions.

It's probably something we should add, but there's enough to do
getting the basic feature working first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 12/11/2014 01:07 PM, Bruce Momjian wrote:
  On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote:
  On 12/11/14 1:35 AM, Bruce Momjian wrote:
  I have heard repeated concerns about the commitfest process in
  the past few months.  The fact we have been in a continual
  commitfest since August also is concerning.
  
  I realized the other day, I'm embracing the idea of a continual
  commitfest.
  
  I'm still working on patches from the last commitfest.  Why not?
  They didn't expire.  Sure, it would have been nicer to get them
  done sooner, but what are you gonna do?  The fact that Nov 15 
  now  Dec 15 isn't going to change the fact that I have a few
  hours to spare right now and the patches are still relevant.
  
  As far as I'm concerned, we might as well just have one
  commitfest per major release.  Call it a patch list.  Make the
  list sortable by created date and last-updated date, and let the
  system police itself.  At least that's honest.
  
  Wow, that's radical, and interesting.
 
 Actually, to me it sounds a lot like what we did 10 years ago before
 the commitfests except with a way to track the patches (other than the
 mail archives) and more people participating in patch reviews.

Yes, it does remind me of the mbox files I put on the web.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Mark Dilger

I spoke too soon.  Actually, the Oid definition is in
src/include/postgres_ext.h,
which I'm sure you all know.  But I'm changing other typedefs too, not
just for Oid.

I've been going through a copy of the code and replacing int32 and uint32
with the appropriate type such as Oid, TransactionId, and such, and have
created my own typedef for TypeModType, in order to help chase through
the code and figure out what functions handle what kind of data.  By
defining TypeModType to int64, for example, I get lots of compiler errors
when passing a TypeModType * into a function that takes int32 *, and that
lets me know that the callers of that function think of it as a typemod
argument,
even if it does not have any clear documentation to that effect.

The exercise is a bit tedious, as I have to change lots of strings like
the value %d is out of bounds to something like
the value  TYPEMODTYPE_FORMAT  is out of bounds, but the clarity
I get from it helps enough that it is useful to me.

I hope to eventually be able to simply pass switches to configure like
--64bit-oids and have the whole system work with Oid defined as 64-bits.
Likewise for varlena headers.

I ported the whole of postgres to 64-bit Oids and 64-bit varlena headers
once before, and it worked and passed all the regression tests and such,
but I did it by replacing lots of instances of int32 with instances of
int64,
so it didn't help clarify the meaning of anything.  This time, I'm hoping
that
I can keep in sync with all the commits so that I can build a 32-bit or a
64-bit postgres at a moments notice.


Mark Dilger

On Thu, Dec 11, 2014 at 1:25 PM, Mark Dilger hornschnor...@gmail.com
wrote:



 On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Mark Dilger hornschnor...@gmail.com writes:
  The attached patch changes the type of chunk_seq to int32,
  rather than changing the %d formatting.  The other changes
  are the same as in the previous patch.

 BTW, how are you finding these?  If it's some automated tool, what?
 (If you're finding them by hand, I'm in awe of your scan rate.)

 regards, tom lane


 I've been advised to hoodwink you and say that I have found them all
 by hand.  Actually, I am changing the typedef in src/include/c.h and then
 recompiling, and watching for compiler warnings telling me that I am
 passing a uint64 to a %d.  Of course, I get lots of warnings about
 passing a uint64 to a %u, but I can filter through those easily enough.

 Also, the macros in outfuncs.c tend to complain in a similar way, and
 I can check that output, too.

 mark

Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings

2014-12-11 Thread Bruce Momjian

On Thu, Dec 11, 2014 at 01:25:00PM -0800, Mark Dilger wrote:
 
 
 On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
 Mark Dilger hornschnor...@gmail.com writes:
  The attached patch changes the type of chunk_seq to int32,
  rather than changing the %d formatting.  The other changes
  are the same as in the previous patch.
 
 BTW, how are you finding these?  If it's some automated tool, what?
 (If you're finding them by hand, I'm in awe of your scan rate.)
 
                         regards, tom lane
 
 
 I've been advised to hoodwink you and say that I have found them all
 by hand.  Actually, I am changing the typedef in src/include/c.h and then
 recompiling, and watching for compiler warnings telling me that I am
 passing a uint64 to a %d.  Of course, I get lots of warnings about
 passing a uint64 to a %u, but I can filter through those easily enough.

Cool idea!


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Alvaro Herrera

FWIW I don't think any amount of process would have gotten multixact to
not have the copious bugs it had.  It was just too complex a patch,
doing ugly things to parts too deeply linked to the inner guts of the
server.  We might have spared a few with some extra testing (such as the
idiotic wraparound bugs, and perhaps the pg_upgrade problems), but most
of the rest of it would have taken a real production load to notice --
which is exactly what happened.

(Of course, review from Tom might have unearthed many of these bugs
before they hit final release, but we're saying we don't want to be
dependent on Tom's review for every patch so that doesn't count.)

Review is good, but (as history shows) some bugs can slip through even
extensive review such as the one multixacts got from Noah and Andres.
Had anyone put some real stress on the beta, we could have noticed some
of these bugs much earlier.  One of the problems we have is that our
serious users don't seem to take the beta period seriously.  All our
users are too busy for that, it seems, and they expect that somebody
else will test the point-zero release, and that by the time it hits
point-one or point-two it will be bug-free, which is quite clearly not
the case.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Alvaro Herrera

Bruce Momjian wrote:
 On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote:

  Actually, to me it sounds a lot like what we did 10 years ago before
  the commitfests except with a way to track the patches (other than the
  mail archives) and more people participating in patch reviews.
 
 Yes, it does remind me of the mbox files I put on the web.

The whole commitfest process was put in place in part precisely so we
could get rid of your mboxen.  Please let's not get back to that!

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread Peter Geoghegan

On Thu, Dec 11, 2014 at 1:49 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Review is good, but (as history shows) some bugs can slip through even
 extensive review such as the one multixacts got from Noah and Andres.
 Had anyone put some real stress on the beta, we could have noticed some
 of these bugs much earlier.  One of the problems we have is that our
 serious users don't seem to take the beta period seriously.  All our
 users are too busy for that, it seems, and they expect that somebody
 else will test the point-zero release, and that by the time it hits
 point-one or point-two it will be bug-free, which is quite clearly not
 the case.

Good, strategic stress testing has a big role to play here IMV. I have
had some difficulty generating interest in that, but I think it's
essential.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread David G Johnston

Peter Eisentraut-2 wrote
 On 12/11/14 1:35 AM, Bruce Momjian wrote:
 I have heard repeated concerns about the commitfest process in the past
 few months.  The fact we have been in a continual commitfest since
 August also is concerning.
 
 I realized the other day, I'm embracing the idea of a continual
 commitfest.
 
 As far as I'm concerned, we might as well just have one commitfest per
 major release.  Call it a patch list.  Make the list sortable by created
 date and last-updated date, and let the system police itself.  At least
 that's honest.

Having just written about the idea of a CF manager and a CF reviewer...
while having a continual CF works in theory I think breaking it down into,
say, quarters and having each pair commit to 3 months of being in charge has
some logic behind it.

The patch list concept should be formalized, and should include a
targeted release concept.  The idea of a CF seems like it should apply to
the people involved and include an explicit choosing of items from the
patch list that are going to get attention by those people during the CF
period.  A policy of every new addition being added to the next CF, along
with an item being involved in at least every other CF, would make sense. 
Moving along the path of continuous delivery maybe just create CF teams
instead of a monolithic CF process.

David J.




--
View this message in context: 
http://postgresql.nabble.com/Commitfest-problems-tp5830061p5830189.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Josh Berkus

On 12/11/2014 01:52 PM, Alvaro Herrera wrote:
 Bruce Momjian wrote:
 On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote:
 
 Actually, to me it sounds a lot like what we did 10 years ago before
 the commitfests except with a way to track the patches (other than the
 mail archives) and more people participating in patch reviews.

 Yes, it does remind me of the mbox files I put on the web.
 
 The whole commitfest process was put in place in part precisely so we
 could get rid of your mboxen.  Please let's not get back to that!

Well, the CF process was added for 2 reasons:

1) We were losing track of some patches until integration time, when
Bruce suddently found them in the -hackers archives.

2) In order to give the senior committers some time *off* from reviewing
other people's patches.  The idea was to spend 3 weeks or so intensively
reviewing others' patches, and then to have a month (or more) *off* from
working on anything but your own stuff.

While the CFs are still doing (1), support for (2) ended sometime in the
9.3 development cycle.  Partly this is because current CFMs are not
permitted to take authoritative steps to ensure that the CF ends on
time, and partly it's because of the increase in big complicated patches
which just don't fit into the CF cycle.

Speaking as the originator of commitfests, they were *always* intended
to be a temporary measure, a step on the way to something else like
continuous integration.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Peter Geoghegan

On Thu, Dec 11, 2014 at 1:07 PM, Bruce Momjian br...@momjian.us wrote:
 As far as I'm concerned, we might as well just have one commitfest per
 major release.  Call it a patch list.  Make the list sortable by created
 date and last-updated date, and let the system police itself.  At least
 that's honest.

 Wow, that's radical, and interesting.

Agreed. I don't think it's radical, though - it's just acknowledging
the elephant in the room, which is that the commitfests in a given
cycle are not really distinct at all. I suspect that better tooling
had a lot to do with the success of commitfests, which is more or less
incidental to how they were originally conceived. There is still room
for improvement there.

I'd have to think about it some more, but I'd almost be ready to vote
for formalizing how things actually work in practice given the chance
(i.e. Voting to follow Peter's suggestion).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 Well, the CF process was added for 2 reasons:

 1) We were losing track of some patches until integration time, when
 Bruce suddently found them in the -hackers archives.

 2) In order to give the senior committers some time *off* from reviewing
 other people's patches.  The idea was to spend 3 weeks or so intensively
 reviewing others' patches, and then to have a month (or more) *off* from
 working on anything but your own stuff.

Right; I was in the middle of composing words to the same effect when
this arrived.

 While the CFs are still doing (1), support for (2) ended sometime in the
 9.3 development cycle.  Partly this is because current CFMs are not
 permitted to take authoritative steps to ensure that the CF ends on
 time, and partly it's because of the increase in big complicated patches
 which just don't fit into the CF cycle.

I don't see why you think CFMs are not permitted to close out a CF
when they want to.  At least some of the fests have been closed out per
calendar schedule, punting unprocessed patches to the next fest.  We've
certainly utterly failed to do that since August, but I think that's
mismanagement.

 Speaking as the originator of commitfests, they were *always* intended
 to be a temporary measure, a step on the way to something else like
 continuous integration.

Really?  How would that address (2)?  And please note that part of the
problem we're having right now is that senior people are rebelling
against no longer having any agreed-on time off.  IMV, goal (2) is
not optional; if you try to force people into continuous review work,
you'll just lose them completely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Josh Berkus

On 12/11/2014 02:12 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 While the CFs are still doing (1), support for (2) ended sometime in the
 9.3 development cycle.  Partly this is because current CFMs are not
 permitted to take authoritative steps to ensure that the CF ends on
 time, and partly it's because of the increase in big complicated patches
 which just don't fit into the CF cycle.
 
 I don't see why you think CFMs are not permitted to close out a CF
 when they want to.  At least some of the fests have been closed out per
 calendar schedule, punting unprocessed patches to the next fest.  We've
 certainly utterly failed to do that since August, but I think that's
 mismanagement.

I love how you're willing to accuse Michael of mismangement when you
yourself have *never* run a CF.  If I was Michael, I'd quit right now.

Every CFM who has taken forceful measures to close out the CF on time
has gotten a large amount of flack for it, and absolutely zero back-up
from the committers or the core team.  This is why David, Robert and I
will no longer manage CFs (and probably others as well).  The CFM is
expected to miraculously end the CF on time, without bouncing or
forwarding unreviewed patches, cutting off patches which already had one
round of review, bouncing out or holding patches from contributors who
don't participate in the review process, nagging anyone too much, or
taking any other steps which anyone might take exception to.

In fact, the technique you cite (just punting the patches to the next
CF) resulted in Fetter getting a great deal of hassling on this list
when he did it.  The only people who backed him up on it were other
former CFMs.  And I closed out my CF on my expected schedule (based on
number of patches), but only after taking so much shit for it, I'm never
doing it again.  And I have a pretty high tolerance for taking shit.

So now you're left with CFMs who won't do anything to wind up the CF
because they know that the committers and hackers will *not* support
them if they do.  So, eternal CF.

How about *you* run the next one, Tom?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

2014-12-11 Thread Tomas Vondra

On 11.12.2014 22:16, Robert Haas wrote:
 On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra t...@fuzzy.cz wrote:
 No, it's not rescanned. It's scanned only once (for the batch #0), and
 tuples belonging to the other batches are stored in files. If the number
 of batches needs to be increased (e.g. because of incorrect estimate of
 the inner table), the tuples are moved later.
 
 Yeah, I think I sort of knew that, but I got confused.  Thanks for clarifying.
 
 The idea was that if we could increase the load a bit (e.g. using 2
 tuples per bucket instead of 1), we will still use a single batch in
 some cases (when we miss the work_mem threshold by just a bit). The
 lookups will be slower, but we'll save the I/O.
 
 Yeah.  That seems like a valid theory, but your test results so far
 seem to indicate that it's not working out like that - which I find
 quite surprising, but, I mean, it is what it is, right?

Not exactly. My tests show that as long as the outer table batches fit
into page cache, icreasing the load factor results in worse performance
than batching.

When the outer table is sufficiently small, the batching is faster.

Regarding the sufficiently small - considering today's hardware, we're
probably talking about gigabytes. On machines with significant memory
pressure (forcing the temporary files to disk), it might be much lower,
of course. Of course, it also depends on kernel settings (e.g.
dirty_bytes/dirty_background_bytes).

If we could identify those cases (at least the temp files  RAM) then
maybe we could do this. Otherwise we're going to penalize all the other
queries ...

Maybe the best solution for now is increase the work_mem a bit
recommendation.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 How about *you* run the next one, Tom?

I think the limited amount of time I can put into a commitfest is better
spent on reviewing patches than on managing the process.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-12-11 Thread Simon Riggs

On 9 December 2014 at 12:21, Michael Paquier michael.paqu...@gmail.com wrote:
 On Tue, Dec 9, 2014 at 6:43 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 OK. Perhaps that's not worth mentioning in the release notes, but some
 users may be used to the old behavior. What about the other issues
 (regression test for permissions incorrect and matviews)?
 Here is in any case an updated patch to fix all those things rebased
 on latest HEAD (ae4e688).
 Thanks,

I rewrote and applied this patch to ensure we didn't introduce further bugs.

Tests for the reindex part were rewritten from scratch also.

Rather unluckily that seems to have coincided with Tom's changes, so
I've only added the bits that have nothing to do with Tom's changes -
all of which stand.

(I just got going again after my flight back from Tokyo).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-11 Thread Andrew Gierth

 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  What that code does is produce plans that look like this:

  GroupAggregate
  - Sort
 - ChainAggregate
- Sort
   - ChainAggregate

  in much the same way that WindowAgg nodes are generated.

 Tom That seems pretty messy, especially given your further comments
 Tom that these plan nodes are interconnected and know about each
 Tom other (though you failed to say exactly how).

I'd already explained in more detail way back when we posted the
patch. But to reiterate: the ChainAggregate nodes pass through their
input data unchanged, but on group boundaries they write aggregated
result rows to a tuplestore shared by the whole chain. The top node
returns the data from the tuplestore after its own output is
completed.

The chain_head pointer in the ChainAggregate nodes is used for:

  - obtaining the head node's targetlist and qual, to use to project
rows into the tuplestore (the ChainAggregate nodes don't do
ordinary projection so they have dummy targetlists like the Sort
nodes do)

  - obtaining the pointer to the tuplestore itself

  - on rescan without parameter change, to inform the parent node
whether or not the child nodes are also being rescanned (since
the Sort nodes may or may not block this)

 Tom The claimed analogy to WindowAgg therefore seems bogus since
 Tom stacked WindowAggs are independent, AFAIR anyway.

The analogy is only in that they need to see the same input rows but
in different sort orders.

 Tom I'm also wondering about performance: doesn't this imply more
 Tom rows passing through some of the plan steps than really
 Tom necessary?

There's no way to cut down the number of rows seen by intermediate
nodes unless you implement (and require) associative aggregates, which
we do not do in this patch (that's left for possible future
optimization efforts). Our approach makes no new demands on the
implementation of aggregate functions.

 Tom Also, how would this extend to preferring hashed aggregation in
 Tom some of the grouping steps?

My suggestion for extending it to hashed aggs is: by having a (single)
HashAggregate node keep multiple hash tables, per grouping set, then
any arbitrary collection of grouping sets can be handled in one node
provided that memory permits and no non-hashable features are used.
So the normal plan for CUBE(a,b) under this scheme would be just:

  HashAggregate
  Grouping Sets: (), (a), (b), (a,b)
  - (input path in unsorted order)

If a mixture of hashable and non-hashable data types are used, for
example CUBE(hashable,unhashable), then a plan of this form could be
constructed:

  HashAggregate
  Grouping Sets: (), (hashable)
  - ChainAggregate
 Grouping Sets: (unhashable), (unhashable,hashable)
 - (input path sorted by (unhashable,hashable))

Likewise, plans of this form could be considered for cases like
CUBE(low_card, high_card) where hashed grouping on high_card would
require excessive memory:

  HashAggregate
  Grouping Sets: (), (low_card)
  - ChainAggregate
 Grouping Sets: (high_card), (high_card, low_card)
 - (input path sorted by (high_card, low_card))

 Tom ISTM that maybe what we should do is take a cue from the SQL
 Tom spec, which defines these things in terms of UNION ALL of
 Tom plain-GROUP-BY operations reading from a common CTE.

I looked at that, in fact that was our original plan, but it became
clear quite quickly that it was not going to be easy.

I tried two different approaches. First was to actually re-plan the
input (i.e. running query_planner more than once) for different sort
orders; that crashed and burned quickly thanks to the extent to which
the planner assumes that it'll be run once only on any given input.

Second was to generate a CTE for the input data. This didn't get very
far because everything that already exists to handle CTE nodes assumes
that they are explicit in the planner's input (that they have their
own Query node, etc.) and I was not able to determine a good solution.
If you have any suggestions for how to approach the problem, I'm happy
to have another go at it.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 7:27 AM, Josh Berkus j...@agliodbs.com wrote:

 On 12/11/2014 02:12 PM, Tom Lane wrote:
  Josh Berkus j...@agliodbs.com writes:
  While the CFs are still doing (1), support for (2) ended sometime in the
  9.3 development cycle.  Partly this is because current CFMs are not
  permitted to take authoritative steps to ensure that the CF ends on
  time, and partly it's because of the increase in big complicated patches
  which just don't fit into the CF cycle.
 
  I don't see why you think CFMs are not permitted to close out a CF
  when they want to.  At least some of the fests have been closed out per
  calendar schedule, punting unprocessed patches to the next fest.  We've
  certainly utterly failed to do that since August, but I think that's
  mismanagement.

 I love how you're willing to accuse Michael of mismangement when you
 yourself have *never* run a CF.  If I was Michael, I'd quit right now.


Er, just to be clear, I was basically not the CF manager for the one that
began in October. I just showed up because I had some hours left and things
were not moving on for many patches. Now, and I think that nobody is
volunteering, I am fine to be the next CF manager for the one beginning on
Monday. That would help making things move on.
Regards,
-- 
Michael

Re: [HACKERS] Commitfest problems

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 7:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Josh Berkus j...@agliodbs.com writes:
  How about *you* run the next one, Tom?

 I think the limited amount of time I can put into a commitfest is better
 spent on reviewing patches than on managing the process.

That's not only your case, I have the same feeling that time is more
productive when being focused on a single task. CFM work take your energy
away from other things, and if a CFM reviews patches he'd tend to miss
things in the patches he looked at.
-- 
Michael

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 8:00 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 9 December 2014 at 12:21, Michael Paquier michael.paqu...@gmail.com
 wrote:
  On Tue, Dec 9, 2014 at 6:43 PM, Michael Paquier
  michael.paqu...@gmail.com wrote:
  OK. Perhaps that's not worth mentioning in the release notes, but some
  users may be used to the old behavior. What about the other issues
  (regression test for permissions incorrect and matviews)?
  Here is in any case an updated patch to fix all those things rebased
  on latest HEAD (ae4e688).
  Thanks,

 I rewrote and applied this patch to ensure we didn't introduce further
 bugs.

 Tests for the reindex part were rewritten from scratch also.

 Rather unluckily that seems to have coincided with Tom's changes, so
 I've only added the bits that have nothing to do with Tom's changes -
 all of which stand.


Glad that things are in order now. I have nothing else left to complain
about with this feature. Thanks.

(I just got going again after my flight back from Tokyo).

I understand going east makes one day longer :)
-- 
Michael

Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.

2014-12-11 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 This issue is that some query returns duplicate rows after FOR
 UPDATE was blocked, in other words, after getting
 HeapTupleUpdated in ExecLockRows.

Good catch!

 In the EPQ block in ExecScanFetch, it seems should return NULL if
 the relation does not has test tuple. The attached patch does so
 on the current master and the problem has disappeared.

... but bad fix.  This would break join cases, wherein it's important
that other scan nodes still return all the required tuples.  (It's
unfortunate that the eval-plan-qual isolation test fails to cover
joins :-(.)

After digging around a bit, I realized that the problem is actually in
nodeLockRows.c.  It is supposed to do EvalPlanQualSetTuple(..., NULL)
for each child table that's not supposed to return any row during the
current EPQ test cycle.  Unfortunately, it only does that reliably once
the EPQ environment is already set up.  If we discover we need an EPQ
test while looking at a non-first child table, tables already passed
over in the loop over node-lr_arowMarks don't get the word.

So the correct fix (or a correct fix, anyway; this could also have been
done with more-invasive loop logic changes) is as attached.  I'm working
on back-patching this.

regards, tom lane

diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 990240b..5dcb9b0 100644
*** a/src/backend/executor/nodeLockRows.c
--- b/src/backend/executor/nodeLockRows.c
*** lnext:
*** 194,200 
--- 194,222 
   */
  if (!epq_started)
  {
+ 	ListCell   *lc2;
+ 
  	EvalPlanQualBegin(node-lr_epqstate, estate);
+ 
+ 	/*
+ 	 * Ensure that rels with already-visited rowmarks are told
+ 	 * not to return tuples during the first EPQ test.  We can
+ 	 * exit this loop once it reaches the current rowmark;
+ 	 * rels appearing later in the list will be set up
+ 	 * correctly by the EvalPlanQualSetTuple call at the top
+ 	 * of the loop.
+ 	 */
+ 	foreach(lc2, node-lr_arowMarks)
+ 	{
+ 		ExecAuxRowMark *aerm2 = (ExecAuxRowMark *) lfirst(lc2);
+ 
+ 		if (lc2 == lc)
+ 			break;
+ 		EvalPlanQualSetTuple(node-lr_epqstate,
+ 			 aerm2-rowmark-rti,
+ 			 NULL);
+ 	}
+ 
  	epq_started = true;
  }
  
diff --git a/src/test/isolation/expected/eval-plan-qual.out b/src/test/isolation/expected/eval-plan-qual.out
index ab778cb..0f6595f 100644
*** a/src/test/isolation/expected/eval-plan-qual.out
--- b/src/test/isolation/expected/eval-plan-qual.out
*** accountid  balance
*** 49,51 
--- 49,74 
  
  checking   600
  savings2334   
+ 
+ starting permutation: readp1 writep1 readp2 c1 c2
+ step readp1: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE;
+ tableoid   ctid   a  b  c  
+ 
+ c1 (0,1)  0  0  0  
+ c1 (0,4)  0  1  0  
+ c2 (0,1)  1  0  0  
+ c2 (0,4)  1  1  0  
+ c3 (0,1)  2  0  0  
+ c3 (0,4)  2  1  0  
+ step writep1: UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0;
+ step readp2: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; waiting ...
+ step c1: COMMIT;
+ step readp2: ... completed
+ tableoid   ctid   a  b  c  
+ 
+ c1 (0,1)  0  0  0  
+ c1 (0,4)  0  1  0  
+ c2 (0,1)  1  0  0  
+ c3 (0,1)  2  0  0  
+ c3 (0,4)  2  1  0  
+ step c2: COMMIT;
diff --git a/src/test/isolation/specs/eval-plan-qual.spec b/src/test/isolation/specs/eval-plan-qual.spec
index 786b791..876e547 100644
*** a/src/test/isolation/specs/eval-plan-qual.spec
--- b/src/test/isolation/specs/eval-plan-qual.spec
*** setup
*** 8,18 
--- 8,27 
  {
   CREATE TABLE accounts (accountid text PRIMARY KEY, balance numeric not null);
   INSERT INTO accounts VALUES ('checking', 600), ('savings', 600);
+ 
+  CREATE TABLE p (a int, b int, c int);
+  CREATE TABLE c1 () INHERITS (p);
+  CREATE TABLE c2 () INHERITS (p);
+  CREATE TABLE c3 () INHERITS (p);
+  INSERT INTO c1 SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a;
+  INSERT INTO c2 SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a;
+  INSERT INTO c3 SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a;
  }

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 1:34 AM, Bruce Momjian br...@momjian.us wrote:

 On Thu, Dec 11, 2014 at 01:26:38PM +0530, Rahila Syed wrote:
  I am sorry but I can't understand the above results due to wrapping.
  Are you saying compression was twice as slow?
 
  CPU usage at user level (in seconds)  for compression set 'on' is 562
 secs
  while that for compression  set 'off' is 354 secs. As per the readings,
 it
  takes little less than double CPU time to compress.
  However , the total time  taken to run 25 transactions for each of
 the
  scenario is as follows,
 
  compression = 'on'  : 1838 secs
  = 'off' : 1701 secs
 
 
  Different is around 140 secs.

 OK, so the compression took 2x the cpu and was 8% slower.  The only
 benefit is WAL files are 35% smaller?


That depends as well on the compression algorithm used. I am far from being
a specialist in this area, but I guess that there are things consuming less
CPU for a lower rate of compression and that there are no magic solutions.
A correct answer would be to either change the compression algorithm
present in core to something that is more compliant to the FPW compression,
or to add hooks to allow people to plug in the compression algorithm they
want for the compression and decompression calls. In any case and for any
type of compression (be it different algo, record-level compression or FPW
compression), what we have here is a tradeoff, and a switch for people who
care more about I/O than CPU usage. And we would still face in any case CPU
bursts at checkpoints because I can't imagine FPWs not being compressed
even if we do something at record level (thinking so what we have here is
the light-compression version).
Regards,
-- 
Michael

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 8:41 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Fri, Dec 12, 2014 at 8:00 AM, Simon Riggs si...@2ndquadrant.com
 wrote:

 Rather unluckily that seems to have coincided with Tom's changes, so
 I've only added the bits that have nothing to do with Tom's changes -
 all of which stand.

 Glad that things are in order now. I have nothing else left to complain
 about with this feature. Thanks.

Actually there is one thing left: this patch from Sawada-san to finish the
cleanup of ReindexStmt for the boolean flags.
http://www.postgresql.org/message-id/cad21aocxfe1j+hskbej80rqqnhr8m_yuxdgkwz4dl8zpqt8...@mail.gmail.com

Regards,
-- 
Michael

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-12-11 Thread Simon Riggs

On 11 December 2014 at 23:41, Michael Paquier michael.paqu...@gmail.com wrote:

 Glad that things are in order now. I have nothing else left to complain
 about with this feature. Thanks.

I think you've discovered a new meaning of Ready for Committer

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.

2014-12-11 Thread Kyotaro HORIGUCHI

Hello,

  This issue is that some query returns duplicate rows after FOR
  UPDATE was blocked, in other words, after getting
  HeapTupleUpdated in ExecLockRows.
 
 Good catch!

Thank you.

  In the EPQ block in ExecScanFetch, it seems should return NULL if
  the relation does not has test tuple. The attached patch does so
  on the current master and the problem has disappeared.
 
 ... but bad fix.  This would break join cases, wherein it's important
 that other scan nodes still return all the required tuples.  (It's
 unfortunate that the eval-plan-qual isolation test fails to cover
 joins :-(.)

Hmm. I regretfully forgot to do isolation check. The confluent
didn't seem a bug but I couldn't see its intention.

 After digging around a bit, I realized that the problem is actually in
 nodeLockRows.c.  It is supposed to do EvalPlanQualSetTuple(..., NULL)
 for each child table that's not supposed to return any row during the
 current EPQ test cycle.  Unfortunately, it only does that reliably once
 the EPQ environment is already set up.  If we discover we need an EPQ
 test while looking at a non-first child table, tables already passed
 over in the loop over node-lr_arowMarks don't get the word.

Thank you for the detailed explanation. I was wondering during
investigation about EvalPlanQualSetTuple(, NULL) was called only
for children after the first one that had EPQ test tuple. Now I
understand that it was the core of this bug.

 So the correct fix (or a correct fix, anyway; this could also have been
 done with more-invasive loop logic changes) is as attached.  I'm working
 on back-patching this.

That really helps. Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-12-11 Thread Michael Paquier

On Fri, Dec 12, 2014 at 10:19 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 December 2014 at 23:41, Michael Paquier michael.paqu...@gmail.com
 wrote:

  Glad that things are in order now. I have nothing else left to complain
  about with this feature. Thanks.

 I think you've discovered a new meaning of Ready for Committer

Yep. Sorry for missing so many things.
-- 
Michael

Re: [HACKERS] double vacuum in initdb

2014-12-11 Thread Peter Eisentraut

On 12/11/14 11:44 AM, Kevin Grittner wrote:
 We want to finish with VACUUM FREEZE without the FULL, unless we
 don't care about missing visibility maps and free space maps.

Why would we care, and if we do, why does VACUUM FULL remove them?

You can also run plain VACUUM after FULL to put the maps back.

But the documentation is apparently missing details about this.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-11 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 This is caused by that IndexRecheck examines the test tuple with
 a qual c = '0' without b IN ('0', '1'). The part has been
 removed in create_indexscan_plan. It decieds whether to remove a
 qual or not using get_parse_rowmark(root-parse(-rowMarks)) and
 predicate_implied_by(). But the former always says no (NULL) for
 child relations even if the parent has rowMarks.

 On the other hand, rowmarks on children is already distributed at
 the time by expand_inherited_rtentry() into root-rowMarks.

 So I replaced the get_parse_rowmark() with get_plan_rowmark() as
 the attached patch and the problem disappeared.

Yeah, this is clearly a thinko: really, nothing in the planner should
be using get_parse_rowmark().  I looked around for other errors of the
same type and found that postgresGetForeignPlan() is also using
get_parse_rowmark().  While that's harmless at the moment because we
don't support foreign tables as children, it's still wrong.  Will
fix that too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-11 Thread Etsuro Fujita

(2014/12/12 10:37), Tom Lane wrote:
 Yeah, this is clearly a thinko: really, nothing in the planner should
 be using get_parse_rowmark().  I looked around for other errors of the
 same type and found that postgresGetForeignPlan() is also using
 get_parse_rowmark().  While that's harmless at the moment because we
 don't support foreign tables as children, it's still wrong.  Will
 fix that too.

I don't think we need to fix that too.  In order to support that, I'm
proposing to modify postgresGetForeignPlan() in the following way [1]
(see fdw-inh-5.patch).

*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 824,834  postgresGetForeignPlan(PlannerInfo *root,
{
RowMarkClause *rc = get_parse_rowmark(root-parse, 
baserel-relid);

if (rc)
{
/*
!* Relation is specified as a FOR UPDATE/SHARE target, 
so handle
!* that.
 *
 * For now, just ignore any [NO] KEY specification, 
since (a) it's
 * not clear what that means for a remote table that we 
don't have
--- 824,847 
{
RowMarkClause *rc = get_parse_rowmark(root-parse, 
baserel-relid);

+   /*
+* It's possible that relation is contained in an inheritance 
set and
+* that parent relation is selected FOR UPDATE/SHARE.  If so, 
get the
+* RowMarkClause for parent relation.
+*/
+   if (rc == NULL)
+   {
+   PlanRowMark *prm = get_plan_rowmark(root-rowMarks, 
baserel-relid);
+
+   if (prm  prm-rti != prm-prti)
+   rc = get_parse_rowmark(root-parse, prm-prti);
+   }
+
if (rc)
{
/*
!* Relation or parent relation is specified as a FOR 
UPDATE/SHARE
!* target, so handle that.
 *
 * For now, just ignore any [NO] KEY specification, 
since (a) it's
 * not clear what that means for a remote table that we 
don't have

[1] http://www.postgresql.org/message-id/5487bb9d.7070...@lab.ntt.co.jp

Thanks,

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-11 Thread Tom Lane

Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 (2014/12/12 10:37), Tom Lane wrote:
 Yeah, this is clearly a thinko: really, nothing in the planner should
 be using get_parse_rowmark().  I looked around for other errors of the
 same type and found that postgresGetForeignPlan() is also using
 get_parse_rowmark().  While that's harmless at the moment because we
 don't support foreign tables as children, it's still wrong.  Will
 fix that too.

 I don't think we need to fix that too.  In order to support that, I'm
 proposing to modify postgresGetForeignPlan() in the following way [1]
 (see fdw-inh-5.patch).

My goodness, that's ugly.  And it's still wrong, because this is planner
code so it shouldn't be using get_parse_rowmark at all.  The whole point
here is that the rowmark info has been transformed into something
appropriate for the planner to use.  While that transformation is
relatively trivial today, it might not always be so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Refactoring code for sync node detection

2014-12-11 Thread Michael Paquier

On Thu, Dec 11, 2014 at 10:07 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 * I don't like filling the priorities-array in
 SyncRepGetSynchronousStandby(). It might be convenient for the one caller
 that needs it, but it seems pretty ad hoc.

 In fact, I don't really see the point of having the array at all. You
 could just print the priority in the main loop in
 pg_stat_get_wal_senders(). Yeah, nominally you need to hold SyncRepLock
 while reading the priority, but it seems over-zealous to be so strict about
 that in pg_stat_wal_senders(), since it's just an informational view, and
 priority changes so rarely that it's highly unlikely to hit a race
 condition there. Also note that when you change the list of synchronous
 standbys in the config file, and SIGHUP, the walsender processes will
 update their priorities in random order. So the idea that you get a
 consistent snapshot of the priorities is a bit bogus anyway. Also note that
 between filling the priorities array and the main loop in
 pg_stat_get_wal_senders(), a new WAL sender might connect and set a slot's
 pid. With the current coding, you'll print an uninitialized value from the
 array if that happens. So the only thing that holding the SyncRepLock
 really protects from is a torn read of the value, if reading an int is
 not atomic. We could use the spinlock to protect from that if we care.


That's fair, and it simplifies the whole process as well as the API able to
fetch the synchronous WAL sender.


 * Would be nicer to return a pointer, than index into the wal senders
 array. That's what the caller really wants.

 I propose the attached (I admit I haven't tested it).


Actually if you do it this way I think that it would be worth adding the
small optimization Fujii-san mentioned upthread: if priority is equal to 1,
we leave the loop earlier and return immediately the pointer. All those
things gathered give the patch attached, that I actually tested FWIW with
multiple standbys and multiple entries in s_s_names.

Regards,
-- 
Michael
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index aa54bfb..34530a0 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -5,7 +5,7 @@
  * Synchronous replication is new as of PostgreSQL 9.1.
  *
  * If requested, transaction commits wait until their commit LSN is
- * acknowledged by the sync standby.
+ * acknowledged by the synchronous standby.
  *
  * This module contains the code for waiting and release of backends.
  * All code in this module executes on the primary. The core streaming
@@ -358,6 +358,62 @@ SyncRepInitConfig(void)
 }
 
 /*
+ * Find the WAL sender servicing the synchronous standby with the lowest
+ * priority value, or NULL if no synchronous standby is connected. If there
+ * are multiple nodes with the same lowest priority value, the first node
+ * found is selected. The caller must hold SyncRepLock.
+ */
+WalSnd *
+SyncRepGetSynchronousStandby(void)
+{
+	WalSnd	   *result = NULL;
+	int			result_priority = 0;
+	int			i;
+
+	/* Scan WAL senders and find synchronous node if any */
+	for (i = 0; i  max_wal_senders; i++)
+	{
+		/* Use volatile pointer to prevent code rearrangement */
+		volatile WalSnd *walsnd = WalSndCtl-walsnds[i];
+		int			this_priority;
+
+		/* Must be active */
+		if (walsnd-pid == 0)
+			continue;
+
+		/* Must be streaming */
+		if (walsnd-state != WALSNDSTATE_STREAMING)
+			continue;
+
+		/* Must be synchronous */
+		this_priority = walsnd-sync_standby_priority;
+		if (this_priority == 0)
+			continue;
+
+		/* Must have a lower priority value than any previous ones */
+		if (result != NULL  result_priority = this_priority)
+			continue;
+
+		/* Must have a valid flush position */
+		if (XLogRecPtrIsInvalid(walsnd-flush))
+			continue;
+
+		result = (WalSnd *) walsnd;
+		result_priority = this_priority;
+
+		/*
+		 * If priority is equal to 1, we are sure that there are no other
+		 * WAL senders that could be synchronous with a lower prioroty,
+		 * hence leave immediately with this one.
+		 */
+		if (this_priority == 1)
+			return result;
+	}
+
+	return result;
+}
+
+/*
  * Update the LSNs on each queue based upon our latest state. This
  * implements a simple policy of first-valid-standby-releases-waiter.
  *
@@ -368,11 +424,9 @@ void
 SyncRepReleaseWaiters(void)
 {
 	volatile WalSndCtlData *walsndctl = WalSndCtl;
-	volatile WalSnd *syncWalSnd = NULL;
+	WalSnd	   *syncWalSnd;
 	int			numwrite = 0;
 	int			numflush = 0;
-	int			priority = 0;
-	int			i;
 
 	/*
 	 * If this WALSender is serving a standby that is not on the list of
@@ -388,32 +442,13 @@ SyncRepReleaseWaiters(void)
 	/*
 	 * We're a potential sync standby. Release waiters if we are the highest
 	 * priority standby. If there are multiple standbys with same priorities
-	 * then we use the first mentioned standby. If you change this, also
-	 * change pg_stat_get_wal_senders().
+	 * then we use the first mentioned standbys.
 	 */

Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-11 Thread Etsuro Fujita

(2014/12/12 11:19), Tom Lane wrote:
 Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 (2014/12/12 10:37), Tom Lane wrote:
 Yeah, this is clearly a thinko: really, nothing in the planner should
 be using get_parse_rowmark().  I looked around for other errors of the
 same type and found that postgresGetForeignPlan() is also using
 get_parse_rowmark().  While that's harmless at the moment because we
 don't support foreign tables as children, it's still wrong.  Will
 fix that too.
 
 I don't think we need to fix that too.  In order to support that, I'm
 proposing to modify postgresGetForeignPlan() in the following way [1]
 (see fdw-inh-5.patch).
 
 My goodness, that's ugly.  And it's still wrong, because this is planner
 code so it shouldn't be using get_parse_rowmark at all.  The whole point
 here is that the rowmark info has been transformed into something
 appropriate for the planner to use.  While that transformation is
 relatively trivial today, it might not always be so.

OK, I'll update the inheritance patch on top of the revison you'll make.

Thanks,

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_regress writes into source tree

2014-12-11 Thread Peter Eisentraut

When using a vpath build pg_regress writes the processed input/*.source
files into the *source* tree, which isn't supposed to happen.

This appears to be a thinko introduced in this patch:
e3fc4a97bc8ee82a78605b5ffe79bd4cf3c6213b

The attached patch fixes it.
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 27c46ab..8683035 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -611,7 +611,7 @@ convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c
 static void
 convert_sourcefiles(void)
 {
-	convert_sourcefiles_in(input, inputdir, sql, sql);
+	convert_sourcefiles_in(input, outputdir, sql, sql);
 	convert_sourcefiles_in(output, outputdir, expected, out);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-12-11 Thread Simon Riggs

On 17 November 2014 at 22:08, Simon Riggs si...@2ndquadrant.com wrote:
 On 17 November 2014 21:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 What happened to this patch?  I'm going over something that could use
 the concept of clean some stuff up when reading this page, but only if
 we're already writing or similar.

 I see some cases were presented that had a performance decrease.  Did we
 get any numbers for the increase in performance in some other
 interesting cases?

 It's not dead; it just needs more work. Maybe for next CF, or you can now.

Latest version attached for next CF

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


hot_disable.v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 103 matches

Mail list logo