Re: [HACKERS] WRITE_UINT_FIELD used where WRITE_OID_FIELD likely intended
On 12/11/2014 05:57 AM, Michael Paquier wrote: On Thu, Dec 11, 2014 at 7:44 AM, Mark Dilger m...@port25.com wrote: At line 1787 of outfuncs.c, the line: WRITE_UINT_FIELD(reltablespace) should probably say WRITE_OID_FIELD(reltablespace) since that variable is of type Oid, not uint32. Granted, these two macros are interchangeable, but they won't be if we ever move to 64-bit Oids. For correctness you are right. Looks like you spent quite some time looking at that.. Fixed, thanks. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Casting issues with domains
Le 11/12/2014 00:46, Tom Lane a écrit : Kevin Grittner kgri...@ymail.com writes: Tom Lane t...@sss.pgh.pa.us wrote: As far as that goes, I think the OP was unhappy about the performance of the information_schema views, which in our implementation do exactly that so that the exposed types of the view columns conform to the SQL standard, even though the underlying catalogs use PG-centric types. I don't believe that that's the only reason why the performance of the information_schema views tends to be sucky, but it's certainly a reason. Is that schema too edge case to justify some functional indexes on the cast values on the underlying catalogs? (I'm inclined to think so, but it seemed like a question worth putting out there) We don't support functional indexes on system catalogs, so whether it'd be justified is sorta moot. On the whole though I'm inclined to agree that the information_schema views aren't used enough to justify adding overhead to system-catalog updates, even if the pieces for that all existed. Or, since these particular domains are known, is there any sane way to special-case these to allow the underlying types to work? I don't particularly care for a kluge solution here. I notice that recent versions of the SQL spec contain the notion of a distinct type, which is a user-defined type that is representationally identical to some base type but has its own name, and comes equipped with assignment-grade casts to and from the base type (which in PG terms would be binary-compatible casts, though the spec doesn't require that). It seems like this might be intended to be the sort of zero cost type alias we were talking about, except that the SQL committee seems to have got it wrong by not specifying the cast-to-base-type as being implicit. Which ISTM you would want so that operators/functions on the base type would apply automatically to the distinct type. But perhaps we could extend the spec with some option to CREATE TYPE to allow the cast to come out that way. Or in short, maybe we should try to replace the domains used in the current information_schema with distinct types. That's interesting and could easily solve the problem. To give some context, for some reason, Drupal queries the information_schema views before displaying some pages. As our customer has many tables (approx 6 tables, organised à la Oracle with one schema per database user). Thus, the seq scan against pg_class takes ~50ms and the very same one without the cast takes less than 1ms. There is an example of query used : SELECT column_name, data_type, column_default FROM information_schema.columns WHERE table_schema = 'one_schema' AND table_name = 'one_table' AND ( data_type = 'bytea' OR ( numeric_precision IS NOT NULL AND column_default::text LIKE '%nextval%' ) ); Regards -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Wed, Dec 10, 2014 at 10:18 PM, Josh Berkus j...@agliodbs.com wrote: So far, I haven't seen any features for 9.5 which would delay a more timely release the way we did for 9.4. Anybody know of a bombshell someone's going to drop on us for CF5? I had wondered about that myself. What about jsquery? Is that something that is planned for submission some time in the current cycle? FWIW, I strongly doubt that I'll find time to work on anything like that for 9.5. -- Peter Geoghegan
Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby
On 12/11/2014 05:45 AM, Andres Freund wrote: A customer recently reported getting backup_label contains data inconsistent with control file after taking a basebackup from a standby and starting it with a typo in primary_conninfo. When starting postgres from a basebackup StartupXLOG() has the follow code to deal with backup labels: if (haveBackupLabel) { ControlFile-backupStartPoint = checkPoint.redo; ControlFile-backupEndRequired = backupEndRequired; if (backupFromStandby) { if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY) ereport(FATAL, (errmsg(backup_label contains data inconsistent with control file), errhint(This means that the backup is corrupted and you will have to use another backup for recovery.))); ControlFile-backupEndPoint = ControlFile-minRecoveryPoint; } } while I'm not enthusiastic about the error message, that bit of code looks sane at first glance. We certainly expect the control file to indicate we're in recovery. Since we're unlinking the backup label shortly afterwards we'd normally not expect to hit that case after a shutdown in recovery. Check. The problem is that after reading the backup label we also have to read the corresponding checkpoing from pg_xlog. If primary_conninfo and/or restore_command are misconfigured and can't restore files that can only be fixed by shutting down the cluster and fixing up recovery.conf - which sets DB_SHUTDOWNED_IN_RECOVERY in the control file. No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in CreateRestartPoint(). If you shut down the server before it has even read the initial checkpoint record, it will not attempt to create a restartpoint nor update the control file. The easiest solution seems to be to simply also allow that as a state in the above check. It might be nicer to not allow a ShutdownXLOG to modify the control file et al at that stage, but I think that'd end up being more invasive. A short search shows that that also looks like a credible explanation for #12128... Yeah. I was not able to reproduce this, but I'm clearly missing something, since both you and Sergey have seen this happening. Can you write a script to reproduce? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Hello, we found (and maybe fixed) two wrong behavior related to inheritance and FOR UPDATE. This report is about one of them. This behavior could be observed on master and back to 8.4 but 8.4 behaves a bit more funny. I haven't checked on 8.3. This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. - Reproducing the symptom This behavior is easy to reproduce as following. A=# DROP TABLE IF EXISTS p CASCADE; A=# CREATE TABLE p (a int, b int, c int); A=# CREATE TABLE c1 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c2 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c3 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# INSERT INTO c1 (SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a); A=# INSERT INTO c2 (SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a); A=# INSERT INTO c3 (SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a); A=# ANALYZE; A=# BEGIN; A=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; A=# UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0; B=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; -- the above is blocked by session A until it commits. A=# COMMIT; Where A and B are individual sessions. You will get the follwing result on sesison B just after the COMMIT on session A. | tableoid | ctid | a | b | c | --+---+---+---+--- * 33834 | (0,1) | 0 | 0 | 0 | 33834 | (0,4) | 0 | 1 | 0 | 33838 | (0,1) | 1 | 0 | 0 * 33834 | (0,1) | 0 | 0 | 0 | 33842 | (0,1) | 2 | 0 | 0 | 33842 | (0,4) | 2 | 1 | 0 | (6 rows) The lines prefixed with '*' appear twice in the result. The plan for the query is as following, LockRows - Result - Append - Seq Scan on p Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c1 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c2 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c3 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Analysys and solution This seems to be caused in ExecScanFetch under EvalPlanQualNext, when es_epqTuleSet[scanrelid - 1] is false (p and c1 in the above example). In that case, execution confluents onto non-EPQ route. As the result, the accessMtd (= SeqScan, IndexScan will behaves as the same) retunes null slot at the first iteration on p. Then the second iteration on c1, it reuturns the c1's first tuple (33834:(0,1)) and EvalPlanQualNext is satisfied with the wrong tuple. In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From 79c201539abdc59400af8339795c6c63c2980278 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Thu, 11 Dec 2014 09:06:59 +0900 Subject: [PATCH] Fix duplicate tuples for inheritance and FOR UPDATE --- src/backend/executor/execScan.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c index 1319519..d5c491c 100644 --- a/src/backend/executor/execScan.c +++ b/src/backend/executor/execScan.c @@ -74,6 +74,11 @@ ExecScanFetch(ScanState *node, return slot; } + else + { + /* This relation has no tuple to recheck.*/ + return NULL; + } } /* -- 2.1.0.GIT -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
Hello, this is about the second issue. SELECT FROM inheritance parent WHERE cond FOR UPDATE may return results which does not match the cond. The following steps will reproduce the problematic behavior (A and B are individual sessions) on master and back to 9.0 but 8.4 gives correct result. I haven't checked on 8.3. - Reproducing the symptom A=# SET enable_seqscan TO false; A=# SET enable_bitmapscan TO false; A=# DROP TABLE IF EXISTS p CASCADE; A=# CREATE TABLE p (id text, a text, b text, c text); A=# CREATE INDEX p_i1 ON p (a, b, c) WHERE b IN ('0', '1') AND c = '0'; A=# CREATE TABLE c1 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c2 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c3 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# INSERT INTO c1 (SELECT 1 + a, 0, a % 4, 0 FROM generate_series(0, 7) a); A=# INSERT INTO c2 (SELECT 11 + a, 1, a % 4, 0 FROM generate_series(0, 7) a); A=# INSERT INTO c3 (SELECT 21 + a, 2, a % 4, 0 FROM generate_series(0, 7) a); A=# ANALYZE; A=# BEGIN; A=# CREATE TEMP TABLE tt1 AS A=# SELECT id FROM p WHERE b IN ('0', '1') AND c = '0' ORDER BY id LIMIT 1 FOR UPDATE; A=# UPDATE p SET b = -1 WHERE id IN (SELECT id FROM tt1) RETURNING id; A=# DROP TABLE tt1; A=# SET enable_seqscan TO false; A=# SET enable_bitmapscan TO false; B=# SELECT tableoid, ctid, * FROM p WHERE b IN ('0', '1') AND c = '0' ORDER BY id LIMIT 1 FOR UPDATE; A=# COMMIT; On session B. | tableoid | ctid | id | a | b | c | --+---++---++--- | 34316 | (0,9) | 1 | 0 | -1 | 0 b = -1 apparently contradicts the WHERE clause. The plan for the query is as following. The part b IN ('0', '1') in the WHERE clause is omitted even though required by EPQ recheck. Limit - LockRows - Sort Sort Key: p.id - Result - Append - Index Scan using p_i1 on p Index Cond: (c = '0'::text) - Index Scan using c1_a_b_c_idx on c1 Index Cond: (c = '0'::text) - Index Scan using c2_a_b_c_idx on c2 Index Cond: (c = '0'::text) - Index Scan using c3_a_b_c_idx on c3 Index Cond: (c = '0'::text) - Analysys and solution This is caused by that IndexRecheck examines the test tuple with a qual c = '0' without b IN ('0', '1'). The part has been removed in create_indexscan_plan. It decieds whether to remove a qual or not using get_parse_rowmark(root-parse(-rowMarks)) and predicate_implied_by(). But the former always says no (NULL) for child relations even if the parent has rowMarks. On the other hand, rowmarks on children is already distributed at the time by expand_inherited_rtentry() into root-rowMarks. So I replaced the get_parse_rowmark() with get_plan_rowmark() as the attached patch and the problem disappeared. By the way, get_plan_rowmark() has the comment like this, * This probably ought to be elsewhere, but there's no very good place I haven't moved it anywhere but createplan.c might be the good plance. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From a0e26063bd412ce06df7dc252b8881e51af10f35 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Thu, 11 Dec 2014 18:20:37 +0900 Subject: [PATCH] Fix bogus tuples for inhertance and FOR UPDATE --- src/backend/optimizer/plan/createplan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index bf8dbe0..30df3ce 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -29,6 +29,7 @@ #include optimizer/clauses.h #include optimizer/cost.h #include optimizer/paths.h +#include optimizer/prep.h #include optimizer/placeholder.h #include optimizer/plancat.h #include optimizer/planmain.h @@ -1231,7 +1232,7 @@ create_indexscan_plan(PlannerInfo *root, if (best_path-indexinfo-indpred) { if (baserelid != root-parse-resultRelation - get_parse_rowmark(root-parse, baserelid) == NULL) + get_plan_rowmark(root-rowMarks, baserelid) == NULL) if (predicate_implied_by(clausel, best_path-indexinfo-indpred)) continue; /* implied by index predicate */ -- 2.1.0.GIT -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
(2014/12/11 14:54), Ashutosh Bapat wrote: I marked this as ready for committer. Many thanks! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Memory-bounded HashAgg
On Sun, 2014-08-10 at 14:26 -0700, Jeff Davis wrote: This patch is requires the Memory Accounting patch, or something similar to track memory usage. The attached patch enables hashagg to spill to disk, which means that hashagg will contain itself to work_mem even if the planner makes a bad misestimate of the cardinality. New patch attached. All open items are complete, though the patch may have a few rough edges. Summary of changes: * rebased on top of latest memory accounting patch http://www.postgresql.org/message-id/1417497257.5584.5.camel@jeff-desktop * added a flag to hash_create to prevent it from creating an extra level of memory context - without this, the memory accounting would have a measurable impact on performance * cost model for the disk usage * intelligently choose the number of partitions for each pass of the data * explain support * in build_hash_table(), be more intelligent about the value of nbuckets to pass to BuildTupleHashTable() - BuildTupleHashTable tries to choose a value to keep the table in work_mem, but it isn't very accurate. * some very rudimentary testing (sanity checks, really) shows good results Summary of previous discussion (my summary; I may have missed some points): Tom Lane requested that the patch also handle the case where transition values grow (e.g. array_agg) beyond work_mem. I feel this patch provides a lot of benefit as it is, and trying to handle that case would be a lot more work (we need a way to write the transition values out to disk at a minimum, and perhaps combine them with other transition values). I also don't think my patch would interfere with a fix there in the future. Tomas Vondra suggested an alternative design that more closely resembles HashJoin: instead of filling up the hash table and then spilling any new groups, the idea would be to split the current data into two partitions, keep one in the hash table, and spill the other (see ExecHashIncreaseNumBatches()). This has the advantage that it's very fast to identify whether the tuple is part of the in-memory batch or not; and we can avoid even looking in the memory hashtable if not. The batch-splitting approach has a major downside, however: you are likely to evict a skew value from the in-memory batch, which will result in all subsequent tuples with that skew value going to disk. My approach never evicts from the in-memory table until we actually finalize the groups, so the skew values are likely to be completely processed in the first pass. So, the attached patch implements my original approach, which I still feel is the best solution. Regards, Jeff Davis *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 3017,3022 include_dir 'conf.d' --- 3017,3037 /listitem /varlistentry + varlistentry id=guc-enable-hashagg-disk xreflabel=enable_hashagg_disk + termvarnameenable_hashagg_disk/varname (typeboolean/type) + indexterm +primaryvarnameenable_hashagg_disk/ configuration parameter/primary + /indexterm + /term + listitem +para + Enables or disables the query planner's use of hashed aggregation plan + types when the planner expects the hash table size to exceed + varnamework_mem/varname. The default is literalon/. +/para + /listitem + /varlistentry + varlistentry id=guc-enable-hashjoin xreflabel=enable_hashjoin termvarnameenable_hashjoin/varname (typeboolean/type) indexterm *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *** *** 86,91 static void show_sort_group_keys(PlanState *planstate, const char *qlabel, --- 86,92 List *ancestors, ExplainState *es); static void show_sort_info(SortState *sortstate, ExplainState *es); static void show_hash_info(HashState *hashstate, ExplainState *es); + static void show_hashagg_info(AggState *hashstate, ExplainState *es); static void show_tidbitmap_info(BitmapHeapScanState *planstate, ExplainState *es); static void show_instrumentation_count(const char *qlabel, int which, *** *** 1423,1428 ExplainNode(PlanState *planstate, List *ancestors, --- 1424,1430 case T_Agg: show_agg_keys((AggState *) planstate, ancestors, es); show_upper_qual(plan-qual, Filter, planstate, ancestors, es); + show_hashagg_info((AggState *) planstate, es); if (plan-qual) show_instrumentation_count(Rows Removed by Filter, 1, planstate, es); *** *** 1913,1918 show_sort_info(SortState *sortstate, ExplainState *es) --- 1915,1956 } /* + * Show information on hash aggregate buckets and batches + */ + static void + show_hashagg_info(AggState *aggstate, ExplainState *es) + { + Agg *agg = (Agg *)aggstate-ss.ps.plan; + + Assert(IsA(aggstate, AggState)); + + if (agg-aggstrategy != AGG_HASHED) + return; + + if
Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions
I'm marking this as Rejected in the commitfest. It's quite clear that this isn't going to fly in its current form. For the COPY FROM use case, I'd suggest just doing COPY FROM STDIN. Yes, it's slower, but not much. And you probably could optimize it further - there's some gratuitous memcpy()ing happening from buffer to buffer in that codepath. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby
On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/11/2014 05:45 AM, Andres Freund wrote: A customer recently reported getting backup_label contains data inconsistent with control file after taking a basebackup from a standby and starting it with a typo in primary_conninfo. When starting postgres from a basebackup StartupXLOG() has the follow code to deal with backup labels: if (haveBackupLabel) { ControlFile-backupStartPoint = checkPoint.redo; ControlFile-backupEndRequired = backupEndRequired; if (backupFromStandby) { if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY) ereport(FATAL, (errmsg(backup_label contains data inconsistent with control file), errhint(This means that the backup is corrupted and you will have to use another backup for recovery.))); ControlFile-backupEndPoint = ControlFile-minRecoveryPoint; } } while I'm not enthusiastic about the error message, that bit of code looks sane at first glance. We certainly expect the control file to indicate we're in recovery. Since we're unlinking the backup label shortly afterwards we'd normally not expect to hit that case after a shutdown in recovery. Check. The problem is that after reading the backup label we also have to read the corresponding checkpoing from pg_xlog. If primary_conninfo and/or restore_command are misconfigured and can't restore files that can only be fixed by shutting down the cluster and fixing up recovery.conf - which sets DB_SHUTDOWNED_IN_RECOVERY in the control file. No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in CreateRestartPoint(). If you shut down the server before it has even read the initial checkpoint record, it will not attempt to create a restartpoint nor update the control file. Yes, it does. There's a shortcut that just sets the state in the control file and then exits. The easiest solution seems to be to simply also allow that as a state in the above check. It might be nicer to not allow a ShutdownXLOG to modify the control file et al at that stage, but I think that'd end up being more invasive. A short search shows that that also looks like a credible explanation for #12128... Yeah. I was not able to reproduce this, but I'm clearly missing something, since both you and Sergey have seen this happening. Can you write a script to reproduce? Not right now, I only have my mobile... Its quite easy though. Create a pg-basebackup from a standby. Create a recovery.conf with a broken primary conninfo. Start. Shutdown. Fix conninfo. Start. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Refactoring code for sync node detection
On 11/18/2014 11:23 PM, Michael Paquier wrote: On Tue, Nov 18, 2014 at 6:33 PM, Simon Riggs si...@2ndquadrant.com wrote: Can we just wait on this patch until we have the whole feature? Well, this may take some time to even define, and even if goals are clearly defined this may take even more time to have a prototype to discuss about. We quite often break larger patches down into smaller ones, but if they arrive in lots of small pieces its more difficult to understand and correct issues that are happening on the larger scale. Churning the same piece of code multiple times is rather mind numbing. Hm. I think that we are losing ourselves in this thread. The primary goal is to remove a code duplication between syncrep.c and walsender,c that exists since 9.1. Would it be possible to focus only on that for now? That's still the topic of this CF. Some comments on this patch: * I don't like filling the priorities-array in SyncRepGetSynchronousStandby(). It might be convenient for the one caller that needs it, but it seems pretty ad hoc. In fact, I don't really see the point of having the array at all. You could just print the priority in the main loop in pg_stat_get_wal_senders(). Yeah, nominally you need to hold SyncRepLock while reading the priority, but it seems over-zealous to be so strict about that in pg_stat_wal_senders(), since it's just an informational view, and priority changes so rarely that it's highly unlikely to hit a race condition there. Also note that when you change the list of synchronous standbys in the config file, and SIGHUP, the walsender processes will update their priorities in random order. So the idea that you get a consistent snapshot of the priorities is a bit bogus anyway. Also note that between filling the priorities array and the main loop in pg_stat_get_wal_senders(), a new WAL sender might connect and set a slot's pid. With the current coding, you'll print an uninitialized value from the array if that happens. So the only thing that holding the SyncRepLock really protects from is a torn read of the value, if reading an int is not atomic. We could use the spinlock to protect from that if we care. * Would be nicer to return a pointer, than index into the wal senders array. That's what the caller really wants. I propose the attached (I admit I haven't tested it). - Heikki diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index aa54bfb..da89a3b 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -5,7 +5,7 @@ * Synchronous replication is new as of PostgreSQL 9.1. * * If requested, transaction commits wait until their commit LSN is - * acknowledged by the sync standby. + * acknowledged by the synchronous standby. * * This module contains the code for waiting and release of backends. * All code in this module executes on the primary. The core streaming @@ -358,6 +358,54 @@ SyncRepInitConfig(void) } /* + * Find the WAL sender servicing the synchronous standby with the lowest + * priority value, or NULL if no synchronous standby is connected. If there + * are multiple nodes with the same lowest priority value, the first node + * found is selected. The caller must hold SyncRepLock. + */ +WalSnd * +SyncRepGetSynchronousStandby(void) +{ + WalSnd *result = NULL; + int result_priority = 0; + int i; + + /* Scan WAL senders and find synchronous node if any */ + for (i = 0; i max_wal_senders; i++) + { + /* Use volatile pointer to prevent code rearrangement */ + volatile WalSnd *walsnd = WalSndCtl-walsnds[i]; + int this_priority; + + /* Must be active */ + if (walsnd-pid == 0) + continue; + + /* Must be streaming */ + if (walsnd-state != WALSNDSTATE_STREAMING) + continue; + + /* Must be synchronous */ + this_priority = walsnd-sync_standby_priority; + if (this_priority == 0) + continue; + + /* Must have a lower priority value than any previous ones */ + if (result != NULL result_priority = this_priority) + continue; + + /* Must have a valid flush position */ + if (XLogRecPtrIsInvalid(walsnd-flush)) + continue; + + result = (WalSnd *) walsnd; + result_priority = this_priority; + } + + return result; +} + +/* * Update the LSNs on each queue based upon our latest state. This * implements a simple policy of first-valid-standby-releases-waiter. * @@ -368,11 +416,9 @@ void SyncRepReleaseWaiters(void) { volatile WalSndCtlData *walsndctl = WalSndCtl; - volatile WalSnd *syncWalSnd = NULL; + WalSnd *syncWalSnd; int numwrite = 0; int numflush = 0; - int priority = 0; - int i; /* * If this WALSender is serving a standby that is not on the list of @@ -388,32 +434,13 @@ SyncRepReleaseWaiters(void) /* * We're a potential sync standby. Release waiters if we are the highest * priority standby. If there are multiple standbys with same priorities - * then we use the first mentioned standby. If you change this,
Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby
Il 11/12/14 12:38, Andres Freund ha scritto: On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/11/2014 05:45 AM, Andres Freund wrote: Yeah. I was not able to reproduce this, but I'm clearly missing something, since both you and Sergey have seen this happening. Can you write a script to reproduce? Not right now, I only have my mobile... Its quite easy though. Create a pg-basebackup from a standby. Create a recovery.conf with a broken primary conninfo. Start. Shutdown. Fix conninfo. Start. Just tested it. There steps are not sufficient to reproduce the issue on a test installation. I suppose because, on small test datadir, the checkpoint location and the redo location on the pg_control are the same present in the backup_label. To trigger this bug you need to have at least a restartpoint happened on standby between the start and the end of the backup. you could simulate it issuing a checkpoint on master, a checkpoint on standby (to force a restartpoint), then copying the pg_control from the standby. This way I've been able to reproduce it. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
On 11/5/14 9:50 AM, Ali Akbar wrote: I noticed somewhat big performance regression if the xml is large (i use PRODML Product Volume sample from energistics.org http://energistics.org): * Without patch (tested 3 times): select unnest(xpath('//a:flow', x, ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u; Time: 84,012 ms Time: 85,683 ms Time: 88,766 ms * With latest v6 patch (notice the correct result with namespace definition): select unnest(xpath('//a:flow', x, ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u; Time: 108,912 ms Time: 108,267 ms Time: 114,848 ms It's 23% performance regression. * Just curious, i'm also testing v5 patch performance (notice the namespace in the result): select unnest(xpath('//a:flow', x, ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u; Time: 92,552 ms Time: 97,440 ms Time: 99,309 ms The regression is only 13%. I know the xmlCopyNode() version (v6 patch) is much more cleaner than v5patch, should we consider the performance benefit? I ran a test using postgres-US.fo built in the PostgreSQL source tree, which is 38 MB, and ran select unnest(xpath('//fo:bookmark-title', b, array[array['fo', 'http://www.w3.org/1999/XSL/Format']])) from data; (Table contains one row only.) The timings were basically indistinguishable between the three code versions. I'll try to reproduce your test. How big is your file? Do you have a link to the actual file? Could you share your load script? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Wed, Dec 10, 2014 at 11:00:21PM -0800, Josh Berkus wrote: On 12/10/2014 10:35 PM, Bruce Momjian wrote: I think we are reaching the limits on how much we can do with our existing commitfest structure, and it might be time to consider changes. While the commitfest process hasn't changed much and was very successful in the first few years, a few things have changed externally: 1 more new developers involved in contributing small patches 2 more full-time developers creating big patches 3 increased time demands on experienced Postgres developers I will add: 4. commitfest managers have burned out and refuse to do it again Agreed. The fun, if it was ever there, has left the commitfest process. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
On Thu, Dec 11, 2014 at 12:00 AM, Amit Kapila amit.kapil...@gmail.com wrote: Yeah either this way or what Josh has suggested upthread, the main point was that if at all we want to support multi-column list partitioning then we need to have slightly different syntax, however I feel that we can leave multi-column list partitioning for first version. Yeah, possibly. I think we could stand to have a lot more discussion about the syntax here. So far the idea seems to be to copy what Oracle has, but it's not clear if we're going to have exactly what Oracle has or something subtly different. I personally don't find the Oracle syntax very PostgreSQL-ish. Stuff like VALUES LESS THAN 500 doesn't sit especially well with me - less than according to which opclass? Are we going to insist that partitioning must use the default btree opclass so that we can use that syntax? That seems kind of lame. There are lots of interesting things we could do here, e.g.: CREATE TABLE parent_name PARTITION ON (column [ USING opclass ] [, ... ]); CREATE TABLE child_name PARTITION OF parent_name FOR { (value, ...) [ TO (value, ...) ] } [, ...]; So instead of making a hard distinction between range and list partitioning, you can say: CREATE TABLE child_name PARTITION OF parent_name FOR (3), (5), (7); CREATE TABLE child2_name PARTITION OF parent_name FOR (8) TO (12); CREATE TABLE child2_name PARTITION OF parent_name FOR (20) TO (30), (120) TO (130); Now that might be a crappy idea for various reasons, but the point is there are a lot of details to be hammered out with the syntax, and there are several ways we can go wrong. If we choose an overly-limiting syntax, we're needlessly restricting what can be done. If we choose an overly-permissive syntax, we'll restrict the optimization opportunities. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSSAPI, SSPI - include_realm default
On Wed, Dec 10, 2014 at 4:53 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Dec 9, 2014 at 05:40:35PM -0500, Stephen Frost wrote: I thought the idea was to backpatch documentation saying it's a good idea to change this value to x because of y. Not actually referring to the upcoming change directly. And I still think that part is a good idea, as it helps people avoid potential security pitfalls. I agree with this but I don't really see why we wouldn't say hey, this is going to change in 9.5. Peter's argument sounds like he'd rather we not make any changes to the existing documentation, and I don't agree with that, and if we're making changes then, imv, we might as well comment that the default is changed in 9.5. I agree with Peter --- it is unwise to reference a future released feature in a backbranch doc patch. Updating the backbranch docs to add a recommendation is fine. I am strongly in agreement with that principle as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL information view
Magnus Hagander mag...@hagander.net writes: You should add it to the next CF for proper tracking, there are already many patches in the queue waiting for reviews :) Absolutely - I just wanted those that were already involved in the thread to get a chance to look at it early :) didn't want to submit it until it was complete. Which it is now - attached is a new version that includes docs. Here's my review of the patch: * Applies to the current HEAD, no failed hunks. * Compiles and works as advertised. * Docs included. The following catches my eye: psql (9.5devel) SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type help for help. postgres=# select * from pg_stat_ssl; pid | ssl | bits | compression | version | cipher| clientdn --+-+--+-+-+-+-- 1343 | t | 256 | f | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | (1 row) I think the order of details in the psql prompt makes more sense, because it puts more important details first. I suggest we reorder the view columns, while also renaming 'version' to 'protocol' (especially since we have another patch in the works that adds a GUC named 'ssl_protocols'): pid, ssl, protocol, cipher, bits, compression, clientdn. Next, this one: + be_tls_get_cipher(Port *port, char *ptr, size_t len) + { + if (port-ssl) + strlcpy(ptr, SSL_get_cipher(port-ssl), NAMEDATALEN); should be this: + strlcpy(ptr, SSL_get_cipher(port-ssl), len); The same goes for this one: + be_tls_get_peerdn_name(Port *port, char *ptr, size_t len) + { + if (port-peer) + strlcpy(ptr, X509_NAME_to_cstring(X509_get_subject_name(port-peer)), NAMEDATALEN); The NAMEDATALEN constant is passed in the calling code in pgstat_bestart(). Other than that, looks good to me. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 12:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: Quite. So, here's a new thread. MHO is that, although 9.4 has slipped more than any of us would like, 9.5 development launched right on time in August. So I don't see a good reason to postpone 9.5 release just because 9.4 has slipped. I think we should stick to the schedule agreed to in Ottawa last spring. Comments? I'm fine with that, but in the spirit of playing the devil's advocate: 1. At the development meeting, Simon argued for the 5CF schedule for this release, with CF5 not starting until February, as a way of making sure that there was time after the release of 9.4 to get feedback from actual users in time to do something about it for 9.5. If anything, we're going to end up being significantly worse off in that regard than we would have been, because we're releasing in late December instead of early September; an extra month of time to get patches in does not make up for a release that was delayed nearly three months. 2. It's not clear that we're going to have a particularly-impressive list of major features for 9.5. So far we've got RLS and BRIN. I expect that GROUPING SETS is far enough along that it should be possible to get it in before development ends, and there are a few performance patches pending (Andres's lwlock scalability patches, Rahila's work on compressing full-page writes) that I think will probably make the grade. But after that it seems to me that it gets pretty thin on the ground. Are we going to bill commit timestamp tracking - with replication node ID tracking as the real goal, despite the name - as a major feature, or DDL deparsing if that goes in, as major features? As useful as they may be for BDR, they don't strike me as things we can publicize as major features independent of BDR. And it's getting awfully late for any other major work that people are thinking of to start showing up. Now, against all that, if we don't get back on our usual release schedule then (a) it will look like we're losing momentum, which I'm actually afraid may be true rather than merely a perception, and (b) people whose stuff did get in will have to wait longer to see it released. So, I'm not sure waiting is any better. But there are certainly some things not to like about where we are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus j...@agliodbs.com wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. And it became pretty clear early on in that discussion that it was not going to be resolved until Tom signed off on some proposed fix. Which I think points to the issue Bruce mentioned about how busy many of our key contributors are. I could have easily spent four times as much time doing reviewing and committing over the last few months, but I'm not willing to work an 80 or 100 hour week to make that happen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
Robert Haas robertmh...@gmail.com writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus j...@agliodbs.com wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. Meh. While we certainly weren't very speedy about resolving that, I don't think that issue deserves all or even most of the blame. I agree with Josh: the problem really was that people were not focusing on getting 9.4 tested and releasable. One way in which that lack of focus manifested was not having any urgency about resolving JSONB ... but it struck me as a systemic problem and not that specific issue's fault. I'd finger two underlying issues here: 1. As Bruce points out in a nearby thread, we've been in commitfest mode more or less continuously since August. That inherently sucks away developer mindshare from testing already-committed stuff. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. I don't know what to do about either of those things, but I predict future release cycles will be just as bad unless we can fix them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 11, 2014 at 01:26:38PM +0530, Rahila Syed wrote: I am sorry but I can't understand the above results due to wrapping. Are you saying compression was twice as slow? CPU usage at user level (in seconds) for compression set 'on' is 562 secs while that for compression set 'off' is 354 secs. As per the readings, it takes little less than double CPU time to compress. However , the total time taken to run 25 transactions for each of the scenario is as follows, compression = 'on' : 1838 secs = 'off' : 1701 secs Different is around 140 secs. OK, so the compression took 2x the cpu and was 8% slower. The only benefit is WAL files are 35% smaller? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] double vacuum in initdb
Tom Lane t...@sss.pgh.pa.us wrote: I think we could go to PG_CMD_PUTS(ANALYZE;\nVACUUM FULL FREEZE;\n); without any degradation of the intended results. Another idea would be to drop the FULL part and make this PG_CMD_PUTS(ANALYZE;\nVACUUM FREEZE;\n); We want to finish with VACUUM FREEZE without the FULL, unless we don't care about missing visibility maps and free space maps. [ initdb and start the cluster ] server started kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls1 kgrittn@Kevin-Desktop:~/pg/master$ psql -c vacuum freeze; postgres VACUUM kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls2 kgrittn@Kevin-Desktop:~/pg/master$ psql -c vacuum full freeze; postgres VACUUM kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l ~/ls3 kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls1 | wc -l 116 kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls2 | wc -l 119 kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm ~/ls3 | wc -l 80 kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls1 | wc -l 116 kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls2 | wc -l 117 kgrittn@Kevin-Desktop:~/pg/master$ grep _vm ~/ls3 | wc -l 77 -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 10:37:32AM -0500, Robert Haas wrote: 2. It's not clear that we're going to have a particularly-impressive list of major features for 9.5. So far we've got RLS and BRIN. I expect that GROUPING SETS is far enough along that it should be possible to get it in before development ends, and there are a few performance patches pending (Andres's lwlock scalability patches, Rahila's work on compressing full-page writes) that I think will probably make the grade. But after that it seems to me that it gets pretty thin on the ground. Are we going to bill commit timestamp tracking - with replication node ID tracking as the real goal, despite the name - as a major feature, or DDL deparsing if that goes in, as major features? As useful as they may be for BDR, they don't strike me as things we can publicize as major features independent of BDR. And it's getting awfully late for any other major work that people are thinking of to start showing up. How bad is the 9.5 feature list going to be compared to the 9.4 one that had JSONB, but also a lot of infrastructure additions. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/11/14 1:35 AM, Bruce Momjian wrote: While the commitfest process hasn't changed much and was very successful in the first few years, a few things have changed externally: 1 more new developers involved in contributing small patches 2 more full-time developers creating big patches 3 increased time demands on experienced Postgres developers The number of patches registered in the commit fests hasn't actually changed over the years. It has always fluctuated between 50 and 100, depending on the point of the release cycle. So I don't think (1) is necessarily the problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: multivariate statistics / proof of concept
On 10/13/2014 01:00 AM, Tomas Vondra wrote: Hi, attached is a WIP patch implementing multivariate statistics. Great! Really glad to see you working on this. +* FIXME This sample sizing is mostly OK when computing stats for +* individual columns, but when computing multi-variate stats +* for multivariate stats (histograms, mcv, ...) it's rather +* insufficient. For small number of dimensions it works, but +* for complex stats it'd be nice use sample proportional to +* the table (say, 0.5% - 1%) instead of a fixed size. I don't think a fraction of the table is appropriate. As long as the sample is random, the accuracy of a sample doesn't depend much on the size of the population. For example, if you sample 1,000 rows from a table with 100,000 rows, or 1000 rows from a table with 100,000,000 rows, the accuracy is pretty much the same. That doesn't change when you go from a single variable to multiple variables. You do need a bigger sample with multiple variables, however. My gut feeling is that if you sample N rows for a single variable, with two variables you need to sample N^2 rows to get the same accuracy. But it's not proportional to the table size. (I have no proof for that, but I'm sure there is literature on this.) + * Multivariate histograms + * + * Histograms are a collection of buckets, represented by n-dimensional + * rectangles. Each rectangle is delimited by an array of lower and + * upper boundaries, so that for for the i-th attribute + * + * min[i] = value[i] = max[i] + * + * Each bucket tracks frequency (fraction of tuples it contains), + * information about the inequalities, number of distinct values in + * each dimension (which is used when building the histogram) etc. + * + * The boundaries may be either inclusive or exclusive, or the whole + * dimension may be NULL. + * + * The buckets may overlap (assuming the build algorithm keeps the + * frequencies additive) or may not cover the whole space (i.e. allow + * gaps). This entirely depends on the algorithm used to build the + * histogram. That sounds pretty exotic. These buckets are quite different from the single-dimension buckets we currently have. The paper you reference in partition_bucket() function, M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988: 28-36, actually doesn't mention overlapping buckets at all. I haven't read the code in detail, but if it implements the algorithm from that paper, there will be no overlap. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
Bruce Momjian br...@momjian.us writes: On Thu, Dec 11, 2014 at 10:37:32AM -0500, Robert Haas wrote: 2. It's not clear that we're going to have a particularly-impressive list of major features for 9.5. How bad is the 9.5 feature list going to be compared to the 9.4 one that had JSONB, but also a lot of infrastructure additions. Well, whatever the list ends up being, let's wait until we have some more features isn't a tenable scheduling policy. We learned years ago the folly of delaying a release until not-quite-ready feature X was ready. Are we going to delay 9.5 until not-even-proposed-yet features are ready? More abstractly, there's a lot of value in having a predictable release schedule. That's going to mean that some release cycles are thin on user-visible features, even if just as much work went into them. It's the nature of the game. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. Meh. While we certainly weren't very speedy about resolving that, I don't think that issue deserves all or even most of the blame. I agree with Josh: the problem really was that people were not focusing on getting 9.4 tested and releasable. One way in which that lack of focus manifested was not having any urgency about resolving JSONB ... but it struck me as a systemic problem and not that specific issue's fault. I'd finger two underlying issues here: 1. As Bruce points out in a nearby thread, we've been in commitfest mode more or less continuously since August. That inherently sucks away developer mindshare from testing already-committed stuff. The problem is that, on the one hand, we have a number of serious problems with things that got committed and turned out to have problems - the multixact stuff, and JSONB, in particular - and on the other hand, we are lacking in adequate committer bandwidth to properly handle all of the new patches that come in. We can fix the first problem by tightening up on the requirements for committing things, but that exacerbates the second problem. Or we can fix the second problem by loosening up on the requirements for commit, but that exacerbates the first problem. Promoting more or fewer committers is really the same trade-off: if you're very careful about who you promote, you'll get better people but not as many of them, so less will get done but with fewer mistakes; if you're more generous in handing out commit bits, you reduce the bottleneck to stuff getting done but, inevitably, you'll be trusting people in whom you have at least slightly less confidence. There's an inherent tension between quality and rate of progress that we can't get rid of, and the fact that some of our best people are busier than ever with things other than PostgreSQL hacking is not helping - not only because less actual review/commit happens, but because newcomers to the community don't have as much contact with the more senior people who could help mentor them if they only had the time. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. I don't know what to do about either of those things, but I predict future release cycles will be just as bad unless we can fix them. I agree. We have had a few people, Jeff Janes perhaps foremost among them, who have done a lot of really useful testing, but overall it does feel pretty thin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] double vacuum in initdb
On Thu, Dec 11, 2014 at 11:44 AM, Kevin Grittner kgri...@ymail.com wrote: We want to finish with VACUUM FREEZE without the FULL, unless we don't care about missing visibility maps and free space maps. Oh, good point. I had forgotten about that issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Peter Eisentraut pete...@gmx.net writes: On 12/11/14 1:35 AM, Bruce Momjian wrote: While the commitfest process hasn't changed much and was very successful in the first few years, a few things have changed externally: 1 more new developers involved in contributing small patches 2 more full-time developers creating big patches 3 increased time demands on experienced Postgres developers The number of patches registered in the commit fests hasn't actually changed over the years. It has always fluctuated between 50 and 100, depending on the point of the release cycle. So I don't think (1) is necessarily the problem. Interesting point, but of course not all patches are equal; perhaps the problem is that we're seeing fewer of (1) and more of (2) in the commitfest queue. Is there any easy way to quantify the size/complexity of the patches in past fests? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
Tom Lane-2 wrote Robert Haas lt; robertmhaas@ gt; writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt; josh@ gt; wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. The compressibility properties of a new type seem like something that should be mandated before it is committed - it shouldn't require good fortune that -- View this message in context: http://postgresql.nabble.com/logical-column-ordering-tp5829775p5830115.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
David G Johnston wrote Tom Lane-2 wrote Robert Haas lt; robertmhaas@ gt; writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt; josh@ gt; wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. The compressibility properties of a new type seem like something that should be mandated before it is committed - it shouldn't require good fortune that ... someone thought to test it out. Is there anywhere to effectively add that to maximize the chance someone remembers for next time? 3. Effort and brain power was also diverted to fixing 9.3 this time around. I don't have any answers but if a release is thin on features but thick on review and testing I wouldn't complain. That testing, especially applied to existing releases, would have considerable benefit to those still using older supported releases instead of only benefitting those who can upgrade to the latest and greatest. An existing user on an older release is just, if not more, important than the potential user who is looking for new features before deciding to migrate or begin using PostgreSQL. David J. -- View this message in context: http://postgresql.nabble.com/logical-column-ordering-tp5829775p5830116.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On 12/11/2014 06:59 PM, Robert Haas wrote: On Thu, Dec 11, 2014 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. Meh. While we certainly weren't very speedy about resolving that, I don't think that issue deserves all or even most of the blame. I agree with Josh: the problem really was that people were not focusing on getting 9.4 tested and releasable. One way in which that lack of focus manifested was not having any urgency about resolving JSONB ... but it struck me as a systemic problem and not that specific issue's fault. I'd finger two underlying issues here: 1. As Bruce points out in a nearby thread, we've been in commitfest mode more or less continuously since August. That inherently sucks away developer mindshare from testing already-committed stuff. The problem is that, on the one hand, we have a number of serious problems with things that got committed and turned out to have problems - the multixact stuff, and JSONB, in particular - and on the other hand, we are lacking in adequate committer bandwidth to properly handle all of the new patches that come in. We can fix the first problem by tightening up on the requirements for committing things, but that exacerbates the second problem. Or we can fix the second problem by loosening up on the requirements for commit, but that exacerbates the first problem. There is a third option: reject more patches, more quickly, with less discussion. IOW, handle new patches less properly. The commitfest is good at making sure that no patch completely falls off the radar. That's also a problem. Before the commitfest process, many patches were not actively rejected, but they just fell to the side if no committer was interested enough in it to pick it up, review, and commit. There are a lot of patches in the October commitfest that I personally don't care about, and if I was a dictator I would just drop as not worth the trouble to review. Often a patch just doesn't feel quite right, or would require some work to clean up, but you can't immediately point to anything outright wrong with it. It takes some effort to review such a patch enough to give feedback on it, if you want more meaningful feedback than This smells bad. I imagine that it's the same for everyone else. Many of the patches that sit in the commitfest for weeks are patches that no-one really cares much about. I'm not sure what to do about that. It would be harsh to reject a patch just because no-one's gotten around to reviewing it, and if we start doing that, it's not clear what the point of a commitfest is any more. Perhaps we should change the process so that it is the patch author's responsibility to find a reviewer, and a committer, for the patch. If you can't cajole anyone to review your patch, it's a sign that no-one cares about it, and the patch is rejected by default. Or put a quick +1/-1 voting option to each patch in the commitfest, to get a quick gauge of how the committers feel about it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
Robert Haas robertmh...@gmail.com wrote: On Sat, Dec 6, 2014 at 10:08 PM, Tomas Vondra t...@fuzzy.cz wrote: select a.i, b.i from a join b on (a.i = b.i); I think the concern is that the inner side might be something more elaborate than a plain table scan, like an aggregate or join. I might be all wet, but my impression is that you can make rescanning arbitrarily expensive if you work at it. I'm not sure I'm following. Let's use a function to select from b: create or replace function fb() returns setof b language plpgsql rows 1 as $$ begin return query select i from b; end; $$; explain (analyze, buffers, verbose) select a.i, b.i from a join fb() b on (a.i = b.i); I used the low row estimate to cause the planner to put this on the inner side. 16 batches Execution time: 1638.582 ms Now let's make it slow. create or replace function fb() returns setof b language plpgsql rows 1 as $$ begin perform pg_sleep(2.0); return query select i from b; end; $$; explain (analyze, buffers, verbose) select a.i, b.i from a join fb() b on (a.i = b.i); 16 batches Execution time: 3633.859 ms Under what conditions do you see the inner side get loaded into the hash table multiple times? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP patch for Oid formatting in printf/elog strings
I found a few places in the code where a variable of type Oid is printed using %d rather than %u and changed them in the attached patch. In src/backend/replication/logical/reorderbuffer.c, circa line 2494, chunk_seq is of type Oid, but ent-last_chunk_seq is of type int32, leading me to question if perhaps the use of %d for chunk_seq is correct, but the use of Oid for the type of chunk_seq is in error. If neither is in error, perhaps someone can provide a short code comment explaining the logic of the signed/unsigned discrepancy. Thanks, Mark Dilger diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index ce44bbd..d230387 100644 --- a/src/backend/access/heap/tuptoaster.c +++ b/src/backend/access/heap/tuptoaster.c @@ -2155,7 +2155,7 @@ toast_open_indexes(Relation toastrel, * wrong if there is nothing. */ if (!found) - elog(ERROR, no valid index found for toast relation with Oid %d, + elog(ERROR, no valid index found for toast relation with Oid %u, RelationGetRelid(toastrel)); return res; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c7f41a5..c25ac31 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -4540,7 +4540,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len) DBWriteRequest *newreq; PgStat_StatDBEntry *dbentry; - elog(DEBUG2, received inquiry for %d, msg-databaseid); + elog(DEBUG2, received inquiry for %u, msg-databaseid); /* * Find the last write request for this DB. If it's older than the @@ -4598,7 +4598,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len) writetime = pstrdup(timestamptz_to_str(dbentry-stats_timestamp)); mytime = pstrdup(timestamptz_to_str(cur_ts)); elog(LOG, - stats_timestamp %s is later than collector's time %s for db %d, + stats_timestamp %s is later than collector's time %s for db %u, writetime, mytime, dbentry-databaseid); pfree(writetime); pfree(mytime); diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 6e75398..41a4896 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn, dlist_init(ent-chunks); if (chunk_seq != 0) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq 0, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq 0, chunk_seq, chunk_id); } else if (found chunk_seq != ent-last_chunk_seq + 1) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq %d, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq %d, chunk_seq, chunk_id, ent-last_chunk_seq + 1); chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc, isnull)); diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index 0a9ac02..a59508f 100644 --- a/src/bin/pg_dump/parallel.c +++ b/src/bin/pg_dump/parallel.c @@ -821,7 +821,7 @@ lockTableNoWait(ArchiveHandle *AH, TocEntry *te) pg_class.relname FROM pg_class JOIN pg_namespace on pg_namespace.oid = relnamespace - WHERE pg_class.oid = %d, te-catalogId.oid); + WHERE pg_class.oid = %u, te-catalogId.oid); res = PQexec(AH-connection, query-data); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
Mark Dilger wrote: I found a few places in the code where a variable of type Oid is printed using %d rather than %u and changed them in the attached patch. In src/backend/replication/logical/reorderbuffer.c, circa line 2494, chunk_seq is of type Oid, but ent-last_chunk_seq is of type int32, leading me to question if perhaps the use of %d for chunk_seq is correct, but the use of Oid for the type of chunk_seq is in error. If neither is in error, perhaps someone can provide a short code comment explaining the logic of the signed/unsigned discrepancy. tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk is wrong in declaring it Oid, and so this part of your patch is bogus. The others seem correct. diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 6e75398..41a4896 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn, dlist_init(ent-chunks); if (chunk_seq != 0) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq 0, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq 0, chunk_seq, chunk_id); } else if (found chunk_seq != ent-last_chunk_seq + 1) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq %d, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq %d, chunk_seq, chunk_id, ent-last_chunk_seq + 1); chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc, isnull)); -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 7:37 AM, Robert Haas robertmh...@gmail.com wrote: 2. It's not clear that we're going to have a particularly-impressive list of major features for 9.5. So far we've got RLS and BRIN. I expect that GROUPING SETS is far enough along that it should be possible to get it in before development ends, and there are a few performance patches pending (Andres's lwlock scalability patches, Rahila's work on compressing full-page writes) that I think will probably make the grade. But after that it seems to me that it gets pretty thin on the ground. I'm slightly surprised that you didn't mention abbreviated keys in that list of performance features. You're reviewing that patch; how do you feel about it now? Are we going to bill commit timestamp tracking - with replication node ID tracking as the real goal, despite the name - as a major feature, or DDL deparsing if that goes in, as major features? As useful as they may be for BDR, they don't strike me as things we can publicize as major features independent of BDR. And it's getting awfully late for any other major work that people are thinking of to start showing up. Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August - when development launched. It still doesn't have a reviewer, and it isn't actually in evidence that someone else has so much as downloaded and applied the patch (I'm sure someone has done that much, but the fact is that all the feedback that I've received this year concerns the semantics/syntax, which you can form an opinion on by just looking at the extensive documentation and other supplementary material I've written). It's consistently one of the most requested features, and yet major aspects of the design, that permeate through every major subsystem go unremarked on for months now. This feature is *definitely* major feature list material, since people have been loudly requesting it for over a decade, and yet no one mentions it in this thread (only Bruce mentioned it in the other thread about the effectiveness of the CF process). It's definitely in the top 2 or 3 most requested features, alongside much harder problems like parallel query and comprehensive partitioning support. If there is a lesson here for people that are working on major features, or me personally, I don't know what it is -- if anyone else knows, please tell me. I've bent over backwards to make the patch as accessible as possible, and as easy to review as possible. I also think its potential to destabilize the system (as major features go) is only about average. What am I doing wrong here? There is an enormous amount of supplementary documentation associated with the patch: https://wiki.postgresql.org/wiki/UPSERT https://wiki.postgresql.org/wiki/Value_locking Both of these pages are far larger than the Wiki page for RLS, for example. The UPSERT wiki page is kept right up to date. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 8:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. We are not particularly inviting of feedback for whatever testing has been done. The definitive guide seems to be https://wiki.postgresql.org/wiki/HowToBetaTest, and says: You can report tests by email. You can subscribe to any PostgreSQL mailing list from the subscription form http://www.postgresql.org/community/lists/ . - pgsql-bugs: this is the preferred mailing list if you think you have found a bug in the beta. You can also use the Bug Reporting Form http://www.postgresql.org/support/submitbug/. - pgsql-hackers: bugs, questions, and successful test reports are welcome here if you are already subscribed to pgsql-hackers. Note that pgsql-hackers is a high-traffic mailing list with a lot of development discussion. = So if you find a bug, you can report it on the bug reporting form (which doesn't have a drop-down entry for 9.4RC1). If you have positive results rather than negative ones (or even complaints that are not actually bugs), you can subscribe to a mailing list which generates a lot of traffic which is probably over your head and not interesting to you. Does the core team keep a mental list of items they want to see tested by the public, and they will spend their own time testing those things themselves if they don't hear back on some positive tests for them? If we find reports of public testing that yields good results (or at least no bugs) to be useful, we should be more clear on how to go about doing it. But are positive reports useful? If I report a bug, I can write down the steps to reproduce it, and then follow my own instructions to make sure it does actually reproduce it. If I find no bugs, it is just I did a bunch of random stuff and nothing bad happened, that I noticed. Chees, Jeff
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
Thank you Alvaro, The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. Mark Dilger On Thu, Dec 11, 2014 at 9:39 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Mark Dilger wrote: I found a few places in the code where a variable of type Oid is printed using %d rather than %u and changed them in the attached patch. In src/backend/replication/logical/reorderbuffer.c, circa line 2494, chunk_seq is of type Oid, but ent-last_chunk_seq is of type int32, leading me to question if perhaps the use of %d for chunk_seq is correct, but the use of Oid for the type of chunk_seq is in error. If neither is in error, perhaps someone can provide a short code comment explaining the logic of the signed/unsigned discrepancy. tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk is wrong in declaring it Oid, and so this part of your patch is bogus. The others seem correct. diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 6e75398..41a4896 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn, dlist_init(ent-chunks); if (chunk_seq != 0) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq 0, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq 0, chunk_seq, chunk_id); } else if (found chunk_seq != ent-last_chunk_seq + 1) - elog(ERROR, got sequence entry %d for toast chunk %u instead of seq %d, + elog(ERROR, got sequence entry %u for toast chunk %u instead of seq %d, chunk_seq, chunk_id, ent-last_chunk_seq + 1); chunk = DatumGetPointer(fastgetattr(newtup-tuple, 3, desc, isnull)); -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index ce44bbd..d230387 100644 --- a/src/backend/access/heap/tuptoaster.c +++ b/src/backend/access/heap/tuptoaster.c @@ -2155,7 +2155,7 @@ toast_open_indexes(Relation toastrel, * wrong if there is nothing. */ if (!found) - elog(ERROR, no valid index found for toast relation with Oid %d, + elog(ERROR, no valid index found for toast relation with Oid %u, RelationGetRelid(toastrel)); return res; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c7f41a5..c25ac31 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -4540,7 +4540,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len) DBWriteRequest *newreq; PgStat_StatDBEntry *dbentry; - elog(DEBUG2, received inquiry for %d, msg-databaseid); + elog(DEBUG2, received inquiry for %u, msg-databaseid); /* * Find the last write request for this DB. If it's older than the @@ -4598,7 +4598,7 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len) writetime = pstrdup(timestamptz_to_str(dbentry-stats_timestamp)); mytime = pstrdup(timestamptz_to_str(cur_ts)); elog(LOG, - stats_timestamp %s is later than collector's time %s for db %d, + stats_timestamp %s is later than collector's time %s for db %u, writetime, mytime, dbentry-databaseid); pfree(writetime); pfree(mytime); diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 6e75398..cd132c1 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2458,7 +2458,7 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn, Pointer chunk; TupleDesc desc = RelationGetDescr(relation); Oid chunk_id; - Oid chunk_seq; + int32 chunk_seq; if (txn-toast_hash == NULL) ReorderBufferToastInitHash(rb, txn); diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index 0a9ac02..a59508f 100644 --- a/src/bin/pg_dump/parallel.c +++ b/src/bin/pg_dump/parallel.c @@ -821,7 +821,7 @@ lockTableNoWait(ArchiveHandle *AH, TocEntry *te) pg_class.relname FROM pg_class
Re: [HACKERS] Commitfest problems
On 12/11/2014 09:02 AM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On 12/11/14 1:35 AM, Bruce Momjian wrote: While the commitfest process hasn't changed much and was very successful in the first few years, a few things have changed externally: 1 more new developers involved in contributing small patches 2 more full-time developers creating big patches 3 increased time demands on experienced Postgres developers The number of patches registered in the commit fests hasn't actually changed over the years. It has always fluctuated between 50 and 100, depending on the point of the release cycle. So I don't think (1) is necessarily the problem. I don't think that's accurate. The number of patches per CF *has* edged upwards by 10-30% per CF over the years: http://www.databasesoup.com/2013/08/94-commitfest-1-wrap-up.html (I haven't done the rest of 9.4 yet or 9.5) No, it's not a jump up by 2X, but it is an upwards trend. And I think that Tom has it right that the additional patches we're seeing are additional large, complex patches. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 1:06 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Dec 11, 2014 at 7:37 AM, Robert Haas robertmh...@gmail.com wrote: 2. It's not clear that we're going to have a particularly-impressive list of major features for 9.5. So far we've got RLS and BRIN. I expect that GROUPING SETS is far enough along that it should be possible to get it in before development ends, and there are a few performance patches pending (Andres's lwlock scalability patches, Rahila's work on compressing full-page writes) that I think will probably make the grade. But after that it seems to me that it gets pretty thin on the ground. I'm slightly surprised that you didn't mention abbreviated keys in that list of performance features. You're reviewing that patch; how do you feel about it now? I'm not sure it's where I think it needs to be yet, but yeah, I think that will get in. I thought of it after I hit send. Are we going to bill commit timestamp tracking - with replication node ID tracking as the real goal, despite the name - as a major feature, or DDL deparsing if that goes in, as major features? As useful as they may be for BDR, they don't strike me as things we can publicize as major features independent of BDR. And it's getting awfully late for any other major work that people are thinking of to start showing up. Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August - when development launched. It still doesn't have a reviewer, and it isn't actually in evidence that someone else has so much as downloaded and applied the patch (I'm sure someone has done that much, but the fact is that all the feedback that I've received this year concerns the semantics/syntax, which you can form an opinion on by just looking at the extensive documentation and other supplementary material I've written). It's consistently one of the most requested features, and yet major aspects of the design, that permeate through every major subsystem go unremarked on for months now. This feature is *definitely* major feature list material, since people have been loudly requesting it for over a decade, and yet no one mentions it in this thread (only Bruce mentioned it in the other thread about the effectiveness of the CF process). It's definitely in the top 2 or 3 most requested features, alongside much harder problems like parallel query and comprehensive partitioning support. I'm not sure whether that patch is going to go anywhere. There has been so much arguing about different aspects of the design that felt rather unproductive that I think most of the people who would be qualified to commit it have grown a bit gun-shy. That would be a good problem to get fixed, but I don't have a proposal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On 12/11/2014 09:22 AM, Heikki Linnakangas wrote: I imagine that it's the same for everyone else. Many of the patches that sit in the commitfest for weeks are patches that no-one really cares much about. I'm not sure what to do about that. It would be harsh to reject a patch just because no-one's gotten around to reviewing it, and if we start doing that, it's not clear what the point of a commitfest is any more. So the nobody cares argument is manifestly based on false assumptions. Are you contending that nobody cares about UPSERT or GROUPING SETS? In my experience, the patches that sit for weeks on the CF fall into 4 groups: 1. Large complicated patches that only a handful of committers are even capable of reviewing. 2. Obscure patches which require specialized knowledge our outside input to review. 3. Inconsequential patches where the submitter doesn't work the social process. 4. Patches with contentious spec or syntax. 5. Patches which everyone wants but have persistent hard-to-resolve issues. Of these only (3) would fit into nobody cares, and that's a pretty small group. There's also a chicken-and-egg problem there. Say that we started not reviewing DW features because nobody cares. Then the people who contributed those features don't go on to become major contributors, which means they won't review further DW patches. Which means that we've just closed off an entire use-case for PostgreSQL. I'd think that PostGIS would have taught us the value of the nobody cares fallacy. Also, if we go back on the promise that every patch gets a review, then we're definitely headed towards no more new contributors. As Noah said at one developer meeting (to paraphrase), one of the few things which keeps contributors persisting through our baroque, poorly-documented, excessively political contribution process is the promise that they'll get a fair shake for their invested time. If we drop that promise, we'll solve our workflow problem by cutting off the flow of new patches entirely. Perhaps we should change the process so that it is the patch author's responsibility to find a reviewer, and a committer, for the patch. If you can't cajole anyone to review your patch, it's a sign that no-one cares about it, and the patch is rejected by default. Or put a quick +1/-1 voting option to each patch in the commitfest, to get a quick gauge of how the committers feel about it. Again, that process would favor existing contributors and other folks who know how to work the Postgres community. It would be effectively the same as hanging up a sign which says no new contributors wanted. It would also be dramatically increasing the amount of politics around submitted patches, which take up far more time than the technical work. Overall, we're experiencing this issue because of a few predictable reasons: 1. a gradual increase in the number of submitted patches, especially large patches 2. Tom Lane cutting back on the number of other people's patches he reviews and revises. 3. Other major committers getting busier with their day jobs. 4. Failure to recruit, mentor and promote new committers at a rate proportional to the number of new contributors or the size of our community. 5. Failure to adopt or develop automated systems to remove some of the grunt work from patch submission and review. 6. Failure to adhere to any uniform standards of patch handing for the Commitfests. (2) out of these is actually the biggest thing we're seeing right now, I think. Tom was historically a one-man-patch-fixing machine, at one time responsible for 70% of the patches committed to PostgreSQL. This was never a good thing for the community, even if it was a good thing for the code base, becuase it discouraged others from stepping into a senior committer role. Now Tom has cut back, and others have to take up the slack. In the long run this will be good for our development community; in the short run it's kind of painful. I will also point out that some of the existing senior committers, who are the heaviest burdened under the existing system, have also been the most resistant to any change in the system. You (Heikki) yourself expressed a strong opposition to any attempt to recruit more reviewers. So, given that you are inside the heart of the problem, do you have a solution other than your proposal above that we simply stop accepting new contributors? Or is that your solution? It would work, for some definition of work. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On 12/11/2014 08:59 AM, Tom Lane wrote: More abstractly, there's a lot of value in having a predictable release schedule. That's going to mean that some release cycles are thin on user-visible features, even if just as much work went into them. It's the nature of the game. + 1,000,000 from me. ;-) Frankly, BRIN, UPSERT and a couple other things are plenty for a Postgres release. Other SQL databases would be thrilled to have that much ... can you name 3 major advances in the last MySQL release? And given that I've seen nothing about jquery/VODKA since pgCon, I'm expecting them for 9.6/whatever, not 9.5. There's a whole longish syntax discussion we haven't even started yet, let alone actual technical review. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Thu, Dec 11, 2014 at 1:26 PM, Josh Berkus j...@agliodbs.com wrote: No, it's not a jump up by 2X, but it is an upwards trend. And I think that Tom has it right that the additional patches we're seeing are additional large, complex patches. I feel like there's increasingly not that much easy stuff left to do. Many of the problems we haven't solved yet are unsolved because they are really hard, or because not that important, or both. For example, if you look at the current CommitFest, there are things like: Add ssl_protocols configuration option alter user/role CURRENT_USER Fix local_preload_libraries to work as expected. Refactoring code for synchronous node detection Refactor of functions related to quoting from builtins.h to utils/quote.h Those things are all, obviously, important to somebody, or there wouldn't be patches for them in need of review. But in the broader scheme of things, they are very minor issues. And then you have things like: deparse DDL in event triggers Compression of Full Page Writes WIP: dynahash replacement for buffer table Foreign table inheritance Grouping Sets INSERT ... ON CONFLICT {UPDATE | IGNORE} Pretty much everything on that list is something we've been wrestling with as a project for at least half a decade. I remember people complaining about DDL triggers when I first got involved in the PostgreSQL community, and the initial patch for foreign tables had support for inheritance and constraints which I ripped out before committing as too half-baked. Pavel had a GROUPING SETS patch by 2009. Brainstorming solutions to the full_page_writes problem has been a perennial PGCon after-hours activity for as long as I've been going. So it's natural that, to the extent these patches are making progress, they're doing so slowly. It's hard stuff to get right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
On Thu, Dec 11, 2014 at 12:29 PM, Kevin Grittner kgri...@ymail.com wrote: Robert Haas robertmh...@gmail.com wrote: On Sat, Dec 6, 2014 at 10:08 PM, Tomas Vondra t...@fuzzy.cz wrote: select a.i, b.i from a join b on (a.i = b.i); I think the concern is that the inner side might be something more elaborate than a plain table scan, like an aggregate or join. I might be all wet, but my impression is that you can make rescanning arbitrarily expensive if you work at it. I'm not sure I'm following. Let's use a function to select from b: create or replace function fb() returns setof b language plpgsql rows 1 as $$ begin return query select i from b; end; $$; explain (analyze, buffers, verbose) select a.i, b.i from a join fb() b on (a.i = b.i); I used the low row estimate to cause the planner to put this on the inner side. 16 batches Execution time: 1638.582 ms Now let's make it slow. create or replace function fb() returns setof b language plpgsql rows 1 as $$ begin perform pg_sleep(2.0); return query select i from b; end; $$; explain (analyze, buffers, verbose) select a.i, b.i from a join fb() b on (a.i = b.i); 16 batches Execution time: 3633.859 ms Under what conditions do you see the inner side get loaded into the hash table multiple times? Huh, interesting. I guess I was thinking that the inner side got rescanned for each new batch, but I guess that's not what happens. Maybe there's no real problem here, and we just win. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Thu, Dec 11, 2014 at 11:49:43AM -0500, Peter Eisentraut wrote: On 12/11/14 1:35 AM, Bruce Momjian wrote: While the commitfest process hasn't changed much and was very successful in the first few years, a few things have changed externally: 1 more new developers involved in contributing small patches 2 more full-time developers creating big patches 3 increased time demands on experienced Postgres developers The number of patches registered in the commit fests hasn't actually changed over the years. It has always fluctuated between 50 and 100, depending on the point of the release cycle. So I don't think (1) is necessarily the problem. Yes, I was hoping someone would produce some numbers --- thanks. I think the big question is whether the level of complexity of the patches has increased, or whether it is just the amount of experienced developer time (3) that has decreased, or something else. Or maybe things have not materially changed at all over the years. I am following this thought process: 1. Do we have a problem? 2. What is the cause of the problem? 3. How do we fix the problem? I think to get a useful outcome we have to process things in that order. So, is #1 true, and if so, what is the answer to #2? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Andrew Gierth and...@tao11.riddles.org.uk writes: Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom I've not spent any real effort looking at gsp2.patch yet, but it Tom seems even worse off comment-wise: if there's any explanation in Tom there at all of what a chained aggregate is, I didn't find it. (Maybe stacked would have been a better term.) What that code does is produce plans that look like this: GroupAggregate - Sort - ChainAggregate - Sort - ChainAggregate in much the same way that WindowAgg nodes are generated. That seems pretty messy, especially given your further comments that these plan nodes are interconnected and know about each other (though you failed to say exactly how). The claimed analogy to WindowAgg therefore seems bogus since stacked WindowAggs are independent, AFAIR anyway. I'm also wondering about performance: doesn't this imply more rows passing through some of the plan steps than really necessary? Also, how would this extend to preferring hashed aggregation in some of the grouping steps? ISTM that maybe what we should do is take a cue from the SQL spec, which defines these things in terms of UNION ALL of plain-GROUP-BY operations reading from a common CTE. Abstractly, that is, we'd have Append - GroupAggregate - Sort - source data - GroupAggregate - Sort - source data - GroupAggregate - Sort - source data ... (or some of the arms could be HashAgg without a sort). Then the question is what exactly the aggregates are reading from. We could do worse than make it a straight CTE, I suppose. Tom I'd also counsel you to find some other way to do it than Tom putting bool chain_head fields in Aggref nodes; There are no chain_head fields in Aggref nodes. Oh, I mistook struct Agg for struct Aggref. (That's another pretty poorly chosen struct name, though I suppose it's far too late to change that choice.) Still, interconnecting plan nodes that aren't adjacent in the plan tree doesn't sound like a great idea to me. Tom I took a quick look at gsp-u.patch. It seems like that approach Tom should work, with of course the caveat that using CUBE/ROLLUP as Tom function names in a GROUP BY list would be problematic. I'm not Tom convinced by the commentary in ruleutils.c suggesting that extra Tom parentheses would help disambiguate: aren't extra parentheses Tom still going to contain grouping specs according to the standard? The extra parens do actually disambiguate because CUBE(x) and (CUBE(x)) are not equivalent anywhere; while CUBE(x) can appear inside GROUPING SETS (...), it cannot appear inside a (...) list nested inside a GROUPING SETS list (or anywhere else). Maybe, but this seems very fragile and non-future-proof. I think double-quoting or schema-qualifying such function names would be safer when you think about the use-case of dumping views that may get loaded into future Postgres versions. The question that needs deciding here is less whether the approach _could_ work but whether we _want_ it. The objection has been made that we are in effect introducing a new category of unreserved almost everywhere keyword, which I think has a point; True, but I think that ship has already sailed. We already have similar behavior for PARTITION, RANGE, and ROWS (see the opt_existing_window_name production), and I think PRECEDING, FOLLOWING, and UNBOUNDED are effectively reserved-in-certain-very-specific-contexts as well. And there are similar behaviors in plpgsql's parser. on the other hand, reserving CUBE is a seriously painful prospect. Precisely. I think renaming or getting rid of contrib/cube would have to be something done in a staged fashion over multiple release cycles. Waiting several years to get GROUPING SETS doesn't seem appealing at all compared to this alternative. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On 12/11/2014 08:51 PM, Josh Berkus wrote: On 12/11/2014 09:22 AM, Heikki Linnakangas wrote: I imagine that it's the same for everyone else. Many of the patches that sit in the commitfest for weeks are patches that no-one really cares much about. I'm not sure what to do about that. It would be harsh to reject a patch just because no-one's gotten around to reviewing it, and if we start doing that, it's not clear what the point of a commitfest is any more. So the nobody cares argument is manifestly based on false assumptions. Are you contending that nobody cares about UPSERT or GROUPING SETS? No. I was thinking of patches like Add IF NOT EXISTS to CREATE TABLE AS and CREATE MATERIALIZED VIEW, event triggers: more DROP info, Point to polygon distance operator and pgcrypto: support PGP signatures. And nobody was an exaggeration; clearly *someone* cares about those things, or they wouldn't have written a patch in the first place. But none of the committers care enough to pick them up. Even in the case that someone reviews them, it's often not because the reviewer is interested in the patch, it's just to help out with the process. (Apologies to the authors of those patches; they were just the first few that caught my eye) There's also a chicken-and-egg problem there. Say that we started not reviewing DW features because nobody cares. Then the people who contributed those features don't go on to become major contributors, which means they won't review further DW patches. Which means that we've just closed off an entire use-case for PostgreSQL. I'd think that PostGIS would have taught us the value of the nobody cares fallacy. Also, if we go back on the promise that every patch gets a review, then we're definitely headed towards no more new contributors. As Noah said at one developer meeting (to paraphrase), one of the few things which keeps contributors persisting through our baroque, poorly-documented, excessively political contribution process is the promise that they'll get a fair shake for their invested time. If we drop that promise, we'll solve our workflow problem by cutting off the flow of new patches entirely. Yeah, there is that. Perhaps we should change the process so that it is the patch author's responsibility to find a reviewer, and a committer, for the patch. If you can't cajole anyone to review your patch, it's a sign that no-one cares about it, and the patch is rejected by default. Or put a quick +1/-1 voting option to each patch in the commitfest, to get a quick gauge of how the committers feel about it. Again, that process would favor existing contributors and other folks who know how to work the Postgres community. It would be effectively the same as hanging up a sign which says no new contributors wanted. It would also be dramatically increasing the amount of politics around submitted patches, which take up far more time than the technical work. I was thinking that by getting candid feedback that none of the existing contributors are interested to look at your patch, the author could revise the patch to garner more interest, or perhaps promise to review someone else's patch in return. Right now, the patches just linger, and the author doesn't know why, what's going to happen or what to do about it. I will also point out that some of the existing senior committers, who are the heaviest burdened under the existing system, have also been the most resistant to any change in the system. You (Heikki) yourself expressed a strong opposition to any attempt to recruit more reviewers. Huh, I did? Can you elaborate? So, given that you are inside the heart of the problem, do you have a solution other than your proposal above that we simply stop accepting new contributors? Or is that your solution? It would work, for some definition of work. I don't have any good solutions, I'm afraid. It might help to ping the reviewers who have signed up more often, to make the reviews to happen more quickly. I did some nudge people in the August commitfest, but I felt that it didn't actually achieve much. The most efficient way to move the commitfest forward was to just review more patches myself. That's one thought. Robert said the same thing about when he was the commitfest manager; he just reviewed most the patches himself in the end. And you mentioned that Tom used to review 70% of all incoming patches. How about we make that official? It's the commitfest manager's duty to review all the patches. He can recruit others if he can, but at the end of the day he's expected to take a look at every patch, and commit or return with some feedback. The problem with that is that we'll have a hard time to find volunteers for that. But we only need to find one sucker for each commitfest. I can volunteer to do that once a year; if the other active committers do the same, we're covered. - Heikki -- Sent via pgsql-hackers mailing list
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
On 11.12.2014 20:00, Robert Haas wrote: On Thu, Dec 11, 2014 at 12:29 PM, Kevin Grittner kgri...@ymail.com wrote: Under what conditions do you see the inner side get loaded into the hash table multiple times? Huh, interesting. I guess I was thinking that the inner side got rescanned for each new batch, but I guess that's not what happens. No, it's not rescanned. It's scanned only once (for the batch #0), and tuples belonging to the other batches are stored in files. If the number of batches needs to be increased (e.g. because of incorrect estimate of the inner table), the tuples are moved later. Maybe there's no real problem here, and we just win. I'm a bit confused by this discussion, because the inner relation has nothing to do with this patch. It gets scanned exactly once, no matter what the load factor is. If a batching is necessary, only the already files (without reexecuting the inner part) are read. However in that case this patch makes no difference, because it explicitly reverts to load factor = NTUP_PER_BUCKET (which is 1). The only point of this patch was to prevent batching because of the outer table. Usually, the outer table is much larger than the inner one (e.g. in a star schema, outer = fact table, inner = dimension). Batching the outer table means you have to write = 50% into a temporary file. The idea was that if we could increase the load a bit (e.g. using 2 tuples per bucket instead of 1), we will still use a single batch in some cases (when we miss the work_mem threshold by just a bit). The lookups will be slower, but we'll save the I/O. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
Heikki Linnakangas wrote: That's one thought. Robert said the same thing about when he was the commitfest manager; he just reviewed most the patches himself in the end. And you mentioned that Tom used to review 70% of all incoming patches. How about we make that official? It's the commitfest manager's duty to review all the patches. He can recruit others if he can, but at the end of the day he's expected to take a look at every patch, and commit or return with some feedback. The problem with that is that we'll have a hard time to find volunteers for that. But we only need to find one sucker for each commitfest. I can volunteer to do that once a year; if the other active committers do the same, we're covered. Hm, I was commitfest manager once, too, and this exact same thing happened. Maybe we need to make that official, and give the sucker some perk. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 11:52 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: The problem with that is that we'll have a hard time to find volunteers for that. But we only need to find one sucker for each commitfest. I can volunteer to do that once a year; if the other active committers do the same, we're covered. Hm, I was commitfest manager once, too, and this exact same thing happened. Maybe we need to make that official, and give the sucker some perk. I should do it at least once, if only because I haven't experienced it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 10:43 AM, Robert Haas robertmh...@gmail.com wrote: Version 1.0 of INSERT ... ON CONFLICT UPDATE was posted in August - when development launched. It still doesn't have a reviewer, and it isn't actually in evidence that someone else has so much as downloaded and applied the patch I'm not sure whether that patch is going to go anywhere. There has been so much arguing about different aspects of the design that felt rather unproductive that I think most of the people who would be qualified to commit it have grown a bit gun-shy. That would be a good problem to get fixed, but I don't have a proposal. Really? I have acted comprehensively on 100% of feedback to date on the semantics/syntax of the ON CONFLICT UPDATE patch. The record reflects that. I don't believe that the problem is that people are gun shy. We haven't even had a real disagreement since last year, and last year's discussion of value locking was genuinely very useful (I hate to say it, but the foreign key locking patch might have considered the possibility of unprincipled deadlocks more closely, as we saw recently). Lots of other systems have a comparable feature. Most recently, VoltDB added a custom UPSERT, even though they don't have half the SQL features we do. It's a complicated feature, but it's not a particularly complicated feature as big, impactful features go (like LATERAL, or logical decoding, or the foreign key locking stuff). It's entirely possible to get the feature in in the next few months, if someone will work with me on it. Even Heikki, who worked on this with me far more than everyone else, found the value locking page a useful summary. He didn't even know that there was a third design advocated by Simon and Andres until he saw it! The discussion was difficult, but had a useful outcome, because accounting for unprincipled deadlocks is a major source of complexity. I have seen a lot of benefit from comprehensively documenting stuff in one place (the wiki pages), but that has tapered off. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: multivariate statistics / proof of concept
On 11.12.2014 17:53, Heikki Linnakangas wrote: On 10/13/2014 01:00 AM, Tomas Vondra wrote: Hi, attached is a WIP patch implementing multivariate statistics. Great! Really glad to see you working on this. + * FIXME This sample sizing is mostly OK when computing stats for + * individual columns, but when computing multi-variate stats + * for multivariate stats (histograms, mcv, ...) it's rather + * insufficient. For small number of dimensions it works, but + * for complex stats it'd be nice use sample proportional to + * the table (say, 0.5% - 1%) instead of a fixed size. I don't think a fraction of the table is appropriate. As long as the sample is random, the accuracy of a sample doesn't depend much on the size of the population. For example, if you sample 1,000 rows from a table with 100,000 rows, or 1000 rows from a table with 100,000,000 rows, the accuracy is pretty much the same. That doesn't change when you go from a single variable to multiple variables. I might be wrong, but I doubt that. First, I read a number of papers while working on this patch, and all of them used samples proportional to the data set. That's an indirect evidence, though. You do need a bigger sample with multiple variables, however. My gut feeling is that if you sample N rows for a single variable, with two variables you need to sample N^2 rows to get the same accuracy. But it's not proportional to the table size. (I have no proof for that, but I'm sure there is literature on this.) Maybe. I think it's somehow related to the number of buckets (which somehow determines the precision of the histogram). If you want 1000 buckets, the number of rows scanned needs to be e.g. 10x that. With multi-variate histograms, we may shoot for more buckets (say, 100 in each dimension). + * Multivariate histograms + * + * Histograms are a collection of buckets, represented by n-dimensional + * rectangles. Each rectangle is delimited by an array of lower and + * upper boundaries, so that for for the i-th attribute + * + * min[i] = value[i] = max[i] + * + * Each bucket tracks frequency (fraction of tuples it contains), + * information about the inequalities, number of distinct values in + * each dimension (which is used when building the histogram) etc. + * + * The boundaries may be either inclusive or exclusive, or the whole + * dimension may be NULL. + * + * The buckets may overlap (assuming the build algorithm keeps the + * frequencies additive) or may not cover the whole space (i.e. allow + * gaps). This entirely depends on the algorithm used to build the + * histogram. That sounds pretty exotic. These buckets are quite different from the single-dimension buckets we currently have. The paper you reference in partition_bucket() function, M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988: 28-36, actually doesn't mention overlapping buckets at all. I haven't read the code in detail, but if it implements the algorithm from that paper, there will be no overlap. The algorithm implemented in partition_bucket() is very simple and naive, and it mostly resembles the algorithm described in the paper. I'm sure there are differences, it's not a 1:1 implementation, but you're right it produces non-overlapping buckets. The point is that I envision more complex algorithms or different histogram types, and some of them may produce overlapping buckets. Maybe that's premature comment, and it will turn out it's not really necessary. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
Peter Geoghegan wrote: On Thu, Dec 11, 2014 at 11:52 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: The problem with that is that we'll have a hard time to find volunteers for that. But we only need to find one sucker for each commitfest. I can volunteer to do that once a year; if the other active committers do the same, we're covered. Hm, I was commitfest manager once, too, and this exact same thing happened. Maybe we need to make that official, and give the sucker some perk. I should do it at least once, if only because I haven't experienced it. We could make an slogan out of it: you've never experienced the PostgreSQL development process if you haven't been a CFM at least once in your life. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
Mark Dilger hornschnor...@gmail.com writes: The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. BTW, how are you finding these? If it's some automated tool, what? (If you're finding them by hand, I'm in awe of your scan rate.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Memory-bounded HashAgg
On 11.12.2014 11:46, Jeff Davis wrote: New patch attached. All open items are complete, though the patch may have a few rough edges. Summary of changes: * rebased on top of latest memory accounting patch http://www.postgresql.org/message-id/1417497257.5584.5.camel@jeff-desktop * added a flag to hash_create to prevent it from creating an extra level of memory context - without this, the memory accounting would have a measurable impact on performance * cost model for the disk usage * intelligently choose the number of partitions for each pass of the data * explain support * in build_hash_table(), be more intelligent about the value of nbuckets to pass to BuildTupleHashTable() - BuildTupleHashTable tries to choose a value to keep the table in work_mem, but it isn't very accurate. * some very rudimentary testing (sanity checks, really) shows good results I plan to look into this over the holidays, hopefully. Summary of previous discussion (my summary; I may have missed some points): Tom Lane requested that the patch also handle the case where transition values grow (e.g. array_agg) beyond work_mem. I feel this patch provides a lot of benefit as it is, and trying to handle that case would be a lot more work (we need a way to write the transition values out to disk at a minimum, and perhaps combine them with other transition values). I also don't think my patch would interfere with a fix there in the future. Tomas Vondra suggested an alternative design that more closely resembles HashJoin: instead of filling up the hash table and then spilling any new groups, the idea would be to split the current data into two partitions, keep one in the hash table, and spill the other (see ExecHashIncreaseNumBatches()). This has the advantage that it's very fast to identify whether the tuple is part of the in-memory batch or not; and we can avoid even looking in the memory hashtable if not. The batch-splitting approach has a major downside, however: you are likely to evict a skew value from the in-memory batch, which will result in all subsequent tuples with that skew value going to disk. My approach never evicts from the in-memory table until we actually finalize the groups, so the skew values are likely to be completely processed in the first pass. I don't think that's the main issue - there are probably ways to work around that (e.g. by keeping a skew hash table for those frequent values, similarly to what hash join does). The main problem IMHO is that it requires writing the transition values to disk, which we don't know in many cases (esp. in the interesting ones, where the transtion values grow). So, the attached patch implements my original approach, which I still feel is the best solution. I think this is a reasonable approach - it's true it does no handle the case with growing aggregate state (e.g. array_agg), so it really fixes just the case when we underestimate the number of groups. But I believe we need this approach anyway, becauce we'll never know how to write all the various transition values (e.g. think of custom aggregates), and this is an improvement. We can build on this and add the more elaborate hashjoin-like approach in the future. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/11/14 1:56 PM, Robert Haas wrote: I feel like there's increasingly not that much easy stuff left to do. Many of the problems we haven't solved yet are unsolved because they are really hard, or because not that important, or both. You know, I was telling people the very same thing in 2004. Yet here we are. Maybe we just need to relax and accept that we're awesome. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 11:59:58AM -0500, Robert Haas wrote: The problem is that, on the one hand, we have a number of serious problems with things that got committed and turned out to have problems - the multixact stuff, and JSONB, in particular - and on the other hand, we are lacking in adequate committer bandwidth to properly handle all of the new patches that come in. We can fix the first problem by tightening up on the requirements for committing things, but that exacerbates the second problem. Or we can fix the second problem by loosening up on the requirements for commit, but that exacerbates the first problem. Promoting more or fewer committers is really the same trade-off: if you're very careful about who you promote, you'll get better people but not as many of them, so less will get done but with fewer mistakes; if you're more generous in handing out commit bits, you reduce the bottleneck to stuff getting done but, inevitably, you'll be trusting people in whom you have at least slightly less confidence. There's an inherent tension between quality and rate of progress that we can't get rid of, and the fact that some of our best people are busier than ever with things other than PostgreSQL hacking is not helping - not only because less actual review/commit happens, but because newcomers to the community don't have as much contact with the more senior people who could help mentor them if they only had the time. Great outline of the tradeoffs involved in being more aggressive about committing. I do think the multixact and JSONB problems have spooked some of us to be slower/more careful about committing. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
Tom Lane wrote: Mark Dilger hornschnor...@gmail.com writes: The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. BTW, how are you finding these? If it's some automated tool, what? (If you're finding them by hand, I'm in awe of your scan rate.) BTW if it's an automated tool, it might also improve by examining the DatumGetFoo() macros. In the reorderbuffer.c code just fixed you can see that chunk_seq is obtained by DatumGetInt32(), which was wrong to assign to a value of type Oid. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/11/14 1:35 AM, Bruce Momjian wrote: I have heard repeated concerns about the commitfest process in the past few months. The fact we have been in a continual commitfest since August also is concerning. I realized the other day, I'm embracing the idea of a continual commitfest. I'm still working on patches from the last commitfest. Why not? They didn't expire. Sure, it would have been nicer to get them done sooner, but what are you gonna do? The fact that Nov 15 now Dec 15 isn't going to change the fact that I have a few hours to spare right now and the patches are still relevant. As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 11:40 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/11/2014 08:51 PM, Josh Berkus wrote: On 12/11/2014 09:22 AM, Heikki Linnakangas wrote: Perhaps we should change the process so that it is the patch author's responsibility to find a reviewer, and a committer, for the patch. If you can't cajole anyone to review your patch, it's a sign that no-one cares about it, and the patch is rejected by default. Or put a quick +1/-1 voting option to each patch in the commitfest, to get a quick gauge of how the committers feel about it. Again, that process would favor existing contributors and other folks who know how to work the Postgres community. It would be effectively the same as hanging up a sign which says no new contributors wanted. It would also be dramatically increasing the amount of politics around submitted patches, which take up far more time than the technical work. I was thinking that by getting candid feedback that none of the existing contributors are interested to look at your patch, the author could revise the patch to garner more interest, or perhaps promise to review someone else's patch in return. Right now, the patches just linger, and the author doesn't know why, what's going to happen or what to do about it. I agree. Having your patch disappear into the void is not friendly at all. But I don't think a commentless -1 is the answer, either. That might one of the few things worse than silence. Even if the comment is just This seems awfully complicated for a minimally useful feature or This will probably have unintended side effects in the executor that I don't have time to trace down, can someone make an argument for correctness, or even this format is unreadable, I won't review this until it is fixed then at least the author (and perhaps more important, junior contributors who are not the author) will know either what argument to make to lobby for it, what avenue to take when reviewing it, or whether to just forget it and work on something more important. Cheers, Jeff
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 3:47 PM, Jeff Janes jeff.ja...@gmail.com wrote: I agree. Having your patch disappear into the void is not friendly at all. But I don't think a commentless -1 is the answer, either. That might one of the few things worse than silence. Even if the comment is just This seems awfully complicated for a minimally useful feature or This will probably have unintended side effects in the executor that I don't have time to trace down, can someone make an argument for correctness, or even this format is unreadable, I won't review this until it is fixed then at least the author (and perhaps more important, junior contributors who are not the author) will know either what argument to make to lobby for it, what avenue to take when reviewing it, or whether to just forget it and work on something more important. Agreed! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote: Tom Lane-2 wrote Robert Haas lt; robertmhaas@ gt; writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt; josh@ gt; wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. The compressibility properties of a new type seem like something that should be mandated before it is committed - it shouldn't require good fortune that Odd are the next problem will have nothing to do with compressibility --- we can't assume old failure will repeat themselves. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote: On 12/11/14 1:35 AM, Bruce Momjian wrote: I have heard repeated concerns about the commitfest process in the past few months. The fact we have been in a continual commitfest since August also is concerning. I realized the other day, I'm embracing the idea of a continual commitfest. I'm still working on patches from the last commitfest. Why not? They didn't expire. Sure, it would have been nicer to get them done sooner, but what are you gonna do? The fact that Nov 15 now Dec 15 isn't going to change the fact that I have a few hours to spare right now and the patches are still relevant. As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. Wow, that's radical, and interesting. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/11/2014 01:07 PM, Bruce Momjian wrote: On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote: On 12/11/14 1:35 AM, Bruce Momjian wrote: I have heard repeated concerns about the commitfest process in the past few months. The fact we have been in a continual commitfest since August also is concerning. I realized the other day, I'm embracing the idea of a continual commitfest. I'm still working on patches from the last commitfest. Why not? They didn't expire. Sure, it would have been nicer to get them done sooner, but what are you gonna do? The fact that Nov 15 now Dec 15 isn't going to change the fact that I have a few hours to spare right now and the patches are still relevant. As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. Wow, that's radical, and interesting. Actually, to me it sounds a lot like what we did 10 years ago before the commitfests except with a way to track the patches (other than the mail archives) and more people participating in patch reviews. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAEBAgAGBQJUigi9AAoJEDfy90M199hlgTEP/3wZov8w32JN8Olp/KV8OCsC ZwTluKRxgfAMf3bZmimxrnD5SYvgj6RhGwBfPvnNXPub+ODfCqTkMJ9dx41Xv/H9 wm0sFctfZTJb3TLbdZS4slg32LMbpCXIwL4QmNAXmkbKqqTVkTx+G2RvjtdkIpSQ ZApg4S7LvDazACZjTSl+bUNfzvu9Ygl6Nu4otgXaMbM1ntfKtgF3irpf0+K8Wwi9 7zQ1eU+bSnH9+/kb416WufLutIP182OalbB3BI1KbLURTL98FElUmPYD7xV39OwZ tE6bFNwUDYdrZCR6B+vkbf/z+9xa8C2vqU9d85MyKNjV1sB1MXX5O+yMRT7eQd8q JOX1IhKOMMS79c1LqB2vTQlv8lU6VD6gvxFGO4X2OD5cGaLyl4L90hSYmZDordQe uhwtCo0xoBFExaNb1NF9bzH8u0bJJpxDYZJPPhMNcWMY7EZ3+McXU7MxinDwLh1D wNynzrZs0Oq8jyn0XwGSF54urkC5clmpEwnhlzeiu2oHosgKLOOw688r38O77MOL EtNDcVSq2x4B6ocCjGEN7Xcbjm8kbiuQoQJBXUm2C/lvqbiNyzYJ19oH01tHcVTC oMv3MrhlJ4p83t40bZGqoDUpA5/hme255OgqRdDDH7WFGQXwA6CQEsROFshcIcyR f7Zx9iww4SGACOj+j0OS =hSWH -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra t...@fuzzy.cz wrote: No, it's not rescanned. It's scanned only once (for the batch #0), and tuples belonging to the other batches are stored in files. If the number of batches needs to be increased (e.g. because of incorrect estimate of the inner table), the tuples are moved later. Yeah, I think I sort of knew that, but I got confused. Thanks for clarifying. The idea was that if we could increase the load a bit (e.g. using 2 tuples per bucket instead of 1), we will still use a single batch in some cases (when we miss the work_mem threshold by just a bit). The lookups will be slower, but we'll save the I/O. Yeah. That seems like a valid theory, but your test results so far seem to indicate that it's not working out like that - which I find quite surprising, but, I mean, it is what it is, right? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Mark Dilger hornschnor...@gmail.com writes: The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. BTW, how are you finding these? If it's some automated tool, what? (If you're finding them by hand, I'm in awe of your scan rate.) regards, tom lane I've been advised to hoodwink you and say that I have found them all by hand. Actually, I am changing the typedef in src/include/c.h and then recompiling, and watching for compiler warnings telling me that I am passing a uint64 to a %d. Of course, I get lots of warnings about passing a uint64 to a %u, but I can filter through those easily enough. Also, the macros in outfuncs.c tend to complain in a similar way, and I can check that output, too. mark
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 2:05 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote: Tom Lane-2 wrote Robert Haas lt; robertmhaas@ gt; writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt; josh@ gt; wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. The compressibility properties of a new type seem like something that should be mandated before it is committed - it shouldn't require good fortune that Odd are the next problem will have nothing to do with compressibility --- we can't assume old failure will repeat themselves. tl;dr: assign two people, a manager/curator and a lead reviewer. Give the curator better tools and the responsibility to engage the community. If the primary reviewer cannot review a patch in the current commitfest it can be marked awaiting reviewer and moved to the next CF for evaluation by the next pair of individuals. At minimum, though, if it does get moved the manager AND reviewer need to comment on why it was not handled during their CF. Also, formalize documentation targeting developers and reviewers just like the documentation for users has been formalized and committed to the repository. That knowledge and history is probably more important that source code commit log and definitely should be more accessible to newcomers. While true a checklist of things to look for and evaluate when adding a new type to the system still has value. How new types interact, if at all, with TOAST seems like something that warrants explicit attention before commit, and there are probably others (e.g., OIDs, input/output function volatility and configuration, etc...). Maybe this exists somewhere but it you are considering improvements to the commitfest application having an top-of-page always-visible checklist that can bootstrapped based upon the patch/feature type and modified for specific nuances for the item in question seems like it would be valuable. If this was in place before 9.3 then the whatever category the multi-xact patches fall into would want to have their template modified to incorporate the various research and testing - along with links to the discussions - the resulted from the various bug reports that were submitted. It could even be structured to both be an interactive checklist as well as acting as curated documentation for developers and reviewers. The wiki has some of this but if the goal is to encourage people to learn how to contribute to PostgreSQL it should receive a similar level of attention and quality control that our documentation for people wanting to learn how to use PostgreSQL receive. But that is above and beyond the simple goal of having meaningful checklists attached to each of the major commit-fest items whose TODO items can be commented upon and serve as a reference for how close to commit a feature may be. Comments can be as simple as a place for people to upload a psql script and say this is what I did and everything seemed to work/fail in the way I expected - on this platform. Curation is hard though so I get why easier - just provide links to the mailing list mainly - actions are what is currently being used. Instead of the CF manager being a reviewer (which is a valid approach) having them be a curator and providing better tools geared toward that role (both to do the work and to share the results) seem like a better ideal. Having that role a CF reviewer should maybe also be assigned. The two people - reviewer and manager - would then be responsible for ensuring that reviews are happening and that the communication to and recruitment from the community is being handled - respectively. Just some off-the-cuff thoughts... David J.
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Dec 11, 2014 at 1:11 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan p...@heroku.com wrote: Attached revision, v1.6, slightly tweaks the ordering of per-statement trigger execution. Right now, there is no way for a before row insert/update trigger to determine whether it was called as part of an INSERT ... ON CONFLICT UPDATE or not. It's also not possible for a DO INSTEAD trigger on a view (a before row insert trigger) to determine that it was called specifically due to an INSERT...IGNORE (which I think ought to imply that any corresponding, redirected insertion into a table should also use IGNOREthat's at least going to be something that a certain number of apps will need to be made robust against). The question is: Do we want to expose this distinction to triggers? The natural way to do so would probably be to add TG_SPECULATIVE special variable to plpgsql (and equivalent variables in other PLs). This text variable would be either UPSERT or IGNORE; it would be NULL when it was not applicable (e.g. with traditional INSERTs). How do people feel about this? Is it important to include this in our initial cut of the feature? I thought that I'd avoid that kind of thing until prompted to address it by others, since it probably won't end up being a common concern, but I'd like to hear a few opinions. It's probably something we should add, but there's enough to do getting the basic feature working first. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/11/2014 01:07 PM, Bruce Momjian wrote: On Thu, Dec 11, 2014 at 03:47:29PM -0500, Peter Eisentraut wrote: On 12/11/14 1:35 AM, Bruce Momjian wrote: I have heard repeated concerns about the commitfest process in the past few months. The fact we have been in a continual commitfest since August also is concerning. I realized the other day, I'm embracing the idea of a continual commitfest. I'm still working on patches from the last commitfest. Why not? They didn't expire. Sure, it would have been nicer to get them done sooner, but what are you gonna do? The fact that Nov 15 now Dec 15 isn't going to change the fact that I have a few hours to spare right now and the patches are still relevant. As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. Wow, that's radical, and interesting. Actually, to me it sounds a lot like what we did 10 years ago before the commitfests except with a way to track the patches (other than the mail archives) and more people participating in patch reviews. Yes, it does remind me of the mbox files I put on the web. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
I spoke too soon. Actually, the Oid definition is in src/include/postgres_ext.h, which I'm sure you all know. But I'm changing other typedefs too, not just for Oid. I've been going through a copy of the code and replacing int32 and uint32 with the appropriate type such as Oid, TransactionId, and such, and have created my own typedef for TypeModType, in order to help chase through the code and figure out what functions handle what kind of data. By defining TypeModType to int64, for example, I get lots of compiler errors when passing a TypeModType * into a function that takes int32 *, and that lets me know that the callers of that function think of it as a typemod argument, even if it does not have any clear documentation to that effect. The exercise is a bit tedious, as I have to change lots of strings like the value %d is out of bounds to something like the value TYPEMODTYPE_FORMAT is out of bounds, but the clarity I get from it helps enough that it is useful to me. I hope to eventually be able to simply pass switches to configure like --64bit-oids and have the whole system work with Oid defined as 64-bits. Likewise for varlena headers. I ported the whole of postgres to 64-bit Oids and 64-bit varlena headers once before, and it worked and passed all the regression tests and such, but I did it by replacing lots of instances of int32 with instances of int64, so it didn't help clarify the meaning of anything. This time, I'm hoping that I can keep in sync with all the commits so that I can build a 32-bit or a 64-bit postgres at a moments notice. Mark Dilger On Thu, Dec 11, 2014 at 1:25 PM, Mark Dilger hornschnor...@gmail.com wrote: On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Mark Dilger hornschnor...@gmail.com writes: The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. BTW, how are you finding these? If it's some automated tool, what? (If you're finding them by hand, I'm in awe of your scan rate.) regards, tom lane I've been advised to hoodwink you and say that I have found them all by hand. Actually, I am changing the typedef in src/include/c.h and then recompiling, and watching for compiler warnings telling me that I am passing a uint64 to a %d. Of course, I get lots of warnings about passing a uint64 to a %u, but I can filter through those easily enough. Also, the macros in outfuncs.c tend to complain in a similar way, and I can check that output, too. mark
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
On Thu, Dec 11, 2014 at 01:25:00PM -0800, Mark Dilger wrote: On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Mark Dilger hornschnor...@gmail.com writes: The attached patch changes the type of chunk_seq to int32, rather than changing the %d formatting. The other changes are the same as in the previous patch. BTW, how are you finding these? If it's some automated tool, what? (If you're finding them by hand, I'm in awe of your scan rate.) regards, tom lane I've been advised to hoodwink you and say that I have found them all by hand. Actually, I am changing the typedef in src/include/c.h and then recompiling, and watching for compiler warnings telling me that I am passing a uint64 to a %d. Of course, I get lots of warnings about passing a uint64 to a %u, but I can filter through those easily enough. Cool idea! -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
FWIW I don't think any amount of process would have gotten multixact to not have the copious bugs it had. It was just too complex a patch, doing ugly things to parts too deeply linked to the inner guts of the server. We might have spared a few with some extra testing (such as the idiotic wraparound bugs, and perhaps the pg_upgrade problems), but most of the rest of it would have taken a real production load to notice -- which is exactly what happened. (Of course, review from Tom might have unearthed many of these bugs before they hit final release, but we're saying we don't want to be dependent on Tom's review for every patch so that doesn't count.) Review is good, but (as history shows) some bugs can slip through even extensive review such as the one multixacts got from Noah and Andres. Had anyone put some real stress on the beta, we could have noticed some of these bugs much earlier. One of the problems we have is that our serious users don't seem to take the beta period seriously. All our users are too busy for that, it seems, and they expect that somebody else will test the point-zero release, and that by the time it hits point-one or point-two it will be bug-free, which is quite clearly not the case. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Bruce Momjian wrote: On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote: Actually, to me it sounds a lot like what we did 10 years ago before the commitfests except with a way to track the patches (other than the mail archives) and more people participating in patch reviews. Yes, it does remind me of the mbox files I put on the web. The whole commitfest process was put in place in part precisely so we could get rid of your mboxen. Please let's not get back to that! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 1:49 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Review is good, but (as history shows) some bugs can slip through even extensive review such as the one multixacts got from Noah and Andres. Had anyone put some real stress on the beta, we could have noticed some of these bugs much earlier. One of the problems we have is that our serious users don't seem to take the beta period seriously. All our users are too busy for that, it seems, and they expect that somebody else will test the point-zero release, and that by the time it hits point-one or point-two it will be bug-free, which is quite clearly not the case. Good, strategic stress testing has a big role to play here IMV. I have had some difficulty generating interest in that, but I think it's essential. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Peter Eisentraut-2 wrote On 12/11/14 1:35 AM, Bruce Momjian wrote: I have heard repeated concerns about the commitfest process in the past few months. The fact we have been in a continual commitfest since August also is concerning. I realized the other day, I'm embracing the idea of a continual commitfest. As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. Having just written about the idea of a CF manager and a CF reviewer... while having a continual CF works in theory I think breaking it down into, say, quarters and having each pair commit to 3 months of being in charge has some logic behind it. The patch list concept should be formalized, and should include a targeted release concept. The idea of a CF seems like it should apply to the people involved and include an explicit choosing of items from the patch list that are going to get attention by those people during the CF period. A policy of every new addition being added to the next CF, along with an item being involved in at least every other CF, would make sense. Moving along the path of continuous delivery maybe just create CF teams instead of a monolithic CF process. David J. -- View this message in context: http://postgresql.nabble.com/Commitfest-problems-tp5830061p5830189.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/11/2014 01:52 PM, Alvaro Herrera wrote: Bruce Momjian wrote: On Thu, Dec 11, 2014 at 01:12:30PM -0800, Joe Conway wrote: Actually, to me it sounds a lot like what we did 10 years ago before the commitfests except with a way to track the patches (other than the mail archives) and more people participating in patch reviews. Yes, it does remind me of the mbox files I put on the web. The whole commitfest process was put in place in part precisely so we could get rid of your mboxen. Please let's not get back to that! Well, the CF process was added for 2 reasons: 1) We were losing track of some patches until integration time, when Bruce suddently found them in the -hackers archives. 2) In order to give the senior committers some time *off* from reviewing other people's patches. The idea was to spend 3 weeks or so intensively reviewing others' patches, and then to have a month (or more) *off* from working on anything but your own stuff. While the CFs are still doing (1), support for (2) ended sometime in the 9.3 development cycle. Partly this is because current CFMs are not permitted to take authoritative steps to ensure that the CF ends on time, and partly it's because of the increase in big complicated patches which just don't fit into the CF cycle. Speaking as the originator of commitfests, they were *always* intended to be a temporary measure, a step on the way to something else like continuous integration. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Thu, Dec 11, 2014 at 1:07 PM, Bruce Momjian br...@momjian.us wrote: As far as I'm concerned, we might as well just have one commitfest per major release. Call it a patch list. Make the list sortable by created date and last-updated date, and let the system police itself. At least that's honest. Wow, that's radical, and interesting. Agreed. I don't think it's radical, though - it's just acknowledging the elephant in the room, which is that the commitfests in a given cycle are not really distinct at all. I suspect that better tooling had a lot to do with the success of commitfests, which is more or less incidental to how they were originally conceived. There is still room for improvement there. I'd have to think about it some more, but I'd almost be ready to vote for formalizing how things actually work in practice given the chance (i.e. Voting to follow Peter's suggestion). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Josh Berkus j...@agliodbs.com writes: Well, the CF process was added for 2 reasons: 1) We were losing track of some patches until integration time, when Bruce suddently found them in the -hackers archives. 2) In order to give the senior committers some time *off* from reviewing other people's patches. The idea was to spend 3 weeks or so intensively reviewing others' patches, and then to have a month (or more) *off* from working on anything but your own stuff. Right; I was in the middle of composing words to the same effect when this arrived. While the CFs are still doing (1), support for (2) ended sometime in the 9.3 development cycle. Partly this is because current CFMs are not permitted to take authoritative steps to ensure that the CF ends on time, and partly it's because of the increase in big complicated patches which just don't fit into the CF cycle. I don't see why you think CFMs are not permitted to close out a CF when they want to. At least some of the fests have been closed out per calendar schedule, punting unprocessed patches to the next fest. We've certainly utterly failed to do that since August, but I think that's mismanagement. Speaking as the originator of commitfests, they were *always* intended to be a temporary measure, a step on the way to something else like continuous integration. Really? How would that address (2)? And please note that part of the problem we're having right now is that senior people are rebelling against no longer having any agreed-on time off. IMV, goal (2) is not optional; if you try to force people into continuous review work, you'll just lose them completely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/11/2014 02:12 PM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: While the CFs are still doing (1), support for (2) ended sometime in the 9.3 development cycle. Partly this is because current CFMs are not permitted to take authoritative steps to ensure that the CF ends on time, and partly it's because of the increase in big complicated patches which just don't fit into the CF cycle. I don't see why you think CFMs are not permitted to close out a CF when they want to. At least some of the fests have been closed out per calendar schedule, punting unprocessed patches to the next fest. We've certainly utterly failed to do that since August, but I think that's mismanagement. I love how you're willing to accuse Michael of mismangement when you yourself have *never* run a CF. If I was Michael, I'd quit right now. Every CFM who has taken forceful measures to close out the CF on time has gotten a large amount of flack for it, and absolutely zero back-up from the committers or the core team. This is why David, Robert and I will no longer manage CFs (and probably others as well). The CFM is expected to miraculously end the CF on time, without bouncing or forwarding unreviewed patches, cutting off patches which already had one round of review, bouncing out or holding patches from contributors who don't participate in the review process, nagging anyone too much, or taking any other steps which anyone might take exception to. In fact, the technique you cite (just punting the patches to the next CF) resulted in Fetter getting a great deal of hassling on this list when he did it. The only people who backed him up on it were other former CFMs. And I closed out my CF on my expected schedule (based on number of patches), but only after taking so much shit for it, I'm never doing it again. And I have a pretty high tolerance for taking shit. So now you're left with CFMs who won't do anything to wind up the CF because they know that the committers and hackers will *not* support them if they do. So, eternal CF. How about *you* run the next one, Tom? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
On 11.12.2014 22:16, Robert Haas wrote: On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra t...@fuzzy.cz wrote: No, it's not rescanned. It's scanned only once (for the batch #0), and tuples belonging to the other batches are stored in files. If the number of batches needs to be increased (e.g. because of incorrect estimate of the inner table), the tuples are moved later. Yeah, I think I sort of knew that, but I got confused. Thanks for clarifying. The idea was that if we could increase the load a bit (e.g. using 2 tuples per bucket instead of 1), we will still use a single batch in some cases (when we miss the work_mem threshold by just a bit). The lookups will be slower, but we'll save the I/O. Yeah. That seems like a valid theory, but your test results so far seem to indicate that it's not working out like that - which I find quite surprising, but, I mean, it is what it is, right? Not exactly. My tests show that as long as the outer table batches fit into page cache, icreasing the load factor results in worse performance than batching. When the outer table is sufficiently small, the batching is faster. Regarding the sufficiently small - considering today's hardware, we're probably talking about gigabytes. On machines with significant memory pressure (forcing the temporary files to disk), it might be much lower, of course. Of course, it also depends on kernel settings (e.g. dirty_bytes/dirty_background_bytes). If we could identify those cases (at least the temp files RAM) then maybe we could do this. Otherwise we're going to penalize all the other queries ... Maybe the best solution for now is increase the work_mem a bit recommendation. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Josh Berkus j...@agliodbs.com writes: How about *you* run the next one, Tom? I think the limited amount of time I can put into a commitfest is better spent on reviewing patches than on managing the process. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal : REINDEX SCHEMA
On 9 December 2014 at 12:21, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Dec 9, 2014 at 6:43 PM, Michael Paquier michael.paqu...@gmail.com wrote: OK. Perhaps that's not worth mentioning in the release notes, but some users may be used to the old behavior. What about the other issues (regression test for permissions incorrect and matviews)? Here is in any case an updated patch to fix all those things rebased on latest HEAD (ae4e688). Thanks, I rewrote and applied this patch to ensure we didn't introduce further bugs. Tests for the reindex part were rewritten from scratch also. Rather unluckily that seems to have coincided with Tom's changes, so I've only added the bits that have nothing to do with Tom's changes - all of which stand. (I just got going again after my flight back from Tokyo). -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Tom == Tom Lane t...@sss.pgh.pa.us writes: What that code does is produce plans that look like this: GroupAggregate - Sort - ChainAggregate - Sort - ChainAggregate in much the same way that WindowAgg nodes are generated. Tom That seems pretty messy, especially given your further comments Tom that these plan nodes are interconnected and know about each Tom other (though you failed to say exactly how). I'd already explained in more detail way back when we posted the patch. But to reiterate: the ChainAggregate nodes pass through their input data unchanged, but on group boundaries they write aggregated result rows to a tuplestore shared by the whole chain. The top node returns the data from the tuplestore after its own output is completed. The chain_head pointer in the ChainAggregate nodes is used for: - obtaining the head node's targetlist and qual, to use to project rows into the tuplestore (the ChainAggregate nodes don't do ordinary projection so they have dummy targetlists like the Sort nodes do) - obtaining the pointer to the tuplestore itself - on rescan without parameter change, to inform the parent node whether or not the child nodes are also being rescanned (since the Sort nodes may or may not block this) Tom The claimed analogy to WindowAgg therefore seems bogus since Tom stacked WindowAggs are independent, AFAIR anyway. The analogy is only in that they need to see the same input rows but in different sort orders. Tom I'm also wondering about performance: doesn't this imply more Tom rows passing through some of the plan steps than really Tom necessary? There's no way to cut down the number of rows seen by intermediate nodes unless you implement (and require) associative aggregates, which we do not do in this patch (that's left for possible future optimization efforts). Our approach makes no new demands on the implementation of aggregate functions. Tom Also, how would this extend to preferring hashed aggregation in Tom some of the grouping steps? My suggestion for extending it to hashed aggs is: by having a (single) HashAggregate node keep multiple hash tables, per grouping set, then any arbitrary collection of grouping sets can be handled in one node provided that memory permits and no non-hashable features are used. So the normal plan for CUBE(a,b) under this scheme would be just: HashAggregate Grouping Sets: (), (a), (b), (a,b) - (input path in unsorted order) If a mixture of hashable and non-hashable data types are used, for example CUBE(hashable,unhashable), then a plan of this form could be constructed: HashAggregate Grouping Sets: (), (hashable) - ChainAggregate Grouping Sets: (unhashable), (unhashable,hashable) - (input path sorted by (unhashable,hashable)) Likewise, plans of this form could be considered for cases like CUBE(low_card, high_card) where hashed grouping on high_card would require excessive memory: HashAggregate Grouping Sets: (), (low_card) - ChainAggregate Grouping Sets: (high_card), (high_card, low_card) - (input path sorted by (high_card, low_card)) Tom ISTM that maybe what we should do is take a cue from the SQL Tom spec, which defines these things in terms of UNION ALL of Tom plain-GROUP-BY operations reading from a common CTE. I looked at that, in fact that was our original plan, but it became clear quite quickly that it was not going to be easy. I tried two different approaches. First was to actually re-plan the input (i.e. running query_planner more than once) for different sort orders; that crashed and burned quickly thanks to the extent to which the planner assumes that it'll be run once only on any given input. Second was to generate a CTE for the input data. This didn't get very far because everything that already exists to handle CTE nodes assumes that they are explicit in the planner's input (that they have their own Query node, etc.) and I was not able to determine a good solution. If you have any suggestions for how to approach the problem, I'm happy to have another go at it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Fri, Dec 12, 2014 at 7:27 AM, Josh Berkus j...@agliodbs.com wrote: On 12/11/2014 02:12 PM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: While the CFs are still doing (1), support for (2) ended sometime in the 9.3 development cycle. Partly this is because current CFMs are not permitted to take authoritative steps to ensure that the CF ends on time, and partly it's because of the increase in big complicated patches which just don't fit into the CF cycle. I don't see why you think CFMs are not permitted to close out a CF when they want to. At least some of the fests have been closed out per calendar schedule, punting unprocessed patches to the next fest. We've certainly utterly failed to do that since August, but I think that's mismanagement. I love how you're willing to accuse Michael of mismangement when you yourself have *never* run a CF. If I was Michael, I'd quit right now. Er, just to be clear, I was basically not the CF manager for the one that began in October. I just showed up because I had some hours left and things were not moving on for many patches. Now, and I think that nobody is volunteering, I am fine to be the next CF manager for the one beginning on Monday. That would help making things move on. Regards, -- Michael
Re: [HACKERS] Commitfest problems
On Fri, Dec 12, 2014 at 7:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: How about *you* run the next one, Tom? I think the limited amount of time I can put into a commitfest is better spent on reviewing patches than on managing the process. That's not only your case, I have the same feeling that time is more productive when being focused on a single task. CFM work take your energy away from other things, and if a CFM reviews patches he'd tend to miss things in the patches he looked at. -- Michael
Re: [HACKERS] Proposal : REINDEX SCHEMA
On Fri, Dec 12, 2014 at 8:00 AM, Simon Riggs si...@2ndquadrant.com wrote: On 9 December 2014 at 12:21, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Dec 9, 2014 at 6:43 PM, Michael Paquier michael.paqu...@gmail.com wrote: OK. Perhaps that's not worth mentioning in the release notes, but some users may be used to the old behavior. What about the other issues (regression test for permissions incorrect and matviews)? Here is in any case an updated patch to fix all those things rebased on latest HEAD (ae4e688). Thanks, I rewrote and applied this patch to ensure we didn't introduce further bugs. Tests for the reindex part were rewritten from scratch also. Rather unluckily that seems to have coincided with Tom's changes, so I've only added the bits that have nothing to do with Tom's changes - all of which stand. Glad that things are in order now. I have nothing else left to complain about with this feature. Thanks. (I just got going again after my flight back from Tokyo). I understand going east makes one day longer :) -- Michael
Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. Good catch! In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. ... but bad fix. This would break join cases, wherein it's important that other scan nodes still return all the required tuples. (It's unfortunate that the eval-plan-qual isolation test fails to cover joins :-(.) After digging around a bit, I realized that the problem is actually in nodeLockRows.c. It is supposed to do EvalPlanQualSetTuple(..., NULL) for each child table that's not supposed to return any row during the current EPQ test cycle. Unfortunately, it only does that reliably once the EPQ environment is already set up. If we discover we need an EPQ test while looking at a non-first child table, tables already passed over in the loop over node-lr_arowMarks don't get the word. So the correct fix (or a correct fix, anyway; this could also have been done with more-invasive loop logic changes) is as attached. I'm working on back-patching this. regards, tom lane diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c index 990240b..5dcb9b0 100644 *** a/src/backend/executor/nodeLockRows.c --- b/src/backend/executor/nodeLockRows.c *** lnext: *** 194,200 --- 194,222 */ if (!epq_started) { + ListCell *lc2; + EvalPlanQualBegin(node-lr_epqstate, estate); + + /* + * Ensure that rels with already-visited rowmarks are told + * not to return tuples during the first EPQ test. We can + * exit this loop once it reaches the current rowmark; + * rels appearing later in the list will be set up + * correctly by the EvalPlanQualSetTuple call at the top + * of the loop. + */ + foreach(lc2, node-lr_arowMarks) + { + ExecAuxRowMark *aerm2 = (ExecAuxRowMark *) lfirst(lc2); + + if (lc2 == lc) + break; + EvalPlanQualSetTuple(node-lr_epqstate, + aerm2-rowmark-rti, + NULL); + } + epq_started = true; } diff --git a/src/test/isolation/expected/eval-plan-qual.out b/src/test/isolation/expected/eval-plan-qual.out index ab778cb..0f6595f 100644 *** a/src/test/isolation/expected/eval-plan-qual.out --- b/src/test/isolation/expected/eval-plan-qual.out *** accountid balance *** 49,51 --- 49,74 checking 600 savings2334 + + starting permutation: readp1 writep1 readp2 c1 c2 + step readp1: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; + tableoid ctid a b c + + c1 (0,1) 0 0 0 + c1 (0,4) 0 1 0 + c2 (0,1) 1 0 0 + c2 (0,4) 1 1 0 + c3 (0,1) 2 0 0 + c3 (0,4) 2 1 0 + step writep1: UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0; + step readp2: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; waiting ... + step c1: COMMIT; + step readp2: ... completed + tableoid ctid a b c + + c1 (0,1) 0 0 0 + c1 (0,4) 0 1 0 + c2 (0,1) 1 0 0 + c3 (0,1) 2 0 0 + c3 (0,4) 2 1 0 + step c2: COMMIT; diff --git a/src/test/isolation/specs/eval-plan-qual.spec b/src/test/isolation/specs/eval-plan-qual.spec index 786b791..876e547 100644 *** a/src/test/isolation/specs/eval-plan-qual.spec --- b/src/test/isolation/specs/eval-plan-qual.spec *** setup *** 8,18 --- 8,27 { CREATE TABLE accounts (accountid text PRIMARY KEY, balance numeric not null); INSERT INTO accounts VALUES ('checking', 600), ('savings', 600); + + CREATE TABLE p (a int, b int, c int); + CREATE TABLE c1 () INHERITS (p); + CREATE TABLE c2 () INHERITS (p); + CREATE TABLE c3 () INHERITS (p); + INSERT INTO c1 SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a; + INSERT INTO c2 SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a; + INSERT INTO c3 SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a; }
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Fri, Dec 12, 2014 at 1:34 AM, Bruce Momjian br...@momjian.us wrote: On Thu, Dec 11, 2014 at 01:26:38PM +0530, Rahila Syed wrote: I am sorry but I can't understand the above results due to wrapping. Are you saying compression was twice as slow? CPU usage at user level (in seconds) for compression set 'on' is 562 secs while that for compression set 'off' is 354 secs. As per the readings, it takes little less than double CPU time to compress. However , the total time taken to run 25 transactions for each of the scenario is as follows, compression = 'on' : 1838 secs = 'off' : 1701 secs Different is around 140 secs. OK, so the compression took 2x the cpu and was 8% slower. The only benefit is WAL files are 35% smaller? That depends as well on the compression algorithm used. I am far from being a specialist in this area, but I guess that there are things consuming less CPU for a lower rate of compression and that there are no magic solutions. A correct answer would be to either change the compression algorithm present in core to something that is more compliant to the FPW compression, or to add hooks to allow people to plug in the compression algorithm they want for the compression and decompression calls. In any case and for any type of compression (be it different algo, record-level compression or FPW compression), what we have here is a tradeoff, and a switch for people who care more about I/O than CPU usage. And we would still face in any case CPU bursts at checkpoints because I can't imagine FPWs not being compressed even if we do something at record level (thinking so what we have here is the light-compression version). Regards, -- Michael
Re: [HACKERS] Proposal : REINDEX SCHEMA
On Fri, Dec 12, 2014 at 8:41 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Dec 12, 2014 at 8:00 AM, Simon Riggs si...@2ndquadrant.com wrote: Rather unluckily that seems to have coincided with Tom's changes, so I've only added the bits that have nothing to do with Tom's changes - all of which stand. Glad that things are in order now. I have nothing else left to complain about with this feature. Thanks. Actually there is one thing left: this patch from Sawada-san to finish the cleanup of ReindexStmt for the boolean flags. http://www.postgresql.org/message-id/cad21aocxfe1j+hskbej80rqqnhr8m_yuxdgkwz4dl8zpqt8...@mail.gmail.com Regards, -- Michael
Re: [HACKERS] Proposal : REINDEX SCHEMA
On 11 December 2014 at 23:41, Michael Paquier michael.paqu...@gmail.com wrote: Glad that things are in order now. I have nothing else left to complain about with this feature. Thanks. I think you've discovered a new meaning of Ready for Committer -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Hello, This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. Good catch! Thank you. In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. ... but bad fix. This would break join cases, wherein it's important that other scan nodes still return all the required tuples. (It's unfortunate that the eval-plan-qual isolation test fails to cover joins :-(.) Hmm. I regretfully forgot to do isolation check. The confluent didn't seem a bug but I couldn't see its intention. After digging around a bit, I realized that the problem is actually in nodeLockRows.c. It is supposed to do EvalPlanQualSetTuple(..., NULL) for each child table that's not supposed to return any row during the current EPQ test cycle. Unfortunately, it only does that reliably once the EPQ environment is already set up. If we discover we need an EPQ test while looking at a non-first child table, tables already passed over in the loop over node-lr_arowMarks don't get the word. Thank you for the detailed explanation. I was wondering during investigation about EvalPlanQualSetTuple(, NULL) was called only for children after the first one that had EPQ test tuple. Now I understand that it was the core of this bug. So the correct fix (or a correct fix, anyway; this could also have been done with more-invasive loop logic changes) is as attached. I'm working on back-patching this. That really helps. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal : REINDEX SCHEMA
On Fri, Dec 12, 2014 at 10:19 AM, Simon Riggs si...@2ndquadrant.com wrote: On 11 December 2014 at 23:41, Michael Paquier michael.paqu...@gmail.com wrote: Glad that things are in order now. I have nothing else left to complain about with this feature. Thanks. I think you've discovered a new meaning of Ready for Committer Yep. Sorry for missing so many things. -- Michael
Re: [HACKERS] double vacuum in initdb
On 12/11/14 11:44 AM, Kevin Grittner wrote: We want to finish with VACUUM FREEZE without the FULL, unless we don't care about missing visibility maps and free space maps. Why would we care, and if we do, why does VACUUM FULL remove them? You can also run plain VACUUM after FULL to put the maps back. But the documentation is apparently missing details about this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: This is caused by that IndexRecheck examines the test tuple with a qual c = '0' without b IN ('0', '1'). The part has been removed in create_indexscan_plan. It decieds whether to remove a qual or not using get_parse_rowmark(root-parse(-rowMarks)) and predicate_implied_by(). But the former always says no (NULL) for child relations even if the parent has rowMarks. On the other hand, rowmarks on children is already distributed at the time by expand_inherited_rtentry() into root-rowMarks. So I replaced the get_parse_rowmark() with get_plan_rowmark() as the attached patch and the problem disappeared. Yeah, this is clearly a thinko: really, nothing in the planner should be using get_parse_rowmark(). I looked around for other errors of the same type and found that postgresGetForeignPlan() is also using get_parse_rowmark(). While that's harmless at the moment because we don't support foreign tables as children, it's still wrong. Will fix that too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
(2014/12/12 10:37), Tom Lane wrote: Yeah, this is clearly a thinko: really, nothing in the planner should be using get_parse_rowmark(). I looked around for other errors of the same type and found that postgresGetForeignPlan() is also using get_parse_rowmark(). While that's harmless at the moment because we don't support foreign tables as children, it's still wrong. Will fix that too. I don't think we need to fix that too. In order to support that, I'm proposing to modify postgresGetForeignPlan() in the following way [1] (see fdw-inh-5.patch). *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 824,834 postgresGetForeignPlan(PlannerInfo *root, { RowMarkClause *rc = get_parse_rowmark(root-parse, baserel-relid); if (rc) { /* !* Relation is specified as a FOR UPDATE/SHARE target, so handle !* that. * * For now, just ignore any [NO] KEY specification, since (a) it's * not clear what that means for a remote table that we don't have --- 824,847 { RowMarkClause *rc = get_parse_rowmark(root-parse, baserel-relid); + /* +* It's possible that relation is contained in an inheritance set and +* that parent relation is selected FOR UPDATE/SHARE. If so, get the +* RowMarkClause for parent relation. +*/ + if (rc == NULL) + { + PlanRowMark *prm = get_plan_rowmark(root-rowMarks, baserel-relid); + + if (prm prm-rti != prm-prti) + rc = get_parse_rowmark(root-parse, prm-prti); + } + if (rc) { /* !* Relation or parent relation is specified as a FOR UPDATE/SHARE !* target, so handle that. * * For now, just ignore any [NO] KEY specification, since (a) it's * not clear what that means for a remote table that we don't have [1] http://www.postgresql.org/message-id/5487bb9d.7070...@lab.ntt.co.jp Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: (2014/12/12 10:37), Tom Lane wrote: Yeah, this is clearly a thinko: really, nothing in the planner should be using get_parse_rowmark(). I looked around for other errors of the same type and found that postgresGetForeignPlan() is also using get_parse_rowmark(). While that's harmless at the moment because we don't support foreign tables as children, it's still wrong. Will fix that too. I don't think we need to fix that too. In order to support that, I'm proposing to modify postgresGetForeignPlan() in the following way [1] (see fdw-inh-5.patch). My goodness, that's ugly. And it's still wrong, because this is planner code so it shouldn't be using get_parse_rowmark at all. The whole point here is that the rowmark info has been transformed into something appropriate for the planner to use. While that transformation is relatively trivial today, it might not always be so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Refactoring code for sync node detection
On Thu, Dec 11, 2014 at 10:07 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: * I don't like filling the priorities-array in SyncRepGetSynchronousStandby(). It might be convenient for the one caller that needs it, but it seems pretty ad hoc. In fact, I don't really see the point of having the array at all. You could just print the priority in the main loop in pg_stat_get_wal_senders(). Yeah, nominally you need to hold SyncRepLock while reading the priority, but it seems over-zealous to be so strict about that in pg_stat_wal_senders(), since it's just an informational view, and priority changes so rarely that it's highly unlikely to hit a race condition there. Also note that when you change the list of synchronous standbys in the config file, and SIGHUP, the walsender processes will update their priorities in random order. So the idea that you get a consistent snapshot of the priorities is a bit bogus anyway. Also note that between filling the priorities array and the main loop in pg_stat_get_wal_senders(), a new WAL sender might connect and set a slot's pid. With the current coding, you'll print an uninitialized value from the array if that happens. So the only thing that holding the SyncRepLock really protects from is a torn read of the value, if reading an int is not atomic. We could use the spinlock to protect from that if we care. That's fair, and it simplifies the whole process as well as the API able to fetch the synchronous WAL sender. * Would be nicer to return a pointer, than index into the wal senders array. That's what the caller really wants. I propose the attached (I admit I haven't tested it). Actually if you do it this way I think that it would be worth adding the small optimization Fujii-san mentioned upthread: if priority is equal to 1, we leave the loop earlier and return immediately the pointer. All those things gathered give the patch attached, that I actually tested FWIW with multiple standbys and multiple entries in s_s_names. Regards, -- Michael diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index aa54bfb..34530a0 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -5,7 +5,7 @@ * Synchronous replication is new as of PostgreSQL 9.1. * * If requested, transaction commits wait until their commit LSN is - * acknowledged by the sync standby. + * acknowledged by the synchronous standby. * * This module contains the code for waiting and release of backends. * All code in this module executes on the primary. The core streaming @@ -358,6 +358,62 @@ SyncRepInitConfig(void) } /* + * Find the WAL sender servicing the synchronous standby with the lowest + * priority value, or NULL if no synchronous standby is connected. If there + * are multiple nodes with the same lowest priority value, the first node + * found is selected. The caller must hold SyncRepLock. + */ +WalSnd * +SyncRepGetSynchronousStandby(void) +{ + WalSnd *result = NULL; + int result_priority = 0; + int i; + + /* Scan WAL senders and find synchronous node if any */ + for (i = 0; i max_wal_senders; i++) + { + /* Use volatile pointer to prevent code rearrangement */ + volatile WalSnd *walsnd = WalSndCtl-walsnds[i]; + int this_priority; + + /* Must be active */ + if (walsnd-pid == 0) + continue; + + /* Must be streaming */ + if (walsnd-state != WALSNDSTATE_STREAMING) + continue; + + /* Must be synchronous */ + this_priority = walsnd-sync_standby_priority; + if (this_priority == 0) + continue; + + /* Must have a lower priority value than any previous ones */ + if (result != NULL result_priority = this_priority) + continue; + + /* Must have a valid flush position */ + if (XLogRecPtrIsInvalid(walsnd-flush)) + continue; + + result = (WalSnd *) walsnd; + result_priority = this_priority; + + /* + * If priority is equal to 1, we are sure that there are no other + * WAL senders that could be synchronous with a lower prioroty, + * hence leave immediately with this one. + */ + if (this_priority == 1) + return result; + } + + return result; +} + +/* * Update the LSNs on each queue based upon our latest state. This * implements a simple policy of first-valid-standby-releases-waiter. * @@ -368,11 +424,9 @@ void SyncRepReleaseWaiters(void) { volatile WalSndCtlData *walsndctl = WalSndCtl; - volatile WalSnd *syncWalSnd = NULL; + WalSnd *syncWalSnd; int numwrite = 0; int numflush = 0; - int priority = 0; - int i; /* * If this WALSender is serving a standby that is not on the list of @@ -388,32 +442,13 @@ SyncRepReleaseWaiters(void) /* * We're a potential sync standby. Release waiters if we are the highest * priority standby. If there are multiple standbys with same priorities - * then we use the first mentioned standby. If you change this, also - * change pg_stat_get_wal_senders(). + * then we use the first mentioned standbys. */
Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
(2014/12/12 11:19), Tom Lane wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: (2014/12/12 10:37), Tom Lane wrote: Yeah, this is clearly a thinko: really, nothing in the planner should be using get_parse_rowmark(). I looked around for other errors of the same type and found that postgresGetForeignPlan() is also using get_parse_rowmark(). While that's harmless at the moment because we don't support foreign tables as children, it's still wrong. Will fix that too. I don't think we need to fix that too. In order to support that, I'm proposing to modify postgresGetForeignPlan() in the following way [1] (see fdw-inh-5.patch). My goodness, that's ugly. And it's still wrong, because this is planner code so it shouldn't be using get_parse_rowmark at all. The whole point here is that the rowmark info has been transformed into something appropriate for the planner to use. While that transformation is relatively trivial today, it might not always be so. OK, I'll update the inheritance patch on top of the revison you'll make. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_regress writes into source tree
When using a vpath build pg_regress writes the processed input/*.source files into the *source* tree, which isn't supposed to happen. This appears to be a thinko introduced in this patch: e3fc4a97bc8ee82a78605b5ffe79bd4cf3c6213b The attached patch fixes it. diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 27c46ab..8683035 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -611,7 +611,7 @@ convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c static void convert_sourcefiles(void) { - convert_sourcefiles_in(input, inputdir, sql, sql); + convert_sourcefiles_in(input, outputdir, sql, sql); convert_sourcefiles_in(output, outputdir, expected, out); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On 17 November 2014 at 22:08, Simon Riggs si...@2ndquadrant.com wrote: On 17 November 2014 21:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote: What happened to this patch? I'm going over something that could use the concept of clean some stuff up when reading this page, but only if we're already writing or similar. I see some cases were presented that had a performance decrease. Did we get any numbers for the increase in performance in some other interesting cases? It's not dead; it just needs more work. Maybe for next CF, or you can now. Latest version attached for next CF -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services hot_disable.v7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers