Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
>
> I wrote:
> > Amit Kapila  writes:
> >> It is slightly tricky to write a reproducible parallel-query test, but
> >> point taken and I think we should try to have a test unless such a
test is
> >> really time consuming.
>
> > BTW, decent regression tests could be written without the need to create
> > enormous tables if the minimum rel size in create_plain_partial_paths()
> > could be configured to something less than 1000 blocks.  I think it's
> > fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> > we make it a GUC?
>
> Just as an experiment to see what would happen, I did
>
> -   int parallel_threshold = 1000;
> +   int parallel_threshold = 1;
>
> and ran the regression tests.  I got a core dump in the window.sql test:
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> 4307Index   sgref =
final_target->sortgrouprefs[i];
> (gdb) bt
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> #1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8,
> target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
> at planner.c:3420
> #2  0x00667405 in grouping_planner (root=0x1795018,
> inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
> #3  0x00668c80 in subquery_planner (glob=,
> parse=0x1703580, parent_root=,
> hasRecursion=, tuple_fraction=0) at planner.c:769
> #4  0x00668ea5 in standard_planner (parse=0x1703580,
> cursorOptions=256, boundParams=0x0) at planner.c:308
> #5  0x006691b6 in planner (parse=,
> cursorOptions=, boundParams=)
> at planner.c:178
> #6  0x006fb069 in pg_plan_query (querytree=0x1703580,
> cursorOptions=256, boundParams=0x0) at postgres.c:798
> (gdb) p debug_query_string
> $1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"
>
> which I think may be another manifestation of the failure-to-apply-proper-
> pathtarget issue we're looking at in this thread.  Or maybe it's just
> an unjustified assumption in make_partialgroup_input_target that the
> input path must always have some sortgrouprefs assigned.
>

I don't see this problem after your recent commit
- 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?

>
> Before getting to that point, there was also an unexplainable plan change:
>

Yeah, I am also seeing such a plan change, but I this is a separate thing
to investigate.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?

2016-06-13 Thread Noah Misch
On Fri, Jun 10, 2016 at 03:10:40AM -0400, Noah Misch wrote:
> On Tue, Jun 07, 2016 at 06:05:10PM -0400, Tom Lane wrote:
> > Jean-Pierre Pelletier  writes:
> > > I wanted to test if phraseto_tsquery(), new with 9.6 could be used for
> > > matching consecutive words but it won't work for us if it cannot handle
> > > consecutive *duplicate* words.
> > 
> > > For example, the following returns true:select
> > > phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue');
> > 
> > > Is this expected ?
> > 
> > I concur that that seems like a rather useless behavior.  If we have
> > "x <-> y" it is not possible to match at distance zero, while if we
> > have "x <-> x" it seems unlikely that the user is expecting us to
> > treat that identically to "x".  So phrase search simply should not
> > consider distance-zero matches.
> 
> [Action required within 72 hours.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 9.6 open item.  Teodor,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> 9.6 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within 72 hours of this
> message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping 9.6rc1.  Consequently, I will appreciate your
> efforts toward speedy resolution.  Thanks.
> 
> [1] 
> http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com

This PostgreSQL 9.6 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Peter Eisentraut

On 6/9/16 7:16 PM, Tatsuo Ishii wrote:

Something like SetClientEncoding(GetDatabaseEncoding()) is a Little
bit ugly. It would be nice if we could have a switch to turn off the
automatic encoding conversion in the future, but for 9.6, I feel I'm
fine with your proposed patch.


One way to make this nicer would be to put the encoding of a string into 
the StringInfoData structure.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Peter Eisentraut

On 6/10/16 2:08 PM, Peter Eisentraut wrote:

On 6/6/16 9:45 PM, Peter Eisentraut wrote:

Attached is a patch to illustrates how this could be fixed.  There might
be similar issues elsewhere.  The notification propagation in particular
could be affected.


Tracing the code, NotificationResponse messages are converted to the
client encoding during transmission from the worker to the leader and
then sent on binarily to the client, so this should appear to work at
the moment.  But it will break if we make a change like I suggested,
namely changing the client encoding in the worker process to be the
server encoding, because then nothing will transcode it before it
reaches the client anymore.  So this will need a well-considered
solution in concert with the error/notice issue.

Then again, it's not clear to me under what circumstances a parallel
worker could or should be sending a NotificationResponse.


Modulo that last point, here is a patch that shows how I think this 
could work, in combination with the patch I posted previously that sets 
the "client encoding" in the parallel worker to the server encoding.


This patch disassembles the NotificationResponse message with a 
temporary client encoding, and then sends it off to the real frontend 
using the real client encoding.


Doing it this way also takes care of a few special cases that 
NotifyMyFrontEnd() handles, such as a client with protocol version 2 
that doesn't expect a payload in the message.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..0bb93b4 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -810,7 +810,22 @@ HandleParallelMessage(ParallelContext *pcxt, int i, 
StringInfo msg)
case 'A':   /* NotifyResponse */
{
/* Propagate NotifyResponse. */
-   pq_putmessage(msg->data[0], >data[1], 
msg->len - 1);
+   int save_client_encoding;
+   int32   pid;
+   const char *channel;
+   const char *payload;
+
+   save_client_encoding = pg_get_client_encoding();
+   SetClientEncoding(GetDatabaseEncoding());
+
+   pid = pq_getmsgint(msg, 4);
+   channel = pq_getmsgstring(msg);
+   payload = pq_getmsgstring(msg);
+   pq_endmessage(msg);
+
+   SetClientEncoding(save_client_encoding);
+
+   NotifyMyFrontEnd(channel, payload, pid);
break;
}
 
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index c39ac3a..716f1c3 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -390,9 +390,6 @@ static bool asyncQueueProcessPageEntries(volatile 
QueuePosition *current,
 char *page_buffer);
 static void asyncQueueAdvanceTail(void);
 static void ProcessIncomingNotify(void);
-static void NotifyMyFrontEnd(const char *channel,
-const char *payload,
-int32 srcPid);
 static bool AsyncExistsPendingNotify(const char *channel, const char *payload);
 static void ClearPendingActionsAndNotifies(void);
 
@@ -2076,7 +2073,7 @@ ProcessIncomingNotify(void)
 /*
  * Send NOTIFY message to my front end.
  */
-static void
+void
 NotifyMyFrontEnd(const char *channel, const char *payload, int32 srcPid)
 {
if (whereToSendOutput == DestRemote)
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index b4c13fa..95559df 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -28,6 +28,10 @@ extern volatile sig_atomic_t notifyInterruptPending;
 extern Size AsyncShmemSize(void);
 extern void AsyncShmemInit(void);
 
+extern void NotifyMyFrontEnd(const char *channel,
+const char *payload,
+int32 srcPid);
+
 /* notify-related SQL statements */
 extern void Async_Notify(const char *channel, const char *payload);
 extern void Async_Listen(const char *channel);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Data at rest encryption

2016-06-13 Thread Haribabu Kommi
On Sun, Jun 12, 2016 at 5:13 PM, Ants Aasma  wrote:
> On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi
>  wrote:
>
>> 2. Instead of depending on a contrib module for the encryption, how
>> about integrating pgcrypto contrib in to the core and add that as a
>> default encryption method. And also provide an option to the user
>> to use a different encryption methods if needs.
>
> Technically that would be simple enough, this is more of a policy
> decision. I think having builtin encryption provided by pgcrypto is
> completely fine. If a consensus emerges that it needs to be
> integrated, it would need to be a separate patch anyway.

In our proprietary database, we are using the encryption methods
provided by openSSL [1]. May be we can have a look at those
methods provided by openSSL for the use of encryption for builds
under USE_SSL. Ignore it if you have already validated.


>> 5. Instead of providing passphrase through environmental variable,
>> better to provide some options to pg_ctl etc.
>
> That looks like it would be worse from a security perspective.
> Integrating a passphrase prompt would be an option, but a way for
> scripts to provide passphrases would still be needed.

What I felt was, if we store the passphrase in an environmental variable,
a person who is having an access to the system can get the details
and using that it may be possible to decrypt the data files.


[1] - https://www.openssl.org/docs/manmaster/crypto/EVP_EncryptInit.html


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:42 AM, Julien Rouhaud
 wrote:
> Agreed, and fixed in attached v3.

I don't entirely like the new logic in
RegisterDynamicBackgroundWorker.  I wonder if we can't drive this off
of a couple of counters, instead of having the process registering the
background worker iterate over every slot.  Suppose we add two
counters to BackgroundWorkerArray, parallel_register_count and
parallel_terminate_count.  Whenever a backend successfully registers a
parallel worker, it increments parallel_register_count.  Whenever the
postmaster marks a parallel wokrer slot as no longer in use, it
increments parallel_terminate_count.  Then, the number of active
parallel workers is just parallel_register_count -
parallel_terminate_count.  (We can't have the postmaster and the
backends share the same counter, because then it would need locking,
and the postmaster can't try to take spinlocks - can't even use
atomics, because those might be emulated using spinlocks.)

Of course, it would be nice if we could make these counters 64-bit
integers, but we can't, because we don't rely on 64-bit reads and
writes to be atomic on all platforms.  So instead they'll have to be
uint32.  That means they could wrap (if you really work at it) but
subtraction will still return the right answer, so it's OK.  If we
want to allow the number of parallel workers started to be available
for statistical purposes, we can keep to uint32 values for that
(parallel_register_count_lo and parallel_register_count_hi, for
example), and increment the second one whenever the first one rolls
over to zero.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IsUnderPostmaster with shared_preload_libraries on Windows

2016-06-13 Thread Joe Conway
On 06/13/2016 01:57 PM, Tom Lane wrote:
> Joe Conway  writes:
>> Is it expected that IsUnderPostmaster is true during postmaster startup
>> in an extension's _PG_init() when preloading under Windows? On Linux it
>> is false at this point AFAICT.
> 
> AFAIK it will be false in the *postmaster's* execution of _PG_init().
> But keep in mind that on Windows each exec'd child process will have to
> load the shared_preload_libraries again for itself, and those executions
> should see IsUnderPostmaster = true.  I think what you are seeing is
> additional executions in the startup process or other background
> processes, which don't happen in a forked-children environment.
> 
> You can probably duplicate this behavior for testing purposes on Unix
> by compiling with EXEC_BACKEND defined, if that helps. 

Thanks, definitely easier to debug on Linux. It appears the best answer
is to use process_shared_preload_libraries_in_progress rather than
trying to derive that information using IsUnderPostmaster.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 6:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Again, please have a look at the patch and see if it seems unsuitable
>> to you for some reason.  I'm not confident of my ability to get all of
>> this path stuff right without a bit of help from the guy who designed
>> it.
>
> I'm kind of in a bind right now because Tomas has produced an
> FK-selectivity patch rewrite on schedule, and I already committed to
> review that before beta2, so I better go do that first.  If you can wait
> awhile I will try to help out more with parallel query.

I'm happy to have you look at his patch first.

> Having said that, the main thing I noticed in your draft patch is that
> you'd removed set_grouped_rel_consider_parallel() in favor of hard-wiring
> this in create_grouping_paths():
>
> +   if (input_rel->consider_parallel &&
> +   !has_parallel_hazard((Node *) target->exprs, false) &&
> +   !has_parallel_hazard((Node *) parse->havingQual, false))
> +   grouped_rel->consider_parallel = true;
>
> I think this is bad because it forces any external creators of
> UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
> if anyone is out of sync on whether to set the flag.  So I'd rather keep
> set_grouped_rel_consider_parallel(), even if all it does is the above.
> And make it global not static.  Ditto for the other upperrels.

I'm slightly mystified by this because we really shouldn't be setting
that flag more than once.  We don't want to do that work repeatedly,
just once, and prior to adding any paths to the rel.  Are you
imagining that somebody would try to created grouped paths before we
finish scan/join planning?

> Also, I wonder whether we could reduce that test to just the
> has_parallel_hazard tests, so as to avoid the ordering dependency of
> needing to create the top scan/join rel (and set its consider_parallel
> flag) before we can create the UPPERREL_GROUP_AGG rel.  This would mean
> putting more dependency on per-path parallel_safe flags to detect whether
> you can't parallelize a given step for lack of parallel-safe input, but
> I'm not sure that that's a bad thing.

It doesn't sound like an especially good thing to me.  Skipping all
parallel path generation is quite a bit less work than trying each
path in turn and realizing that none of them will work, and there are
various places where we optimize on that basis.  I don't understand,
in any event, why it makes any sense to create the UPPERREL_GROUP_AGG
rel before we finish scan/join planning.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:51 PM, Robert Haas  wrote:
> On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas  wrote:
>> Although I have done a bit of review of this patch, it needs more
>> thought than I have so far had time to give it.  I will update again
>> by Tuesday.
>
> I've reviewed this a bit further and have discovered an infelicity.

Also, independent of the fix for this particular issue, I think it
would be smart to apply the attached patch to promote the assertion
that failed here to an elog().  If we have more bugs of this sort, now
or in the future, I'd like to catch them even in non-assert-enabled
builds by getting a sensible error rather than just by failing an
assertion.  I think it's our general practice to check node types with
elog() rather than Assert() when the nodes are coming from some
far-distant part of the code, which is certainly the case here.

I plan to commit this without delay unless there are vigorous,
well-reasoned objections.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 7d2512c..f38da5d 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1112,8 +1112,10 @@ deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
 		/* Extract expression if TargetEntry node */
 		Assert(IsA(tle, TargetEntry));
 		var = (Var *) tle->expr;
+
 		/* We expect only Var nodes here */
-		Assert(IsA(var, Var));
+		if (!IsA(var, Var))
+			elog(ERROR, "non-Var not expected in target list");
 
 		if (i > 0)
 			appendStringInfoString(buf, ", ");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> Again, please have a look at the patch and see if it seems unsuitable
> to you for some reason.  I'm not confident of my ability to get all of
> this path stuff right without a bit of help from the guy who designed
> it.

I'm kind of in a bind right now because Tomas has produced an
FK-selectivity patch rewrite on schedule, and I already committed to
review that before beta2, so I better go do that first.  If you can wait
awhile I will try to help out more with parallel query.

Having said that, the main thing I noticed in your draft patch is that
you'd removed set_grouped_rel_consider_parallel() in favor of hard-wiring
this in create_grouping_paths():

+   if (input_rel->consider_parallel &&
+   !has_parallel_hazard((Node *) target->exprs, false) &&
+   !has_parallel_hazard((Node *) parse->havingQual, false))
+   grouped_rel->consider_parallel = true;

I think this is bad because it forces any external creators of
UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
if anyone is out of sync on whether to set the flag.  So I'd rather keep
set_grouped_rel_consider_parallel(), even if all it does is the above.
And make it global not static.  Ditto for the other upperrels.

Also, I wonder whether we could reduce that test to just the
has_parallel_hazard tests, so as to avoid the ordering dependency of
needing to create the top scan/join rel (and set its consider_parallel
flag) before we can create the UPPERREL_GROUP_AGG rel.  This would mean
putting more dependency on per-path parallel_safe flags to detect whether
you can't parallelize a given step for lack of parallel-safe input, but
I'm not sure that that's a bad thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> One problem that I've realized that is related to this is that the way
>> that the consider_parallel flag is being set for upper rels is almost
>> totally bogus, which I believe accounts for your complaint at PGCon
>> that force_parallel_mode was not doing as much as it ought to do.
>
> Yeah, I just ran into a different reason to think that.  I tried marking
> MinMaxAggPaths in the same way as other relation-scan paths, ie
>
> pathnode->path.parallel_safe = rel->consider_parallel;
>
> but that did nothing because the just-created UPPERREL_GROUP_AGG rel
> didn't have its consider_parallel flag set yet.  Moreover, if I'd tried to
> fix that by invoking set_grouped_rel_consider_parallel() from planagg.c,
> it would still not work right because set_grouped_rel_consider_parallel()
> is hard-wired to consider only partial aggregation, which is not what's
> going on in a MinMaxAggPath.  I'm not sure about the best rewrite here,
> but I would urge you to make sure that consider_parallel on an upper rel
> reflects only properties inherent to the rel (eg, safeness of quals that
> must be applied there) and not properties of specific implementation
> methods.

Yeah, I think that David and I goofed this up when adding parallel
aggregation.  I just wasn't thinking clearly about the way upper rels
needed to work with consider_parallel.  If you would be willing to do
any sort of review of the WIP patch I posted just upthread, that would
help.

> Also, please make sure that whatever logic is involved remains
> factored out where it can be called by an extension that might want to
> create an upperrel sooner than the core code would do.

Again, please have a look at the patch and see if it seems unsuitable
to you for some reason.  I'm not confident of my ability to get all of
this path stuff right without a bit of help from the guy who designed
it.

> BTW, do we need wholePlanParallelSafe at all, and if so why?  Can't
> it be replaced by examining the parallel_safe flag on the selected
> topmost Path?

Oh, man.  I think you're right.  This is yet another piece of fallout
from upper planner pathification.  That was absolutely necessary
before, but now if we do the other bits right we don't need it any
more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-13 Thread Robert Haas
On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas  wrote:
> Although I have done a bit of review of this patch, it needs more
> thought than I have so far had time to give it.  I will update again
> by Tuesday.

I've reviewed this a bit further and have discovered an infelicity.
The following is all with the patch applied.

By itself, this join can be pushed down:

contrib_regression=# EXPLAIN SELECT 13 FROM ft1 RIGHT JOIN ft2 ON
ft1.c1 = ft2.c1;
  QUERY PLAN
--
 Foreign Scan  (cost=100.00..137.66 rows=822 width=4)
   Relations: (public.ft2) LEFT JOIN (public.ft1)
(2 rows)

That's great.  However, when that query is used as the outer rel of a
left join, it can't:

contrib_regression=# explain verbose select * from ft4 LEFT JOIN
(SELECT 13 FROM ft1 RIGHT JOIN ft2 ON ft1.c1 = ft2.c1) q on true;
 QUERY PLAN
-
 Nested Loop Left Join  (cost=353.50..920.77 rows=41100 width=19)
   Output: ft4.c1, ft4.c2, ft4.c3, (13)
   ->  Foreign Scan on public.ft4  (cost=100.00..102.50 rows=50 width=15)
 Output: ft4.c1, ft4.c2, ft4.c3
 Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
   ->  Materialize  (cost=253.50..306.57 rows=822 width=4)
 Output: (13)
 ->  Hash Left Join  (cost=253.50..302.46 rows=822 width=4)
   Output: 13
   Hash Cond: (ft2.c1 = ft1.c1)
   ->  Foreign Scan on public.ft2  (cost=100.00..137.66
rows=822 width=4)
 Output: ft2.c1
 Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
   ->  Hash  (cost=141.00..141.00 rows=1000 width=4)
 Output: ft1.c1
 ->  Foreign Scan on public.ft1
(cost=100.00..141.00 rows=1000 width=4)
   Output: ft1.c1
   Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
(18 rows)

Of course, because of the PlaceHolderVar, there's no way to push down
the ft1-ft2-ft4 join to the remote side.  But we could still push down
the ft1-ft2 join and then locally perform the join between the result
and ft4.  However, the proposed fix doesn't allow that, because
ph_eval_at is (b 4 5) and relids for the ft1-ft2 join is also (b 4 5),
and so the bms_is_subset(phinfo->ph_eval_at, relids) test returns
true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> Oh, one other thing about this: I'm not going to try to defend
> whatever confusion between subplans and subqueries appears in that
> comment, but prior to upper planner pathification I think we were dead
> in the water here anyway, because the subquery was going to output a
> finished plan, not a list of paths.  Since finished plans aren't
> labeled as to parallel-safety, we'd have to conservatively assume that
> the finished plan we got back might not be parallel-safe.  Now that
> we're using the path representation throughout, we can do better.

Well, if that were still the state of affairs, we could certainly consider
adding a parallel_safe flag to Plans as well as Paths.  But as you say,
it's moot now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> One problem that I've realized that is related to this is that the way
> that the consider_parallel flag is being set for upper rels is almost
> totally bogus, which I believe accounts for your complaint at PGCon
> that force_parallel_mode was not doing as much as it ought to do.

Yeah, I just ran into a different reason to think that.  I tried marking
MinMaxAggPaths in the same way as other relation-scan paths, ie

pathnode->path.parallel_safe = rel->consider_parallel;

but that did nothing because the just-created UPPERREL_GROUP_AGG rel
didn't have its consider_parallel flag set yet.  Moreover, if I'd tried to
fix that by invoking set_grouped_rel_consider_parallel() from planagg.c,
it would still not work right because set_grouped_rel_consider_parallel()
is hard-wired to consider only partial aggregation, which is not what's
going on in a MinMaxAggPath.  I'm not sure about the best rewrite here,
but I would urge you to make sure that consider_parallel on an upper rel
reflects only properties inherent to the rel (eg, safeness of quals that
must be applied there) and not properties of specific implementation
methods.  Also, please make sure that whatever logic is involved remains
factored out where it can be called by an extension that might want to
create an upperrel sooner than the core code would do.

BTW, do we need wholePlanParallelSafe at all, and if so why?  Can't
it be replaced by examining the parallel_safe flag on the selected
topmost Path?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:27 PM, Robert Haas  wrote:
> On Mon, Jun 13, 2016 at 4:46 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> In practice, we don't yet have the ability for
>>> parallel-safe paths from subqueries to affect planning at higher query
>>> levels, but that's because the pathification stuff landed too late in
>>> the cycle for me to fully react to it.
>>
>> I wonder if that's not just from confusion between subplans and
>> subqueries.  I don't believe any of the claims made in the comment
>> adjusted in the patch below (other than "Subplans currently aren't passed
>> to workers", which is true but irrelevant).  Nested Gather nodes is
>> something that you would need, and already have, a defense for anyway.
>
> I think you may be correct.

Oh, one other thing about this: I'm not going to try to defend
whatever confusion between subplans and subqueries appears in that
comment, but prior to upper planner pathification I think we were dead
in the water here anyway, because the subquery was going to output a
finished plan, not a list of paths.  Since finished plans aren't
labeled as to parallel-safety, we'd have to conservatively assume that
the finished plan we got back might not be parallel-safe.  Now that
we're using the path representation throughout, we can do better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 4:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> In practice, we don't yet have the ability for
>> parallel-safe paths from subqueries to affect planning at higher query
>> levels, but that's because the pathification stuff landed too late in
>> the cycle for me to fully react to it.
>
> I wonder if that's not just from confusion between subplans and
> subqueries.  I don't believe any of the claims made in the comment
> adjusted in the patch below (other than "Subplans currently aren't passed
> to workers", which is true but irrelevant).  Nested Gather nodes is
> something that you would need, and already have, a defense for anyway.

I think you may be correct.

One problem that I've realized that is related to this is that the way
that the consider_parallel flag is being set for upper rels is almost
totally bogus, which I believe accounts for your complaint at PGCon
that force_parallel_mode was not doing as much as it ought to do.
When I originally wrote a lot of this logic, there were no upper rels,
which led to this code:

if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)
{
glob->parallelModeNeeded = false;
glob->wholePlanParallelSafe = false;/* either
false or don't care */
}
else
{
glob->parallelModeNeeded = true;
glob->wholePlanParallelSafe =
!has_parallel_hazard((Node *) parse, false);
}

The main way that wholePlanParallelSafe is supposed to be cleared is
in create_plan:

/* Update parallel safety information if needed. */
if (!best_path->parallel_safe)
root->glob->wholePlanParallelSafe = false;

However, at the time that code was written, it was impossible to rely
entirely on the latter mechanism.  Since there were no upper rels and
no paths for upper plan nodes, you could have the case where every
path was parallel-safe but the whole plan was node.  Therefore the
code shown above was needed to scan the whole darned plan for
parallel-unsafe things.  Post-pathification, this whole thing is
pretty bogus: upper rels mostly don't get consider_parallel set at
all, which means plans involving upper rels get marked parallel-unsafe
even if they are not, which means the wholePlanParallelSafe flag gets
cleared in a bunch of cases where it shouldn't.  And on the flip side
I think that the first chunk of code quoted above would be totally
unnecessary if we were actually setting consider_parallel correctly on
the upper rels.

I started working on a patch to fix all this, which I'm attaching here
so you can see what I'm thinking.  I am not sure it's correct, but it
does cause force_parallel_mode to do something interesting in many
more cases.

Anyway, the reason this is related to the issue at hand is that you
might have something like A LEFT JOIN (SELECT x, sum(q) FROM B GROUP
BY 1) ON A.x = B.x.   Right now, I think that the paths for the
aggregated subquery will always be marked as not parallel-safe, but
that's only because the consider_parallel flag on the grouped rel
isn't being set properly.  If it were, you could theoretically do a
parallel seq scan on A and then have each worker evaluate the join for
the rows that pop out of the subquery.  Maybe that's a stupid example,
but the point is that I bet there are cases where failing to mark the
upper rels with consider_parallel where appropriate causes good
parallel plans not to get generated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b1554cb..8232a64 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -107,9 +107,6 @@ static double get_number_of_groups(PlannerInfo *root,
 	 double path_rows,
 	 List *rollup_lists,
 	 List *rollup_groupclauses);
-static void set_grouped_rel_consider_parallel(PlannerInfo *root,
-	 RelOptInfo *grouped_rel,
-	 PathTarget *target);
 static Size estimate_hashagg_tablesize(Path *path, AggClauseCosts *agg_costs,
 	 double dNumGroups);
 static RelOptInfo *create_grouping_paths(PlannerInfo *root,
@@ -272,8 +269,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
 	else
 	{
 		glob->parallelModeNeeded = true;
-		glob->wholePlanParallelSafe =
-			!has_parallel_hazard((Node *) parse, false);
+		glob->wholePlanParallelSafe = true;
 	}
 
 	/* Determine what fraction of the plan is likely to be scanned */
@@ -1878,6 +1874,17 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 	 */
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
 
+	/*
+	 * We can just inherit the current_rel's consider_parallel flag; nothing
+	 * that happens here can switch us from being parallel-safe path to
+	 * being parallel-restricted.  (If 

Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread br...@momjian.us
On Mon, Jun 13, 2016 at 04:29:26PM -0400, David G. Johnston wrote:
> On Mon, Jun 13, 2016 at 3:40 PM, br...@momjian.us  wrote:
> I am not sure how we can improve things, but I wanted to clarify exactly
> what is happening.
> 
> 
> ​"""
> Comparisons on non-uniformly-distributed
> columns and specification of non-existent values affects the average
> plan cost, and hence if and when a generic plan is chosen
> ​"""
> 
> If we are going to be more precise lets do so here as well.  I have, just
> reading this, no clue whether having non-uniformity and often searching for
> non-existent value will increase or decrease the average plan cost.

Well, we can't be more precise here as the average plan cost could go up
or down, depending on the constants used, becuase the values are not
uniformly-distributed.

> I'm still not certain how this is particularly useful.  If we are willing to
> draw a conclusion here in what circumstances would I, as an end-user, want to
> forgo using a prepared statement and instead dynamically construct an SQL
> statement?  Because at this point while this seems like good detail often 
> times
> my choice of parameters is influenced by what I consider data external to the
> query proper and not any kind of inherent performance aspect.  I'd consider
> this advanced usage which doesn't neatly fit into the SQL Command section of
> the docs.

True, but we have lumped all the "prepared" information into that
section, and I don't see a more logical location for this tidbit of
information.

The big point is that the constants don't affect the generic plan, they
just choose if and when the generic plan is chosen.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-13 Thread Tomas Vondra

Hi,

Attached is a reworked patch, mostly following the new design proposal 
from this thread.


I'm not entirely happy with the code, but it's the best I've been able 
to come up with by now and I won't be able to significantly improve that 
due to travel over. Inevitably there will be issues in the code, and if 
there's a chance of fixing them I'll do my best to do that over the 
evenings at a hotel.


The filtering and matching to eclasses / join quals is triggered from 
planmain.c - I believe this is the right place and that those pieces are 
reasonably solid.


The estimation still happens in costsize.c, of course, but was modified 
to use the pre-matched info. I have my doubts about this part, and I'm 
sure Tom had something less complex / more efficient in mind (using the 
pre-matched info).



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


fk-estimates-reworked.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IsUnderPostmaster with shared_preload_libraries on Windows

2016-06-13 Thread Tom Lane
Joe Conway  writes:
> Is it expected that IsUnderPostmaster is true during postmaster startup
> in an extension's _PG_init() when preloading under Windows? On Linux it
> is false at this point AFAICT.

AFAIK it will be false in the *postmaster's* execution of _PG_init().
But keep in mind that on Windows each exec'd child process will have to
load the shared_preload_libraries again for itself, and those executions
should see IsUnderPostmaster = true.  I think what you are seeing is
additional executions in the startup process or other background
processes, which don't happen in a forked-children environment.

You can probably duplicate this behavior for testing purposes on Unix
by compiling with EXEC_BACKEND defined, if that helps. 

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> In practice, we don't yet have the ability for
> parallel-safe paths from subqueries to affect planning at higher query
> levels, but that's because the pathification stuff landed too late in
> the cycle for me to fully react to it.

I wonder if that's not just from confusion between subplans and
subqueries.  I don't believe any of the claims made in the comment
adjusted in the patch below (other than "Subplans currently aren't passed
to workers", which is true but irrelevant).  Nested Gather nodes is
something that you would need, and already have, a defense for anyway.

regards, tom lane


diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index cc8ba61..8ab049c 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** set_rel_consider_parallel(PlannerInfo *r
*** 560,574 
  		case RTE_SUBQUERY:
  
  			/*
! 			 * Subplans currently aren't passed to workers.  Even if they
! 			 * were, the subplan might be using parallelism internally, and we
! 			 * can't support nested Gather nodes at present.  Finally, we
! 			 * don't have a good way of knowing whether the subplan involves
! 			 * any parallel-restricted operations.  It would be nice to relax
! 			 * this restriction some day, but it's going to take a fair amount
! 			 * of work.
  			 */
! 			return;
  
  		case RTE_JOIN:
  			/* Shouldn't happen; we're only considering baserels here. */
--- 560,574 
  		case RTE_SUBQUERY:
  
  			/*
! 			 * If the subquery doesn't have anything parallel-restricted, we
! 			 * can consider parallel scans.  Note that this does not mean that
! 			 * all (or even any) of the paths produced for the subquery will
! 			 * actually be parallel-safe; but that's true for paths produced
! 			 * for regular tables, too.
  			 */
! 			if (has_parallel_hazard((Node *) rte->subquery, false))
! return;
! 			break;
  
  		case RTE_JOIN:
  			/* Shouldn't happen; we're only considering baserels here. */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] IsUnderPostmaster with shared_preload_libraries on Windows

2016-06-13 Thread Joe Conway
Is it expected that IsUnderPostmaster is true during postmaster startup
in an extension's _PG_init() when preloading under Windows? On Linux it
is false at this point AFAICT.

Thanks,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread David G. Johnston
On Mon, Jun 13, 2016 at 3:40 PM, br...@momjian.us  wrote:

>
> > > Looking at how the code behaves, it seems custom plans that are _more_
> > > expensive (plus planning cost) than the generic plan switch to the
> > > generic plan after five executions, as now documented.  Custom plans
> > > that are significantly _cheaper_ than the generic plan _never_ use the
> > > generic plan.
> >
> > Yes, that's what the suggested documentation improvement says as well,
> > right?
>
> Yes.  What is odd is that it isn't the plan of the actual supplied
> parameters that is cheaper, just the generic plan that assumes each
> distinct value in the query is equally likely to be used.  So, when we
> say the generic plan is cheaper, it is just comparing the custom plan
> with the supplied parameters vs. the generic plan --- it is not saying
> that running the supplied constants with the generic plan will execute
> faster, because in fact we might be using a sub-optimial generic plan.
>
> For example, giving my test table that I posted earlier, if you ran the
> most common constant (50% common) the first five time, the custom plan
> would use a sequential scan.  On the sixth run of that same constant, a
> bitmap scan generic plan would be used.  Now, that does have a lower
> cost, but only for the _average_ distinct value, not for the 50%
> constant that is being used.  A bitmap scan on a constant that would
> normally use a sequential scan will take longer than even a sequential
> scan, because if it didn't, the custom plan would have chosen the bitmap
> scan.
>
> I am not sure how we can improve things, but I wanted to clarify exactly
> what is happening.
>

​"""
Comparisons on non-uniformly-distributed
columns and specification of non-existent values affects the average
plan cost, and hence if and when a generic plan is chosen
​"""

If we are going to be more precise lets do so here as well.  I have, just
reading this, no clue whether having non-uniformity and often searching for
non-existent value will increase or decrease the average plan cost.

I'm still not certain how this is particularly useful.  If we are willing
to draw a conclusion here in what circumstances would I, as an end-user,
want to forgo using a prepared statement and instead dynamically construct
an SQL statement?  Because at this point while this seems like good detail
often times my choice of parameters is influenced by what I consider data
external to the query proper and not any kind of inherent performance
aspect.  I'd consider this advanced usage which doesn't neatly fit into the
SQL Command section of the docs.

David J.


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 3:42 PM, Tom Lane  wrote:
>> I would not be surprised at a change to a parallel-query plan, but there's
>> no parallelism here, so what happened?  This looks like a bug to me.
>> (Also, doing this query without COSTS OFF shows that the newly selected
>> plan actually has a greater estimated cost than the expected plan, which
>> makes it definitely a bug.)
>
> I looked into this and found that the costs are considered fuzzily the
> same, and then add_path prefers the slightly-worse path on the grounds
> that it is marked parallel_safe while the MinMaxAgg path is not.  It seems
> to me that there is some fuzzy thinking going on there.  On exactly what
> grounds is a path to be preferred merely because it is parallel safe, and
> not actually parallelized?

A parallel-safe plan can be joined to a partial path at a higher level
of the plan tree to form a new partial path.  A non-parallel-safe path
cannot.  Therefore, if two paths are equally good, the one which is
parallel-safe is more useful (just as a sorted path which is slightly
more expensive than an unsorted path is worth keeping around because
it is ordered).  In practice, we don't yet have the ability for
parallel-safe paths from subqueries to affect planning at higher query
levels, but that's because the pathification stuff landed too late in
the cycle for me to fully react to it.

> Or perhaps the question to ask is whether a
> MinMaxAgg path can be marked parallel-safe.

That is a good question.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-13 Thread Tom Lane
Simon Riggs  writes:
> On 13 June 2016 at 19:16, Tom Lane  wrote:
>> Another point here is that I'm now unconvinced that restricting the logic
>> to consider only multi-column fkeys is really what we want.  It looks to
>> me like the code can also improve estimates in the case where there are
>> multiple single-column FKs linking to the same target relation.  That
>> might not be too common for two-table queries, but I bet it happens a
>> lot in three-or-more-table queries.

> Is it realistic that we consider that at this point? Certainly not for
> myself, at least.

It's pretty much built into the redesign I proposed, or so I thought.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
I wrote:
> ... there was also an unexplainable plan change:

> *** /home/postgres/pgsql/src/test/regress/expected/aggregates.out Thu Apr 
>  7 21:13:14 2016
> --- /home/postgres/pgsql/src/test/regress/results/aggregates.out  Mon Jun 
> 13 11:54:01 2016
> ***
> *** 577,590 
  
>   explain (costs off)
> select max(unique1) from tenk1 where unique1 > 42000;
> ! QUERY PLAN 
> ! ---
> !  Result
> !InitPlan 1 (returns $0)
> !  ->  Limit
> !->  Index Only Scan Backward using tenk1_unique1 on tenk1
> !  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
> ! (5 rows)
  
>   select max(unique1) from tenk1 where unique1 > 42000;
>max 
> --- 577,588 
  
>   explain (costs off)
> select max(unique1) from tenk1 where unique1 > 42000;
> !  QUERY PLAN 
> ! 
> !  Aggregate
> !->  Index Only Scan using tenk1_unique1 on tenk1
> !  Index Cond: (unique1 > 42000)
> ! (3 rows)
  
>   select max(unique1) from tenk1 where unique1 > 42000;
>max 

> I would not be surprised at a change to a parallel-query plan, but there's
> no parallelism here, so what happened?  This looks like a bug to me.
> (Also, doing this query without COSTS OFF shows that the newly selected
> plan actually has a greater estimated cost than the expected plan, which
> makes it definitely a bug.)

I looked into this and found that the costs are considered fuzzily the
same, and then add_path prefers the slightly-worse path on the grounds
that it is marked parallel_safe while the MinMaxAgg path is not.  It seems
to me that there is some fuzzy thinking going on there.  On exactly what
grounds is a path to be preferred merely because it is parallel safe, and
not actually parallelized?  Or perhaps the question to ask is whether a
MinMaxAgg path can be marked parallel-safe.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread 'br...@momjian.us'
On Mon, Jun 13, 2016 at 01:26:04PM +, Albe Laurenz wrote:
> Bruce Momjian wrote:
> > Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
> > protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE.
> > However, I don't see any way to inject EXPLAIN into the libpq/wire
> > prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
> > EXPLAIN SELECT throws a syntax error.)
> 
> I am not sure what you mean:
> EXPLAIN PREPARE to get EXPLAIN for PREPARE, or PREPARE ... FOR EXPLAIN
> to get an EXPLAIN statement with parameters.
> What should EXPLAIN PREPARE show that EXPLAIN SELECT wouldn't?
> Why the need for EXPLAIN statements with parameters?

Well, you can't use EXPLAIN with SQL PREPARE:

test=> EXPLAIN PREPARE SELECT * FROM pg_class;
ERROR:  syntax error at or near "PREPARE"
LINE 1: EXPLAIN PREPARE SELECT * FROM pg_class;
^
test=> PREPARE EXPLAIN SELECT * FROM pg_class;
ERROR:  syntax error at or near "SELECT"
LINE 1: PREPARE EXPLAIN SELECT * FROM pg_class;
^
You can only do EXPLAIN EXECUTE ..., which works fine, e.g.:

EXPLAIN EXECUTE prep_c1(0);

However, for the wire protocol prepare/execute, how do you do EXPLAIN?
The only way I can see doing it is to put the EXPLAIN in the prepare
query, but I wasn't sure that works.  So, I just wrote and tested the
attached C program and it properly output the explain information, e.g.

res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, 
NULL);
---
generated:

QUERY PLAN

Seq Scan on pg_language  (cost=0.00..1.04 rows=4 width=114)

so that works --- good.

> > Looking at how the code behaves, it seems custom plans that are _more_
> > expensive (plus planning cost) than the generic plan switch to the
> > generic plan after five executions, as now documented.  Custom plans
> > that are significantly _cheaper_ than the generic plan _never_ use the
> > generic plan.
> 
> Yes, that's what the suggested documentation improvement says as well,
> right?

Yes.  What is odd is that it isn't the plan of the actual supplied
parameters that is cheaper, just the generic plan that assumes each
distinct value in the query is equally likely to be used.  So, when we
say the generic plan is cheaper, it is just comparing the custom plan
with the supplied parameters vs. the generic plan --- it is not saying
that running the supplied constants with the generic plan will execute
faster, because in fact we might be using a sub-optimial generic plan.

For example, giving my test table that I posted earlier, if you ran the
most common constant (50% common) the first five time, the custom plan
would use a sequential scan.  On the sixth run of that same constant, a
bitmap scan generic plan would be used.  Now, that does have a lower
cost, but only for the _average_ distinct value, not for the 50%
constant that is being used.  A bitmap scan on a constant that would
normally use a sequential scan will take longer than even a sequential
scan, because if it didn't, the custom plan would have chosen the bitmap
scan.

I am not sure how we can improve things, but I wanted to clarify exactly
what is happening.

> > Updated patch attached.
> 
> Upon re-read, one tiny question:
> 
> !Prepared statements can optionally use generic plans rather than
> !re-planning with each set of supplied EXECUTE values.
> 
> Maybe the "optionally" should be omitted, since the user has no choice.
> 
> It is true that there is a cursor option CURSOR_OPT_CUSTOM_PLAN, but that
> cannot be used on the SQL level.

Right.  Updated patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
/*
 * src/test/isolation/isolationtester.c
 *
 * isolationtester.c
 *		Runs an isolation test specified by a spec file.
 */

#include 
#include 
#include 

static void printResultSet(PGresult *res);


static void
exit_nicely(PGconn *conn)
{
PQfinish(conn);
exit(1);
}


int
main(int argc, char **argv)
{
const char *conninfo;
PGconn *conn;
	PGresult   *res;

if (argc > 2)
{
	fprintf(stderr, "Usage:  %s connection-string\n", argv[0]);
	exit(1);
}
	else if (argc == 2)
conninfo = argv[1];
else
conninfo = "dbname = postgres";

/* Make a connection to the database */
conn = PQconnectdb(conninfo);

/* Check to see that the backend connection was successfully made */
if (PQstatus(conn) != CONNECTION_OK)
{
fprintf(stderr, "Connection to database failed: %s",
PQerrorMessage(conn));
exit_nicely(conn);
}

	res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, NULL);
	if 

Re: [HACKERS] 10.0

2016-06-13 Thread Merlin Moncure
On Tue, May 17, 2016 at 12:45 AM, Craig Ringer  wrote:
> On 14 May 2016 at 02:49, Tom Lane  wrote:
>>
>> * This year's major release will be 9.6.0, with minor updates 9.6.1,
>> 9.6.2, etc.  It's too late to do otherwise for this release cycle.
>>
>> * Next year's major release will be 10.0, with minor updates 10.1,
>> 10.2, etc.
>>
>> * The year after, 11.0.  Etc cetera.
>>
>
> Yes. Please!
>
> I get tired of explaining to people that PostgreSQL "9.x" isn't a thing,
> that yes, 9.3 and 9.4 really _do_ have incompatible data directories and
> replication protocols, and that when the docs say "major version" they don't
> mean "major version as you might actually expect" but "first two version
> number parts".
>
> Lets get rid of this user-baffling wart.

Agreed.  What's been nagging me is what the impacts to users could be.
I just stumbled across some code that *could* have been broken, but by
sheer luck it's safe:

  /* only postgres 9.3+ supports row count from 'COPY' */
  IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3
  THEN
GET DIAGNOSTICS _RowCount = ROW_COUNT;
  ELSE
_RowCount := (LineCount(_ScratchFileName,
  (SELECT WorkFolder FROM CDSControl)))::BIGINT - 1;
  END IF;

LineCount() is a pl/sh wrapper to 'wc -l' that supports this wrapper
to COPY so that it always gives a count of rows loaded.  I guess this
is a pretty hacky way of doing version checks inside of SQL but I
suppose changing the structure of the version number might break
similar approaches to parsing the string.  This regex would in fact
continue to work properly, but it did raise some alarm bells.

Looking ahad, IMO we could:

A) make a variant of version() that returns major/minor/bugfix as
separate fields with minor being set to 0 for all released versions
10.0 and beyond.  We could then add a NOTE to the version function and
other places suggesting to use that instead for 9.6.

B) Preserve x.y.z as returned by version() and show server_version for
those usages only, with 'y' being always 0 for 10.0 and beyond.  For
all other purposes (marketing/documentation/etc) that structure would
*not* be preserved, and we would put notes in the documentation
describing why the extra digit is there.

C) Do neither A or B, and let our users discover such issues on their own.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-13 Thread Simon Riggs
On 13 June 2016 at 19:16, Tom Lane  wrote:

> Simon Riggs  writes:
> > So a simple change is to make RelationGetFKeyList() only retrieve FKs
> with
> > nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
> > the scope for increased planning time.
>
> FWIW, I don't particularly agree with that.  It makes the relcache's fkey
> storage extremely specific to this one use-case, a decision I expect we'd
> regret later.


Hmm, clearly I thought that earlier also; that earlier thinking may be
influencing you. My commits had the concept of generic FK info and then a
specific optimization. So the main part of the planning problem was caused
by stored info that would never be used, in 9.6.

What changes my mind here is 1) point in dev cycle, 2) the point that the
list of FKs doesn't take into account whether the constraints are
deferrable, deferred or immediate and whether they are valid/invalid. ISTM
likely that we would care about those things in the future if we believe
that info is generic.

But then each new usage of the info will have the same planning time
problem to consider if they choose to extend the amount of info they hold.

Rejecting an optimization in 9.6 because it might be undone by later
changes is surely a problem for those later changes to resolve.


> And the planner needs to filter the fkey list anyway,
> because it only wants fkeys that link to tables that are also in the
> current query.  Thus, my recommendation was that we should allow
> RelationGetFKeyList to return a pointer directly to the cached info list
> and require the planner to immediately copy (only) the entries that it
> needs for the current query.
>
> Another point here is that I'm now unconvinced that restricting the logic
> to consider only multi-column fkeys is really what we want.  It looks to
> me like the code can also improve estimates in the case where there are
> multiple single-column FKs linking to the same target relation.  That
> might not be too common for two-table queries, but I bet it happens a
> lot in three-or-more-table queries.
>

Is it realistic that we consider that at this point? Certainly not for
myself, at least.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-13 Thread Tom Lane
Simon Riggs  writes:
> So a simple change is to make RelationGetFKeyList() only retrieve FKs with
> nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
> the scope for increased planning time.

FWIW, I don't particularly agree with that.  It makes the relcache's fkey
storage extremely specific to this one use-case, a decision I expect we'd
regret later.  And the planner needs to filter the fkey list anyway,
because it only wants fkeys that link to tables that are also in the
current query.  Thus, my recommendation was that we should allow
RelationGetFKeyList to return a pointer directly to the cached info list
and require the planner to immediately copy (only) the entries that it
needs for the current query.

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want.  It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation.  That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 1:06 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
>>> BTW, decent regression tests could be written without the need to create
>>> enormous tables if the minimum rel size in create_plain_partial_paths()
>>> could be configured to something less than 1000 blocks.  I think it's
>>> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
>>> we make it a GUC?
>
>> That was proposed before, and I didn't do it mostly because I couldn't
>> think of a name for it that didn't sound unbelievably corny.
>
> min_parallel_relation_size, or min_parallelizable_relation_size, or
> something like that?

Sure.

>> Also,
>> the whole way that algorithm works is kind of a hack and probably
>> needs to be overhauled entirely in some future release.  I'm worried
>> about having the words "backward compatibility" thrown in my face when
>> it's time to improve this logic.  But aside from those two issues I'm
>> OK with exposing a knob.
>
> I agree it's a hack, and I don't want to expose anything about the
> number-of-workers scaling behavior, for precisely that reason.  But a
> threshold on the size of a table to consider parallel scans for at all
> doesn't seem unreasonable.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila 
wrote:
> >> In create_grouping_paths(), we are building partial_grouping_path and
same
> >> is used for gather path and other grouping paths (for partial paths).
> >> However, we don't use it for partial path list and sort path due to
which
> >> path target for Sort path is different from what we have expected.  Is
there
> >> a problem in applying  partial_grouping_path for partial pathlist?
> >> Attached patch just does that and I don't see error with patch.
>
> > It doesn't seem like a good idea to destructive modify
> > input_rel->partial_pathlist.  Applying the projection to each path
> > before using it would probably be better.
>
> I think the real question here is why the code removed by 04ae11f62
> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
> but using create_projection_path directly would have avoided the
> stated problem.  And it's very unclear that this new patch doesn't
> bring back that bug in a different place.
>

This new patch still doesn't seem to be right, but it won't bring back the
original problem because apply_projection_to_path will be only done if
grouped_rel is parallel_safe which means it doesn't have any
parallel-unsafe or parallel-restricted clause in quals or target list.

> I am not very happy that neither 04ae11f62 nor this patch include
> any regression test case proving that a problem existed and has
> been fixed.
>

It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 7:17 PM, Robert Haas  wrote:
>
> On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila 
wrote:
> > In create_grouping_paths(), we are building partial_grouping_path and
same
> > is used for gather path and other grouping paths (for partial paths).
> > However, we don't use it for partial path list and sort path due to
which
> > path target for Sort path is different from what we have expected.  Is
there
> > a problem in applying  partial_grouping_path for partial pathlist?
> > Attached patch just does that and I don't see error with patch.
>
> It doesn't seem like a good idea to destructive modify
> input_rel->partial_pathlist.  Applying the projection to each path
> before using it would probably be better.
>

Do you mean to have it when we generate a complete GroupAgg Path atop of
the cheapest partial path?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-13 Thread Simon Riggs
On 4 June 2016 at 20:44, Tom Lane  wrote:

> This is a branch of the discussion in
>
> https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
> but I'm starting a new thread as the original title is getting
> increasingly off-topic.
>
> I complained in that thread that the FK join selectivity patch had a
> very brute-force approach to matching join qual clauses to FK
> constraints, requiring a total of seven nested levels of looping to
> get anything done, and expensively rediscovering the same facts over
> and over.  Here is a sketch of what I think is a better way:
>

Thanks for your review and design notes here, which look like good
improvements.

Tomas has been discussing that with myself and others, but I just realised
that might not be apparent on list, so just to mention there is activity on
this and new code will be published very soon.


On the above mentioned thread, Tomas' analysis was this...
https://www.postgresql.org/message-id/8344835e-18af-9d40-aed7-bd2261be9162%402ndquadrant.com
> There are probably a few reasonably simple things we could do - e.g.
ignore foreign keys
> with just a single column, as the primary goal of the patch is improving
estimates with
> multi-column foreign keys. I believe that covers a vast majority of
foreign keys in the wild.

I agree with that comment. The relcache code retrieves all FKs, even ones
that have a single column. Yet the planner code never uses them unless
nKeys>1. That was masked somewhat by my two commits, treating the info as
generic and then using only a very specific subset of it.

So a simple change is to make RelationGetFKeyList() only retrieve FKs with
nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
the scope for increased planning time.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] proposal: integration bloat tables (indexes) to core

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 3:23 AM, Pavel Stehule  
> wrote:
>> There are lot of useful queries (views), that are on our wiki. Some queries
>> are necessary for maintenance, and I am thinking these queries should be
>> integrated part of Postgres.

> It's likely to be hard to get agreement on which things to include.
> But if we can, it might be worth doing.  It would be nice to do it as
> an extension.

The problem with an extension is: when we make a core change that breaks
one of these views, which we will, how can you pg_upgrade a database
with the extension installed?  There's no provision for upgrading an
extension concurrently with the core upgrade.  Maybe there should be,
but I'm unclear how we could make that work.

At the same time, I'm pretty suspicious of putting stuff like this in
core, because the expectations for cross-version compatibility go up
by orders of magnitude as soon as we do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-13 Thread Andres Freund


On June 13, 2016 11:02:42 AM CDT, Robert Haas  wrote:

>(Official status update: I'm hoping that senior hackers will carefully
>review these patches for defects.  If they do not, I plan to commit
>the patches anyway neither less than 48 nor more than 60 hours from
>now after re-reviewing them myself.)

I'm traveling today and tomorrow, but will look after that.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
>> BTW, decent regression tests could be written without the need to create
>> enormous tables if the minimum rel size in create_plain_partial_paths()
>> could be configured to something less than 1000 blocks.  I think it's
>> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
>> we make it a GUC?

> That was proposed before, and I didn't do it mostly because I couldn't
> think of a name for it that didn't sound unbelievably corny.

min_parallel_relation_size, or min_parallelizable_relation_size, or
something like that?

> Also,
> the whole way that algorithm works is kind of a hack and probably
> needs to be overhauled entirely in some future release.  I'm worried
> about having the words "backward compatibility" thrown in my face when
> it's time to improve this logic.  But aside from those two issues I'm
> OK with exposing a knob.

I agree it's a hack, and I don't want to expose anything about the
number-of-workers scaling behavior, for precisely that reason.  But a
threshold on the size of a table to consider parallel scans for at all
doesn't seem unreasonable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: integration bloat tables (indexes) to core

2016-06-13 Thread Pavel Stehule
2016-06-13 18:52 GMT+02:00 Robert Haas :

> On Mon, Jun 13, 2016 at 3:23 AM, Pavel Stehule 
> wrote:
> > There are lot of useful queries (views), that are on our wiki. Some
> queries
> > are necessary for maintenance, and I am thinking these queries should be
> > integrated part of Postgres.
> >
> > Mainly queries for detecting table bloat, index bloat, But some queries
> over
> > pg_locks should be useful too.
> >
> > Notes, comments?
>
> It's likely to be hard to get agreement on which things to include.
> But if we can, it might be worth doing.  It would be nice to do it as
> an extension.
>

maybe estimated pgstattuple functions ?

but bloating is too important, so I would to see solution in core more.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] proposal: integration bloat tables (indexes) to core

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 3:23 AM, Pavel Stehule  wrote:
> There are lot of useful queries (views), that are on our wiki. Some queries
> are necessary for maintenance, and I am thinking these queries should be
> integrated part of Postgres.
>
> Mainly queries for detecting table bloat, index bloat, But some queries over
> pg_locks should be useful too.
>
> Notes, comments?

It's likely to be hard to get agreement on which things to include.
But if we can, it might be worth doing.  It would be nice to do it as
an extension.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> It is slightly tricky to write a reproducible parallel-query test, but
>> point taken and I think we should try to have a test unless such a test is
>> really time consuming.
>
> BTW, decent regression tests could be written without the need to create
> enormous tables if the minimum rel size in create_plain_partial_paths()
> could be configured to something less than 1000 blocks.  I think it's
> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> we make it a GUC?

That was proposed before, and I didn't do it mostly because I couldn't
think of a name for it that didn't sound unbelievably corny.  Also,
the whole way that algorithm works is kind of a hack and probably
needs to be overhauled entirely in some future release.  I'm worried
about having the words "backward compatibility" thrown in my face when
it's time to improve this logic.  But aside from those two issues I'm
OK with exposing a knob.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Robert Haas
On Sun, Jun 12, 2016 at 3:05 AM, Noah Misch  wrote:
> On Thu, Jun 09, 2016 at 12:04:59PM -0400, Peter Eisentraut wrote:
>> On 6/6/16 9:45 PM, Peter Eisentraut wrote:
>> >There appears to be a problem with how client encoding is handled in the
>> >communication from parallel workers.
>>
>> I have added this to the open items for now.
>
> [Action required within 72 hours.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> 9.6 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within 72 hours of this
> message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping 9.6rc1.  Consequently, I will appreciate your
> efforts toward speedy resolution.  Thanks.

There is no realistic way that I am going to have this fixed before
beta2.  There are currently 10 open items listed on the open items
page, of which 8 relate to my commits and 5 to parallel query in
particular.  I am not going to get 8 issues fixed in the next 4 days,
or even the next 6 days if you assume I can work through the weekend
on this (and that it would be desirable that I be slinging fixes into
the tree just before the wrap, which seems doubtful).  Furthermore, of
those issues, I judge this to be least important (except for the
documentation update, but that's pending further from Peter Geoghegan
about exactly what he thinks needs to be changed).

Therefore, my plan is to revisit this in two weeks once beta2 is out
the door, unless someone else would like to volunteer to fix it
sooner.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
> amul sul  writes:
>> It's look like bug in to_timestamp() function when format string has more 
>> whitespaces compare to input string, see below:
>
> No, I think this is a case of "input doesn't match the format string".
>
> As a rule of thumb, using to_timestamp() for input that could be parsed
> just fine by the standard timestamp input function is not a particularly
> good idea.  to_timestamp() is meant to deal with input that is in a
> well-defined format that happens to not be parsable by timestamp_in.
> This example doesn't meet either of those preconditions.

I think a space in the format string should skip a whitespace
character in the input string, but not a non-whitespace character.
It's my understanding that these functions exist in no small part for
compatibility with Oracle, and Oracle declines to skip the digit '1'
on the basis of an extra space in the format string, which IMHO is the
behavior any reasonable user would expect.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-13 Thread Tom Lane
amul sul  writes:
> It's look like bug in to_timestamp() function when format string has more 
> whitespaces compare to input string, see below: 

No, I think this is a case of "input doesn't match the format string".

As a rule of thumb, using to_timestamp() for input that could be parsed
just fine by the standard timestamp input function is not a particularly
good idea.  to_timestamp() is meant to deal with input that is in a
well-defined format that happens to not be parsable by timestamp_in.
This example doesn't meet either of those preconditions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
I wrote:
> Amit Kapila  writes:
>> It is slightly tricky to write a reproducible parallel-query test, but
>> point taken and I think we should try to have a test unless such a test is
>> really time consuming.

> BTW, decent regression tests could be written without the need to create
> enormous tables if the minimum rel size in create_plain_partial_paths()
> could be configured to something less than 1000 blocks.  I think it's
> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> we make it a GUC?

Just as an experiment to see what would happen, I did

-   int parallel_threshold = 1000;
+   int parallel_threshold = 1;

and ran the regression tests.  I got a core dump in the window.sql test:

Program terminated with signal 11, Segmentation fault.
#0  0x00664dbc in make_partialgroup_input_target (root=0x1795018, 
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, 
rollup_groupclauses=0x0) at planner.c:4307
4307Index   sgref = final_target->sortgrouprefs[i];
(gdb) bt
#0  0x00664dbc in make_partialgroup_input_target (root=0x1795018, 
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, 
rollup_groupclauses=0x0) at planner.c:4307
#1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8, 
target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
at planner.c:3420
#2  0x00667405 in grouping_planner (root=0x1795018, 
inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
#3  0x00668c80 in subquery_planner (glob=, 
parse=0x1703580, parent_root=, 
hasRecursion=, tuple_fraction=0) at planner.c:769
#4  0x00668ea5 in standard_planner (parse=0x1703580, 
cursorOptions=256, boundParams=0x0) at planner.c:308
#5  0x006691b6 in planner (parse=, 
cursorOptions=, boundParams=)
at planner.c:178
#6  0x006fb069 in pg_plan_query (querytree=0x1703580, 
cursorOptions=256, boundParams=0x0) at postgres.c:798
(gdb) p debug_query_string
$1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"

which I think may be another manifestation of the failure-to-apply-proper-
pathtarget issue we're looking at in this thread.  Or maybe it's just
an unjustified assumption in make_partialgroup_input_target that the
input path must always have some sortgrouprefs assigned.

Before getting to that point, there was also an unexplainable plan change:

*** /home/postgres/pgsql/src/test/regress/expected/aggregates.out   Thu Apr 
 7 21:13:14 2016
--- /home/postgres/pgsql/src/test/regress/results/aggregates.outMon Jun 
13 11:54:01 2016
***
*** 577,590 
  
  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN 
! ---
!  Result
!InitPlan 1 (returns $0)
!  ->  Limit
!->  Index Only Scan Backward using tenk1_unique1 on tenk1
!  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
  
  select max(unique1) from tenk1 where unique1 > 42000;
   max 
--- 577,588 
  
  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
!  QUERY PLAN 
! 
!  Aggregate
!->  Index Only Scan using tenk1_unique1 on tenk1
!  Index Cond: (unique1 > 42000)
! (3 rows)
  
  select max(unique1) from tenk1 where unique1 > 42000;
   max 

I would not be surprised at a change to a parallel-query plan, but there's
no parallelism here, so what happened?  This looks like a bug to me.
(Also, doing this query without COSTS OFF shows that the newly selected
plan actually has a greater estimated cost than the expected plan, which
makes it definitely a bug.)

At this point I'm pretty firmly convinced that we should have a way to
run the regression tests with parallel scans considered for even very
small tables.  If someone doesn't want that way to be a GUC, you'd better
propose another solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-13 Thread Robert Haas
On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas  wrote:
>> How about changing the return tuple of heap_prepare_freeze_tuple to
>> a bitmap?  Two flags: "Freeze [not] done" and "[No] more freezing
>> needed"
>
> Yes, I think something like that sounds about right.

Here's a patch.  I took the approach of adding a separate bool out
parameter instead.  I am also attaching an update of the
check-visibility patch which responds to assorted review comments and
adjusting it for the problems found on Friday which could otherwise
lead to false positives.  I'm still getting occasional TIDs from the
pg_check_visible() function during pgbench runs, though, so evidently
not all is well with the world.

(Official status update: I'm hoping that senior hackers will carefully
review these patches for defects.  If they do not, I plan to commit
the patches anyway neither less than 48 nor more than 60 hours from
now after re-reviewing them myself.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


fix-freeze-map-v1.patch
Description: binary/octet-stream


check-visibility-v4.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in to_timestamp().

2016-06-13 Thread amul sul
Hi,

It's look like bug in to_timestamp() function when format string has more 
whitespaces compare to input string, see below: 

Ex.1: Two white spaces before HH24 whereas one before input time string

postgres=# SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS');
to_timestamp 

2016-06-13 05:43:36-07   <— incorrect time 
(1 row)



Ex.2: One whitespace before  format string

postgres=# SELECT TO_TIMESTAMP('2016/06/13 15:43:36', ' /MM/DD HH24:MI:SS');
to_timestamp 
--
0016-06-13 15:43:36-07:52:58  <— incorrect year
(1 row)



If there are one or more consecutive whitespace in the format, we should skip 
those as long as we could get an actual field.
Thoughts?
Thanks & Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Data at rest encryption

2016-06-13 Thread Peter Eisentraut

On 6/12/16 3:13 AM, Ants Aasma wrote:

5. Instead of providing passphrase through environmental variable,
> better to provide some options to pg_ctl etc.

That looks like it would be worse from a security perspective.
Integrating a passphrase prompt would be an option, but a way for
scripts to provide passphrases would still be needed.


Environment variables and command-line options are visible to other 
processes on the machine, so neither of these approaches is really going 
to work.  We would need some kind of integration with secure 
password-entry mechanisms, such as pinentry.


Also note that all tools that work directly on the data directory would 
need password-entry and encryption/decryption support, including 
pg_basebackup, pg_controldata, pg_ctl, pg_receivexlog, pg_resetxlog, 
pg_rewind, pg_upgrade, pg_xlogdump.


It seems that your implementation doesn't encrypt pg_control, thus 
avoiding some of that.  But that doesn't seem right.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Data at rest encryption

2016-06-13 Thread Peter Eisentraut

On 6/7/16 9:56 AM, Ants Aasma wrote:

Similar things can be achieved with filesystem level encryption.
However this is not always optimal for various reasons. One of the
better reasons is the desire for HSM based encryption in a storage
area network based setup.


Could you explain this in more detail?

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Amit Kapila  writes:
> It is slightly tricky to write a reproducible parallel-query test, but
> point taken and I think we should try to have a test unless such a test is
> really time consuming.

BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks.  I think it's
fairly crazy that that arbitrary constant is hard-wired anyway.  Should
we make it a GUC?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
>> I think the real question here is why the code removed by 04ae11f62
>> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
>> but using create_projection_path directly would have avoided the
>> stated problem.  And it's very unclear that this new patch doesn't
>> bring back that bug in a different place.

> This new patch still doesn't seem to be right, but it won't bring back the
> original problem because apply_projection_to_path will be only done if
> grouped_rel is parallel_safe which means it doesn't have any
> parallel-unsafe or parallel-restricted clause in quals or target list.

The problem cited in 04ae11f62's commit message is that
apply_projection_to_path would overwrite the paths' pathtargets in-place,
causing problems if they'd been used for other purposes elsewhere.  I do
not share your confidence that using apply_projection_to_path within
create_grouping_paths is free of such a hazard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila  wrote:
>> In create_grouping_paths(), we are building partial_grouping_path and same
>> is used for gather path and other grouping paths (for partial paths).
>> However, we don't use it for partial path list and sort path due to which
>> path target for Sort path is different from what we have expected.  Is there
>> a problem in applying  partial_grouping_path for partial pathlist?
>> Attached patch just does that and I don't see error with patch.

> It doesn't seem like a good idea to destructive modify
> input_rel->partial_pathlist.  Applying the projection to each path
> before using it would probably be better.

I think the real question here is why the code removed by 04ae11f62
was wrong.  It was unsafe to use apply_projection_to_path, certainly,
but using create_projection_path directly would have avoided the
stated problem.  And it's very unclear that this new patch doesn't
bring back that bug in a different place.

I am not very happy that neither 04ae11f62 nor this patch include
any regression test case proving that a problem existed and has
been fixed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila  wrote:
> In create_grouping_paths(), we are building partial_grouping_path and same
> is used for gather path and other grouping paths (for partial paths).
> However, we don't use it for partial path list and sort path due to which
> path target for Sort path is different from what we have expected.  Is there
> a problem in applying  partial_grouping_path for partial pathlist?
> Attached patch just does that and I don't see error with patch.

It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist.  Applying the projection to each path
before using it would probably be better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread Albe Laurenz
Bruce Momjian wrote:
> Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
> protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE.
> However, I don't see any way to inject EXPLAIN into the libpq/wire
> prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
> EXPLAIN SELECT throws a syntax error.)

I am not sure what you mean:
EXPLAIN PREPARE to get EXPLAIN for PREPARE, or PREPARE ... FOR EXPLAIN
to get an EXPLAIN statement with parameters.
What should EXPLAIN PREPARE show that EXPLAIN SELECT wouldn't?
Why the need for EXPLAIN statements with parameters?

> Looking at how the code behaves, it seems custom plans that are _more_
> expensive (plus planning cost) than the generic plan switch to the
> generic plan after five executions, as now documented.  Custom plans
> that are significantly _cheaper_ than the generic plan _never_ use the
> generic plan.

Yes, that's what the suggested documentation improvement says as well,
right?

> Updated patch attached.

Upon re-read, one tiny question:

!Prepared statements can optionally use generic plans rather than
!re-planning with each set of supplied EXECUTE values.

Maybe the "optionally" should be omitted, since the user has no choice.

It is true that there is a cursor option CURSOR_OPT_CUSTOM_PLAN, but that
cannot be used on the SQL level.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Data at rest encryption

2016-06-13 Thread Ants Aasma
On Mon, Jun 13, 2016 at 5:17 AM, Michael Paquier
 wrote:
> On Sun, Jun 12, 2016 at 4:13 PM, Ants Aasma  wrote:
>>> I feel separate file is better to include the key data instead of pg_control
>>> file.
>>
>> I guess that would be more flexible. However I think at least the fact
>> that the database is encrypted should remain in the control file to
>> provide useful error messages for faulty backup procedures.
>
> Another possibility could be always to do some encryption at data-type
> level for text data. For example I recalled the following thing while
> going through this thread:
> https://github.com/nec-postgres/tdeforpg
> Though I don't quite understand the use for encrypt.enable in this
> code... This has the advantage to not patch upstream.

While certainly possible, this does not cover the requirements I want
to satisfy - user data never gets stored on disk unencrypted without
making changes to the application or schema. This seems to be mostly
about separating administrator roles, specifically that centralised
storage and backup administrators should not have access to database
contents. I see this as orthogonal to per column encryption, which in
my opinion is better done in the application.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we don't have checksums on clog files

2016-06-13 Thread Robert Haas
On Sun, Jun 12, 2016 at 10:54 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jun 7, 2016 at 10:41 AM, Amit Kapila  wrote:
>>> I think it will be better if users can have an option to checksum clog
>>> pages.  However, I think that will need a change in page (CLOG-page) format
>>> which might not be a trivial work to accomplish.
>
>> Doesn't the pd_checksum field exist on SLRU pages also?
>
> Uh, no.
>
> (I don't think that's an inherent property of slru.c, but definitely
> clog.c packs the pages totally fully of xact status bits.  See
> CLOG_XACTS_PER_PAGE.

Oh.  Well, that makes it harder, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-13 Thread Julien Rouhaud
On 13/06/2016 03:16, Robert Haas wrote:
> On Sat, Jun 11, 2016 at 6:24 PM, Julien Rouhaud
>  wrote:
>> On 11/06/2016 23:37, Julien Rouhaud wrote:
>>> On 09/06/2016 16:04, Robert Haas wrote:
 There remains only the task of adding max_parallel_degree
 as a system-wide limit

>>>
>>> PFA a patch to fix this open item.  I used the GUC name provided in the
>>> 9.6 open item page (max_parallel_workers), with a default value of 4.
>>> Value 0 is another way to disable parallel query.
>>>
>>
>> Sorry I just realized I made a stupid mistake, I didn't check in all
>> slots to get the number of active parallel workers. Fixed in attached v2.
> 
> I think instead of adding a "bool parallel" argument to
> RegisterDynamicBackgroundWorker, we should just define a new constant
> for bgw_flags, say BGW_IS_PARALLEL_WORKER.
> 

Agreed, and fixed in attached v3.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0e5a1e..c661c7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2018,6 +2018,23 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..b429474 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index cc8ba61..2bcd86b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index e7f63f4..fd62126 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 54c0440..81d124c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
IsUnderPostmaster && dynamic_shared_memory_type != 
DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
parse->utilityStmt == NULL && max_parallel_workers_per_gather > 
0 &&
-   !IsParallelWorker() && !IsolationIsSerializable() &&
-   !has_parallel_hazard((Node *) parse, true);
+   max_parallel_workers > 0 && !IsParallelWorker() &&
+   !IsolationIsSerializable() && !has_parallel_hazard((Node *) 
parse,
+   true);
 
/*
 * glob->parallelModeNeeded should tell us whether it's necessary to
diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 382edad..3749d72 100644
--- 

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tatsuro Yamada

Hi,

I applied your patch and run tpc-h.
Then I got new errors on Q4,12,17 and 19.

ERROR:  Aggref found in non-Agg plan node.
See bellow,

--
postgres=# \i queries/4.explain.sql
ERROR:  Aggref found in non-Agg plan node
STATEMENT:  explain analyze select
o_orderpriority,
count(*) as order_count
from
orders
where
o_orderdate >= date '1993-10-01'
and o_orderdate < date '1993-10-01' + interval '3' month
and exists (
select
*
from
lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority
LIMIT 1;
--

Regards,
Tatsuro Yamada
NTT OSS Center


On 2016/06/13 16:18, Amit Kapila wrote:

On Mon, Jun 13, 2016 at 11:05 AM, David Rowley > wrote:
 >
 > On 13 June 2016 at 15:39, Thomas Munro > wrote:
 > > What is going on here?
 >
 > ...
 >
 > >
 > > postgres=# set max_parallel_workers_per_gather = 2;
 > > SET
 > > postgres=# explain select length(data) from logs group by length(data);
 > > ERROR:  ORDER/GROUP BY expression not found in targetlist
 >
 > Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
 >

In create_grouping_paths(), we are building partial_grouping_path and same is 
used for gather path and other grouping paths (for partial paths). However, we 
don't use it for partial path list and sort path due to which path target for 
Sort path is different from what we have expected.  Is there a problem in 
applying  partial_grouping_path for partial pathlist?   Attached patch just 
does that and I don't see error with patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com 








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tatsuro Yamada

On 2016/06/13 15:52, Michael Paquier wrote:

On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
 wrote:

I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.

   The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
   Because I got the same error using tpc-h.


This looks like a different regression..


I understand it now, thanks. :-)



I checked the list, but the bug is not listed.
   https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items


And the winner is:

04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas 
Date:   Fri Jun 3 14:27:33 2016 -0400

I am adding that to the list of open items.


Oh...
I'll try to run tpc-h if I got a new patch which fixes the bug. :)


Thanks,
Tatsuro Yamada
NTT OSS Center





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: integration bloat tables (indexes) to core

2016-06-13 Thread Pavel Stehule
Hi

There are lot of useful queries (views), that are on our wiki. Some queries
are necessary for maintenance, and I am thinking these queries should be
integrated part of Postgres.

Mainly queries for detecting table bloat, index bloat, But some queries
over pg_locks should be useful too.

Notes, comments?

Regards

Pavel


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 11:05 AM, David Rowley 
wrote:
>
> On 13 June 2016 at 15:39, Thomas Munro 
wrote:
> > What is going on here?
>
> ...
>
> >
> > postgres=# set max_parallel_workers_per_gather = 2;
> > SET
> > postgres=# explain select length(data) from logs group by length(data);
> > ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
>

In create_grouping_paths(), we are building partial_grouping_path and same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to which
path target for Sort path is different from what we have expected.  Is
there a problem in applying  partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


apply_partial_pathtarget_partial_pathlist_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Michael Paquier
On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
 wrote:
> I got mistake to write an e-mail to -hackers on last week. :-<
> I should have written this.
>
>   The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
>   Because I got the same error using tpc-h.

This looks like a different regression..

>>> Today, I try it again by changing max_parallel_workers_per_gather
>>> parameter.
>>> The result of Q1 is bellow. Is this bug in the Open items on wiki?
>>
>> I don't see it on the Open Issues list.
>
> I checked the list, but the bug is not listed.
>   https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items

And the winner is:

04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas 
Date:   Fri Jun 3 14:27:33 2016 -0400

I am adding that to the list of open items.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers