Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> On Fri, Aug 25, 2017 at 5:12 PM, Tom Lane  wrote:
>> Here's a reviewed version of the second patch.  I fixed one bug
>> (wrong explain group nesting) and made some additional cosmetic
>> improvements beyond the struct relocation.

> The capitalization of TupleSortInstrumentation is inconsistent with
> TuplesortSpaceType, TuplesortMethod, and Tuplesortstate; I recommend
> we lower-case the S.

Sure, do it that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 5:12 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane  wrote:
>>> Hmm, I'm not sure why SortInstrumentation belongs naturally to
>>> tuplesort.h but putting an array of them there would be a "gross
>>> abstraction violation".  Perhaps it would help to rename
>>> struct SharedSortInfo to SortInstrumentationArray, and change its
>>> field names to be less specific to the parallel-worker use case?
>
>> What other use case could there be?  I think an array of
>> SortInstrumentation objects intended to be stored in DSM is fairly
>> clearly a bit of executor-specific machinery and thus properly
>> declared along with the node that contains it.
>
> I'm not really convinced, but it's not worth arguing further.
>
> Here's a reviewed version of the second patch.  I fixed one bug
> (wrong explain group nesting) and made some additional cosmetic
> improvements beyond the struct relocation.

The capitalization of TupleSortInstrumentation is inconsistent with
TuplesortSpaceType, TuplesortMethod, and Tuplesortstate; I recommend
we lower-case the S.

- * Ordinary plan nodes won't do anything here, but parallel-aware plan
- * nodes may need to initialize shared state in the DSM before parallel
- * workers are available.  They can allocate the space they previously
+ * Most plan nodes won't do anything here, but plan nodes that allocated
+ * DSM may need to initialize shared state in the DSM before parallel
+ * workers are launched.  They can allocate the space they previously

On reading this again, it seems to me that the first instance of
"allocated" in this comment block needs to be changed to "estimated",
because otherwise saying they can allocated it later on is
unintelligible.

Other than that, LGTM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane  wrote:
>> Hmm, I'm not sure why SortInstrumentation belongs naturally to
>> tuplesort.h but putting an array of them there would be a "gross
>> abstraction violation".  Perhaps it would help to rename
>> struct SharedSortInfo to SortInstrumentationArray, and change its
>> field names to be less specific to the parallel-worker use case?

> What other use case could there be?  I think an array of
> SortInstrumentation objects intended to be stored in DSM is fairly
> clearly a bit of executor-specific machinery and thus properly
> declared along with the node that contains it.

I'm not really convinced, but it's not worth arguing further.

Here's a reviewed version of the second patch.  I fixed one bug
(wrong explain group nesting) and made some additional cosmetic
improvements beyond the struct relocation.

regards, tom lane

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 953e74d..385b325 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2292,15 +2292,21 @@ show_tablesample(TableSampleClause *tsc, PlanState *planstate,
 static void
 show_sort_info(SortState *sortstate, ExplainState *es)
 {
-	if (es->analyze && sortstate->sort_Done &&
-		sortstate->tuplesortstate != NULL)
+	if (!es->analyze)
+		return;
+
+	if (sortstate->sort_Done && sortstate->tuplesortstate != NULL)
 	{
 		Tuplesortstate *state = (Tuplesortstate *) sortstate->tuplesortstate;
+		TupleSortInstrumentation stats;
 		const char *sortMethod;
 		const char *spaceType;
 		long		spaceUsed;
 
-		tuplesort_get_stats(state, &sortMethod, &spaceType, &spaceUsed);
+		tuplesort_get_stats(state, &stats);
+		sortMethod = tuplesort_method_name(stats.sortMethod);
+		spaceType = tuplesort_space_type_name(stats.spaceType);
+		spaceUsed = stats.spaceUsed;
 
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 		{
@@ -2315,6 +2321,51 @@ show_sort_info(SortState *sortstate, ExplainState *es)
 			ExplainPropertyText("Sort Space Type", spaceType, es);
 		}
 	}
+
+	if (sortstate->shared_info != NULL)
+	{
+		int			n;
+		bool		opened_group = false;
+
+		for (n = 0; n < sortstate->shared_info->num_workers; n++)
+		{
+			TupleSortInstrumentation *sinstrument;
+			const char *sortMethod;
+			const char *spaceType;
+			long		spaceUsed;
+
+			sinstrument = &sortstate->shared_info->sinstrument[n];
+			if (sinstrument->sortMethod == SORT_TYPE_STILL_IN_PROGRESS)
+continue;		/* ignore any unfilled slots */
+			sortMethod = tuplesort_method_name(sinstrument->sortMethod);
+			spaceType = tuplesort_space_type_name(sinstrument->spaceType);
+			spaceUsed = sinstrument->spaceUsed;
+
+			if (es->format == EXPLAIN_FORMAT_TEXT)
+			{
+appendStringInfoSpaces(es->str, es->indent * 2);
+appendStringInfo(es->str,
+ "Worker %d:  Sort Method: %s  %s: %ldkB\n",
+ n, sortMethod, spaceType, spaceUsed);
+			}
+			else
+			{
+if (!opened_group)
+{
+	ExplainOpenGroup("Workers", "Workers", false, es);
+	opened_group = true;
+}
+ExplainOpenGroup("Worker", NULL, true, es);
+ExplainPropertyInteger("Worker Number", n, es);
+ExplainPropertyText("Sort Method", sortMethod, es);
+ExplainPropertyLong("Sort Space Used", spaceUsed, es);
+ExplainPropertyText("Sort Space Type", spaceType, es);
+ExplainCloseGroup("Worker", NULL, true, es);
+			}
+		}
+		if (opened_group)
+			ExplainCloseGroup("Workers", "Workers", false, es);
+	}
 }
 
 /*
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ad9eba6..01316ff 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -28,9 +28,10 @@
 #include "executor/nodeBitmapHeapscan.h"
 #include "executor/nodeCustom.h"
 #include "executor/nodeForeignscan.h"
-#include "executor/nodeSeqscan.h"
 #include "executor/nodeIndexscan.h"
 #include "executor/nodeIndexonlyscan.h"
+#include "executor/nodeSeqscan.h"
+#include "executor/nodeSort.h"
 #include "executor/tqueue.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/planmain.h"
@@ -202,10 +203,10 @@ ExecSerializePlan(Plan *plan, EState *estate)
 }
 
 /*
- * Ordinary plan nodes won't do anything here, but parallel-aware plan nodes
- * may need some state which is shared across all parallel workers.  Before
- * we size the DSM, give them a chance to call shm_toc_estimate_chunk or
- * shm_toc_estimate_keys on &pcxt->estimator.
+ * Parallel-aware plan nodes (and occasionally others) may need some state
+ * which is shared across all parallel workers.  Before we size the DSM, give
+ * them a chance to call shm_toc_estimate_chunk or shm_toc_estimate_keys on
+ * &pcxt->estimator.
  *
  * While we're at it, count the number of PlanState nodes in the tree, so
  * we know how many SharedPlanStateInstrumentation structures we need.
@@ -219,38 +220,43 @@ ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)
 	/* Count this node. 

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane  wrote:
>> I think moving SharedSortInfo into tuplesort.h would be a gross
>> abstraction violation, but moving SortInstrumentation into tuplesort.h
>> seems like a modest improvement.
>
> Hmm, I'm not sure why SortInstrumentation belongs naturally to
> tuplesort.h but putting an array of them there would be a "gross
> abstraction violation".  Perhaps it would help to rename
> struct SharedSortInfo to SortInstrumentationArray, and change its
> field names to be less specific to the parallel-worker use case?

What other use case could there be?  I think an array of
SortInstrumentation objects intended to be stored in DSM is fairly
clearly a bit of executor-specific machinery and thus properly
declared along with the node that contains it.  Only nodeSort.c and
explain.c need to know anything about it; tuplesort.c is totally
ignorant of it and I see no reason why we'd ever want to change that.
Indeed, I don't quite see how we could do so without some kind of
abstraction violation.

SortInstrumentation is a diffferent kettle of fish.  I chose to make
that an executor-specific bit as well, but an advantage of your
proposed approach is that future additions to (or changes in) what
gets returned by tuplesort_get_stats() wouldn't require as much
tweaking elsewhere.  With your proposed change, nodeSort.c wouldn't
really care about the contents of that structure (as long as there are
no embedded pointers), so we could add new members or rename things
and only tuplesort.c and explain.c would need updating.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> On Fri, Aug 25, 2017 at 2:01 PM, Tom Lane  wrote:
>> I looked through this a little, and feel uncomfortable with the division
>> of typedefs between execnodes.h and tuplesort.h.  I'm inclined to push
>> struct SortInstrumentation, and maybe also SharedSortInfo, into
>> tuplesort.h.

> I think moving SharedSortInfo into tuplesort.h would be a gross
> abstraction violation, but moving SortInstrumentation into tuplesort.h
> seems like a modest improvement.

Hmm, I'm not sure why SortInstrumentation belongs naturally to
tuplesort.h but putting an array of them there would be a "gross
abstraction violation".  Perhaps it would help to rename
struct SharedSortInfo to SortInstrumentationArray, and change its
field names to be less specific to the parallel-worker use case?

>> (BTW, would it make sense to number the workers from 1 not 0 in the
>> EXPLAIN printout?)

> ... So I'm in favor of leaving it alone; I don't think that 0-based
> indexing is such an obscure convention that it will flummox users.

OK, I'm not particularly set on that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 2:01 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On another note, here's a second patch applying on top of my earlier
>> patch to push down Limit through Gather and Gather Merge.
>
> I looked through this a little, and feel uncomfortable with the division
> of typedefs between execnodes.h and tuplesort.h.  I'm inclined to push
> struct SortInstrumentation, and maybe also SharedSortInfo, into
> tuplesort.h.  Then tuplesort_get_stats could be given the signature
>
> void tuplesort_get_stats(Tuplesortstate *state,
>  SortInstrumentation *stats);
>
> which seems less messy.  Thoughts?

I think moving SharedSortInfo into tuplesort.h would be a gross
abstraction violation, but moving SortInstrumentation into tuplesort.h
seems like a modest improvement.

> I'm also pretty suspicious of the filter code you added to subselect.sql.
> What happens when there's more than one worker?

Uh, then somebody's changed the definition of force_parallel_mode in a
really strange way.

> (BTW, would it make sense to number the workers from 1 not 0 in the
> EXPLAIN printout?)

The main problem I see with that is that it might confuse somebody who
is debugging.  If you've got a backend and print out
ParallelWorkerNumber, you might expect that to match what you see in
the EXPLAIN output.  It also seems likely to me that other parts of
EXPLAIN or other parts of the system generally will eventually expose
parallel worker numbers in mildly user-visible ways (e.g. maybe it'll
show up in a DEBUG message someplace eventually), and if we insist on
sticking a + 1 into EXPLAIN's output in the existing places we'll have
to do it everywhere else, and somebody's eventually going to forget.
So I'm in favor of leaving it alone; I don't think that 0-based
indexing is such an obscure convention that it will flummox users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> On another note, here's a second patch applying on top of my earlier
> patch to push down Limit through Gather and Gather Merge.

I looked through this a little, and feel uncomfortable with the division
of typedefs between execnodes.h and tuplesort.h.  I'm inclined to push
struct SortInstrumentation, and maybe also SharedSortInfo, into
tuplesort.h.  Then tuplesort_get_stats could be given the signature

void tuplesort_get_stats(Tuplesortstate *state,
 SortInstrumentation *stats);

which seems less messy.  Thoughts?

I'm also pretty suspicious of the filter code you added to subselect.sql.
What happens when there's more than one worker?

(BTW, would it make sense to number the workers from 1 not 0 in the
EXPLAIN printout?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 1:31 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Awesome.  What's the business with enable_hashagg in the regression test 
>> stuff?
>
> Just relocating that "reset" so that it only affects the test case it's
> needed for.  (I think the 0-worker test was inserted in the wrong place.)

Oh, hmm.  Well, no objection to that, I guess.  Thanks for working on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> Awesome.  What's the business with enable_hashagg in the regression test 
> stuff?

Just relocating that "reset" so that it only affects the test case it's
needed for.  (I think the 0-worker test was inserted in the wrong place.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 1:18 PM, Tom Lane  wrote:
> v4, with a test case and some more comment-polishing.  I think this
> is committable.

Awesome.  What's the business with enable_hashagg in the regression test stuff?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
v4, with a test case and some more comment-polishing.  I think this
is committable.

regards, tom lane

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ce47f1d..ad9eba6 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -47,17 +47,26 @@
  * greater than any 32-bit integer here so that values < 2^32 can be used
  * by individual parallel nodes to store their own state.
  */
-#define PARALLEL_KEY_PLANNEDSTMT		UINT64CONST(0xE001)
-#define PARALLEL_KEY_PARAMSUINT64CONST(0xE002)
-#define PARALLEL_KEY_BUFFER_USAGE		UINT64CONST(0xE003)
-#define PARALLEL_KEY_TUPLE_QUEUE		UINT64CONST(0xE004)
-#define PARALLEL_KEY_INSTRUMENTATION	UINT64CONST(0xE005)
-#define PARALLEL_KEY_DSAUINT64CONST(0xE006)
-#define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE007)
+#define PARALLEL_KEY_EXECUTOR_FIXED		UINT64CONST(0xE001)
+#define PARALLEL_KEY_PLANNEDSTMT		UINT64CONST(0xE002)
+#define PARALLEL_KEY_PARAMSUINT64CONST(0xE003)
+#define PARALLEL_KEY_BUFFER_USAGE		UINT64CONST(0xE004)
+#define PARALLEL_KEY_TUPLE_QUEUE		UINT64CONST(0xE005)
+#define PARALLEL_KEY_INSTRUMENTATION	UINT64CONST(0xE006)
+#define PARALLEL_KEY_DSAUINT64CONST(0xE007)
+#define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE008)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
 /*
+ * Fixed-size random stuff that we need to pass to parallel workers.
+ */
+typedef struct FixedParallelExecutorState
+{
+	int64		tuples_needed;	/* tuple bound, see ExecSetTupleBound */
+} FixedParallelExecutorState;
+
+/*
  * DSM structure for accumulating per-PlanState instrumentation.
  *
  * instrument_options: Same meaning here as in instrument.c.
@@ -381,12 +390,14 @@ ExecParallelReinitialize(ParallelExecutorInfo *pei)
  * execution and return results to the main backend.
  */
 ParallelExecutorInfo *
-ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
+ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers,
+	 int64 tuples_needed)
 {
 	ParallelExecutorInfo *pei;
 	ParallelContext *pcxt;
 	ExecParallelEstimateContext e;
 	ExecParallelInitializeDSMContext d;
+	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
 	char	   *param_space;
@@ -418,6 +429,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
 	 * for the various things we need to store.
 	 */
 
+	/* Estimate space for fixed-size state. */
+	shm_toc_estimate_chunk(&pcxt->estimator,
+		   sizeof(FixedParallelExecutorState));
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for query text. */
 	query_len = strlen(estate->es_sourceText);
 	shm_toc_estimate_chunk(&pcxt->estimator, query_len);
@@ -487,6 +503,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
 	 * asked for has been allocated or initialized yet, though, so do that.
 	 */
 
+	/* Store fixed-size state. */
+	fpes = shm_toc_allocate(pcxt->toc, sizeof(FixedParallelExecutorState));
+	fpes->tuples_needed = tuples_needed;
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECUTOR_FIXED, fpes);
+
 	/* Store query string */
 	query_string = shm_toc_allocate(pcxt->toc, query_len);
 	memcpy(query_string, estate->es_sourceText, query_len);
@@ -833,6 +854,7 @@ ExecParallelInitializeWorker(PlanState *planstate, shm_toc *toc)
 void
 ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 {
+	FixedParallelExecutorState *fpes;
 	BufferUsage *buffer_usage;
 	DestReceiver *receiver;
 	QueryDesc  *queryDesc;
@@ -841,6 +863,9 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 	void	   *area_space;
 	dsa_area   *area;
 
+	/* Get fixed-size state. */
+	fpes = shm_toc_lookup(toc, PARALLEL_KEY_EXECUTOR_FIXED, false);
+
 	/* Set up DestReceiver, SharedExecutorInstrumentation, and QueryDesc. */
 	receiver = ExecParallelGetReceiver(seg, toc);
 	instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_INSTRUMENTATION, true);
@@ -868,8 +893,17 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 	queryDesc->planstate->state->es_query_dsa = area;
 	ExecParallelInitializeWorker(queryDesc->planstate, toc);
 
-	/* Run the plan */
-	ExecutorRun(queryDesc, ForwardScanDirection, 0L, true);
+	/* Pass down any tuple bound */
+	ExecSetTupleBound(fpes->tuples_needed, queryDesc->planstate);
+
+	/*
+	 * Run the plan.  If we specified a tuple bound, be careful not to demand
+	 * more tuples than that.
+	 */
+	ExecutorRun(queryDesc,
+ForwardScanDirection,
+fpes->tuples_needed < 0 ? (int64) 0 : fpes->tuples_needed,
+true);
 
 	/* Shut down the executor */
 	ExecutorFinish(queryDesc);
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 36d2914..c1aa506 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 12:05 PM, Tom Lane  wrote:
> Hm.  It seemed pretty reliable for me, but we already know that the
> parallel machinery is quite timing-sensitive.  I think we'd better
> incorporate a regression test case into the committed patch so we
> can see what the buildfarm thinks.  I'll see about adding one.

That would be great.  I didn't have a great idea how to write a test
for this -- maybe one of my systematic failures as a developer.  :-(

>> I think that the comment at the bottom of ExecSetTupleBound should
>> probably be rethought a bit in light of these changes.
>
> Agreed; I'll revisit all the comments, actually.

That also sounds great.

> No, I think that's fundamentally the wrong idea.  The tuple bound
> inherently is a piece of information that has to apply across a whole
> scan.  If you pass it to ExecProcNode then you'll have to get into
> API contracts about whether it's allowed to change across calls,
> which is a mess, even aside from efficiency concerns (and for
> ExecProcNode, I *am* concerned about efficiency).

Fair.

> You could imagine adding it as a parameter to ExecReScan, instead, but
> then there are a bunch of issues with how that would interact with the
> existing optimizations for skipped/delayed/merged rescan calls (cf.
> the existing open issue with parallel rescans).  That is a can of worms
> I'd just as soon not open.
>
> I'm content to leave it as a separate call.  I still think that worrying
> about extra C function calls for this purpose is, at best, premature
> optimization.

OK, you may well be right.  Outside of Limit, I suppose we're not
likely to set any limit other than 1, and if we set a limit of 1 it'll
probably be a limit that survives rescans, because it'll be based on
the parent node not caring about anything more than the first tuple.

BTW, I think another reason why we're likely to eventually want to get
very aggressive about passing down bounds is batch execution.  That's
somewhere on Andres's list of things that could speed up the executor,
although I believe at present there are quite a few other bottlenecks
that are more urgent.  But certainly if we start doing that, then even
things like SeqScan are going to want to know about applicable limit
clauses.

> I'll do another pass over the patch and get back to you.  I've not
> looked at the instrumentation patch yet --- is there a reason to
> worry about it before we finish this one?

I think they're independent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> I had the same thought.  Done in the attached revision of your
> version.  I couldn't consistently reproduce the failure you saw -- it
> failed for me only about 1 time in 10 -- but I don't think it's
> failing any more with this version.

Hm.  It seemed pretty reliable for me, but we already know that the
parallel machinery is quite timing-sensitive.  I think we'd better
incorporate a regression test case into the committed patch so we
can see what the buildfarm thinks.  I'll see about adding one.

> I think that the comment at the bottom of ExecSetTupleBound should
> probably be rethought a bit in light of these changes.

Agreed; I'll revisit all the comments, actually.

> The details aside, Konstantin Knizhnik's proposal to teach this code
> about ForeignScans is yet another independent way for this to win, and
> I think that will also be quite a large win.

+1

> It's possible that we should work out some of these things at plan
> time rather than execution time, and it's also possible that a
> separate call to ExecSetTupleBound is too much work.  I've mulled over
> the idea of whether we should get rid of this mechanism altogether in
> favor of passing the tuple bound as an additional argument to
> ExecProcNode, because it seems silly to call node-specific code once
> to set a (normally small, possibly 1) bound and then call it again to
> get a (or the) tuple.

No, I think that's fundamentally the wrong idea.  The tuple bound
inherently is a piece of information that has to apply across a whole
scan.  If you pass it to ExecProcNode then you'll have to get into
API contracts about whether it's allowed to change across calls,
which is a mess, even aside from efficiency concerns (and for
ExecProcNode, I *am* concerned about efficiency).

You could imagine adding it as a parameter to ExecReScan, instead, but
then there are a bunch of issues with how that would interact with the
existing optimizations for skipped/delayed/merged rescan calls (cf.
the existing open issue with parallel rescans).  That is a can of worms
I'd just as soon not open.

I'm content to leave it as a separate call.  I still think that worrying
about extra C function calls for this purpose is, at best, premature
optimization.

I'll do another pass over the patch and get back to you.  I've not
looked at the instrumentation patch yet --- is there a reason to
worry about it before we finish this one?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 11:15 AM, Tom Lane  wrote:
> The problem I exhibited with the updated patch could probably be resolved
> if the top level logic in a parallel worker knows about tuples_needed
> and doesn't try to pull more than that from its subplan.  So what you're
> describing isn't just an additional optimization, it's necessary to make
> this work at all.

I had the same thought.  Done in the attached revision of your
version.  I couldn't consistently reproduce the failure you saw -- it
failed for me only about 1 time in 10 -- but I don't think it's
failing any more with this version.

I think that the comment at the bottom of ExecSetTupleBound should
probably be rethought a bit in light of these changes.  Teaching the
parallel worker about tuples_needed may not be JUST an additional
optimization, but it IS an additional optimization; IOW, what the
planner can put between Sort and Limit will no longer be the only
relevant consideration.

The details aside, Konstantin Knizhnik's proposal to teach this code
about ForeignScans is yet another independent way for this to win, and
I think that will also be quite a large win.  In the future, I believe
that we might want to use a mechanism like this in even more cases.
For example, a Semi Join or Anti Join only needs 1 row from the inner
side per execution.  There's probably no reason to put a Sort under
the inner side of a Semi Join or Anti Join, but it's certainly not
impossible to have a Foreign Scan there.  It's likely to be an
expensive plan, but it's a heck of a lot more expensive if we fetch
100 rows when we only need 1.

I'm not sure the existing mechanism will stretch to handle such cases.
It's possible that we should work out some of these things at plan
time rather than execution time, and it's also possible that a
separate call to ExecSetTupleBound is too much work.  I've mulled over
the idea of whether we should get rid of this mechanism altogether in
favor of passing the tuple bound as an additional argument to
ExecProcNode, because it seems silly to call node-specific code once
to set a (normally small, possibly 1) bound and then call it again to
get a (or the) tuple.  But I'm not sure passing it to ExecProcNode is
really the right idea; it's adding some overhead for something that is
a bit of a special case, and I'm not sure it will work out cleanly
otherwise, either.  I'd be interested in your thoughts if you have
any.  My intuition is that there's substantially more to be gained in
this area, but exactly how best to gain it is not very clear to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


push-down-bound-to-workers-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> However, there's another opportunity for optimization here, which is
> the Limit -> Gather -> Anything case.  A worker which has itself
> generated enough tuples to fill the limit can certainly stop working,
> but right now it doesn't know that.  The patch I posted before didn't
> think about that case, but it seems to require only trivial
> modifications.

Ah, good point.

> I was going to post an updated patch trying to
> address that, but I see that you've already posted something, so I'll
> go have a look at that instead.

The problem I exhibited with the updated patch could probably be resolved
if the top level logic in a parallel worker knows about tuples_needed
and doesn't try to pull more than that from its subplan.  So what you're
describing isn't just an additional optimization, it's necessary to make
this work at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 9:24 AM, Tom Lane  wrote:
> Basically, this argument is that we should contort the code in order
> to avoid tail recursion, rather than assuming the compiler will turn
> that recursion into iteration, even in cases where there is really
> zero evidence that we should be worrying about performance at all
> rather than correctness and readability.  I do not agree (and just
> finished pushing a patch to change it).

I'm not prepared to argue further about it.  This strikes me as one of
those things that is really a matter of judgement; I don't really care
much about how the code ends up but I think your portrayal of what was
there is unduly negative.

>> On another note, here's a second patch applying on top of my earlier
>> patch to push down Limit through Gather and Gather Merge.
>
> Can you demonstrate any case where the planner would put Gather between
> Sort and Limit?  The GatherMerge case is probably interesting, but I'm
> having doubts about Gather.

That's an interesting point.  I was thinking that we needed both of
these patches in order to make the proposed regression test pass,
because in the relevant case we'll have Gather -> Limit -> Sort, but
actually we only need the second one, because as you point out here
what matters is what's between the Limit and the Sort.  I was thinking
that we'd need to be able to see through the Gather at the top of the
plan, but we don't.

However, there's another opportunity for optimization here, which is
the Limit -> Gather -> Anything case.  A worker which has itself
generated enough tuples to fill the limit can certainly stop working,
but right now it doesn't know that.  The patch I posted before didn't
think about that case, but it seems to require only trivial
modifications.   I was going to post an updated patch trying to
address that, but I see that you've already posted something, so I'll
go have a look at that instead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Amit Kapila  writes:
> On Fri, Aug 25, 2017 at 7:35 PM, Tom Lane  wrote:
>> it would be stupid to put a filter on that node rather than its
>> children, but I see this in both nodeGather.c and nodeGatherMerge.c:
>>
>> gatherstate->ps.qual =
>> ExecInitQual(node->plan.qual, (PlanState *) gatherstate);
>> 
>> It doesn't look like the qual is actually used anywhere in either node
>> type.  Am I right in thinking this is dead code?

> I also think so.  I think this was required in some initial versions
> of gather node patch where we were thinking of having a single node
> (instead of what we have now that Gather node and beneath there will
> be partial scan node) to perform parallel scans.

Thanks for confirming.  I'll replace that with something like
Assert(!node->plan.qual).  I have some other code-review-ish
fixes to do in nodeGatherMerge, too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> I'm inclined to commit both of these after a little more testing and
> self-review, but let me know if anyone else wants to review first.

Attached is an updated version of the first patch, which I rebased
over 3f4c7917b, improved a bunch of the comments for, and fixed a
couple of obvious typos in.  However, when I went to test it,
it blew up real good:

regression=# set parallel_setup_cost=0;
SET
regression=# set parallel_tuple_cost=0;
SET
regression=# set min_parallel_table_scan_size=0;
SET
regression=# set max_parallel_workers_per_gather=4;
SET
regression=# explain analyze select * from tenk1 order by fivethous limit 4;
ERROR:  retrieved too many tuples in a bounded sort
CONTEXT:  parallel worker

The cause is obvious: GatherMerge doesn't know about the contract that
it's not supposed to pull more than tuples_needed rows from an input after
promising not to do so.  I am not convinced that it's worth adding the
logic that would be needed to make that happen, so my inclination is to
abandon this patch.  But here it is if you want to push further.

regards, tom lane

diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index ce47f1d..1287025 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -47,17 +47,26 @@
  * greater than any 32-bit integer here so that values < 2^32 can be used
  * by individual parallel nodes to store their own state.
  */
-#define PARALLEL_KEY_PLANNEDSTMT		UINT64CONST(0xE001)
-#define PARALLEL_KEY_PARAMSUINT64CONST(0xE002)
-#define PARALLEL_KEY_BUFFER_USAGE		UINT64CONST(0xE003)
-#define PARALLEL_KEY_TUPLE_QUEUE		UINT64CONST(0xE004)
-#define PARALLEL_KEY_INSTRUMENTATION	UINT64CONST(0xE005)
-#define PARALLEL_KEY_DSAUINT64CONST(0xE006)
-#define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE007)
+#define PARALLEL_KEY_EXECUTOR_FIXED		UINT64CONST(0xE001)
+#define PARALLEL_KEY_PLANNEDSTMT		UINT64CONST(0xE002)
+#define PARALLEL_KEY_PARAMSUINT64CONST(0xE003)
+#define PARALLEL_KEY_BUFFER_USAGE		UINT64CONST(0xE004)
+#define PARALLEL_KEY_TUPLE_QUEUE		UINT64CONST(0xE005)
+#define PARALLEL_KEY_INSTRUMENTATION	UINT64CONST(0xE006)
+#define PARALLEL_KEY_DSAUINT64CONST(0xE007)
+#define PARALLEL_KEY_QUERY_TEXT		UINT64CONST(0xE008)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE		65536
 
 /*
+ * Fixed-size random stuff that we need to pass to parallel workers.
+ */
+typedef struct FixedParallelExecutorState
+{
+	int64		tuples_needed;	/* tuple bound, see ExecSetTupleBound */
+} FixedParallelExecutorState;
+
+/*
  * DSM structure for accumulating per-PlanState instrumentation.
  *
  * instrument_options: Same meaning here as in instrument.c.
@@ -381,12 +390,14 @@ ExecParallelReinitialize(ParallelExecutorInfo *pei)
  * execution and return results to the main backend.
  */
 ParallelExecutorInfo *
-ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
+ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers,
+	 int64 tuples_needed)
 {
 	ParallelExecutorInfo *pei;
 	ParallelContext *pcxt;
 	ExecParallelEstimateContext e;
 	ExecParallelInitializeDSMContext d;
+	FixedParallelExecutorState *fpes;
 	char	   *pstmt_data;
 	char	   *pstmt_space;
 	char	   *param_space;
@@ -418,6 +429,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
 	 * for the various things we need to store.
 	 */
 
+	/* Estimate space for fixed-size state. */
+	shm_toc_estimate_chunk(&pcxt->estimator,
+		   sizeof(FixedParallelExecutorState));
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Estimate space for query text. */
 	query_len = strlen(estate->es_sourceText);
 	shm_toc_estimate_chunk(&pcxt->estimator, query_len);
@@ -487,6 +503,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers)
 	 * asked for has been allocated or initialized yet, though, so do that.
 	 */
 
+	/* Store fixed-size state. */
+	fpes = shm_toc_allocate(pcxt->toc, sizeof(FixedParallelExecutorState));
+	fpes->tuples_needed = tuples_needed;
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECUTOR_FIXED, fpes);
+
 	/* Store query string */
 	query_string = shm_toc_allocate(pcxt->toc, query_len);
 	memcpy(query_string, estate->es_sourceText, query_len);
@@ -833,6 +854,7 @@ ExecParallelInitializeWorker(PlanState *planstate, shm_toc *toc)
 void
 ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 {
+	FixedParallelExecutorState *fpes;
 	BufferUsage *buffer_usage;
 	DestReceiver *receiver;
 	QueryDesc  *queryDesc;
@@ -841,6 +863,9 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 	void	   *area_space;
 	dsa_area   *area;
 
+	/* Get fixed-size state. */
+	fpes = shm_toc_lookup(toc, PARALLEL_KEY_EXECUTOR_FIXED, false);
+
 	/* Set up DestReceiver, SharedExecutorInstrumentation, and Query

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Amit Kapila
On Fri, Aug 25, 2017 at 7:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I'm inclined to commit both of these after a little more testing and
>> self-review, but let me know if anyone else wants to review first.
>
> Looking through the first one, I wondered whether we needed to
> check for a qual expression on Gather or GatherMerge.  It seems like
> it would be stupid to put a filter on that node rather than its
> children, but I see this in both nodeGather.c and nodeGatherMerge.c:
>
> /*
>  * initialize child expressions
>  */
> gatherstate->ps.qual =
> ExecInitQual(node->plan.qual, (PlanState *) gatherstate);
>
> It doesn't look like the qual is actually used anywhere in either node
> type.  Am I right in thinking this is dead code?
>

I also think so.  I think this was required in some initial versions
of gather node patch where we were thinking of having a single node
(instead of what we have now that Gather node and beneath there will
be partial scan node) to perform parallel scans.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> I'm inclined to commit both of these after a little more testing and
> self-review, but let me know if anyone else wants to review first.

Looking through the first one, I wondered whether we needed to
check for a qual expression on Gather or GatherMerge.  It seems like
it would be stupid to put a filter on that node rather than its
children, but I see this in both nodeGather.c and nodeGatherMerge.c:

/*
 * initialize child expressions
 */
gatherstate->ps.qual =
ExecInitQual(node->plan.qual, (PlanState *) gatherstate);

It doesn't look like the qual is actually used anywhere in either node
type.  Am I right in thinking this is dead code?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 24, 2017 at 2:24 PM, Tom Lane  wrote:
>> I'd have been okay with changing the entire function if it could still be
>> doing all cases the same way.  But the exception for merge-append (and
>> probably soon other cases) means you still end up with a readability
>> problem.

> I don't really agree with that.  I think it's reasonable to handle the
> cases where we are just looking through nodes differently than the
> others.  Just because we can't conveniently avoid recursing in every
> case doesn't mean, to me, that we should recurse even when it's easily
> avoidable.

Basically, this argument is that we should contort the code in order
to avoid tail recursion, rather than assuming the compiler will turn
that recursion into iteration, even in cases where there is really
zero evidence that we should be worrying about performance at all
rather than correctness and readability.  I do not agree (and just
finished pushing a patch to change it).

> On another note, here's a second patch applying on top of my earlier
> patch to push down Limit through Gather and Gather Merge.

Can you demonstrate any case where the planner would put Gather between
Sort and Limit?  The GatherMerge case is probably interesting, but I'm
having doubts about Gather.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-25 Thread Robert Haas
On Thu, Aug 24, 2017 at 2:24 PM, Tom Lane  wrote:
> I'd have been okay with changing the entire function if it could still be
> doing all cases the same way.  But the exception for merge-append (and
> probably soon other cases) means you still end up with a readability
> problem.

I don't really agree with that.  I think it's reasonable to handle the
cases where we are just looking through nodes differently than the
others.  Just because we can't conveniently avoid recursing in every
case doesn't mean, to me, that we should recurse even when it's easily
avoidable.  I think what we might think about doing is have a node
that loops over Subquery Scan and Result and drills down, and then
handle the rest of the cases either as at present or, perhaps, with a
switch.

On another note, here's a second patch applying on top of my earlier
patch to push down Limit through Gather and Gather Merge.  This one
propagates information about sorts performed in parallel workers back
to the leader and adjusts the test case to use a non-temp table again.
This second patch has the nice property of being a fairly good test
that the first patch is actually doing something.

I'm inclined to commit both of these after a little more testing and
self-review, but let me know if anyone else wants to review first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


propagate-sort-instrumentation.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
Douglas Doole  writes:
>> TBH I dislike the fact that
>> you did the subquery case randomly differently from the existing cases;
>> it should just have been added as an additional recursive case.  Since
>> this is done only once at query startup, worrying about hypothetical
>> micro-performance issues seems rather misguided.

> Habit mostly - why write code with potential performance problems when the
> alternative is just as easy to read?

I disagree that having adjacent code doing the same thing in two different
ways is "easy to read"; what it is is confusing.  More globally, since
we're dealing with community code that will be read and hacked on by many
people, I think we need to prioritize simplicity, readability, and
extensibility over micro-efficiency.  I'm willing to compromise those
goals for efficiency where a clear need has been demonstrated, but no such
need has been shown here, nor do I think it's likely that one can be
shown.

I'd have been okay with changing the entire function if it could still be
doing all cases the same way.  But the exception for merge-append (and
probably soon other cases) means you still end up with a readability
problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
Douglas Doole  writes:
> My first thought was to do a regex over the explain output to mask the
> troublesome value, but I couldn't figure out how to make it work and didn't
> find an example (admittedly didn't spent a huge amount of time hunting
> though). I'd love to see how to put it together.

I did it like this:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1177ab1dabf72bafee8f19d904cee3a299f25892

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Douglas Doole
>
> No.  You can't easily avoid recursion for the merge-append case, since
> that has to descend to multiple children.


Agreed. That's why it's not in the while loop in my sketch of the suggested
rework.


>   TBH I dislike the fact that
> you did the subquery case randomly differently from the existing cases;
> it should just have been added as an additional recursive case.  Since
> this is done only once at query startup, worrying about hypothetical
> micro-performance issues seems rather misguided.
>

Habit mostly - why write code with potential performance problems when the
alternative is just as easy to read?

I disagree with saying "it's just done at query startup so don't worry
about it too much". Back in my days with DB2 we were working on the TPCC
benchmark (when it was still relevant) and we found that the startup cost
was a huge factor when doing rapid fire, cached, simple queries. In many
cases the startup cost was >50% of the overall query execution time. Our
testing here in Salesforce is showing a similar pattern in some of our
workloads and PostgreSQL.

I'm not trying to argue that this particular issue of recurse/don't recurse
is going to make or break anything. It's just where my head's at when I
look at the code.

- Doug
Salesforce


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
I wrote:
> To get the buildfarm back to green, I agree with the suggestion of
> setting force_parallel_mode=off for this test, at least for now.

Actually, there's an easier way: we can just make the test table be
TEMP.  That is a good idea anyway to prevent possible interference
from auto-analyze, which conceivably could come along at just the
wrong time and cause a plan change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Douglas Doole
>
> I don't greatly like the way that the regression test case filters
> the output; it hides too much IMO.  I'd be inclined to try to return
> the EXPLAIN output with as much detail preserved as possible.  Maybe
> we could use regex substitution on lines of the output to get rid of
> stuff that won't be invariant.
>

My first thought was to do a regex over the explain output to mask the
troublesome value, but I couldn't figure out how to make it work and didn't
find an example (admittedly didn't spent a huge amount of time hunting
though). I'd love to see how to put it together.

Doug
- Salesforce


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
Robert Haas  writes:
> Buildfarm members with force_parallel_mode=regress are failing now.  I
> haven't had a chance to investigate exactly what's going on here, but
> I think there are probably several issues:

> 1. It's definitely the case that the details about a sort operation
> aren't propagated from a worker back to the leader.  This has elicited
> complaint previously.

Check; this must be the explanation for the buildfarm failures.  That'd
be worth fixing but it seems like material for an independent patch.

> 2. pass_down_bound() probably gets the Gather node at the top of the
> tree and stops, since it doesn't know how to pass down a bound through
> a Gather.  We could probably teach it to pass down the bound through
> Gather or Gather Merge on the same theory as Append/MergeAppend.

While that might be worth doing, it's not the issue here, because
the forced Gather is on top of the Limit node.

> In the short run, I'm not sure we have a better alternative than
> removing this test - unless somebody has a better idea? - but it would
> be good to work on all of the above problems.

To get the buildfarm back to green, I agree with the suggestion of
setting force_parallel_mode=off for this test, at least for now.

I don't greatly like the way that the regression test case filters
the output; it hides too much IMO.  I'd be inclined to try to return
the EXPLAIN output with as much detail preserved as possible.  Maybe
we could use regex substitution on lines of the output to get rid of
stuff that won't be invariant.

Unless you're already on it, I can go do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
Douglas Doole  writes:
> Would we be better off moving those cases into the while loop I added to
> avoid the recursion?

No.  You can't easily avoid recursion for the merge-append case, since
that has to descend to multiple children.  TBH I dislike the fact that
you did the subquery case randomly differently from the existing cases;
it should just have been added as an additional recursive case.  Since
this is done only once at query startup, worrying about hypothetical
micro-performance issues seems rather misguided.

Perhaps, since this function is recursive, it ought to have a
check_stack_depth call, just to be safe.  I'm dubious though that it could
ever eat more stack than the preceding node-initialization calls, so maybe
we needn't bother.

In the long run I wonder whether we should convert "push down bound" into
a generic node operation like the others managed by execAmi.c.  It seems
like you could push a limit through any node type that's guaranteed not to
reduce the number of rows returned, and so we're missing cases.  Maybe
they're not ones that are important in practice, but I'm unsure.

> The two casts I added (to SubqueryScanState and Node) should probably be
> changed to castNode() calls.

The SubqueryScanState cast seems fine given that it immediately follows an
IsA() check.  You can't use castNode() for a cast to Node, because that's
a generic not a specific node type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-24 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 23, 2017 at 6:55 PM, Douglas Doole  wrote:
>> From previous experience, pushing the limit to the workers has the potential
>> to be a big win .

> Here's a not-really-tested draft patch for that.

Both this patch and the already-committed one contain useless (and not
very cheap) expression_returns_set tests.  Since commit 69f4b9c85,
SRFs could only appear in the tlist of ProjectSet nodes.

No opinion on whether this patch is right otherwise.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Douglas Doole
>
> Here's a not-really-tested draft patch for that.
>

I don't know the parallel code so I'm not going to comment on the overall
patch, but a thought on ExecSetTupleBound().

That function is getting a number of cases where we're doing recursion for
a single child (result, gather, gather-merge). Recursion for a single child
isn't particularly efficient. I know that if there's a single case of
recursion like this, compilers will frequently turn it into a loop, but I
don't know if compilers can optimize a branched case like this.

Would we be better off moving those cases into the while loop I added to
avoid the recursion? So we'd end up with something like:

while ()
{
if subquery
else if result
else if gather
else if gather merge
}

if sort
else if merge append

And a nit from my original fix now that I've looked at the pg10 code more.
The two casts I added (to SubqueryScanState and Node) should probably be
changed to castNode() calls.

- Doug
Salesforce


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Robert Haas
On Wed, Aug 23, 2017 at 6:55 PM, Douglas Doole  wrote:
> From previous experience, pushing the limit to the workers has the potential
> to be a big win .

Here's a not-really-tested draft patch for that.

> I haven't played with parallel mode at all yet. Is it possible to disable
> force_parallel_mode for the new test?

Yeah, I suppose that's another alternative.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


push-down-bound-to-workers.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Douglas Doole
>
> Buildfarm members with force_parallel_mode=regress are failing now.  I
> haven't had a chance to investigate exactly what's going on here, but
> I think there are probably several issues:
>

Not an auspicious beginning for my first patch :-(


> 3. However, even if it did that, it would only affect the leader, not
> the workers, because each worker will of course have a separate copy
> of each executor state node.  We could fix that by having the Gather
> or Gather Merge node also store the bound and propagate that to each
> worker via DSM, and then have each worker call pass_down_bound itself.
> (This would require a bit of refactoring of the API for
> pass_down_bound, but it looks doable.)
>

>From previous experience, pushing the limit to the workers has the
potential to be a big win .

In the short run, I'm not sure we have a better alternative than
> removing this test - unless somebody has a better idea? - but it would
> be good to work on all of the above problems.
>

I haven't played with parallel mode at all yet. Is it possible to disable
force_parallel_mode for the new test? If not, then I'd agree that removing
the test is probably the only reasonable short term fix.

- Doug
Salesforce


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Robert Haas
On Mon, Aug 21, 2017 at 2:43 PM, Robert Haas  wrote:
> Works for me.  While I'm sure this won't eclipse previous achievements
> in this area, it still seems worthwhile.  So committed with one minor
> change to shorten a long-ish line.

Buildfarm members with force_parallel_mode=regress are failing now.  I
haven't had a chance to investigate exactly what's going on here, but
I think there are probably several issues:

1. It's definitely the case that the details about a sort operation
aren't propagated from a worker back to the leader.  This has elicited
complaint previously.

2. pass_down_bound() probably gets the Gather node at the top of the
tree and stops, since it doesn't know how to pass down a bound through
a Gather.  We could probably teach it to pass down the bound through
Gather or Gather Merge on the same theory as Append/MergeAppend.

3. However, even if it did that, it would only affect the leader, not
the workers, because each worker will of course have a separate copy
of each executor state node.  We could fix that by having the Gather
or Gather Merge node also store the bound and propagate that to each
worker via DSM, and then have each worker call pass_down_bound itself.
(This would require a bit of refactoring of the API for
pass_down_bound, but it looks doable.)

In the short run, I'm not sure we have a better alternative than
removing this test - unless somebody has a better idea? - but it would
be good to work on all of the above problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Ashutosh Bapat
On Wed, Aug 23, 2017 at 12:56 PM, Konstantin Knizhnik
 wrote:
>
>
> On 22.08.2017 17:27, Konstantin Knizhnik wrote:
>
>
>
> On 18.08.2017 04:33, Robert Haas wrote:
>
>
> It seems like a somewhat ad-hoc approach; it supposes that we can take any
> query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto
> the very end and all will be well.  Maybe that's not a problematic
> assumption, not sure.  The grammar happens to allow both FOR UPDATE LIMIT n
> and LIMIT n FOR UPDATE even though only the latter syntax is documented.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
> I am not absolutely sure that it is possible to append any query which can
> be constructed by postgres_fdw for foreign scan with "LIMIT n" clause.
> But I also do not know example when it is not possible. As you have
> mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres.
>
>
> I have inspected deparseSelectStmtForRel function and now I am sure that
> appending LIMIT to the SQL statement generated by this function will not
> cause any problems.
> It can produce only the following subset of SELECT:
>
> select  FROM  [GROUP BY ... [ HAVING ... ]] [
> OREDER BY ... ] [ FOR UPDATE ... ];
>
>
> The only suspicious clause is FOR UPDATE, but I have checked that "FOR
> UPDATE ... LIMIT n" is  really accepted by Postgres parser.
>

There are two ways we can do this.
1. Implement limit handling in postgresGetForeignUpperPaths() when it
gets UPPERREL_FINAL (or we might want to break this rel into final and
limit rel). We then add paths for limit processing to the final rel
and add corresponding handling in deparseSelectStmtForRel(). If
foreign path happens to be cheaper, LIMIT gets pushed down to the
foreign server. But this approach does not take care of LIMITs passed
down by pass_down_bound(). But it will take care of deparsing the
query correctly and handle OFFSET clause as well.

2. Konstantin's approach takes care of LIMITs passed down by
pass_down_bound(), but "always" pushes down the LIMIT and assumes that
a LIMIT clause can be appended to the query already generated. Both of
these seem sane choices. But then it doesn't push down OFFSET clause
even when it's possible to push it down.

If we could defer deparsing the query to execution time we might be
able to achieve both the targets. It has one more advantage: we could
pass parameters embedded in the query, rather than binding them.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Konstantin Knizhnik



On 22.08.2017 17:27, Konstantin Knizhnik wrote:



On 18.08.2017 04:33, Robert Haas wrote:


It seems like a somewhat ad-hoc approach; it supposes that we can 
take any query produced by deparseSelectStmtForRel() and stick a 
LIMIT clause onto the very end and all will be well.  Maybe that's 
not a problematic assumption, not sure.  The grammar happens to allow 
both FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the 
latter syntax is documented.


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



I am not absolutely sure that it is possible to append any query which 
can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause.
But I also do not know example when it is not possible. As you have 
mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres.




I have inspected deparseSelectStmtForRel function and now I am sure that 
appending LIMIT to the SQL statement generated by this function will not 
cause any problems.

It can produce only the following subset of SELECT:

select  FROM  [GROUP BY ... [ HAVING ... ]] [ 
OREDER BY ... ] [ FOR UPDATE ... ];



The only suspicious clause is FOR UPDATE, but I have checked that "FOR 
UPDATE ... LIMIT n" is  really accepted by Postgres parser.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-22 Thread Konstantin Knizhnik



On 18.08.2017 04:33, Robert Haas wrote:


It seems like a somewhat ad-hoc approach; it supposes that we can take 
any query produced by deparseSelectStmtForRel() and stick a LIMIT 
clause onto the very end and all will be well.  Maybe that's not a 
problematic assumption, not sure.  The grammar happens to allow both 
FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the latter 
syntax is documented.


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



I am not absolutely sure that it is possible to append any query which 
can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause.
But I also do not know example when it is not possible. As you have 
mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres.


Can you suggest how to implement limit push down to FDW in better way?
Move deparseSelectStmtForRel() from postgresGetForeignPlan to 
postgresIterateForeignScan ?
It seems to be problematic because many information required by 
deparseSelectStmtForRel is not available in postgresIterateForeignScan.
In principle, it is possible to somehow propagate it here. But from my 
point of view it is not right approach...


IMHO there is some contradiction in Postgres optimizer that static 
information about limit is not taken in account at the planning stage 
and is actually used only during query execution,
when pass_down_bound() function is called to propagate knowledge about 
limit down through plan nodes. Certainly I understand that it gives more 
flexibility: we can use information from

previous steps of query execution which was not available at planning stage.

But pushing down limit at planning stage requires too much changes. And 
the proposed patch is very small and non-invasive. And in principle, it 
can be used not only postgres_fdw, but also in other FDW implementations 
to push down information about LIMIT.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-21 Thread Michael Paquier
On Tue, Aug 22, 2017 at 3:43 AM, Robert Haas  wrote:
> Works for me.  While I'm sure this won't eclipse previous achievements
> in this area, it still seems worthwhile.

This one is intentional per what happens in the US today? ;)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-21 Thread Robert Haas
On Fri, Aug 18, 2017 at 4:08 PM, Douglas Doole  wrote:
>> 4. I am pretty doubtful that "Memory: 25kB" is going to be stable
>> enough for us to want that output memorialized in the regression ...
>
> Fair enough. I wanted to be a bit more sophisticated in my check than
> looking for a single value so I worked out something that distills the
> explain down to the key elements.

Works for me.  While I'm sure this won't eclipse previous achievements
in this area, it still seems worthwhile.  So committed with one minor
change to shorten a long-ish line.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Douglas Doole
>
> 1. The header comment for pass_down_bound() could mention "one or more
> levels of subqueries" rather than "a subquery".
>

Fixed

2. The first of the comments in the function body appears to have a
> whitespace issue that needs to be fixed manually or, better yet,
> addressed by pgindent.
>

Fixed


> 3. The formatting of the comment in the regression tests appears to be
> unlike any other comment in that same file.
>

A side effect of inheriting it from our branches ;-) Reworked.


> 4. I am pretty doubtful that "Memory: 25kB" is going to be stable
> enough for us to want that output memorialized in the regression ...
>

Fair enough. I wanted to be a bit more sophisticated in my check than
looking for a single value so I worked out something that distills the
explain down to the key elements.
*** a/src/backend/executor/nodeLimit.c
--- b/src/backend/executor/nodeLimit.c
***
*** 308,313  recompute_limits(LimitState *node)
--- 308,316 
   * since the MergeAppend surely need read no more than that many tuples from
   * any one input.  We also have to be prepared to look through a Result,
   * since the planner might stick one atop MergeAppend for projection purposes.
+  * We can also accept one or more levels of subqueries that have no quals or
+  * SRFs (that is, each subquery is just projecting columns) between the LIMIT
+  * and any of the above.
   *
   * This is a bit of a kluge, but we don't have any more-abstract way of
   * communicating between the two nodes; and it doesn't seem worth trying
***
*** 320,325  recompute_limits(LimitState *node)
--- 323,349 
  static void
  pass_down_bound(LimitState *node, PlanState *child_node)
  {
+ 	/*
+ 	 * If the child is a subquery that does no filtering (no predicates)
+ 	 * and does not have any SRFs in the target list then we can potentially
+ 	 * push the limit through the subquery. It is possible that we could have
+ 	 * multiple subqueries, so tunnel through them all.
+ 	 */
+ 	while (IsA(child_node, SubqueryScanState))
+ 	{
+ 		SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node;
+ 
+ 		/*
+ 		 * Non-empty predicates or an SRF means we cannot push down the limit.
+ 		 */
+ 		if (subqueryScanState->ss.ps.qual != NULL ||
+ 			expression_returns_set((Node *) child_node->plan->targetlist))
+ 			return;
+ 
+ 		/* Use the child in the following checks */
+ 		child_node = subqueryScanState->subplan;
+ 	}
+ 
  	if (IsA(child_node, SortState))
  	{
  		SortState  *sortState = (SortState *) child_node;
*** a/src/test/regress/expected/subselect.out
--- b/src/test/regress/expected/subselect.out
***
*** 1041,1043  NOTICE:  x = 9, y = 13
--- 1041,1095 
  (3 rows)
  
  drop function tattle(x int, y int);
+ --
+ -- Test that LIMIT can be pushed to SORT through a subquery that just
+ -- projects columns
+ --
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ 	(1, 1, 1),
+ 	(2, 2, 2),
+ 	(3, 3, 3),
+ 	(4, 4, 4),
+ 	(5, 1, 1),
+ 	(6, 2, 2),
+ 	(7, 3, 3),
+ 	(8, 4, 4);
+ -- The explain contains data that may not be invariant, so
+ -- filter for just the interesting bits.  The goal here is that
+ -- we should see three notices, in order:
+ --   NOTICE: Limit
+ --   NOTICE: Subquery
+ --   NOTICE: Top-N Sort
+ -- A missing step, or steps out of order means we have a problem.
+ do $$
+ 	declare x text;
+ 	begin
+ 		for x in
+ 			explain (analyze, summary off, timing off, costs off)
+ 			select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3
+ 		loop
+ 			if (left(ltrim(x), 5) = 'Limit') then
+ raise notice 'Limit';
+ 			end if;
+ 			if (left(ltrim(x), 12) = '->  Subquery') then
+ raise notice 'Subquery';
+ 			end if;
+ 			if (left(ltrim(x), 18) = 'Sort Method: top-N') then
+ raise notice 'Top-N Sort';
+ 			end if;
+ 		end loop;
+ 	end;
+ $$;
+ NOTICE:  Limit
+ NOTICE:  Subquery
+ NOTICE:  Top-N Sort
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+  pk | c2 
+ +
+   1 |  1
+   5 |  1
+   2 |  2
+ (3 rows)
+ 
+ drop table sq_limit;
*** a/src/test/regress/sql/subselect.sql
--- b/src/test/regress/sql/subselect.sql
***
*** 540,542  select * from
--- 540,588 
where tattle(x, u);
  
  drop function tattle(x int, y int);
+ 
+ --
+ -- Test that LIMIT can be pushed to SORT through a subquery that just
+ -- projects columns
+ --
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ 	(1, 1, 1),
+ 	(2, 2, 2),
+ 	(3, 3, 3),
+ 	(4, 4, 4),
+ 	(5, 1, 1),
+ 	(6, 2, 2),
+ 	(7, 3, 3),
+ 	(8, 4, 4);
+ 
+ -- The explain contains data that may not be invariant, so
+ -- filter for just the interesting bits.  The goal here is that
+ -- we should see three notices, in order:
+ --   NOTICE: Limit
+ --   NOTICE: Subquery
+ --   NOTICE: Top-N Sort
+ -- A missing step, or steps out of order means we have a problem.
+ do $$
+ 	declare x text;
+ 	begin
+ 		for

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 11:42 AM, Douglas Doole  wrote:
> Thanks for the feedback on my original patch Robert. Here's an updated patch
> that will tunnel through multiple SubqueryScanStates.

Seems reasonable.  I have some assorted nitpicks.

1. The header comment for pass_down_bound() could mention "one or more
levels of subqueries" rather than "a subquery".

2. The first of the comments in the function body appears to have a
whitespace issue that needs to be fixed manually or, better yet,
addressed by pgindent.

3. The formatting of the comment in the regression tests appears to be
unlike any other comment in that same file.

4. I am pretty doubtful that "Memory: 25kB" is going to be stable
enough for us to want that output memorialized in the regression
tests. It seems like it might vary on different platforms - e.g.
32-bit vs. 64-bit - and it also seems like minor changes to how we do
sorting could perturb it and, perhaps, make it unstable even if today
it isn't.  So I think it would be good to give this a bit more thought
and see if you can come up with a way to test this without running
afoul of that problem.  Maybe adapt from this:

do $$declare x text; begin execute 'explain select 1' into x; if x !~
'^Result' then raise notice '%', x; else raise notice 'looks ok'; end
if; end;$$;

BTW, regarding the other patch on this thread, it struck me that maybe
it would be better to just reduce/limit the fetch count for the cursor
instead of trying to inject LIMIT n into the query itself.  That's not
as good for query optimization purposes but it's a lot more
future-proof.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Douglas Doole
Thanks for the feedback on my original patch Robert. Here's an updated
patch that will tunnel through multiple SubqueryScanStates.

- Doug
Salesforce

On Thu, Aug 17, 2017 at 6:33 PM Robert Haas  wrote:

> On Thu, Aug 17, 2017 at 11:36 AM, Douglas Doole 
> wrote:
>
>> I completely agree. The further a limit can be pushed down, the better.
>>
>> The patch looks good to me.
>>
>
> It seems like a somewhat ad-hoc approach; it supposes that we can take any
> query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto
> the very end and all will be well.  Maybe that's not a problematic
> assumption, not sure.  The grammar happens to allow both FOR UPDATE LIMIT n
> and LIMIT n FOR UPDATE even though only the latter syntax is documented.
>
> Regarding the other patch on this thread, you mentioned upthread that "If
> it is possible to get more than one SubqueryScanState and/or ResultState
> between the limit and sort, then the first block of code could be placed in
> a while loop."  I think that's not possible for a ResultState, but I think
> it *is* possible for a SubqueryScanState.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
*** a/src/backend/executor/nodeLimit.c
--- b/src/backend/executor/nodeLimit.c
***
*** 308,313  recompute_limits(LimitState *node)
--- 308,315 
   * since the MergeAppend surely need read no more than that many tuples from
   * any one input.  We also have to be prepared to look through a Result,
   * since the planner might stick one atop MergeAppend for projection purposes.
+  * We can also accept a subquery that has no quals or SRFs (that is, the
+  * subquery is just projecting columns) between the LIMIT and any of the above.
   *
   * This is a bit of a kluge, but we don't have any more-abstract way of
   * communicating between the two nodes; and it doesn't seem worth trying
***
*** 320,325  recompute_limits(LimitState *node)
--- 322,348 
  static void
  pass_down_bound(LimitState *node, PlanState *child_node)
  {
+ 	/*
+ 	 * If the child is a subquery that does no filtering (no predicates)
+ 	 * and does not have any SRFs in the target list then we can potentially
+ 	 * push the limit through the subquery. It is possible that we could have
+  * multiple subqueries, so tunnel through them all.
+ 	 */
+ 	while (IsA(child_node, SubqueryScanState))
+ 	{
+ 		SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node;
+ 
+ 		/*
+ 		 * Non-empty predicates or an SRF means we cannot push down the limit.
+ 		 */
+ 		if (subqueryScanState->ss.ps.qual != NULL ||
+ 			expression_returns_set((Node *) child_node->plan->targetlist))
+ 			return;
+ 
+ 		/* Use the child in the following checks */
+ 		child_node = subqueryScanState->subplan;
+ 	}
+ 
  	if (IsA(child_node, SortState))
  	{
  		SortState  *sortState = (SortState *) child_node;
*** a/src/test/regress/expected/subselect.out
--- b/src/test/regress/expected/subselect.out
***
*** 1041,1043  NOTICE:  x = 9, y = 13
--- 1041,1077 
  (3 rows)
  
  drop function tattle(x int, y int);
+ -
+ --TEST LIMIT pushdown through subquery scan node
+ -
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ (1, 1, 1),
+ (2, 2, 2),
+ (3, 3, 3),
+ (4, 4, 4),
+ (5, 1, 1),
+ (6, 2, 2),
+ (7, 3, 3),
+ (8, 4, 4);
+ explain (analyze, summary off, timing off, costs off)
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+QUERY PLAN   
+ 
+  Limit (actual rows=3 loops=1)
+->  Subquery Scan on x (actual rows=3 loops=1)
+  ->  Sort (actual rows=3 loops=1)
+Sort Key: sq_limit.c1, sq_limit.pk
+Sort Method: top-N heapsort  Memory: 25kB
+->  Seq Scan on sq_limit (actual rows=8 loops=1)
+ (6 rows)
+ 
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+  pk | c2 
+ +
+   1 |  1
+   5 |  1
+   2 |  2
+ (3 rows)
+ 
+ drop table sq_limit;
*** a/src/test/regress/sql/subselect.sql
--- b/src/test/regress/sql/subselect.sql
***
*** 540,542  select * from
--- 540,563 
where tattle(x, u);
  
  drop function tattle(x int, y int);
+ 
+ -
+ --TEST LIMIT pushdown through subquery scan node
+ -
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ (1, 1, 1),
+ (2, 2, 2),
+ (3, 3, 3),
+ (4, 4, 4),
+ (5, 1, 1),
+ (6, 2, 2),
+ (7, 3, 3),
+ (8, 4, 4);
+ 
+ explain (analyze, summary off, timing off, costs off)
+ select * from (select pk,c2 from sq_limit order by c1,pk) a

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-17 Thread Robert Haas
On Thu, Aug 17, 2017 at 11:36 AM, Douglas Doole  wrote:

> I completely agree. The further a limit can be pushed down, the better.
>
> The patch looks good to me.
>

It seems like a somewhat ad-hoc approach; it supposes that we can take any
query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto
the very end and all will be well.  Maybe that's not a problematic
assumption, not sure.  The grammar happens to allow both FOR UPDATE LIMIT n
and LIMIT n FOR UPDATE even though only the latter syntax is documented.

Regarding the other patch on this thread, you mentioned upthread that "If
it is possible to get more than one SubqueryScanState and/or ResultState
between the limit and sort, then the first block of code could be placed in
a while loop."  I think that's not possible for a ResultState, but I think
it *is* possible for a SubqueryScanState.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-17 Thread Douglas Doole
I completely agree. The further a limit can be pushed down, the better.

The patch looks good to me.

On Thu, Aug 17, 2017 at 8:27 AM Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
>
> On 29.04.2017 00:13, Douglas Doole wrote:
>
> If you add this to the commitfest app, more people might look at it when
>> the next commitfest opens.
>
>
> I have added it. https://commitfest.postgresql.org/14/1119/
>
> Also, it might help if you can provide a query/ies with numbers where this
>> optimization shows improvement.
>>
>
> I can't provide the real queries where we encountered the problem because
> they are internal. However I showed a simplified version of the queries in
> my first post.
>
> On our queries, the change made quite a difference - execution time
> dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows that
> memory use dropped significantly and we didn't have to spill the sort to
> disk
>
> From:
>
> -> Sort (cost=989.95..1013.27 rows=9326 width=30)
> (node_startup_time/loop=31328.891, node_total_time/loop: 31329.756
> rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr
> hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 89592
> Sort Space Type: Disk
>
> To:
>
> -> Sort (cost=989.95..1013.27 rows=9326 width=30)
> (node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 rows=2001
> loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort Method: top-N
> heapsort Sort Space Used: 3256 Sort Space Type: Memory
>
>
> Attached please find yet another small patch which pushes down LIMIT to
> ForeignScan.
> I should notice that currently Postgres optimizer is using "Merge Append"
> and fetches from remote nodes only required number of tuples.
> So even without LIMIT push down, postgres_fdw will not pull the whole
> table from remote host.
> postgres_fdw is using cursor for fetching data from remote. Default fetch
> size is 100, so even without limit remote query will fetch no more  than
> 100 rows at remote site.
>
> Assume the following example:
>
> postgres=# create extension postgres_fdw;
> postgres=# create server shard1  FOREIGN DATA WRAPPER postgres_fdw
> options(dbname 'postgres', host 'localhost', port '5432');
> postgres=# create server shard2  FOREIGN DATA WRAPPER postgres_fdw
> options(dbname 'postgres', host 'localhost', port '5432');
> postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user
> '$user');
> postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user
> '$user');
> postgres=# CREATE TABLE t(u integer primary key, v integer);
> postgres=# CREATE TABLE t1(u integer primary key, v integer);
> postgres=# CREATE TABLE t2(u integer primary key, v integer);
> postgres=# insert into t1 values (generate_series(1,10),
> random()*10);
> postgres=# insert into t2 values (generate_series(1,10),
> random()*10);
> postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1
> options(table_name 't1');
> postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2
> options(table_name 't2');
>
>
> postgres=# explain analyze select * from t order by u limit 1;
>   QUERY
> PLAN
>
> ---
>  Limit  (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010
> rows=1 loops=1)
>->  Merge Append  (cost=200.15..449.39 rows=5121 width=8) (actual
> time=2.009..2.009 rows=1 loops=1)
>  Sort Key: t.u
>  ->  Index Scan using t_pkey on t  (cost=0.12..8.14 rows=1
> width=8) (actual time=0.005..0.005 rows=0 loops=1)
>  ->  Foreign Scan on t_fdw2  (cost=100.00..193.92 rows=2560
> width=8) (actual time=1.074..1.074 rows=1 loops=1)
>  ->  Foreign Scan on t_fdw1  (cost=100.00..193.92 rows=2560
> width=8) (actual time=0.928..0.928 rows=1 loops=1)
>  Planning time: 0.769 ms
>  Execution time: 6.837 ms
> (8 rows)
>
> As you can see foreign scan fetches only one row from each remote node.
>
> But still pushing down limit can have positive effect on performance,
> especially if SORT can be replaced with TOP-N.
> I got the following results (time in seconds):
>
> Query
> original
> limit push down
> select * from t order by u limit 1
> 2.276
> 1.777
> select * from t order by v limit 1
> 100 42
>
> There is index for "u", so fetching records with smallest "u" values can
> be done without sorting, so times are similar.
> But in case of sorting by "v", pushing down limit allows to use TOP-1
> instead of global sort and it reduces query execution time more than 2
> times.
>
> --
>
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-17 Thread Konstantin Knizhnik



On 29.04.2017 00:13, Douglas Doole wrote:


If you add this to the commitfest app, more people might look at
it when the next commitfest opens.


I have added it. https://commitfest.postgresql.org/14/1119/

Also, it might help if you can provide a query/ies with numbers
where this optimization shows improvement.


I can't provide the real queries where we encountered the problem 
because they are internal. However I showed a simplified version of 
the queries in my first post.


On our queries, the change made quite a difference - execution time 
dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows 
that memory use dropped significantly and we didn't have to spill the 
sort to disk


From:

-> Sort (cost=989.95..1013.27 rows=9326 width=30) 
(node_startup_time/loop=31328.891, node_total_time/loop: 31329.756 
rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr 
hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 
89592 Sort Space Type: Disk


To:

-> Sort (cost=989.95..1013.27 rows=9326 width=30) 
(node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 
rows=2001 loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort 
Method: top-N heapsort Sort Space Used: 3256 Sort Space Type: Memory


Attached please find yet another small patch which pushes down LIMIT to 
ForeignScan.
I should notice that currently Postgres optimizer is using "Merge 
Append" and fetches from remote nodes only required number of tuples.
So even without LIMIT push down, postgres_fdw will not pull the whole 
table from remote host.
postgres_fdw is using cursor for fetching data from remote. Default 
fetch size is 100, so even without limit remote query will fetch no 
more  than 100 rows at remote site.


Assume the following example:

postgres=# create extension postgres_fdw;
postgres=# create server shard1  FOREIGN DATA WRAPPER postgres_fdw 
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# create server shard2  FOREIGN DATA WRAPPER postgres_fdw 
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user 
'$user');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user 
'$user');

postgres=# CREATE TABLE t(u integer primary key, v integer);
postgres=# CREATE TABLE t1(u integer primary key, v integer);
postgres=# CREATE TABLE t2(u integer primary key, v integer);
postgres=# insert into t1 values (generate_series(1,10), 
random()*10);
postgres=# insert into t2 values (generate_series(1,10), 
random()*10);
postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1 
options(table_name 't1');
postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2 
options(table_name 't2');



postgres=# explain analyze select * from t order by u limit 1;
  QUERY PLAN
---
 Limit  (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010 
rows=1 loops=1)
   ->  Merge Append  (cost=200.15..449.39 rows=5121 width=8) (actual 
time=2.009..2.009 rows=1 loops=1)

 Sort Key: t.u
 ->  Index Scan using t_pkey on t  (cost=0.12..8.14 rows=1 
width=8) (actual time=0.005..0.005 rows=0 loops=1)
 ->  Foreign Scan on t_fdw2  (cost=100.00..193.92 rows=2560 
width=8) (actual time=1.074..1.074rows=1 loops=1)
 ->  Foreign Scan on t_fdw1  (cost=100.00..193.92 rows=2560 
width=8) (actual time=0.928..0.928rows=1 loops=1)

 Planning time: 0.769 ms
 Execution time: 6.837 ms
(8 rows)

As you can see foreign scan fetches only one row from each remote node.

But still pushing down limit can have positive effect on performance, 
especially if SORT can be replaced with TOP-N.

I got the following results (time in seconds):

Query
original
limit push down
select * from t order by u limit 1
2.276
1.777
select * from t order by v limit 1
100 42


There is index for "u", so fetching records with smallest "u" values can 
be done without sorting, so times are similar.
But in case of sorting by "v", pushing down limit allows to use TOP-1 
instead of global sort and it reduces query execution time more than 2 
times.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 080cb0a..e3847ce 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2949,7 +2949,8 @@ create_cursor(ForeignScanState *node)
 	initStringInfo(&buf);
 	appendStringInfo(&buf, "DECLARE c%u CURSOR FOR\n%s",
 	 fsstate->cursor_number, fsstate->query);
-
+	if (node->limit > 0)
+		appendStringInfo(&buf, " LIMIT %lld", (long long)node->limit);
 	/*
 	 * Notice that we pass NULL for paramTypes,

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-04-28 Thread Douglas Doole
>
> If you add this to the commitfest app, more people might look at it when
> the next commitfest opens.


I have added it. https://commitfest.postgresql.org/14/1119/

Also, it might help if you can provide a query/ies with numbers where this
> optimization shows improvement.
>

I can't provide the real queries where we encountered the problem because
they are internal. However I showed a simplified version of the queries in
my first post.

On our queries, the change made quite a difference - execution time dropped
from 31.4 seconds to 7.2 seconds. Explain analyze also shows that memory
use dropped significantly and we didn't have to spill the sort to disk

From:

-> Sort (cost=989.95..1013.27 rows=9326 width=30)
(node_startup_time/loop=31328.891, node_total_time/loop: 31329.756
rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr
hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 89592
Sort Space Type: Disk

To:

-> Sort (cost=989.95..1013.27 rows=9326 width=30)
(node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 rows=2001
loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort Method: top-N
heapsort Sort Space Used: 3256 Sort Space Type: Memory


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-04-19 Thread Ashutosh Bapat
On Wed, Apr 19, 2017 at 10:51 PM, Douglas Doole  wrote:
> Thanks for the feedback Ashutosh.
>
> I disagree that we need to call pass_down_bound() recursively. The whole
> point of the recursive call would be to tunnel through a plan node. Since
> SubqueryScanState only has one child (unlike MergeAppendState) this can be
> more efficiently done inline rather than via a recursive call.
>
> In the case of ResultState, this is classic tail-end recursion and the C
> compiler should optimize it out. As we add more recursive calls though, it
> becomes harder for the compiler to do the right thing. (I'll admit that I'm
> not up on what state-of-the-art in recursion removal though, so maybe my
> concern is moot. That said, I prefer not to depend on compiler optimizations
> if there's a clean alternative way to express the code.)
>
> I do agree that it would make the code cleaner to handle ResultState and
> SubqueryScanState in a similar fashion. My preference, however, would be to
> remove the recursive call from ResultState handling. That would give us code
> like:
>
> if child == SubqueryScanState
> child = SubqueryScanState->child
> else if child == ResultState
> child = ResultState->child
>
> if child == SortState
> limit pushdown
> else if child == MergeAppendState
> recursive call on all children
>
> If it is possible to get more than one SubqueryScanState and/or ResultState
> between the limit and sort, then the first block of code could be placed in
> a while loop.
>
> If you'd prefer that I rework the patch as I discussed, just let me know and
> I'll resubmit it.

Thinking more about it, pass_down_bound() optimization ultimately
targets sort nodes. It doesn't for example target ForeignScan nodes
which can benefit from such optimization. But I think any generic
LIMIT optimization will need to be handled in the planner as against
the executor.

Said that, I don't think community at large would accept serializing
pass_down_bound() as you are proposing. You may to wait for others'
opinions before working on it. If you add this to the commitfest app,
more people might look at it when the next commitfest opens. Also, it
might help if you can provide a query/ies with numbers where this
optimization shows improvement.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-04-19 Thread Douglas Doole
Thanks for the feedback Ashutosh.

I disagree that we need to call pass_down_bound() recursively. The whole
point of the recursive call would be to tunnel through a plan node. Since
SubqueryScanState only has one child (unlike MergeAppendState) this can be
more efficiently done inline rather than via a recursive call.

In the case of ResultState, this is classic tail-end recursion and the C
compiler should optimize it out. As we add more recursive calls though, it
becomes harder for the compiler to do the right thing. (I'll admit that I'm
not up on what state-of-the-art in recursion removal though, so maybe my
concern is moot. That said, I prefer not to depend on compiler
optimizations if there's a clean alternative way to express the code.)

I do agree that it would make the code cleaner to handle ResultState and
SubqueryScanState in a similar fashion. My preference, however, would be to
remove the recursive call from ResultState handling. That would give us
code like:

if child == SubqueryScanState
child = SubqueryScanState->child
else if child == ResultState
child = ResultState->child

if child == SortState
limit pushdown
else if child == MergeAppendState
recursive call on all children

If it is possible to get more than one SubqueryScanState and/or ResultState
between the limit and sort, then the first block of code could be placed in
a while loop.

If you'd prefer that I rework the patch as I discussed, just let me know
and I'll resubmit it.

- Doug
Salesforce

On Wed, Apr 19, 2017 at 1:57 AM Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> The function pass_down_bound() is a recursive function. For
> SubqueryScanState we have to do something similar to ResultState i.e.
> call pass_down_bound() recursively on subqueryScanState->subplan.
>
> Please add this to the next commitfest.
>
> On Wed, Apr 19, 2017 at 6:09 AM, Douglas Doole 
> wrote:
> > We've hit a case where pass_down_bound() isn't pushing the row count
> limit
> > from limit into sort. The issue is that we're getting a subquery scan
> node
> > between the limit and the sort. The subquery is only doing column
> projection
> > and has no quals or SRFs so it should be safe to push the limit into the
> > sort.
> >
> > The query that hit the problem can be simplified to:
> >
> >SELECT * FROM (SELECT A,B FROM T ORDER BY C) LIMIT 5
> >
> > (Yeah, the query's a little screwy in that the ORDER BY should really be
> > outside the subselect, but it came from a query generator, so that's a
> > different conversation.)
> >
> > Proposed patch is attached.
> >
> > - Doug
> > Salesforce
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> >
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-04-19 Thread Ashutosh Bapat
The function pass_down_bound() is a recursive function. For
SubqueryScanState we have to do something similar to ResultState i.e.
call pass_down_bound() recursively on subqueryScanState->subplan.

Please add this to the next commitfest.

On Wed, Apr 19, 2017 at 6:09 AM, Douglas Doole  wrote:
> We've hit a case where pass_down_bound() isn't pushing the row count limit
> from limit into sort. The issue is that we're getting a subquery scan node
> between the limit and the sort. The subquery is only doing column projection
> and has no quals or SRFs so it should be safe to push the limit into the
> sort.
>
> The query that hit the problem can be simplified to:
>
>SELECT * FROM (SELECT A,B FROM T ORDER BY C) LIMIT 5
>
> (Yeah, the query's a little screwy in that the ORDER BY should really be
> outside the subselect, but it came from a query generator, so that's a
> different conversation.)
>
> Proposed patch is attached.
>
> - Doug
> Salesforce
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers