Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On Fri, Aug 25, 2017 at 5:12 PM, Tom Lane wrote: >> Here's a reviewed version of the second patch. I fixed one bug >> (wrong explain group nesting) and made some additional cosmetic >> improvements beyond the struct relocation. > The capitalization of TupleSortInstrumentation is inconsistent with > TuplesortSpaceType, TuplesortMethod, and Tuplesortstate; I recommend > we lower-case the S. Sure, do it that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 5:12 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane wrote: >>> Hmm, I'm not sure why SortInstrumentation belongs naturally to >>> tuplesort.h but putting an array of them there would be a "gross >>> abstraction violation". Perhaps it would help to rename >>> struct SharedSortInfo to SortInstrumentationArray, and change its >>> field names to be less specific to the parallel-worker use case? > >> What other use case could there be? I think an array of >> SortInstrumentation objects intended to be stored in DSM is fairly >> clearly a bit of executor-specific machinery and thus properly >> declared along with the node that contains it. > > I'm not really convinced, but it's not worth arguing further. > > Here's a reviewed version of the second patch. I fixed one bug > (wrong explain group nesting) and made some additional cosmetic > improvements beyond the struct relocation. The capitalization of TupleSortInstrumentation is inconsistent with TuplesortSpaceType, TuplesortMethod, and Tuplesortstate; I recommend we lower-case the S. - * Ordinary plan nodes won't do anything here, but parallel-aware plan - * nodes may need to initialize shared state in the DSM before parallel - * workers are available. They can allocate the space they previously + * Most plan nodes won't do anything here, but plan nodes that allocated + * DSM may need to initialize shared state in the DSM before parallel + * workers are launched. They can allocate the space they previously On reading this again, it seems to me that the first instance of "allocated" in this comment block needs to be changed to "estimated", because otherwise saying they can allocated it later on is unintelligible. Other than that, LGTM. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane wrote: >> Hmm, I'm not sure why SortInstrumentation belongs naturally to >> tuplesort.h but putting an array of them there would be a "gross >> abstraction violation". Perhaps it would help to rename >> struct SharedSortInfo to SortInstrumentationArray, and change its >> field names to be less specific to the parallel-worker use case? > What other use case could there be? I think an array of > SortInstrumentation objects intended to be stored in DSM is fairly > clearly a bit of executor-specific machinery and thus properly > declared along with the node that contains it. I'm not really convinced, but it's not worth arguing further. Here's a reviewed version of the second patch. I fixed one bug (wrong explain group nesting) and made some additional cosmetic improvements beyond the struct relocation. regards, tom lane diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 953e74d..385b325 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -2292,15 +2292,21 @@ show_tablesample(TableSampleClause *tsc, PlanState *planstate, static void show_sort_info(SortState *sortstate, ExplainState *es) { - if (es->analyze && sortstate->sort_Done && - sortstate->tuplesortstate != NULL) + if (!es->analyze) + return; + + if (sortstate->sort_Done && sortstate->tuplesortstate != NULL) { Tuplesortstate *state = (Tuplesortstate *) sortstate->tuplesortstate; + TupleSortInstrumentation stats; const char *sortMethod; const char *spaceType; long spaceUsed; - tuplesort_get_stats(state, &sortMethod, &spaceType, &spaceUsed); + tuplesort_get_stats(state, &stats); + sortMethod = tuplesort_method_name(stats.sortMethod); + spaceType = tuplesort_space_type_name(stats.spaceType); + spaceUsed = stats.spaceUsed; if (es->format == EXPLAIN_FORMAT_TEXT) { @@ -2315,6 +2321,51 @@ show_sort_info(SortState *sortstate, ExplainState *es) ExplainPropertyText("Sort Space Type", spaceType, es); } } + + if (sortstate->shared_info != NULL) + { + int n; + bool opened_group = false; + + for (n = 0; n < sortstate->shared_info->num_workers; n++) + { + TupleSortInstrumentation *sinstrument; + const char *sortMethod; + const char *spaceType; + long spaceUsed; + + sinstrument = &sortstate->shared_info->sinstrument[n]; + if (sinstrument->sortMethod == SORT_TYPE_STILL_IN_PROGRESS) +continue; /* ignore any unfilled slots */ + sortMethod = tuplesort_method_name(sinstrument->sortMethod); + spaceType = tuplesort_space_type_name(sinstrument->spaceType); + spaceUsed = sinstrument->spaceUsed; + + if (es->format == EXPLAIN_FORMAT_TEXT) + { +appendStringInfoSpaces(es->str, es->indent * 2); +appendStringInfo(es->str, + "Worker %d: Sort Method: %s %s: %ldkB\n", + n, sortMethod, spaceType, spaceUsed); + } + else + { +if (!opened_group) +{ + ExplainOpenGroup("Workers", "Workers", false, es); + opened_group = true; +} +ExplainOpenGroup("Worker", NULL, true, es); +ExplainPropertyInteger("Worker Number", n, es); +ExplainPropertyText("Sort Method", sortMethod, es); +ExplainPropertyLong("Sort Space Used", spaceUsed, es); +ExplainPropertyText("Sort Space Type", spaceType, es); +ExplainCloseGroup("Worker", NULL, true, es); + } + } + if (opened_group) + ExplainCloseGroup("Workers", "Workers", false, es); + } } /* diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c index ad9eba6..01316ff 100644 --- a/src/backend/executor/execParallel.c +++ b/src/backend/executor/execParallel.c @@ -28,9 +28,10 @@ #include "executor/nodeBitmapHeapscan.h" #include "executor/nodeCustom.h" #include "executor/nodeForeignscan.h" -#include "executor/nodeSeqscan.h" #include "executor/nodeIndexscan.h" #include "executor/nodeIndexonlyscan.h" +#include "executor/nodeSeqscan.h" +#include "executor/nodeSort.h" #include "executor/tqueue.h" #include "nodes/nodeFuncs.h" #include "optimizer/planmain.h" @@ -202,10 +203,10 @@ ExecSerializePlan(Plan *plan, EState *estate) } /* - * Ordinary plan nodes won't do anything here, but parallel-aware plan nodes - * may need some state which is shared across all parallel workers. Before - * we size the DSM, give them a chance to call shm_toc_estimate_chunk or - * shm_toc_estimate_keys on &pcxt->estimator. + * Parallel-aware plan nodes (and occasionally others) may need some state + * which is shared across all parallel workers. Before we size the DSM, give + * them a chance to call shm_toc_estimate_chunk or shm_toc_estimate_keys on + * &pcxt->estimator. * * While we're at it, count the number of PlanState nodes in the tree, so * we know how many SharedPlanStateInstrumentation structures we need. @@ -219,38 +220,43 @@ ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e) /* Count this node.
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 2:54 PM, Tom Lane wrote: >> I think moving SharedSortInfo into tuplesort.h would be a gross >> abstraction violation, but moving SortInstrumentation into tuplesort.h >> seems like a modest improvement. > > Hmm, I'm not sure why SortInstrumentation belongs naturally to > tuplesort.h but putting an array of them there would be a "gross > abstraction violation". Perhaps it would help to rename > struct SharedSortInfo to SortInstrumentationArray, and change its > field names to be less specific to the parallel-worker use case? What other use case could there be? I think an array of SortInstrumentation objects intended to be stored in DSM is fairly clearly a bit of executor-specific machinery and thus properly declared along with the node that contains it. Only nodeSort.c and explain.c need to know anything about it; tuplesort.c is totally ignorant of it and I see no reason why we'd ever want to change that. Indeed, I don't quite see how we could do so without some kind of abstraction violation. SortInstrumentation is a diffferent kettle of fish. I chose to make that an executor-specific bit as well, but an advantage of your proposed approach is that future additions to (or changes in) what gets returned by tuplesort_get_stats() wouldn't require as much tweaking elsewhere. With your proposed change, nodeSort.c wouldn't really care about the contents of that structure (as long as there are no embedded pointers), so we could add new members or rename things and only tuplesort.c and explain.c would need updating. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On Fri, Aug 25, 2017 at 2:01 PM, Tom Lane wrote: >> I looked through this a little, and feel uncomfortable with the division >> of typedefs between execnodes.h and tuplesort.h. I'm inclined to push >> struct SortInstrumentation, and maybe also SharedSortInfo, into >> tuplesort.h. > I think moving SharedSortInfo into tuplesort.h would be a gross > abstraction violation, but moving SortInstrumentation into tuplesort.h > seems like a modest improvement. Hmm, I'm not sure why SortInstrumentation belongs naturally to tuplesort.h but putting an array of them there would be a "gross abstraction violation". Perhaps it would help to rename struct SharedSortInfo to SortInstrumentationArray, and change its field names to be less specific to the parallel-worker use case? >> (BTW, would it make sense to number the workers from 1 not 0 in the >> EXPLAIN printout?) > ... So I'm in favor of leaving it alone; I don't think that 0-based > indexing is such an obscure convention that it will flummox users. OK, I'm not particularly set on that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 2:01 PM, Tom Lane wrote: > Robert Haas writes: >> On another note, here's a second patch applying on top of my earlier >> patch to push down Limit through Gather and Gather Merge. > > I looked through this a little, and feel uncomfortable with the division > of typedefs between execnodes.h and tuplesort.h. I'm inclined to push > struct SortInstrumentation, and maybe also SharedSortInfo, into > tuplesort.h. Then tuplesort_get_stats could be given the signature > > void tuplesort_get_stats(Tuplesortstate *state, > SortInstrumentation *stats); > > which seems less messy. Thoughts? I think moving SharedSortInfo into tuplesort.h would be a gross abstraction violation, but moving SortInstrumentation into tuplesort.h seems like a modest improvement. > I'm also pretty suspicious of the filter code you added to subselect.sql. > What happens when there's more than one worker? Uh, then somebody's changed the definition of force_parallel_mode in a really strange way. > (BTW, would it make sense to number the workers from 1 not 0 in the > EXPLAIN printout?) The main problem I see with that is that it might confuse somebody who is debugging. If you've got a backend and print out ParallelWorkerNumber, you might expect that to match what you see in the EXPLAIN output. It also seems likely to me that other parts of EXPLAIN or other parts of the system generally will eventually expose parallel worker numbers in mildly user-visible ways (e.g. maybe it'll show up in a DEBUG message someplace eventually), and if we insist on sticking a + 1 into EXPLAIN's output in the existing places we'll have to do it everywhere else, and somebody's eventually going to forget. So I'm in favor of leaving it alone; I don't think that 0-based indexing is such an obscure convention that it will flummox users. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On another note, here's a second patch applying on top of my earlier > patch to push down Limit through Gather and Gather Merge. I looked through this a little, and feel uncomfortable with the division of typedefs between execnodes.h and tuplesort.h. I'm inclined to push struct SortInstrumentation, and maybe also SharedSortInfo, into tuplesort.h. Then tuplesort_get_stats could be given the signature void tuplesort_get_stats(Tuplesortstate *state, SortInstrumentation *stats); which seems less messy. Thoughts? I'm also pretty suspicious of the filter code you added to subselect.sql. What happens when there's more than one worker? (BTW, would it make sense to number the workers from 1 not 0 in the EXPLAIN printout?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 1:31 PM, Tom Lane wrote: > Robert Haas writes: >> Awesome. What's the business with enable_hashagg in the regression test >> stuff? > > Just relocating that "reset" so that it only affects the test case it's > needed for. (I think the 0-worker test was inserted in the wrong place.) Oh, hmm. Well, no objection to that, I guess. Thanks for working on this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > Awesome. What's the business with enable_hashagg in the regression test > stuff? Just relocating that "reset" so that it only affects the test case it's needed for. (I think the 0-worker test was inserted in the wrong place.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 1:18 PM, Tom Lane wrote: > v4, with a test case and some more comment-polishing. I think this > is committable. Awesome. What's the business with enable_hashagg in the regression test stuff? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
v4, with a test case and some more comment-polishing. I think this is committable. regards, tom lane diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c index ce47f1d..ad9eba6 100644 --- a/src/backend/executor/execParallel.c +++ b/src/backend/executor/execParallel.c @@ -47,17 +47,26 @@ * greater than any 32-bit integer here so that values < 2^32 can be used * by individual parallel nodes to store their own state. */ -#define PARALLEL_KEY_PLANNEDSTMT UINT64CONST(0xE001) -#define PARALLEL_KEY_PARAMSUINT64CONST(0xE002) -#define PARALLEL_KEY_BUFFER_USAGE UINT64CONST(0xE003) -#define PARALLEL_KEY_TUPLE_QUEUE UINT64CONST(0xE004) -#define PARALLEL_KEY_INSTRUMENTATION UINT64CONST(0xE005) -#define PARALLEL_KEY_DSAUINT64CONST(0xE006) -#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE007) +#define PARALLEL_KEY_EXECUTOR_FIXED UINT64CONST(0xE001) +#define PARALLEL_KEY_PLANNEDSTMT UINT64CONST(0xE002) +#define PARALLEL_KEY_PARAMSUINT64CONST(0xE003) +#define PARALLEL_KEY_BUFFER_USAGE UINT64CONST(0xE004) +#define PARALLEL_KEY_TUPLE_QUEUE UINT64CONST(0xE005) +#define PARALLEL_KEY_INSTRUMENTATION UINT64CONST(0xE006) +#define PARALLEL_KEY_DSAUINT64CONST(0xE007) +#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE008) #define PARALLEL_TUPLE_QUEUE_SIZE 65536 /* + * Fixed-size random stuff that we need to pass to parallel workers. + */ +typedef struct FixedParallelExecutorState +{ + int64 tuples_needed; /* tuple bound, see ExecSetTupleBound */ +} FixedParallelExecutorState; + +/* * DSM structure for accumulating per-PlanState instrumentation. * * instrument_options: Same meaning here as in instrument.c. @@ -381,12 +390,14 @@ ExecParallelReinitialize(ParallelExecutorInfo *pei) * execution and return results to the main backend. */ ParallelExecutorInfo * -ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) +ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers, + int64 tuples_needed) { ParallelExecutorInfo *pei; ParallelContext *pcxt; ExecParallelEstimateContext e; ExecParallelInitializeDSMContext d; + FixedParallelExecutorState *fpes; char *pstmt_data; char *pstmt_space; char *param_space; @@ -418,6 +429,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) * for the various things we need to store. */ + /* Estimate space for fixed-size state. */ + shm_toc_estimate_chunk(&pcxt->estimator, + sizeof(FixedParallelExecutorState)); + shm_toc_estimate_keys(&pcxt->estimator, 1); + /* Estimate space for query text. */ query_len = strlen(estate->es_sourceText); shm_toc_estimate_chunk(&pcxt->estimator, query_len); @@ -487,6 +503,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) * asked for has been allocated or initialized yet, though, so do that. */ + /* Store fixed-size state. */ + fpes = shm_toc_allocate(pcxt->toc, sizeof(FixedParallelExecutorState)); + fpes->tuples_needed = tuples_needed; + shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECUTOR_FIXED, fpes); + /* Store query string */ query_string = shm_toc_allocate(pcxt->toc, query_len); memcpy(query_string, estate->es_sourceText, query_len); @@ -833,6 +854,7 @@ ExecParallelInitializeWorker(PlanState *planstate, shm_toc *toc) void ParallelQueryMain(dsm_segment *seg, shm_toc *toc) { + FixedParallelExecutorState *fpes; BufferUsage *buffer_usage; DestReceiver *receiver; QueryDesc *queryDesc; @@ -841,6 +863,9 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc) void *area_space; dsa_area *area; + /* Get fixed-size state. */ + fpes = shm_toc_lookup(toc, PARALLEL_KEY_EXECUTOR_FIXED, false); + /* Set up DestReceiver, SharedExecutorInstrumentation, and QueryDesc. */ receiver = ExecParallelGetReceiver(seg, toc); instrumentation = shm_toc_lookup(toc, PARALLEL_KEY_INSTRUMENTATION, true); @@ -868,8 +893,17 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc) queryDesc->planstate->state->es_query_dsa = area; ExecParallelInitializeWorker(queryDesc->planstate, toc); - /* Run the plan */ - ExecutorRun(queryDesc, ForwardScanDirection, 0L, true); + /* Pass down any tuple bound */ + ExecSetTupleBound(fpes->tuples_needed, queryDesc->planstate); + + /* + * Run the plan. If we specified a tuple bound, be careful not to demand + * more tuples than that. + */ + ExecutorRun(queryDesc, +ForwardScanDirection, +fpes->tuples_needed < 0 ? (int64) 0 : fpes->tuples_needed, +true); /* Shut down the executor */ ExecutorFinish(queryDesc); diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 36d2914..c1aa506 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 12:05 PM, Tom Lane wrote: > Hm. It seemed pretty reliable for me, but we already know that the > parallel machinery is quite timing-sensitive. I think we'd better > incorporate a regression test case into the committed patch so we > can see what the buildfarm thinks. I'll see about adding one. That would be great. I didn't have a great idea how to write a test for this -- maybe one of my systematic failures as a developer. :-( >> I think that the comment at the bottom of ExecSetTupleBound should >> probably be rethought a bit in light of these changes. > > Agreed; I'll revisit all the comments, actually. That also sounds great. > No, I think that's fundamentally the wrong idea. The tuple bound > inherently is a piece of information that has to apply across a whole > scan. If you pass it to ExecProcNode then you'll have to get into > API contracts about whether it's allowed to change across calls, > which is a mess, even aside from efficiency concerns (and for > ExecProcNode, I *am* concerned about efficiency). Fair. > You could imagine adding it as a parameter to ExecReScan, instead, but > then there are a bunch of issues with how that would interact with the > existing optimizations for skipped/delayed/merged rescan calls (cf. > the existing open issue with parallel rescans). That is a can of worms > I'd just as soon not open. > > I'm content to leave it as a separate call. I still think that worrying > about extra C function calls for this purpose is, at best, premature > optimization. OK, you may well be right. Outside of Limit, I suppose we're not likely to set any limit other than 1, and if we set a limit of 1 it'll probably be a limit that survives rescans, because it'll be based on the parent node not caring about anything more than the first tuple. BTW, I think another reason why we're likely to eventually want to get very aggressive about passing down bounds is batch execution. That's somewhere on Andres's list of things that could speed up the executor, although I believe at present there are quite a few other bottlenecks that are more urgent. But certainly if we start doing that, then even things like SeqScan are going to want to know about applicable limit clauses. > I'll do another pass over the patch and get back to you. I've not > looked at the instrumentation patch yet --- is there a reason to > worry about it before we finish this one? I think they're independent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > I had the same thought. Done in the attached revision of your > version. I couldn't consistently reproduce the failure you saw -- it > failed for me only about 1 time in 10 -- but I don't think it's > failing any more with this version. Hm. It seemed pretty reliable for me, but we already know that the parallel machinery is quite timing-sensitive. I think we'd better incorporate a regression test case into the committed patch so we can see what the buildfarm thinks. I'll see about adding one. > I think that the comment at the bottom of ExecSetTupleBound should > probably be rethought a bit in light of these changes. Agreed; I'll revisit all the comments, actually. > The details aside, Konstantin Knizhnik's proposal to teach this code > about ForeignScans is yet another independent way for this to win, and > I think that will also be quite a large win. +1 > It's possible that we should work out some of these things at plan > time rather than execution time, and it's also possible that a > separate call to ExecSetTupleBound is too much work. I've mulled over > the idea of whether we should get rid of this mechanism altogether in > favor of passing the tuple bound as an additional argument to > ExecProcNode, because it seems silly to call node-specific code once > to set a (normally small, possibly 1) bound and then call it again to > get a (or the) tuple. No, I think that's fundamentally the wrong idea. The tuple bound inherently is a piece of information that has to apply across a whole scan. If you pass it to ExecProcNode then you'll have to get into API contracts about whether it's allowed to change across calls, which is a mess, even aside from efficiency concerns (and for ExecProcNode, I *am* concerned about efficiency). You could imagine adding it as a parameter to ExecReScan, instead, but then there are a bunch of issues with how that would interact with the existing optimizations for skipped/delayed/merged rescan calls (cf. the existing open issue with parallel rescans). That is a can of worms I'd just as soon not open. I'm content to leave it as a separate call. I still think that worrying about extra C function calls for this purpose is, at best, premature optimization. I'll do another pass over the patch and get back to you. I've not looked at the instrumentation patch yet --- is there a reason to worry about it before we finish this one? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 11:15 AM, Tom Lane wrote: > The problem I exhibited with the updated patch could probably be resolved > if the top level logic in a parallel worker knows about tuples_needed > and doesn't try to pull more than that from its subplan. So what you're > describing isn't just an additional optimization, it's necessary to make > this work at all. I had the same thought. Done in the attached revision of your version. I couldn't consistently reproduce the failure you saw -- it failed for me only about 1 time in 10 -- but I don't think it's failing any more with this version. I think that the comment at the bottom of ExecSetTupleBound should probably be rethought a bit in light of these changes. Teaching the parallel worker about tuples_needed may not be JUST an additional optimization, but it IS an additional optimization; IOW, what the planner can put between Sort and Limit will no longer be the only relevant consideration. The details aside, Konstantin Knizhnik's proposal to teach this code about ForeignScans is yet another independent way for this to win, and I think that will also be quite a large win. In the future, I believe that we might want to use a mechanism like this in even more cases. For example, a Semi Join or Anti Join only needs 1 row from the inner side per execution. There's probably no reason to put a Sort under the inner side of a Semi Join or Anti Join, but it's certainly not impossible to have a Foreign Scan there. It's likely to be an expensive plan, but it's a heck of a lot more expensive if we fetch 100 rows when we only need 1. I'm not sure the existing mechanism will stretch to handle such cases. It's possible that we should work out some of these things at plan time rather than execution time, and it's also possible that a separate call to ExecSetTupleBound is too much work. I've mulled over the idea of whether we should get rid of this mechanism altogether in favor of passing the tuple bound as an additional argument to ExecProcNode, because it seems silly to call node-specific code once to set a (normally small, possibly 1) bound and then call it again to get a (or the) tuple. But I'm not sure passing it to ExecProcNode is really the right idea; it's adding some overhead for something that is a bit of a special case, and I'm not sure it will work out cleanly otherwise, either. I'd be interested in your thoughts if you have any. My intuition is that there's substantially more to be gained in this area, but exactly how best to gain it is not very clear to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company push-down-bound-to-workers-3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > However, there's another opportunity for optimization here, which is > the Limit -> Gather -> Anything case. A worker which has itself > generated enough tuples to fill the limit can certainly stop working, > but right now it doesn't know that. The patch I posted before didn't > think about that case, but it seems to require only trivial > modifications. Ah, good point. > I was going to post an updated patch trying to > address that, but I see that you've already posted something, so I'll > go have a look at that instead. The problem I exhibited with the updated patch could probably be resolved if the top level logic in a parallel worker knows about tuples_needed and doesn't try to pull more than that from its subplan. So what you're describing isn't just an additional optimization, it's necessary to make this work at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 9:24 AM, Tom Lane wrote: > Basically, this argument is that we should contort the code in order > to avoid tail recursion, rather than assuming the compiler will turn > that recursion into iteration, even in cases where there is really > zero evidence that we should be worrying about performance at all > rather than correctness and readability. I do not agree (and just > finished pushing a patch to change it). I'm not prepared to argue further about it. This strikes me as one of those things that is really a matter of judgement; I don't really care much about how the code ends up but I think your portrayal of what was there is unduly negative. >> On another note, here's a second patch applying on top of my earlier >> patch to push down Limit through Gather and Gather Merge. > > Can you demonstrate any case where the planner would put Gather between > Sort and Limit? The GatherMerge case is probably interesting, but I'm > having doubts about Gather. That's an interesting point. I was thinking that we needed both of these patches in order to make the proposed regression test pass, because in the relevant case we'll have Gather -> Limit -> Sort, but actually we only need the second one, because as you point out here what matters is what's between the Limit and the Sort. I was thinking that we'd need to be able to see through the Gather at the top of the plan, but we don't. However, there's another opportunity for optimization here, which is the Limit -> Gather -> Anything case. A worker which has itself generated enough tuples to fill the limit can certainly stop working, but right now it doesn't know that. The patch I posted before didn't think about that case, but it seems to require only trivial modifications. I was going to post an updated patch trying to address that, but I see that you've already posted something, so I'll go have a look at that instead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Amit Kapila writes: > On Fri, Aug 25, 2017 at 7:35 PM, Tom Lane wrote: >> it would be stupid to put a filter on that node rather than its >> children, but I see this in both nodeGather.c and nodeGatherMerge.c: >> >> gatherstate->ps.qual = >> ExecInitQual(node->plan.qual, (PlanState *) gatherstate); >> >> It doesn't look like the qual is actually used anywhere in either node >> type. Am I right in thinking this is dead code? > I also think so. I think this was required in some initial versions > of gather node patch where we were thinking of having a single node > (instead of what we have now that Gather node and beneath there will > be partial scan node) to perform parallel scans. Thanks for confirming. I'll replace that with something like Assert(!node->plan.qual). I have some other code-review-ish fixes to do in nodeGatherMerge, too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > I'm inclined to commit both of these after a little more testing and > self-review, but let me know if anyone else wants to review first. Attached is an updated version of the first patch, which I rebased over 3f4c7917b, improved a bunch of the comments for, and fixed a couple of obvious typos in. However, when I went to test it, it blew up real good: regression=# set parallel_setup_cost=0; SET regression=# set parallel_tuple_cost=0; SET regression=# set min_parallel_table_scan_size=0; SET regression=# set max_parallel_workers_per_gather=4; SET regression=# explain analyze select * from tenk1 order by fivethous limit 4; ERROR: retrieved too many tuples in a bounded sort CONTEXT: parallel worker The cause is obvious: GatherMerge doesn't know about the contract that it's not supposed to pull more than tuples_needed rows from an input after promising not to do so. I am not convinced that it's worth adding the logic that would be needed to make that happen, so my inclination is to abandon this patch. But here it is if you want to push further. regards, tom lane diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c index ce47f1d..1287025 100644 --- a/src/backend/executor/execParallel.c +++ b/src/backend/executor/execParallel.c @@ -47,17 +47,26 @@ * greater than any 32-bit integer here so that values < 2^32 can be used * by individual parallel nodes to store their own state. */ -#define PARALLEL_KEY_PLANNEDSTMT UINT64CONST(0xE001) -#define PARALLEL_KEY_PARAMSUINT64CONST(0xE002) -#define PARALLEL_KEY_BUFFER_USAGE UINT64CONST(0xE003) -#define PARALLEL_KEY_TUPLE_QUEUE UINT64CONST(0xE004) -#define PARALLEL_KEY_INSTRUMENTATION UINT64CONST(0xE005) -#define PARALLEL_KEY_DSAUINT64CONST(0xE006) -#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE007) +#define PARALLEL_KEY_EXECUTOR_FIXED UINT64CONST(0xE001) +#define PARALLEL_KEY_PLANNEDSTMT UINT64CONST(0xE002) +#define PARALLEL_KEY_PARAMSUINT64CONST(0xE003) +#define PARALLEL_KEY_BUFFER_USAGE UINT64CONST(0xE004) +#define PARALLEL_KEY_TUPLE_QUEUE UINT64CONST(0xE005) +#define PARALLEL_KEY_INSTRUMENTATION UINT64CONST(0xE006) +#define PARALLEL_KEY_DSAUINT64CONST(0xE007) +#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE008) #define PARALLEL_TUPLE_QUEUE_SIZE 65536 /* + * Fixed-size random stuff that we need to pass to parallel workers. + */ +typedef struct FixedParallelExecutorState +{ + int64 tuples_needed; /* tuple bound, see ExecSetTupleBound */ +} FixedParallelExecutorState; + +/* * DSM structure for accumulating per-PlanState instrumentation. * * instrument_options: Same meaning here as in instrument.c. @@ -381,12 +390,14 @@ ExecParallelReinitialize(ParallelExecutorInfo *pei) * execution and return results to the main backend. */ ParallelExecutorInfo * -ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) +ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers, + int64 tuples_needed) { ParallelExecutorInfo *pei; ParallelContext *pcxt; ExecParallelEstimateContext e; ExecParallelInitializeDSMContext d; + FixedParallelExecutorState *fpes; char *pstmt_data; char *pstmt_space; char *param_space; @@ -418,6 +429,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) * for the various things we need to store. */ + /* Estimate space for fixed-size state. */ + shm_toc_estimate_chunk(&pcxt->estimator, + sizeof(FixedParallelExecutorState)); + shm_toc_estimate_keys(&pcxt->estimator, 1); + /* Estimate space for query text. */ query_len = strlen(estate->es_sourceText); shm_toc_estimate_chunk(&pcxt->estimator, query_len); @@ -487,6 +503,11 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate, int nworkers) * asked for has been allocated or initialized yet, though, so do that. */ + /* Store fixed-size state. */ + fpes = shm_toc_allocate(pcxt->toc, sizeof(FixedParallelExecutorState)); + fpes->tuples_needed = tuples_needed; + shm_toc_insert(pcxt->toc, PARALLEL_KEY_EXECUTOR_FIXED, fpes); + /* Store query string */ query_string = shm_toc_allocate(pcxt->toc, query_len); memcpy(query_string, estate->es_sourceText, query_len); @@ -833,6 +854,7 @@ ExecParallelInitializeWorker(PlanState *planstate, shm_toc *toc) void ParallelQueryMain(dsm_segment *seg, shm_toc *toc) { + FixedParallelExecutorState *fpes; BufferUsage *buffer_usage; DestReceiver *receiver; QueryDesc *queryDesc; @@ -841,6 +863,9 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc) void *area_space; dsa_area *area; + /* Get fixed-size state. */ + fpes = shm_toc_lookup(toc, PARALLEL_KEY_EXECUTOR_FIXED, false); + /* Set up DestReceiver, SharedExecutorInstrumentation, and Query
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 25, 2017 at 7:35 PM, Tom Lane wrote: > Robert Haas writes: >> I'm inclined to commit both of these after a little more testing and >> self-review, but let me know if anyone else wants to review first. > > Looking through the first one, I wondered whether we needed to > check for a qual expression on Gather or GatherMerge. It seems like > it would be stupid to put a filter on that node rather than its > children, but I see this in both nodeGather.c and nodeGatherMerge.c: > > /* > * initialize child expressions > */ > gatherstate->ps.qual = > ExecInitQual(node->plan.qual, (PlanState *) gatherstate); > > It doesn't look like the qual is actually used anywhere in either node > type. Am I right in thinking this is dead code? > I also think so. I think this was required in some initial versions of gather node patch where we were thinking of having a single node (instead of what we have now that Gather node and beneath there will be partial scan node) to perform parallel scans. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > I'm inclined to commit both of these after a little more testing and > self-review, but let me know if anyone else wants to review first. Looking through the first one, I wondered whether we needed to check for a qual expression on Gather or GatherMerge. It seems like it would be stupid to put a filter on that node rather than its children, but I see this in both nodeGather.c and nodeGatherMerge.c: /* * initialize child expressions */ gatherstate->ps.qual = ExecInitQual(node->plan.qual, (PlanState *) gatherstate); It doesn't look like the qual is actually used anywhere in either node type. Am I right in thinking this is dead code? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On Thu, Aug 24, 2017 at 2:24 PM, Tom Lane wrote: >> I'd have been okay with changing the entire function if it could still be >> doing all cases the same way. But the exception for merge-append (and >> probably soon other cases) means you still end up with a readability >> problem. > I don't really agree with that. I think it's reasonable to handle the > cases where we are just looking through nodes differently than the > others. Just because we can't conveniently avoid recursing in every > case doesn't mean, to me, that we should recurse even when it's easily > avoidable. Basically, this argument is that we should contort the code in order to avoid tail recursion, rather than assuming the compiler will turn that recursion into iteration, even in cases where there is really zero evidence that we should be worrying about performance at all rather than correctness and readability. I do not agree (and just finished pushing a patch to change it). > On another note, here's a second patch applying on top of my earlier > patch to push down Limit through Gather and Gather Merge. Can you demonstrate any case where the planner would put Gather between Sort and Limit? The GatherMerge case is probably interesting, but I'm having doubts about Gather. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Thu, Aug 24, 2017 at 2:24 PM, Tom Lane wrote: > I'd have been okay with changing the entire function if it could still be > doing all cases the same way. But the exception for merge-append (and > probably soon other cases) means you still end up with a readability > problem. I don't really agree with that. I think it's reasonable to handle the cases where we are just looking through nodes differently than the others. Just because we can't conveniently avoid recursing in every case doesn't mean, to me, that we should recurse even when it's easily avoidable. I think what we might think about doing is have a node that loops over Subquery Scan and Result and drills down, and then handle the rest of the cases either as at present or, perhaps, with a switch. On another note, here's a second patch applying on top of my earlier patch to push down Limit through Gather and Gather Merge. This one propagates information about sorts performed in parallel workers back to the leader and adjusts the test case to use a non-temp table again. This second patch has the nice property of being a fairly good test that the first patch is actually doing something. I'm inclined to commit both of these after a little more testing and self-review, but let me know if anyone else wants to review first. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company propagate-sort-instrumentation.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Douglas Doole writes: >> TBH I dislike the fact that >> you did the subquery case randomly differently from the existing cases; >> it should just have been added as an additional recursive case. Since >> this is done only once at query startup, worrying about hypothetical >> micro-performance issues seems rather misguided. > Habit mostly - why write code with potential performance problems when the > alternative is just as easy to read? I disagree that having adjacent code doing the same thing in two different ways is "easy to read"; what it is is confusing. More globally, since we're dealing with community code that will be read and hacked on by many people, I think we need to prioritize simplicity, readability, and extensibility over micro-efficiency. I'm willing to compromise those goals for efficiency where a clear need has been demonstrated, but no such need has been shown here, nor do I think it's likely that one can be shown. I'd have been okay with changing the entire function if it could still be doing all cases the same way. But the exception for merge-append (and probably soon other cases) means you still end up with a readability problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Douglas Doole writes: > My first thought was to do a regex over the explain output to mask the > troublesome value, but I couldn't figure out how to make it work and didn't > find an example (admittedly didn't spent a huge amount of time hunting > though). I'd love to see how to put it together. I did it like this: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1177ab1dabf72bafee8f19d904cee3a299f25892 regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > No. You can't easily avoid recursion for the merge-append case, since > that has to descend to multiple children. Agreed. That's why it's not in the while loop in my sketch of the suggested rework. > TBH I dislike the fact that > you did the subquery case randomly differently from the existing cases; > it should just have been added as an additional recursive case. Since > this is done only once at query startup, worrying about hypothetical > micro-performance issues seems rather misguided. > Habit mostly - why write code with potential performance problems when the alternative is just as easy to read? I disagree with saying "it's just done at query startup so don't worry about it too much". Back in my days with DB2 we were working on the TPCC benchmark (when it was still relevant) and we found that the startup cost was a huge factor when doing rapid fire, cached, simple queries. In many cases the startup cost was >50% of the overall query execution time. Our testing here in Salesforce is showing a similar pattern in some of our workloads and PostgreSQL. I'm not trying to argue that this particular issue of recurse/don't recurse is going to make or break anything. It's just where my head's at when I look at the code. - Doug Salesforce
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
I wrote: > To get the buildfarm back to green, I agree with the suggestion of > setting force_parallel_mode=off for this test, at least for now. Actually, there's an easier way: we can just make the test table be TEMP. That is a good idea anyway to prevent possible interference from auto-analyze, which conceivably could come along at just the wrong time and cause a plan change. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > I don't greatly like the way that the regression test case filters > the output; it hides too much IMO. I'd be inclined to try to return > the EXPLAIN output with as much detail preserved as possible. Maybe > we could use regex substitution on lines of the output to get rid of > stuff that won't be invariant. > My first thought was to do a regex over the explain output to mask the troublesome value, but I couldn't figure out how to make it work and didn't find an example (admittedly didn't spent a huge amount of time hunting though). I'd love to see how to put it together. Doug - Salesforce
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > Buildfarm members with force_parallel_mode=regress are failing now. I > haven't had a chance to investigate exactly what's going on here, but > I think there are probably several issues: > 1. It's definitely the case that the details about a sort operation > aren't propagated from a worker back to the leader. This has elicited > complaint previously. Check; this must be the explanation for the buildfarm failures. That'd be worth fixing but it seems like material for an independent patch. > 2. pass_down_bound() probably gets the Gather node at the top of the > tree and stops, since it doesn't know how to pass down a bound through > a Gather. We could probably teach it to pass down the bound through > Gather or Gather Merge on the same theory as Append/MergeAppend. While that might be worth doing, it's not the issue here, because the forced Gather is on top of the Limit node. > In the short run, I'm not sure we have a better alternative than > removing this test - unless somebody has a better idea? - but it would > be good to work on all of the above problems. To get the buildfarm back to green, I agree with the suggestion of setting force_parallel_mode=off for this test, at least for now. I don't greatly like the way that the regression test case filters the output; it hides too much IMO. I'd be inclined to try to return the EXPLAIN output with as much detail preserved as possible. Maybe we could use regex substitution on lines of the output to get rid of stuff that won't be invariant. Unless you're already on it, I can go do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Douglas Doole writes: > Would we be better off moving those cases into the while loop I added to > avoid the recursion? No. You can't easily avoid recursion for the merge-append case, since that has to descend to multiple children. TBH I dislike the fact that you did the subquery case randomly differently from the existing cases; it should just have been added as an additional recursive case. Since this is done only once at query startup, worrying about hypothetical micro-performance issues seems rather misguided. Perhaps, since this function is recursive, it ought to have a check_stack_depth call, just to be safe. I'm dubious though that it could ever eat more stack than the preceding node-initialization calls, so maybe we needn't bother. In the long run I wonder whether we should convert "push down bound" into a generic node operation like the others managed by execAmi.c. It seems like you could push a limit through any node type that's guaranteed not to reduce the number of rows returned, and so we're missing cases. Maybe they're not ones that are important in practice, but I'm unsure. > The two casts I added (to SubqueryScanState and Node) should probably be > changed to castNode() calls. The SubqueryScanState cast seems fine given that it immediately follows an IsA() check. You can't use castNode() for a cast to Node, because that's a generic not a specific node type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Robert Haas writes: > On Wed, Aug 23, 2017 at 6:55 PM, Douglas Doole wrote: >> From previous experience, pushing the limit to the workers has the potential >> to be a big win . > Here's a not-really-tested draft patch for that. Both this patch and the already-committed one contain useless (and not very cheap) expression_returns_set tests. Since commit 69f4b9c85, SRFs could only appear in the tlist of ProjectSet nodes. No opinion on whether this patch is right otherwise. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > Here's a not-really-tested draft patch for that. > I don't know the parallel code so I'm not going to comment on the overall patch, but a thought on ExecSetTupleBound(). That function is getting a number of cases where we're doing recursion for a single child (result, gather, gather-merge). Recursion for a single child isn't particularly efficient. I know that if there's a single case of recursion like this, compilers will frequently turn it into a loop, but I don't know if compilers can optimize a branched case like this. Would we be better off moving those cases into the while loop I added to avoid the recursion? So we'd end up with something like: while () { if subquery else if result else if gather else if gather merge } if sort else if merge append And a nit from my original fix now that I've looked at the pg10 code more. The two casts I added (to SubqueryScanState and Node) should probably be changed to castNode() calls. - Doug Salesforce
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Wed, Aug 23, 2017 at 6:55 PM, Douglas Doole wrote: > From previous experience, pushing the limit to the workers has the potential > to be a big win . Here's a not-really-tested draft patch for that. > I haven't played with parallel mode at all yet. Is it possible to disable > force_parallel_mode for the new test? Yeah, I suppose that's another alternative. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company push-down-bound-to-workers.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > Buildfarm members with force_parallel_mode=regress are failing now. I > haven't had a chance to investigate exactly what's going on here, but > I think there are probably several issues: > Not an auspicious beginning for my first patch :-( > 3. However, even if it did that, it would only affect the leader, not > the workers, because each worker will of course have a separate copy > of each executor state node. We could fix that by having the Gather > or Gather Merge node also store the bound and propagate that to each > worker via DSM, and then have each worker call pass_down_bound itself. > (This would require a bit of refactoring of the API for > pass_down_bound, but it looks doable.) > >From previous experience, pushing the limit to the workers has the potential to be a big win . In the short run, I'm not sure we have a better alternative than > removing this test - unless somebody has a better idea? - but it would > be good to work on all of the above problems. > I haven't played with parallel mode at all yet. Is it possible to disable force_parallel_mode for the new test? If not, then I'd agree that removing the test is probably the only reasonable short term fix. - Doug Salesforce
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Mon, Aug 21, 2017 at 2:43 PM, Robert Haas wrote: > Works for me. While I'm sure this won't eclipse previous achievements > in this area, it still seems worthwhile. So committed with one minor > change to shorten a long-ish line. Buildfarm members with force_parallel_mode=regress are failing now. I haven't had a chance to investigate exactly what's going on here, but I think there are probably several issues: 1. It's definitely the case that the details about a sort operation aren't propagated from a worker back to the leader. This has elicited complaint previously. 2. pass_down_bound() probably gets the Gather node at the top of the tree and stops, since it doesn't know how to pass down a bound through a Gather. We could probably teach it to pass down the bound through Gather or Gather Merge on the same theory as Append/MergeAppend. 3. However, even if it did that, it would only affect the leader, not the workers, because each worker will of course have a separate copy of each executor state node. We could fix that by having the Gather or Gather Merge node also store the bound and propagate that to each worker via DSM, and then have each worker call pass_down_bound itself. (This would require a bit of refactoring of the API for pass_down_bound, but it looks doable.) In the short run, I'm not sure we have a better alternative than removing this test - unless somebody has a better idea? - but it would be good to work on all of the above problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Wed, Aug 23, 2017 at 12:56 PM, Konstantin Knizhnik wrote: > > > On 22.08.2017 17:27, Konstantin Knizhnik wrote: > > > > On 18.08.2017 04:33, Robert Haas wrote: > > > It seems like a somewhat ad-hoc approach; it supposes that we can take any > query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto > the very end and all will be well. Maybe that's not a problematic > assumption, not sure. The grammar happens to allow both FOR UPDATE LIMIT n > and LIMIT n FOR UPDATE even though only the latter syntax is documented. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > > I am not absolutely sure that it is possible to append any query which can > be constructed by postgres_fdw for foreign scan with "LIMIT n" clause. > But I also do not know example when it is not possible. As you have > mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres. > > > I have inspected deparseSelectStmtForRel function and now I am sure that > appending LIMIT to the SQL statement generated by this function will not > cause any problems. > It can produce only the following subset of SELECT: > > select FROM [GROUP BY ... [ HAVING ... ]] [ > OREDER BY ... ] [ FOR UPDATE ... ]; > > > The only suspicious clause is FOR UPDATE, but I have checked that "FOR > UPDATE ... LIMIT n" is really accepted by Postgres parser. > There are two ways we can do this. 1. Implement limit handling in postgresGetForeignUpperPaths() when it gets UPPERREL_FINAL (or we might want to break this rel into final and limit rel). We then add paths for limit processing to the final rel and add corresponding handling in deparseSelectStmtForRel(). If foreign path happens to be cheaper, LIMIT gets pushed down to the foreign server. But this approach does not take care of LIMITs passed down by pass_down_bound(). But it will take care of deparsing the query correctly and handle OFFSET clause as well. 2. Konstantin's approach takes care of LIMITs passed down by pass_down_bound(), but "always" pushes down the LIMIT and assumes that a LIMIT clause can be appended to the query already generated. Both of these seem sane choices. But then it doesn't push down OFFSET clause even when it's possible to push it down. If we could defer deparsing the query to execution time we might be able to achieve both the targets. It has one more advantage: we could pass parameters embedded in the query, rather than binding them. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On 22.08.2017 17:27, Konstantin Knizhnik wrote: On 18.08.2017 04:33, Robert Haas wrote: It seems like a somewhat ad-hoc approach; it supposes that we can take any query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto the very end and all will be well. Maybe that's not a problematic assumption, not sure. The grammar happens to allow both FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the latter syntax is documented. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I am not absolutely sure that it is possible to append any query which can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause. But I also do not know example when it is not possible. As you have mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres. I have inspected deparseSelectStmtForRel function and now I am sure that appending LIMIT to the SQL statement generated by this function will not cause any problems. It can produce only the following subset of SELECT: select FROM [GROUP BY ... [ HAVING ... ]] [ OREDER BY ... ] [ FOR UPDATE ... ]; The only suspicious clause is FOR UPDATE, but I have checked that "FOR UPDATE ... LIMIT n" is really accepted by Postgres parser. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On 18.08.2017 04:33, Robert Haas wrote: It seems like a somewhat ad-hoc approach; it supposes that we can take any query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto the very end and all will be well. Maybe that's not a problematic assumption, not sure. The grammar happens to allow both FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the latter syntax is documented. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I am not absolutely sure that it is possible to append any query which can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause. But I also do not know example when it is not possible. As you have mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres. Can you suggest how to implement limit push down to FDW in better way? Move deparseSelectStmtForRel() from postgresGetForeignPlan to postgresIterateForeignScan ? It seems to be problematic because many information required by deparseSelectStmtForRel is not available in postgresIterateForeignScan. In principle, it is possible to somehow propagate it here. But from my point of view it is not right approach... IMHO there is some contradiction in Postgres optimizer that static information about limit is not taken in account at the planning stage and is actually used only during query execution, when pass_down_bound() function is called to propagate knowledge about limit down through plan nodes. Certainly I understand that it gives more flexibility: we can use information from previous steps of query execution which was not available at planning stage. But pushing down limit at planning stage requires too much changes. And the proposed patch is very small and non-invasive. And in principle, it can be used not only postgres_fdw, but also in other FDW implementations to push down information about LIMIT. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Tue, Aug 22, 2017 at 3:43 AM, Robert Haas wrote: > Works for me. While I'm sure this won't eclipse previous achievements > in this area, it still seems worthwhile. This one is intentional per what happens in the US today? ;) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 18, 2017 at 4:08 PM, Douglas Doole wrote: >> 4. I am pretty doubtful that "Memory: 25kB" is going to be stable >> enough for us to want that output memorialized in the regression ... > > Fair enough. I wanted to be a bit more sophisticated in my check than > looking for a single value so I worked out something that distills the > explain down to the key elements. Works for me. While I'm sure this won't eclipse previous achievements in this area, it still seems worthwhile. So committed with one minor change to shorten a long-ish line. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > 1. The header comment for pass_down_bound() could mention "one or more > levels of subqueries" rather than "a subquery". > Fixed 2. The first of the comments in the function body appears to have a > whitespace issue that needs to be fixed manually or, better yet, > addressed by pgindent. > Fixed > 3. The formatting of the comment in the regression tests appears to be > unlike any other comment in that same file. > A side effect of inheriting it from our branches ;-) Reworked. > 4. I am pretty doubtful that "Memory: 25kB" is going to be stable > enough for us to want that output memorialized in the regression ... > Fair enough. I wanted to be a bit more sophisticated in my check than looking for a single value so I worked out something that distills the explain down to the key elements. *** a/src/backend/executor/nodeLimit.c --- b/src/backend/executor/nodeLimit.c *** *** 308,313 recompute_limits(LimitState *node) --- 308,316 * since the MergeAppend surely need read no more than that many tuples from * any one input. We also have to be prepared to look through a Result, * since the planner might stick one atop MergeAppend for projection purposes. + * We can also accept one or more levels of subqueries that have no quals or + * SRFs (that is, each subquery is just projecting columns) between the LIMIT + * and any of the above. * * This is a bit of a kluge, but we don't have any more-abstract way of * communicating between the two nodes; and it doesn't seem worth trying *** *** 320,325 recompute_limits(LimitState *node) --- 323,349 static void pass_down_bound(LimitState *node, PlanState *child_node) { + /* + * If the child is a subquery that does no filtering (no predicates) + * and does not have any SRFs in the target list then we can potentially + * push the limit through the subquery. It is possible that we could have + * multiple subqueries, so tunnel through them all. + */ + while (IsA(child_node, SubqueryScanState)) + { + SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node; + + /* + * Non-empty predicates or an SRF means we cannot push down the limit. + */ + if (subqueryScanState->ss.ps.qual != NULL || + expression_returns_set((Node *) child_node->plan->targetlist)) + return; + + /* Use the child in the following checks */ + child_node = subqueryScanState->subplan; + } + if (IsA(child_node, SortState)) { SortState *sortState = (SortState *) child_node; *** a/src/test/regress/expected/subselect.out --- b/src/test/regress/expected/subselect.out *** *** 1041,1043 NOTICE: x = 9, y = 13 --- 1041,1095 (3 rows) drop function tattle(x int, y int); + -- + -- Test that LIMIT can be pushed to SORT through a subquery that just + -- projects columns + -- + create table sq_limit (pk int primary key, c1 int, c2 int); + insert into sq_limit values + (1, 1, 1), + (2, 2, 2), + (3, 3, 3), + (4, 4, 4), + (5, 1, 1), + (6, 2, 2), + (7, 3, 3), + (8, 4, 4); + -- The explain contains data that may not be invariant, so + -- filter for just the interesting bits. The goal here is that + -- we should see three notices, in order: + -- NOTICE: Limit + -- NOTICE: Subquery + -- NOTICE: Top-N Sort + -- A missing step, or steps out of order means we have a problem. + do $$ + declare x text; + begin + for x in + explain (analyze, summary off, timing off, costs off) + select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3 + loop + if (left(ltrim(x), 5) = 'Limit') then + raise notice 'Limit'; + end if; + if (left(ltrim(x), 12) = '-> Subquery') then + raise notice 'Subquery'; + end if; + if (left(ltrim(x), 18) = 'Sort Method: top-N') then + raise notice 'Top-N Sort'; + end if; + end loop; + end; + $$; + NOTICE: Limit + NOTICE: Subquery + NOTICE: Top-N Sort + select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3; + pk | c2 + + + 1 | 1 + 5 | 1 + 2 | 2 + (3 rows) + + drop table sq_limit; *** a/src/test/regress/sql/subselect.sql --- b/src/test/regress/sql/subselect.sql *** *** 540,542 select * from --- 540,588 where tattle(x, u); drop function tattle(x int, y int); + + -- + -- Test that LIMIT can be pushed to SORT through a subquery that just + -- projects columns + -- + create table sq_limit (pk int primary key, c1 int, c2 int); + insert into sq_limit values + (1, 1, 1), + (2, 2, 2), + (3, 3, 3), + (4, 4, 4), + (5, 1, 1), + (6, 2, 2), + (7, 3, 3), + (8, 4, 4); + + -- The explain contains data that may not be invariant, so + -- filter for just the interesting bits. The goal here is that + -- we should see three notices, in order: + -- NOTICE: Limit + -- NOTICE: Subquery + -- NOTICE: Top-N Sort + -- A missing step, or steps out of order means we have a problem. + do $$ + declare x text; + begin + for
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Fri, Aug 18, 2017 at 11:42 AM, Douglas Doole wrote: > Thanks for the feedback on my original patch Robert. Here's an updated patch > that will tunnel through multiple SubqueryScanStates. Seems reasonable. I have some assorted nitpicks. 1. The header comment for pass_down_bound() could mention "one or more levels of subqueries" rather than "a subquery". 2. The first of the comments in the function body appears to have a whitespace issue that needs to be fixed manually or, better yet, addressed by pgindent. 3. The formatting of the comment in the regression tests appears to be unlike any other comment in that same file. 4. I am pretty doubtful that "Memory: 25kB" is going to be stable enough for us to want that output memorialized in the regression tests. It seems like it might vary on different platforms - e.g. 32-bit vs. 64-bit - and it also seems like minor changes to how we do sorting could perturb it and, perhaps, make it unstable even if today it isn't. So I think it would be good to give this a bit more thought and see if you can come up with a way to test this without running afoul of that problem. Maybe adapt from this: do $$declare x text; begin execute 'explain select 1' into x; if x !~ '^Result' then raise notice '%', x; else raise notice 'looks ok'; end if; end;$$; BTW, regarding the other patch on this thread, it struck me that maybe it would be better to just reduce/limit the fetch count for the cursor instead of trying to inject LIMIT n into the query itself. That's not as good for query optimization purposes but it's a lot more future-proof. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Thanks for the feedback on my original patch Robert. Here's an updated patch that will tunnel through multiple SubqueryScanStates. - Doug Salesforce On Thu, Aug 17, 2017 at 6:33 PM Robert Haas wrote: > On Thu, Aug 17, 2017 at 11:36 AM, Douglas Doole > wrote: > >> I completely agree. The further a limit can be pushed down, the better. >> >> The patch looks good to me. >> > > It seems like a somewhat ad-hoc approach; it supposes that we can take any > query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto > the very end and all will be well. Maybe that's not a problematic > assumption, not sure. The grammar happens to allow both FOR UPDATE LIMIT n > and LIMIT n FOR UPDATE even though only the latter syntax is documented. > > Regarding the other patch on this thread, you mentioned upthread that "If > it is possible to get more than one SubqueryScanState and/or ResultState > between the limit and sort, then the first block of code could be placed in > a while loop." I think that's not possible for a ResultState, but I think > it *is* possible for a SubqueryScanState. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > *** a/src/backend/executor/nodeLimit.c --- b/src/backend/executor/nodeLimit.c *** *** 308,313 recompute_limits(LimitState *node) --- 308,315 * since the MergeAppend surely need read no more than that many tuples from * any one input. We also have to be prepared to look through a Result, * since the planner might stick one atop MergeAppend for projection purposes. + * We can also accept a subquery that has no quals or SRFs (that is, the + * subquery is just projecting columns) between the LIMIT and any of the above. * * This is a bit of a kluge, but we don't have any more-abstract way of * communicating between the two nodes; and it doesn't seem worth trying *** *** 320,325 recompute_limits(LimitState *node) --- 322,348 static void pass_down_bound(LimitState *node, PlanState *child_node) { + /* + * If the child is a subquery that does no filtering (no predicates) + * and does not have any SRFs in the target list then we can potentially + * push the limit through the subquery. It is possible that we could have + * multiple subqueries, so tunnel through them all. + */ + while (IsA(child_node, SubqueryScanState)) + { + SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node; + + /* + * Non-empty predicates or an SRF means we cannot push down the limit. + */ + if (subqueryScanState->ss.ps.qual != NULL || + expression_returns_set((Node *) child_node->plan->targetlist)) + return; + + /* Use the child in the following checks */ + child_node = subqueryScanState->subplan; + } + if (IsA(child_node, SortState)) { SortState *sortState = (SortState *) child_node; *** a/src/test/regress/expected/subselect.out --- b/src/test/regress/expected/subselect.out *** *** 1041,1043 NOTICE: x = 9, y = 13 --- 1041,1077 (3 rows) drop function tattle(x int, y int); + - + --TEST LIMIT pushdown through subquery scan node + - + create table sq_limit (pk int primary key, c1 int, c2 int); + insert into sq_limit values + (1, 1, 1), + (2, 2, 2), + (3, 3, 3), + (4, 4, 4), + (5, 1, 1), + (6, 2, 2), + (7, 3, 3), + (8, 4, 4); + explain (analyze, summary off, timing off, costs off) + select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3; +QUERY PLAN + + Limit (actual rows=3 loops=1) +-> Subquery Scan on x (actual rows=3 loops=1) + -> Sort (actual rows=3 loops=1) +Sort Key: sq_limit.c1, sq_limit.pk +Sort Method: top-N heapsort Memory: 25kB +-> Seq Scan on sq_limit (actual rows=8 loops=1) + (6 rows) + + select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3; + pk | c2 + + + 1 | 1 + 5 | 1 + 2 | 2 + (3 rows) + + drop table sq_limit; *** a/src/test/regress/sql/subselect.sql --- b/src/test/regress/sql/subselect.sql *** *** 540,542 select * from --- 540,563 where tattle(x, u); drop function tattle(x int, y int); + + - + --TEST LIMIT pushdown through subquery scan node + - + create table sq_limit (pk int primary key, c1 int, c2 int); + insert into sq_limit values + (1, 1, 1), + (2, 2, 2), + (3, 3, 3), + (4, 4, 4), + (5, 1, 1), + (6, 2, 2), + (7, 3, 3), + (8, 4, 4); + + explain (analyze, summary off, timing off, costs off) + select * from (select pk,c2 from sq_limit order by c1,pk) a
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Thu, Aug 17, 2017 at 11:36 AM, Douglas Doole wrote: > I completely agree. The further a limit can be pushed down, the better. > > The patch looks good to me. > It seems like a somewhat ad-hoc approach; it supposes that we can take any query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto the very end and all will be well. Maybe that's not a problematic assumption, not sure. The grammar happens to allow both FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the latter syntax is documented. Regarding the other patch on this thread, you mentioned upthread that "If it is possible to get more than one SubqueryScanState and/or ResultState between the limit and sort, then the first block of code could be placed in a while loop." I think that's not possible for a ResultState, but I think it *is* possible for a SubqueryScanState. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
I completely agree. The further a limit can be pushed down, the better. The patch looks good to me. On Thu, Aug 17, 2017 at 8:27 AM Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > > On 29.04.2017 00:13, Douglas Doole wrote: > > If you add this to the commitfest app, more people might look at it when >> the next commitfest opens. > > > I have added it. https://commitfest.postgresql.org/14/1119/ > > Also, it might help if you can provide a query/ies with numbers where this >> optimization shows improvement. >> > > I can't provide the real queries where we encountered the problem because > they are internal. However I showed a simplified version of the queries in > my first post. > > On our queries, the change made quite a difference - execution time > dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows that > memory use dropped significantly and we didn't have to spill the sort to > disk > > From: > > -> Sort (cost=989.95..1013.27 rows=9326 width=30) > (node_startup_time/loop=31328.891, node_total_time/loop: 31329.756 > rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr > hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 89592 > Sort Space Type: Disk > > To: > > -> Sort (cost=989.95..1013.27 rows=9326 width=30) > (node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 rows=2001 > loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort Method: top-N > heapsort Sort Space Used: 3256 Sort Space Type: Memory > > > Attached please find yet another small patch which pushes down LIMIT to > ForeignScan. > I should notice that currently Postgres optimizer is using "Merge Append" > and fetches from remote nodes only required number of tuples. > So even without LIMIT push down, postgres_fdw will not pull the whole > table from remote host. > postgres_fdw is using cursor for fetching data from remote. Default fetch > size is 100, so even without limit remote query will fetch no more than > 100 rows at remote site. > > Assume the following example: > > postgres=# create extension postgres_fdw; > postgres=# create server shard1 FOREIGN DATA WRAPPER postgres_fdw > options(dbname 'postgres', host 'localhost', port '5432'); > postgres=# create server shard2 FOREIGN DATA WRAPPER postgres_fdw > options(dbname 'postgres', host 'localhost', port '5432'); > postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user > '$user'); > postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user > '$user'); > postgres=# CREATE TABLE t(u integer primary key, v integer); > postgres=# CREATE TABLE t1(u integer primary key, v integer); > postgres=# CREATE TABLE t2(u integer primary key, v integer); > postgres=# insert into t1 values (generate_series(1,10), > random()*10); > postgres=# insert into t2 values (generate_series(1,10), > random()*10); > postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1 > options(table_name 't1'); > postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2 > options(table_name 't2'); > > > postgres=# explain analyze select * from t order by u limit 1; > QUERY > PLAN > > --- > Limit (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010 > rows=1 loops=1) >-> Merge Append (cost=200.15..449.39 rows=5121 width=8) (actual > time=2.009..2.009 rows=1 loops=1) > Sort Key: t.u > -> Index Scan using t_pkey on t (cost=0.12..8.14 rows=1 > width=8) (actual time=0.005..0.005 rows=0 loops=1) > -> Foreign Scan on t_fdw2 (cost=100.00..193.92 rows=2560 > width=8) (actual time=1.074..1.074 rows=1 loops=1) > -> Foreign Scan on t_fdw1 (cost=100.00..193.92 rows=2560 > width=8) (actual time=0.928..0.928 rows=1 loops=1) > Planning time: 0.769 ms > Execution time: 6.837 ms > (8 rows) > > As you can see foreign scan fetches only one row from each remote node. > > But still pushing down limit can have positive effect on performance, > especially if SORT can be replaced with TOP-N. > I got the following results (time in seconds): > > Query > original > limit push down > select * from t order by u limit 1 > 2.276 > 1.777 > select * from t order by v limit 1 > 100 42 > > There is index for "u", so fetching records with smallest "u" values can > be done without sorting, so times are similar. > But in case of sorting by "v", pushing down limit allows to use TOP-1 > instead of global sort and it reduces query execution time more than 2 > times. > > -- > > Konstantin Knizhnik > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On 29.04.2017 00:13, Douglas Doole wrote: If you add this to the commitfest app, more people might look at it when the next commitfest opens. I have added it. https://commitfest.postgresql.org/14/1119/ Also, it might help if you can provide a query/ies with numbers where this optimization shows improvement. I can't provide the real queries where we encountered the problem because they are internal. However I showed a simplified version of the queries in my first post. On our queries, the change made quite a difference - execution time dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows that memory use dropped significantly and we didn't have to spill the sort to disk From: -> Sort (cost=989.95..1013.27 rows=9326 width=30) (node_startup_time/loop=31328.891, node_total_time/loop: 31329.756 rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 89592 Sort Space Type: Disk To: -> Sort (cost=989.95..1013.27 rows=9326 width=30) (node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 rows=2001 loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort Method: top-N heapsort Sort Space Used: 3256 Sort Space Type: Memory Attached please find yet another small patch which pushes down LIMIT to ForeignScan. I should notice that currently Postgres optimizer is using "Merge Append" and fetches from remote nodes only required number of tuples. So even without LIMIT push down, postgres_fdw will not pull the whole table from remote host. postgres_fdw is using cursor for fetching data from remote. Default fetch size is 100, so even without limit remote query will fetch no more than 100 rows at remote site. Assume the following example: postgres=# create extension postgres_fdw; postgres=# create server shard1 FOREIGN DATA WRAPPER postgres_fdw options(dbname 'postgres', host 'localhost', port '5432'); postgres=# create server shard2 FOREIGN DATA WRAPPER postgres_fdw options(dbname 'postgres', host 'localhost', port '5432'); postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user '$user'); postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user '$user'); postgres=# CREATE TABLE t(u integer primary key, v integer); postgres=# CREATE TABLE t1(u integer primary key, v integer); postgres=# CREATE TABLE t2(u integer primary key, v integer); postgres=# insert into t1 values (generate_series(1,10), random()*10); postgres=# insert into t2 values (generate_series(1,10), random()*10); postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1 options(table_name 't1'); postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2 options(table_name 't2'); postgres=# explain analyze select * from t order by u limit 1; QUERY PLAN --- Limit (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010 rows=1 loops=1) -> Merge Append (cost=200.15..449.39 rows=5121 width=8) (actual time=2.009..2.009 rows=1 loops=1) Sort Key: t.u -> Index Scan using t_pkey on t (cost=0.12..8.14 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=1) -> Foreign Scan on t_fdw2 (cost=100.00..193.92 rows=2560 width=8) (actual time=1.074..1.074rows=1 loops=1) -> Foreign Scan on t_fdw1 (cost=100.00..193.92 rows=2560 width=8) (actual time=0.928..0.928rows=1 loops=1) Planning time: 0.769 ms Execution time: 6.837 ms (8 rows) As you can see foreign scan fetches only one row from each remote node. But still pushing down limit can have positive effect on performance, especially if SORT can be replaced with TOP-N. I got the following results (time in seconds): Query original limit push down select * from t order by u limit 1 2.276 1.777 select * from t order by v limit 1 100 42 There is index for "u", so fetching records with smallest "u" values can be done without sorting, so times are similar. But in case of sorting by "v", pushing down limit allows to use TOP-1 instead of global sort and it reduces query execution time more than 2 times. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 080cb0a..e3847ce 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2949,7 +2949,8 @@ create_cursor(ForeignScanState *node) initStringInfo(&buf); appendStringInfo(&buf, "DECLARE c%u CURSOR FOR\n%s", fsstate->cursor_number, fsstate->query); - + if (node->limit > 0) + appendStringInfo(&buf, " LIMIT %lld", (long long)node->limit); /* * Notice that we pass NULL for paramTypes,
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
> > If you add this to the commitfest app, more people might look at it when > the next commitfest opens. I have added it. https://commitfest.postgresql.org/14/1119/ Also, it might help if you can provide a query/ies with numbers where this > optimization shows improvement. > I can't provide the real queries where we encountered the problem because they are internal. However I showed a simplified version of the queries in my first post. On our queries, the change made quite a difference - execution time dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows that memory use dropped significantly and we didn't have to spill the sort to disk From: -> Sort (cost=989.95..1013.27 rows=9326 width=30) (node_startup_time/loop=31328.891, node_total_time/loop: 31329.756 rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 89592 Sort Space Type: Disk To: -> Sort (cost=989.95..1013.27 rows=9326 width=30) (node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 rows=2001 loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort Method: top-N heapsort Sort Space Used: 3256 Sort Space Type: Memory
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
On Wed, Apr 19, 2017 at 10:51 PM, Douglas Doole wrote: > Thanks for the feedback Ashutosh. > > I disagree that we need to call pass_down_bound() recursively. The whole > point of the recursive call would be to tunnel through a plan node. Since > SubqueryScanState only has one child (unlike MergeAppendState) this can be > more efficiently done inline rather than via a recursive call. > > In the case of ResultState, this is classic tail-end recursion and the C > compiler should optimize it out. As we add more recursive calls though, it > becomes harder for the compiler to do the right thing. (I'll admit that I'm > not up on what state-of-the-art in recursion removal though, so maybe my > concern is moot. That said, I prefer not to depend on compiler optimizations > if there's a clean alternative way to express the code.) > > I do agree that it would make the code cleaner to handle ResultState and > SubqueryScanState in a similar fashion. My preference, however, would be to > remove the recursive call from ResultState handling. That would give us code > like: > > if child == SubqueryScanState > child = SubqueryScanState->child > else if child == ResultState > child = ResultState->child > > if child == SortState > limit pushdown > else if child == MergeAppendState > recursive call on all children > > If it is possible to get more than one SubqueryScanState and/or ResultState > between the limit and sort, then the first block of code could be placed in > a while loop. > > If you'd prefer that I rework the patch as I discussed, just let me know and > I'll resubmit it. Thinking more about it, pass_down_bound() optimization ultimately targets sort nodes. It doesn't for example target ForeignScan nodes which can benefit from such optimization. But I think any generic LIMIT optimization will need to be handled in the planner as against the executor. Said that, I don't think community at large would accept serializing pass_down_bound() as you are proposing. You may to wait for others' opinions before working on it. If you add this to the commitfest app, more people might look at it when the next commitfest opens. Also, it might help if you can provide a query/ies with numbers where this optimization shows improvement. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
Thanks for the feedback Ashutosh. I disagree that we need to call pass_down_bound() recursively. The whole point of the recursive call would be to tunnel through a plan node. Since SubqueryScanState only has one child (unlike MergeAppendState) this can be more efficiently done inline rather than via a recursive call. In the case of ResultState, this is classic tail-end recursion and the C compiler should optimize it out. As we add more recursive calls though, it becomes harder for the compiler to do the right thing. (I'll admit that I'm not up on what state-of-the-art in recursion removal though, so maybe my concern is moot. That said, I prefer not to depend on compiler optimizations if there's a clean alternative way to express the code.) I do agree that it would make the code cleaner to handle ResultState and SubqueryScanState in a similar fashion. My preference, however, would be to remove the recursive call from ResultState handling. That would give us code like: if child == SubqueryScanState child = SubqueryScanState->child else if child == ResultState child = ResultState->child if child == SortState limit pushdown else if child == MergeAppendState recursive call on all children If it is possible to get more than one SubqueryScanState and/or ResultState between the limit and sort, then the first block of code could be placed in a while loop. If you'd prefer that I rework the patch as I discussed, just let me know and I'll resubmit it. - Doug Salesforce On Wed, Apr 19, 2017 at 1:57 AM Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > The function pass_down_bound() is a recursive function. For > SubqueryScanState we have to do something similar to ResultState i.e. > call pass_down_bound() recursively on subqueryScanState->subplan. > > Please add this to the next commitfest. > > On Wed, Apr 19, 2017 at 6:09 AM, Douglas Doole > wrote: > > We've hit a case where pass_down_bound() isn't pushing the row count > limit > > from limit into sort. The issue is that we're getting a subquery scan > node > > between the limit and the sort. The subquery is only doing column > projection > > and has no quals or SRFs so it should be safe to push the limit into the > > sort. > > > > The query that hit the problem can be simplified to: > > > >SELECT * FROM (SELECT A,B FROM T ORDER BY C) LIMIT 5 > > > > (Yeah, the query's a little screwy in that the ORDER BY should really be > > outside the subselect, but it came from a query generator, so that's a > > different conversation.) > > > > Proposed patch is attached. > > > > - Doug > > Salesforce > > > > > > -- > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-hackers > > > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company >
Re: [HACKERS] [PATCH] Push limit to sort through a subquery
The function pass_down_bound() is a recursive function. For SubqueryScanState we have to do something similar to ResultState i.e. call pass_down_bound() recursively on subqueryScanState->subplan. Please add this to the next commitfest. On Wed, Apr 19, 2017 at 6:09 AM, Douglas Doole wrote: > We've hit a case where pass_down_bound() isn't pushing the row count limit > from limit into sort. The issue is that we're getting a subquery scan node > between the limit and the sort. The subquery is only doing column projection > and has no quals or SRFs so it should be safe to push the limit into the > sort. > > The query that hit the problem can be simplified to: > >SELECT * FROM (SELECT A,B FROM T ORDER BY C) LIMIT 5 > > (Yeah, the query's a little screwy in that the ORDER BY should really be > outside the subselect, but it came from a query generator, so that's a > different conversation.) > > Proposed patch is attached. > > - Doug > Salesforce > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers