On Mon, Jan 5, 2015 at 8:42 PM, Atri Sharma <atri.j...@gmail.com> wrote:
> > Hi All, > > Please forgive if this is a repost. > > Please find attached patch for supporting ORDER BY clause in CREATE > FUNCTION for SRFs. Specifically: > > CREATE OR REPLACE FUNCTION func1(OUT e int, OUT f int) returns setof > record as ' SELECT a,b FROM table1 ORDER BY a; ' language 'sql' ORDER BY e; > > This shall allow for capturing information about existing preorder that > might be present inherently in the SRF's input or algorithm (the above > example and think generate_series). > > This allows for eliminating sorts that can be based off the known existing > preorder. For eg: > > SELECT * FROM correct_order_singlecol() ORDER BY e; # Does not need to > sort by e since existing preorder is known. > > Eliminating such sorts can be a huge gain, especially if the expected > input to needed Sort node is large. > > The obvious question that comes is what happens if specified ORDER BY > clause is false. For checking the order, a new node is added which is top > node of the plan and is responsible for projecting result rows. It tracks > the previous row seen and given a sort order, ensures that the current > tuple to be projected is in the required sort order. > > So, for above example > > EXPLAIN (COSTS OFF) SELECT * FROM correct_order_multicol() ORDER BY e; > QUERY PLAN > ----------------------------------------------- > OrderCheck > -> Function Scan on correct_order_multicol > (2 rows) > > > If order of result rows is not the same as required, an error is raised: > > SELECT * FROM incorrect_order_nulls() ORDER BY e NULLS LAST; > ERROR: Order not same as specified > > > > Preorder columns are first transformed into SortGroupClauses first and > then stored directly in pg_proc. > > > This functionality is a user case seen functionality, and is especially > useful when SRF inputs are large and/or might be pipelined from another > function (SRFs are used in pipelines in analytical systems many times, with > large data). > > The overhead of this patch is small. A new path is added for the preorder > keys, and OrderCheck node's additional cost is pretty low, given that it > only compares two rows and stores only a single row (previous row seen), > hence the memory footprint is minuscule. > > We can eliminate the new node and put onus or having the right order on the user like we do with volatile setting of the function. > In the inner joins thread, Tom mentioned having a new node which has > multiple plans and executor can decide which plan to execute given runtime > conditions. I played around with the idea, and am open to experiment having > a new node which has a Sort based plan and is executed in case OrderCheck > node sees that the inherent order of result tuples is not correct. Feedback > here would be very welcome. > > > I will add the patch to current commitfest. > > Thoughts? > > Regards, > > Atri > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company