On Fri, Mar 8, 2019 at 3:15 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > I took a quick look through this and I'm not very happy with it. > It seems to me that the premise ought to be "just use an Append > if we can prove the output would be ordered anyway", but that's not > what we actually have here: instead you're adding more infrastructure > onto Append, which notably involves invasive changes to the API of > create_append_path, which is the main reason why the patch keeps breaking. > (It's broken again as of HEAD, though the cfbot doesn't seem to have > noticed yet.) Likewise there's a bunch of added complication in > cost_append, create_append_plan, etc. I think you should remove all that > and restrict this optimization to the case where all the subpaths are > natively ordered --- if we have to insert Sorts, it's hardly going to move > the needle to worry about simplifying the parent MergeAppend to Append.
Other people have already said that they don't think this is true; I agree with those people. Even if you have to sort *every* path, sorting a bunch of reasonably large data sets individually is possibly better than sorting all the data together, because (1) you can start emitting rows sooner, (2) it might make you fit in memory instead of having to spill to disk, and (3) O(n lg n) is supralinear. Still, if that were the only case this handled, I wouldn't be too excited, because it seems at least plausible that lumping a bunch of small partitions together and sorting it all at once could save some start-up and tear-down costs vs. sorting them individually. But it isn't; the ability to consider that sort of plan is just a fringe benefit. If a substantial fraction of the partitions have indexes -- half, three-quarters, all-but-one -- sorting only the remaining ones should win big. Admittedly, I think this case is less common than it was a few years ago, because with table inheritance one often ended up with a parent partition that was empty and had no indexes so it produced a dummy-seqscan in every plan, and that's gone with partitioning. Moreover, because of Alvaro's work on cascaded CREATE INDEX, people are probably now more likely to have matching indexes on all the partitions. Still, it's not that hard to imagine a case where older data that doesn't change much is more heavily indexed than tables that are still suffering DML of whatever kind on a regular basis. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company