Re: Ordered Partitioned Table Scans

Robert Haas Fri, 15 Mar 2019 08:50:28 -0700

On Fri, Mar 8, 2019 at 3:15 PM Tom Lane <[email protected]> wrote:
> I took a quick look through this and I'm not very happy with it.
> It seems to me that the premise ought to be "just use an Append
> if we can prove the output would be ordered anyway", but that's not
> what we actually have here: instead you're adding more infrastructure
> onto Append, which notably involves invasive changes to the API of
> create_append_path, which is the main reason why the patch keeps breaking.
> (It's broken again as of HEAD, though the cfbot doesn't seem to have
> noticed yet.)  Likewise there's a bunch of added complication in
> cost_append, create_append_plan, etc.  I think you should remove all that
> and restrict this optimization to the case where all the subpaths are
> natively ordered --- if we have to insert Sorts, it's hardly going to move
> the needle to worry about simplifying the parent MergeAppend to Append.


Other people have already said that they don't think this is true; I
agree with those people.  Even if you have to sort *every* path,
sorting a bunch of reasonably large data sets individually is possibly
better than sorting all the data together, because (1) you can start
emitting rows sooner, (2) it might make you fit in memory instead of
having to spill to disk, and (3) O(n lg n) is supralinear.  Still, if
that were the only case this handled, I wouldn't be too excited,
because it seems at least plausible that lumping a bunch of small
partitions together and sorting it all at once could save some
start-up and tear-down costs vs. sorting them individually.  But it
isn't; the ability to consider that sort of plan is just a fringe
benefit.  If a substantial fraction of the partitions have indexes --
half, three-quarters, all-but-one -- sorting only the remaining ones
should win big.

Admittedly, I think this case is less common than it was a few years
ago, because with table inheritance one often ended up with a parent
partition that was empty and had no indexes so it produced a
dummy-seqscan in every plan, and that's gone with partitioning.
Moreover, because of Alvaro's work on cascaded CREATE INDEX, people
are probably now more likely to have matching indexes on all the
partitions.  Still, it's not that hard to imagine a case where older
data that doesn't change much is more heavily indexed than tables that
are still suffering DML of whatever kind on a regular basis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Ordered Partitioned Table Scans

Reply via email to