RANGE partitioning of time-series data is quite a common range to use partitioning, and such tables tend to grow fairly large. I thought since we always store RANGE partitioned tables in the PartitionDesc in ascending range order that it might be useful to make use of this and when the required pathkeys match the order of the range, then we could make use of an Append node instead of uselessly using a MergeAppend, since the MergeAppend will just exhaust each subplan one at a time, in order.
It does not seem very hard to implement this and it does not add much in the way of additional processing to the planner. Performance wise it seems to give a good boost to getting sorted results from a partitioned table. I performed a quick test just on my laptop with: Setup: CREATE TABLE partbench (id BIGINT NOT NULL, i1 INT NOT NULL, i2 INT NOT NULL, i3 INT NOT NULL, i4 INT NOT NULL, i5 INT NOT NULL) PARTITION BY RANGE (id); select 'CREATE TABLE partbench' || x::text || ' PARTITION OF partbench FOR VALUES FROM (' || (x*100000)::text || ') TO (' || ((x+1)*100000)::text || ');' from generate_Series(0,299) x; \gexec \o INSERT INTO partbench SELECT x,1,2,3,4,5 from generate_Series(0,29999999) x; create index on partbench (id); vacuum analyze; Test: select * from partbench order by id limit 1 offset 29999999; Results Patched: Time: 4234.807 ms (00:04.235) Time: 4237.928 ms (00:04.238) Time: 4241.289 ms (00:04.241) Time: 4234.030 ms (00:04.234) Time: 4244.197 ms (00:04.244) Time: 4266.000 ms (00:04.266) Unpatched: Time: 5917.288 ms (00:05.917) Time: 5937.775 ms (00:05.938) Time: 5911.146 ms (00:05.911) Time: 5906.881 ms (00:05.907) Time: 5918.309 ms (00:05.918) (about 39% faster) The implementation is fairly simple. One thing I don't like about is I'd rather build_partition_pathkeys() performed all the checks to know if the partition should support a natural pathkey, but as of now, I have the calling code ensuring that there are no sub-partitioned tables. These could cause tuples to be output in the wrong order. Does this idea seem like something we'd want? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
v1-0001-Allow-Append-to-be-used-in-place-of-MergeAppend-f.patch
Description: Binary data