On 2019/04/23 14:45, David Rowley wrote: > On Tue, 23 Apr 2019 at 13:49, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> >> On 2019/04/23 7:51, Alvaro Herrera wrote: >>> To me, it sounds >>> unintuitive to accept partitions that don't exactly match the order of >>> the parent table; but it's been supported all along. >> >> You might know it already, but even though column sets of two tables may >> appear identical, their TupleDescs still may not match due to dropped >> columns being different in the two tables. > > I think that's the most likely reason that the TupleDescs would differ > at all. For RANGE partitions on time series data, it's quite likely > that new partitions are periodically created to store new data. If > the partitioned table those belong to evolved over time, gaining new > columns and dropping columns that are no longer needed then some > translation work will end up being required. From my work on > 42f70cd9c, I know tuple conversion is not free, so it's pretty good > that pg_dump will remove the need for maps in this case even with the > proposed change.
Maybe I'm missing something, but if you're talking about pg_dump changes proposed in the latest patch that Alvaro posted on April 18, which is to emit partitions as two steps, then I don't see how that will always improves things in terms of whether maps are needed or not (regardless of whether that's something to optimize for or not.) If partitions needed a map in the old database, this patch means that they will *continue* to need it in the new database. With HEAD, they won't, because partitions created with CREATE TABLE PARTITION OF will have the same descriptor as parent, provided the parent is also created afresh in the new database, which is true in the non-binary-upgrade mode. The current arrangement, as I mentioned in my previous email, is partly inspired from the fact that creating the parent and partition afresh in the new database will lead them to have the same TupleDesc and hence won't need maps. Thanks, Amit