All, Pardon me for jumping into this late. In general, I like Alvaro's approach. However, I wanted to list the major shortcomings of the existing replication system (based on complaints by PGX's users and on IRC) and compare them to Alvaro's proposed implementation to make sure that enough of them are addressed, and that the ones which aren't addressed are not being addressed as a clear decision. We can't address *all* of the limitations of the current system, but let's make sure that we're addressing enough of them to make implementing a 2nd partitioning system worthwhile.
Where I have ? is because I'm not clear from Alvaro's proposal whether they're addressed or not. 1.The Trigger Problem the need to write triggers for INSERT/UPDATE/DELETE. Addressed. 2. The Clutter Problem cluttering up system views and dumps with hundreds of partitioned tables Addressed. 3. Creation Problem the need two write triggers and/or cron jobs to create new partitions Addressed. 4. Creation Locking Problem high probability of lock pile-ups whenever a new partition is created on demand due to multiple backends trying to create the partition at the same time. Not Addressed? 5. Constant Problem Since current partitioned query planning happens before the rewrite phase, SELECTs do not use partition logic to evaluate even simple expressions, let alone IMMUTABLE or STABLE functions. Addressed?? 6. Unique Index Problem Cannot create a unique index across multiple partitions, which prevents the partitioned table from being FK'd. Not Addressed (but could be addressed in the future) 7. JOIN Problem Two partitioned tables being JOINed need to append and materialize before the join, causing a very slow join under some circumstances, even if both tables are partitioned on the same ranges. Not Addressed? (but could be addressed in the future) 8. COPY Problem Cannot bulk-load into the Master, just into individual partitions. Addressed. 9. Hibernate Problem When using the trigger method, inserts into the master partition return 0, which Hibernate and some other ORMs regard as an insert failure. Addressed. 10. Scaling Problem Inheritance partitioning becomes prohibitively slow for the planner at somewhere between 100 and 500 partitions depending on various factors. No idea? 11. Hash Partitioning Some users would prefer to partition into a fixed number of hash-allocated partitions. Not Addressed. 12. Extra Constraint Evaluation Inheritance partitioning evaluates *all* constraints on the partitions, whether they are part of the partitioning scheme or not. This is way expensive if those are, say, polygon comparisons. Addressed. Additionally, I believe that Alvaro's proposal will make the following activities which are supported by partition-by-inheritance more difficult or impossible. Again, these are probably acceptable because inheritance partitioning isn't going away. However, we should consciously decide that: A. COPY/ETL then attach In inheritance partitioning, you can easily build a partition outside the master and then "attach" it, allowing for minimal disturbance of concurrent users. Could be addressed in the future. B. Catchall Partition Many partitioning schemes currently contain a "catchall" partition which accepts rows outside of the range of the partitioning scheme, due to bad input data. Probably not handled on purpose; Alvaro is proposing that we reject these instead, or create the partitions on demand, which is a legitimate approach. C. Asymmetric Partitioning / NULLs in partition column This is the classic Active/Inactive By Month setup for partitions. Could be addressed via special handling for NULL/infinity in the partitioned column. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers