On Fri, May 12, 2017 at 1:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I'd vote that it's not, which means that this whole approach to hash > partitioning is unworkable. I agree with Andres that demanding hash > functions produce architecture-independent values will not fly.
If we can't produce architecture-independent hash values, then what's the other option? One alternative would be to change the way that we dump and restore the data. Instead of dumping the data with the individual partitions, dump it all out for the parent and let tuple routing sort it out at restore time. Of course, this isn't very satisfying because now dump-and-restore hasn't really preserved the state of the database; indeed, you could make it into a hard failure by creating a foreign key referencing a partition hash partition. After dump-and-restore, the row ends up in some other partition and the foreign key can't be recreated because the relationship no longer holds. This isn't limited to foreign keys, either; similar problems could be created with CHECK constraints or other per-table properties that can vary between one child and another. I basically think it's pretty futile to suppose that we can get away with having a dump and restore move rows around between partitions without having that blow up in some cases. > Maintaining such a property for float8 (and the types that depend on it) > might be possible if you believe that nobody ever uses anything but IEEE > floats, but we've never allowed that as a hard assumption before. I don't know how standard that is. Is there any hardware that anyone's likely to be using that doesn't? TBH, I don't really care if support for obscure, nearly-dead platforms like VAX or whatever don't quite work with hash-partitioned tables. In practice, PostgreSQL only sorta works on that kind of platform anyway; there are far bigger problems than this. On the other hand, if there are servers being shipped in 2017 that don't use IEEE floats, that's another problem. What about integers? I think we're already assuming two's-complement arithmetic, which I think means that the only problem with making the hash values portable for integers is big-endian vs. little-endian. That's sounds solveable-ish. > Even architecture dependence isn't the whole scope of the problem. > Consider for example dumping a LATIN1-encoded database and trying > to reload it into a UTF8-encoded database. People will certainly > expect that to be possible, and do you want to guarantee that the > hash of a text value is encoding-independent? No, I think that's expecting too much. I'd be just fine telling people that if you hash-partition on a text column, it may not load into a database with another encoding. If you care about that, don't use hash-partitioning, or don't change the encoding, or dump out the partitions one by one and reload all the roads into the parent table for re-routing, solving whatever problems come up along the way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers