Re: [HACKERS] Hash Functions

2017-09-08 Thread amul sul
On Fri, Sep 1, 2017 at 8:01 AM, Robert Haas wrote: > On Thu, Aug 31, 2017 at 8:40 AM, amul sul wrote: > > Fixed in the attached version. > > I fixed these up a bit and committed them. Thanks. > > I think this takes care of adding not only the infrastructure but > support for all the core data t

Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas writes: > On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane wrote: >> ALTER OPERATOR FAMILY ADD FUNCTION ... ? >> >> That would result in the functions being considered "loose" in the >> family rather than bound into an operator class. I think that's >> actually the right thing, because th

Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane wrote: > Robert Haas writes: >> I think this takes care of adding not only the infrastructure but >> support for all the core data types, but I'm not quite sure how to >> handle upgrading types in contrib. It looks like citext, hstore, and >> several da

Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas writes: > I think this takes care of adding not only the infrastructure but > support for all the core data types, but I'm not quite sure how to > handle upgrading types in contrib. It looks like citext, hstore, and > several data types provided by isn have hash opclasses, and I think

Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 8:40 AM, amul sul wrote: > Fixed in the attached version. I fixed these up a bit and committed them. Thanks. I think this takes care of adding not only the infrastructure but support for all the core data types, but I'm not quite sure how to handle upgrading types in con

Re: [HACKERS] Hash Functions

2017-08-31 Thread amul sul
On Wed, Aug 30, 2017 at 9:05 PM, Robert Haas wrote: > On Wed, Aug 30, 2017 at 10:43 AM, amul sul wrote: > > Thanks for the suggestion, I have updated 0002-patch accordingly. > > Using this I found some strange behaviours as follow: > > > > 1) standard and extended0 output for the jsonb_hash case

Re: [HACKERS] Hash Functions

2017-08-30 Thread Robert Haas
On Wed, Aug 30, 2017 at 10:43 AM, amul sul wrote: > Thanks for the suggestion, I have updated 0002-patch accordingly. > Using this I found some strange behaviours as follow: > > 1) standard and extended0 output for the jsonb_hash case is not same. > 2) standard and extended0 output for the hash_ra

Re: [HACKERS] Hash Functions

2017-08-30 Thread amul sul
On Tue, Aug 29, 2017 at 11:48 PM, Robert Haas wrote: > On Tue, Aug 22, 2017 at 8:14 AM, amul sul wrote: > > Attaching patch 0002 for the reviewer's testing. > > I think that this 0002 is not something we can think of committing > because there's no guarantee that hash functions will return the s

Re: [HACKERS] Hash Functions

2017-08-29 Thread Robert Haas
On Tue, Aug 22, 2017 at 8:14 AM, amul sul wrote: > Attaching patch 0002 for the reviewer's testing. I think that this 0002 is not something we can think of committing because there's no guarantee that hash functions will return the same results on all platforms. However, what we could and, I thi

Re: [HACKERS] Hash Functions

2017-08-29 Thread amul sul
On Tue, Aug 22, 2017 at 5:44 PM, amul sul wrote: > On Fri, Aug 18, 2017 at 11:01 PM, Robert Haas > wrote: > >> On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: >> > I have a small query, what if I want a cache entry with extended hash >> > function instead standard one, I might require that wh

Re: [HACKERS] Hash Functions

2017-08-22 Thread amul sul
On Fri, Aug 18, 2017 at 11:01 PM, Robert Haas wrote: > On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: > > I have a small query, what if I want a cache entry with extended hash > > function instead standard one, I might require that while adding > > hash_array_extended function? Do you think w

Re: [HACKERS] Hash Functions

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: > I have a small query, what if I want a cache entry with extended hash > function instead standard one, I might require that while adding > hash_array_extended function? Do you think we need to extend > lookup_type_cache() as well? Hmm, I thought

Re: [HACKERS] Hash Functions

2017-08-18 Thread amul sul
On Fri, Aug 18, 2017 at 8:49 AM, Robert Haas wrote: > On Wed, Aug 16, 2017 at 5:34 PM, Robert Haas > wrote: > > Attached is a quick sketch of how this could perhaps be done (ignoring > > for the moment the relatively-boring opclass pushups). > > Here it is with some relatively-boring opclass pus

Re: [HACKERS] Hash Functions

2017-08-17 Thread Robert Haas
On Wed, Aug 16, 2017 at 5:34 PM, Robert Haas wrote: > Attached is a quick sketch of how this could perhaps be done (ignoring > for the moment the relatively-boring opclass pushups). Here it is with some relatively-boring opclass pushups added. I just did the int4 bit; the same thing will need to

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Kenneth Marshall writes: > On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote: >> ... In fact, on perusing the linked-to page >> http://burtleburtle.net/bob/hash/doobs.html >> Bob says specifically that taking b and c from this hash does not >> produce a fully random 64-bit result. He has a

Re: [HACKERS] Hash Functions

2017-08-16 Thread Kenneth Marshall
On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote: > Robert Haas writes: > > Attached is a quick sketch of how this could perhaps be done (ignoring > > for the moment the relatively-boring opclass pushups). It introduces > > a new function hash_any_extended which differs from hash_any() in

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Robert Haas writes: > Attached is a quick sketch of how this could perhaps be done (ignoring > for the moment the relatively-boring opclass pushups). It introduces > a new function hash_any_extended which differs from hash_any() in that > (a) it combines both b and c into the result and (b) it ac

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane wrote: > Robert Haas writes: >> After some further thought, I propose the following approach to the >> issues raised on this thread: > >> 1. Allow hash functions to have a second, optional support function, >> similar to what we did for btree opclasses i

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane wrote: > Robert Haas writes: >> After some further thought, I propose the following approach to the >> issues raised on this thread: > >> 1. Allow hash functions to have a second, optional support function, >> similar to what we did for btree opclasses i

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Robert Haas writes: > After some further thought, I propose the following approach to the > issues raised on this thread: > 1. Allow hash functions to have a second, optional support function, > similar to what we did for btree opclasses in > c6e3ac11b60ac4a8942ab964252d51c1c0bd8845. The second

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Thu, Aug 3, 2017 at 6:47 PM, Robert Haas wrote: > That seems pretty lame, although it's sufficient to solve the > immediate problem, and I have to admit to a certain predilection for > things that solve the immediate problem without creating lots of > additional work. After some further though

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 6:08 PM, Andres Freund wrote: >> That's another way to go, but it requires inventing a way to thread >> the IV through the hash opclass interface. > > Only if we really want to do it really well :P. Using a hash_combine() > like > > /* > * Combine two hash values, resulting

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
On 2017-08-03 17:57:37 -0400, Robert Haas wrote: > On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund wrote: > > On 2017-08-03 17:43:44 -0400, Robert Haas wrote: > >> For me, the basic point here is that we need a set of hash functions > >> for hash partitioning that are different than what we use for

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund wrote: > On 2017-08-03 17:43:44 -0400, Robert Haas wrote: >> For me, the basic point here is that we need a set of hash functions >> for hash partitioning that are different than what we use for hash >> indexes and hash joins -- otherwise when we hash

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:32 PM, Andres Freund wrote: >> Do you have any feeling for which of those endianness-independent hash >> functions might be a reasonable choice for us? > > Not a strong / very informed one, TBH. > > I'm not convinced it's worth trying to achieve this in the first place, >

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi, On 2017-08-03 17:43:44 -0400, Robert Haas wrote: > For me, the basic point here is that we need a set of hash functions > for hash partitioning that are different than what we use for hash > indexes and hash joins -- otherwise when we hash partition a table and > create hash indexes on each pa

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi, On 2017-08-03 17:09:41 -0400, Robert Haas wrote: > On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund wrote: > > Just to clarify: I don't think it's a problem to do so for integers and > > most other simple scalar types. There's plenty hash algorithms that are > > endianess independent, and the re

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund wrote: > Just to clarify: I don't think it's a problem to do so for integers and > most other simple scalar types. There's plenty hash algorithms that are > endianess independent, and the rest is just a bit of care. Do you have any feeling for which o

Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 10:19 AM, Joe Conway wrote: >> Yeah, that's not crazy. I find it a bit surprising in terms of the >> semantics, though. SET >> when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it = >> true? > > Maybe > SET partition_tuple_retry = true; > -or- > SET par

Re: [HACKERS] Hash Functions

2017-06-02 Thread Joe Conway
On 06/02/2017 05:47 AM, Robert Haas wrote: > On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis wrote: >> 2. I basically see two approaches to solve the problem: >> (a) Tom suggested at PGCon that we could have a GUC that >> automatically causes inserts to the partition to be re-routed through >> the pa

Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis wrote: > 1. For range partitioning, I think it's "yes, a little". As you point > out, there are already some weird edge cases -- the main way range > partitioning would make the problem worse is simply by having more > users. I agree. > But for hash par

Re: [HACKERS] Hash Functions

2017-06-01 Thread Jeff Davis
On Thu, Jun 1, 2017 at 11:25 AM, Andres Freund wrote: > Secondly, I think that's to a significant degree caused by > the fact that in practice people way more often partition on types like > int4/int8/date/timestamp/uuid rather than text - there's rarely good > reasons to do the latter. Once we s

Re: [HACKERS] Hash Functions

2017-06-01 Thread Jeff Davis
On Thu, Jun 1, 2017 at 10:59 AM, Robert Haas wrote: > 1. Are the new problems worse than the old ones? > > 2. What could we do about it? Exactly the right questions. 1. For range partitioning, I think it's "yes, a little". As you point out, there are already some weird edge cases -- the main way

Re: [HACKERS] Hash Functions

2017-06-01 Thread Joe Conway
On 06/01/2017 11:25 AM, Andres Freund wrote: > On 2017-06-01 13:59:42 -0400, Robert Haas wrote: >> My personal guess is that most people will prefer the fast >> hash functions over the ones that solve their potential future >> migration problems, but, hey, options are good. > > I'm pretty sure tha

Re: [HACKERS] Hash Functions

2017-06-01 Thread Andres Freund
On 2017-06-01 13:59:42 -0400, Robert Haas wrote: > I'm not actually aware of an instance where this has bitten anyone, > even though it seems like it certainly could have and maybe should've > gotten somebody at some point. Has anyone else? Two comments: First, citus has been doing hash-partition

Re: [HACKERS] Hash Functions

2017-06-01 Thread Robert Haas
On Fri, May 12, 2017 at 1:35 PM, Joe Conway wrote: >> That's a good point, but the flip side is that, if we don't have >> such a rule, a pg_dump of a hash-partitioned table on one >> architecture might fail to restore on another architecture. Today, I >> believe that, while the actual database cl

Re: [HACKERS] Hash Functions

2017-05-19 Thread Robert Haas
On Fri, May 19, 2017 at 2:36 AM, Jeff Davis wrote: > I could agree to something like that. Let's explore some of the challenges > there and potential solutions: > > 1. Dump/reload of hash partitioned data. > > Falling back to restore-through-the-root seems like a reasonable answer > here. Moving t

[HACKERS] Hash Functions

2017-05-18 Thread Jeff Davis
On Thursday, May 18, 2017, Robert Haas wrote: > My experience with this area has led > me to give up on the idea of complete uniformity as impractical, and > instead look at it from the perspective of "what do we absolutely have > to ban in order for this to be sane?". I could agree to something

Re: [HACKERS] Hash Functions

2017-05-18 Thread Robert Haas
On Thu, May 18, 2017 at 1:53 AM, Jeff Davis wrote: > For instance, it makes little sense to have individual check > constraints, indexes, permissions, etc. on a hash-partitioned table. > It doesn't mean that we should necessarily forbid them, but it should > make us question whether combining rang

Re: [HACKERS] Hash Functions

2017-05-17 Thread Jeff Davis
On Wed, May 17, 2017 at 11:35 AM, Tom Lane wrote: > I think the question is whether we are going to make a distinction between > logical partitions (where the data division rule makes some sense to the > user) and physical partitions (where it needn't). I think it might be > perfectly reasonable

Re: [HACKERS] Hash Functions

2017-05-17 Thread Jeff Davis
On Wed, May 17, 2017 at 12:10 PM, Robert Haas wrote: > 1. To handle dump-and-reload the way we partitioning does today, hash > functions would need to be portable across encodings. > 2. That's impractically difficult. > 3. So let's always load data through the top-parent. > 4. But that could fail

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas
On Wed, May 17, 2017 at 2:35 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: >>> Why can't hash partitions be stored in tables the same way as we do TOAST? >>> That should take care of the naming problem. > >> Hmm, yeah, something like that could b

Re: [HACKERS] Hash Functions

2017-05-17 Thread Tom Lane
Robert Haas writes: > On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: >> Why can't hash partitions be stored in tables the same way as we do TOAST? >> That should take care of the naming problem. > Hmm, yeah, something like that could be done, but every place where > you are currently allowed

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas
On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: > Why can't hash partitions be stored in tables the same way as we do TOAST? > That should take care of the naming problem. Hmm, yeah, something like that could be done, but every place where you are currently allowed to refer to a partition by n

Re: [HACKERS] Hash Functions

2017-05-16 Thread Ashutosh Bapat
On Tue, May 16, 2017 at 8:40 PM, Jeff Davis wrote: > On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: >> As the discussion has devolved here, it appears that there are, at >> least conceptually, two fundamentally different classes of partition: >> public, which is to say meaningful to DB clie

Re: [HACKERS] Hash Functions

2017-05-16 Thread Amit Langote
On 2017/05/17 5:25, Jeff Davis wrote: > On Tuesday, May 16, 2017, Robert Haas wrote: >> I don't really find this a very practical design. If the table >> partitions are spread across different relfilenodes, then those >> relfilenodes have to have separate pg_class entries and separate >> indexes,

Re: [HACKERS] Hash Functions

2017-05-16 Thread Peter Eisentraut
On 5/16/17 11:10, Jeff Davis wrote: > I concur at this point. I originally thought hash functions might be > made portable, but I think Tom and Andres showed that to be too > problematic -- the issue with different encodings is the real killer. I think it would be OK that if you want to move a has

Re: [HACKERS] Hash Functions

2017-05-16 Thread Jeff Davis
On Tuesday, May 16, 2017, Robert Haas wrote: > I don't really find this a very practical design. If the table > partitions are spread across different relfilenodes, then those > relfilenodes have to have separate pg_class entries and separate > indexes, and those indexes also need to have separat

Re: [HACKERS] Hash Functions

2017-05-16 Thread David Fetter
On Tue, May 16, 2017 at 08:10:39AM -0700, Jeff Davis wrote: > On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: > > As the discussion has devolved here, it appears that there are, at > > least conceptually, two fundamentally different classes of partition: > > public, which is to say meaningful

Re: [HACKERS] Hash Functions

2017-05-16 Thread Robert Haas
On Tue, May 16, 2017 at 11:10 AM, Jeff Davis wrote: > With hash partitioning: > * User only specifies number of partitions of the parent table; does > not specify individual partition properties (modulus, etc.) > * Dump/reload goes through the parent table (though we may provide > options so pg_du

Re: [HACKERS] Hash Functions

2017-05-16 Thread Jeff Davis
On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: > As the discussion has devolved here, it appears that there are, at > least conceptually, two fundamentally different classes of partition: > public, which is to say meaningful to DB clients, and "private", used > for optimizations, but otherwi

Re: [HACKERS] Hash Functions

2017-05-15 Thread David Fetter
On Mon, May 15, 2017 at 03:26:02PM -0400, Robert Haas wrote: > On Sun, May 14, 2017 at 9:35 PM, Andres Freund wrote: > > On 2017-05-14 21:22:58 -0400, Robert Haas wrote: > >> but wanting a CHECK constraint that applies to only one partition > >> seems pretty reasonable (e.g. CHECK that records for

Re: [HACKERS] Hash Functions

2017-05-15 Thread Mark Dilger
> On May 15, 2017, at 7:48 AM, Jeff Davis wrote: > > On Sun, May 14, 2017 at 6:22 PM, Robert Haas wrote: >> You'd have to prohibit a heck of a lot more than that in order for >> this to work 100% reliably. You'd have to prohibit CHECK constraints, >> triggers, rules, RLS policies, and UNIQUE i

Re: [HACKERS] Hash Functions

2017-05-15 Thread Robert Haas
On Sun, May 14, 2017 at 9:35 PM, Andres Freund wrote: > On 2017-05-14 21:22:58 -0400, Robert Haas wrote: >> but wanting a CHECK constraint that applies to only one partition >> seems pretty reasonable (e.g. CHECK that records for older years are >> all in the 'inactive' state, or whatever). > > On

Re: [HACKERS] Hash Functions

2017-05-15 Thread David Fetter
On Mon, May 15, 2017 at 07:48:14AM -0700, Jeff Davis wrote: > This would mean we need to reload through the root as Andres and > others suggested, One refinement of this would be to traverse the partition tree, stopping at the first place where the next branch has hash partitions, or at any rate t

Re: [HACKERS] Hash Functions

2017-05-15 Thread Bruce Momjian
On Mon, May 15, 2017 at 07:32:30AM -0700, Jeff Davis wrote: > On Sun, May 14, 2017 at 8:00 PM, Bruce Momjian wrote: > > Do we even know that floats are precise enough to determine the > > partition. For example, if you have 6.1, is it possible for > > that to be 5.999 on some systems?

Re: [HACKERS] Hash Functions

2017-05-15 Thread Jeff Davis
On Sun, May 14, 2017 at 6:22 PM, Robert Haas wrote: > You'd have to prohibit a heck of a lot more than that in order for > this to work 100% reliably. You'd have to prohibit CHECK constraints, > triggers, rules, RLS policies, and UNIQUE indexes, at the least. You > might be able to convince me t

Re: [HACKERS] Hash Functions

2017-05-15 Thread Jeff Davis
On Sun, May 14, 2017 at 8:00 PM, Bruce Momjian wrote: > Do we even know that floats are precise enough to determine the > partition. For example, if you have 6.1, is it possible for > that to be 5.999 on some systems? Are IEEE systems all the same for > these values? I would say we

Re: [HACKERS] Hash Functions

2017-05-14 Thread Bruce Momjian
On Sun, May 14, 2017 at 01:06:03PM -0700, Andres Freund wrote: > On 2017-05-14 15:59:09 -0400, Greg Stark wrote: > > Personally while I would like to avoid code that actively crashes or > > fails basic tests on Vax > > I personally vote for simply refusing to run/compile on non-IEEE > platforms, i

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
Hi, On 2017-05-14 21:22:58 -0400, Robert Haas wrote: > but wanting a CHECK constraint that applies to only one partition > seems pretty reasonable (e.g. CHECK that records for older years are > all in the 'inactive' state, or whatever). On a hash-partitioned table? > Now that's not to say that

Re: [HACKERS] Hash Functions

2017-05-14 Thread Robert Haas
On Sun, May 14, 2017 at 6:29 PM, Andres Freund wrote: > On 2017-05-14 18:25:08 -0400, Tom Lane wrote: >> It may well be that we can get away with saying "we're not going >> to make it simple to move hash-partitioned tables with float >> partition keys between architectures with different float >>

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
On Sun, May 14, 2017 at 3:30 PM, Tom Lane wrote: > I agree that the Far Eastern systems that can't easily be replaced > by Unicode are that way mostly because they're a mess. But I'm > still of the opinion that locking ourselves into Unicode is a choice > we might regret, far down the road. It's

Re: [HACKERS] Hash Functions

2017-05-14 Thread Thomas Munro
On Mon, May 15, 2017 at 10:08 AM, Thomas Munro wrote: > [2] > https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.cbcux01/flotcop.htm#flotcop Though looking more closely I see that the default is IEEE in 64 bit builds, which seems like a good way to kill the older format

Re: [HACKERS] Hash Functions

2017-05-14 Thread Tom Lane
Peter Geoghegan writes: > The express goal of the Unicode consortium is to replace all existing > encodings with Unicode. My personal opinion is that a Unicode > monoculture would be a good thing, provided reasonable differences can > be accommodated. Can't help remembering Randall Munroe's take

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
On 2017-05-14 18:25:08 -0400, Tom Lane wrote: > It may well be that we can get away with saying "we're not going > to make it simple to move hash-partitioned tables with float > partition keys between architectures with different float > representations". But there's a whole lot of daylight betwee

Re: [HACKERS] Hash Functions

2017-05-14 Thread Tom Lane
Andres Freund writes: > On 2017-05-14 15:59:09 -0400, Greg Stark wrote: >> Personally while I would like to avoid code that actively crashes or >> fails basic tests on Vax > I personally vote for simply refusing to run/compile on non-IEEE > platforms, including VAX. The point of wanting that is

Re: [HACKERS] Hash Functions

2017-05-14 Thread Thomas Munro
On Mon, May 15, 2017 at 7:59 AM, Greg Stark wrote: > On 13 May 2017 at 10:29, Robert Haas wrote: >> - Floats. There may be different representations in use on different >> hardware, which could be a problem. Tom didn't answer my question >> about whether any even-vaguely-modern hardware is stil

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
On Sat, May 13, 2017 at 9:11 PM, Robert Haas wrote: > The latter is > generally false already. Maybe LATIN1 -> UTF8 is no-fail, but what > about UTF8 -> LATIN1 or SJIS -> anything? Based on previous mailing > list discussions, I'm under the impression that it is sometimes > debatable how a chara

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
On 2017-05-14 15:59:09 -0400, Greg Stark wrote: > Personally while I would like to avoid code that actively crashes or > fails basic tests on Vax I personally vote for simply refusing to run/compile on non-IEEE platforms, including VAX. The benefit of even trying to get that right, not to speak o

Re: [HACKERS] Hash Functions

2017-05-14 Thread Greg Stark
On 13 May 2017 at 10:29, Robert Haas wrote: > - Floats. There may be different representations in use on different > hardware, which could be a problem. Tom didn't answer my question > about whether any even-vaguely-modern hardware is still using non-IEEE > floats, which I suspect means that the

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 11:47 PM, Andres Freund wrote: > It'll be differently sized on different platforms. So everyone will have to > write hash functions that look at each member individually, rather than > hashing the entire struct at once. And for each member you'll have to use a > type s

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 1:57 PM, Tom Lane wrote: > Basically, this is simply saying that you're willing to ignore the > hard cases, which reduces the problem to one of documenting the > portability limitations. You might as well not even bother with > worrying about the integer case, because port

Re: [HACKERS] Hash Functions

2017-05-13 Thread Andres Freund
On May 13, 2017 8:44:22 PM PDT, Robert Haas wrote: >On Sat, May 13, 2017 at 7:08 PM, Andres Freund >wrote: >> I seriously doubt that's true. A lot of more complex types have >> internal alignment padding and such. > >True, but I believe we require those padding bytes to be zero. If we >didn't

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 7:08 PM, Andres Freund wrote: > I seriously doubt that's true. A lot of more complex types have > internal alignment padding and such. True, but I believe we require those padding bytes to be zero. If we didn't, then hstore_hash would be broken already. > Consider e.g.

Re: [HACKERS] Hash Functions

2017-05-13 Thread Andres Freund
On 2017-05-13 10:29:09 -0400, Robert Haas wrote: > On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: > > Can we think of defining separate portable hash functions which can be > > used for the purpose of hash partitioning? > > I think that would be a good idea. I think it shouldn't even be th

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 12:38 PM, Robert Haas wrote: > That is a good question. I think it basically amounts to this > question: is hash partitioning useful, and if so, for what? Two words: parallel query. To get parallelism, one of the best approaches is dividing the data, then doing as much wo

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 11:45 AM, Tom Lane wrote: > Forget hash partitioning. There's no law saying that that's a good > idea and we have to have it. With a different set of constraints, > maybe we could do it, but I think the existing design decisions have > basically locked it out --- and I do

Re: [HACKERS] Hash Functions

2017-05-13 Thread Tom Lane
Robert Haas writes: > On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: >> Can we think of defining separate portable hash functions which can be >> used for the purpose of hash partitioning? > I think that would be a good idea. I think it shouldn't even be that > hard. By data type: > - I

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 10:34 AM, Tom Lane wrote: > Maintaining such a property for float8 (and the types that depend on it) > might be possible if you believe that nobody ever uses anything but IEEE > floats, but we've never allowed that as a hard assumption before. This is not such a big practi

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: > Can we think of defining separate portable hash functions which can be > used for the purpose of hash partitioning? I think that would be a good idea. I think it shouldn't even be that hard. By data type: - Integers. We'd need to make sur

Re: [HACKERS] Hash Functions

2017-05-12 Thread Amit Kapila
On Sat, May 13, 2017 at 1:08 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > > Maybe a shorter argument for hash partitioning is that not one but two > different people proposed patches for it within months of the initial > partitioning patch going in. When multiple p

Re: [HACKERS] Hash Functions

2017-05-12 Thread Andres Freund
On 2017-05-12 21:56:30 -0400, Robert Haas wrote: > Cheap isn't free, though. It's got a double-digit percentage overhead > rather than a large-multiple-of-the-runtime overhead as triggers do, > but people still won't want to pay it unnecessarily, I think. That should be partiall addressable with

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 7:36 PM, David Fetter wrote: > On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: >> On 5/12/17 18:13, Alvaro Herrera wrote: >> > I think for logical replication the tuple should appear as being in the >> > parent table, not the partition. No? >> >> Logical

Re: [HACKERS] Hash Functions

2017-05-12 Thread David Fetter
On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: > On 5/12/17 18:13, Alvaro Herrera wrote: > > I think for logical replication the tuple should appear as being in the > > parent table, not the partition. No? > > Logical replication replicates base table to base table. How those

Re: [HACKERS] Hash Functions

2017-05-12 Thread Peter Eisentraut
On 5/12/17 18:13, Alvaro Herrera wrote: > I think for logical replication the tuple should appear as being in the > parent table, not the partition. No? Logical replication replicates base table to base table. How those tables are tied together into a partitioned table or an inheritance tree is

Re: [HACKERS] Hash Functions

2017-05-12 Thread Alvaro Herrera
Peter Eisentraut wrote: > On 5/12/17 14:23, Robert Haas wrote: > > One alternative would be to change the way that we dump and restore > > the data. Instead of dumping the data with the individual partitions, > > dump it all out for the parent and let tuple routing sort it out at > > restore time.

Re: [HACKERS] Hash Functions

2017-05-12 Thread Peter Eisentraut
On 5/12/17 14:23, Robert Haas wrote: > One alternative would be to change the way that we dump and restore > the data. Instead of dumping the data with the individual partitions, > dump it all out for the parent and let tuple routing sort it out at > restore time. I think this could be a pg_dump

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > Yeah, that isn't really appetizing at all. If we were doing physical > partitioning below the user-visible level, we could make it fly. > But the existing design makes the partition boundaries user-visible > which means we have to insist that the

Re: [HACKERS] Hash Functions

2017-05-12 Thread Kenneth Marshall
On Fri, May 12, 2017 at 02:23:14PM -0400, Robert Haas wrote: > > What about integers? I think we're already assuming two's-complement > arithmetic, which I think means that the only problem with making the > hash values portable for integers is big-endian vs. little-endian. > That's sounds solvea

Re: [HACKERS] Hash Functions

2017-05-12 Thread Tom Lane
Robert Haas writes: > On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: >> I'd vote that it's not, which means that this whole approach to hash >> partitioning is unworkable. I agree with Andres that demanding hash >> functions produce architecture-independent values will not fly. > If we can't

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: > I'd vote that it's not, which means that this whole approach to hash > partitioning is unworkable. I agree with Andres that demanding hash > functions produce architecture-independent values will not fly. If we can't produce architecture-indepen

Re: [HACKERS] Hash Functions

2017-05-12 Thread Joe Conway
On 05/12/2017 10:17 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent >> representation, it seems somewhat unrealistic and expensive to have >> a hard rule to keep them architecture agnostic. And if that'

Re: [HACKERS] Hash Functions

2017-05-12 Thread Tom Lane
Robert Haas writes: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent representation, >> it seems somewhat unrealistic and expensive to have a hard rule to keep them >> architecture agnostic. And if that's not guaranteed, t

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: > Given that a lot of data types have a architecture dependent representation, > it seems somewhat unrealistic and expensive to have a hard rule to keep them > architecture agnostic. And if that's not guaranteed, then I'm doubtful it > mak

Re: [HACKERS] Hash Functions

2017-05-12 Thread Andres Freund
On May 12, 2017 10:05:56 AM PDT, Robert Haas wrote: >On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: >> 1. The hash functions as they exist today aren't portable -- they can >> return different results on different machines. That means using >these >> functions for hash partitioning would y

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: > 1. The hash functions as they exist today aren't portable -- they can > return different results on different machines. That means using these > functions for hash partitioning would yield different contents for the > same partition on differen

[HACKERS] Hash Functions

2017-05-11 Thread Jeff Davis
https://www.postgresql.org/message-id/camp0ubeo3fzzefie1vmc1ajkkrpxlnzqooaseu6o-c+...@mail.gmail.com In that thread, I pointed out some important considerations for the hash functions themselves. This is a follow-up, after I looked more carefully. 1. The hash functions as they exist today are