On Sat, Dec 31, 2005 at 12:58:19AM -0500, Greg Stark wrote: > I think this is a mistake -- the same mistake that got us into trouble with > Turkish. > > Hashing depends on the concept of equality which is integral to the type. Two > things are either the same or they aren't, and that can't change based on > context.
So someone who wants a case-insensetive search actually doesn't want "Foo" to equal "foo"? If you're arguing that that should be a different type, well, that's a possibility. But does that mean someone who wants an accent insensetive match also needs a new type? And a phonebook match, where "Mc" and "Mac" are the same? It was my understanding that the problem with Turkish/Hungarian was the we only allow one collation for strings over the whole database. The point is that in the future you will be able to select this on a per column/index/query basis, so we don't need to stick to such a restriction if the user explicitly asks to ignore it. On a more practical level, a Hash Join needs to produce the same results as a Merge Join, so if (a = b) then (hash(a) = hash(b)). So if the user types (a = b COLLATE ignorecase) then the hash function needs to change to match. > Specifically in the case of strings, two strings should only be considered > "equal" if they consist of the exact same series of characters. (That is, they > could be encoded differently but they have to encode the same actual > characters.) That they happen to sort equally compared to all other strings > doesn't mean that they're equal. Sure, for straight strings (COLLATE POSIX), that's absolutly a requirement. But people also have other requirements, like treating strings case-insensetively. I don't think we should restrict ourselves to not being able to support their wishes. You do bring up the possibility of secondary sort functions. Functions which are not involved in testing for equality, but provide addition sorting so that even in a case-insensetive sort, the different variations in case appear together. "All variations are equal, but some are more equal than others" type setup. Thanks for the feedback, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
pgp1xpk5wgBqf.pgp
Description: PGP signature