On Tue, Jul 23, 2013 at 9:42 AM, Craig Ringer <cr...@2ndquadrant.com> wrote: > (Replying on phone, please forgive bad quoting) > > Isn't this pretty much what adopting ICU is supposed to give us? > OS-independent collations?
Yes. > I'd be interested in seeing the rest data for this performance report, partly > as I'd like to see how ICU collations would compare when ICU is crudely > hacked into place for testing. I pretty much lost interest in ICU upon reading that they use UTF-16 as their internal format. http://userguide.icu-project.org/strings#TOC-Strings-in-ICU What that would mean for us is that instead of copying the input strings into a temporary buffer and passing the buffer to strcoll(), we'd need to convert them to ICU's representation (which means writing twice as many bytes as the length of the input string in cases where the input string is mostly single-byte characters) and then call ICU's strcoll() equivalent. I agree that it might be worth testing, but I can't work up much optimism. It seems to me that something that operates directly on the server encoding could run a whole lot faster. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers