Re: [HACKERS] Can ICU be used for a database's default sort order?

2017-06-23 Thread Tom Lane
Peter Geoghegan  writes:
> On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
>  wrote:
>> 1) Associate by name only.  That is, you can create a database with any
>> COLLATION "foo" that you want, and it's only checked when you first
>> connect to or do anything in the database.
>> 
>> 2) Create shared collations.  Then we'd need a way to manage having a
>> mix of shared and non-shared collations around.
>> 
>> There are significant pros and cons to all of these ideas.  Some people
>> I talked to appeared to prefer the shared collations approach.

> I strongly prefer the second approach. The only downside that occurs
> to me is that that approach requires more code. Is there something
> that I've missed?

I'm not very clear on how you'd bootstrap template1 into anything
other than C locale in the second approach.  With our existing
libc-based stuff, it's possible to define what the database's locale
is before there are any catalogs.  It's not apparent how to do that with
a collation-based solution.

In my mind, collations are just a SQL-syntax wrapper for locales that
are really defined one level down.  I think we'd be well advised to
carry that same approach into the database properties, because otherwise
we have circularities to deal with. So I'm imagining something more like

create database encoding 'utf8' lc_collate 'icu-en_US' lc_ctype ...

where lc_collate is just a string that we know how to interpret, the
same as now.

We could optionally reduce the amount of notation involved by merging the
lc_collate and lc_ctype parameters into one, say

create database encoding 'utf8' locale 'icu-en_US' ...

I'm not too clear on how this would play with other libc locale
functionality (lc_monetary and so on), but we'd have to deal with
that question anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can ICU be used for a database's default sort order?

2017-06-23 Thread Peter Geoghegan
On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
 wrote:
> It's something I hope to address soon.

I hope you do. I think that we'd realize significant benefits by
having ICU become the defacto standard collation provider, that most
users get without even realizing it. As things stand, you have to make
a point of specifying an ICU collation as your per-column collation
within every CREATE TABLE. That's a significant barrier to adoption.

> 1) Associate by name only.  That is, you can create a database with any
> COLLATION "foo" that you want, and it's only checked when you first
> connect to or do anything in the database.
>
> 2) Create shared collations.  Then we'd need a way to manage having a
> mix of shared and non-shared collations around.
>
> There are significant pros and cons to all of these ideas.  Some people
> I talked to appeared to prefer the shared collations approach.

I strongly prefer the second approach. The only downside that occurs
to me is that that approach requires more code. Is there something
that I've missed?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can ICU be used for a database's default sort order?

2017-06-23 Thread Peter Eisentraut
On 6/22/17 23:10, Peter Geoghegan wrote:
> On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane  wrote:
>> Is there some way I'm missing, or is this just a not-done-yet feature?
> 
> It's a not-done-yet feature.

It's something I hope to address soon.

The main definitional challenge is how to associate a pg_database entry
with a collation.

What we currently effectively do is duplicate the fields of pg_collation
in pg_database.  But I imagine over time we'll add more properties in
pg_collation, along with additional ALTER COLLATION commands etc., so
duplicating all of that would be a significant amount of code
complication and result in a puzzling user interface.

Ideally, I'd like to see CREATE DATABASE ... COLLATION "foo".  But the
problem is of course that collations are per-database objects.  Possible
solutions:

1) Associate by name only.  That is, you can create a database with any
COLLATION "foo" that you want, and it's only checked when you first
connect to or do anything in the database.

2) Create shared collations.  Then we'd need a way to manage having a
mix of shared and non-shared collations around.

There are significant pros and cons to all of these ideas.  Some people
I talked to appeared to prefer the shared collations approach.

Other ideas?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can ICU be used for a database's default sort order?

2017-06-22 Thread Peter Geoghegan
On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane  wrote:
> Is there some way I'm missing, or is this just a not-done-yet feature?

It's a not-done-yet feature.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Can ICU be used for a database's default sort order?

2017-06-22 Thread Tom Lane
I tried to arrange $subject via

create database icu encoding 'utf8' lc_ctype "en-US-x-icu" lc_collate 
"en-US-x-icu" template template0;

and got only

ERROR:  invalid locale name: "en-US-x-icu"

which is unsurprising after looking into the code, because createdb()
checks those parameters with check_locale() which only knows about
libc-defined locale names.

Is there some way I'm missing, or is this just a not-done-yet feature?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers