On Wed, Apr 23, 2008 at 04:35:04PM +0200, Tim Tassonis wrote:
> Ok, let me put it in another way. If UTF-8 is chosen at initdb, only 
> UTF-8 databases can be created, if C is chosen, you can specify 
> different encodings (UTF-8, LATIN1 etc) for each database.
> 
> As I understood now, sorting will then still be in C style and not in 
> the locale specific way. Which leads me to the following questions:
> 
> If specifying a characterset different from the default locale for a 
> database is such a bad idea, why is it possible at all?

It isn't possible, that's the point. What is possible is that client
can use any encoding they like to talk to the server, but the server
will store and manage it all in one. What locale C means "I'm an
encoding wizard and will ensure all my programs can handle all the
encodings I want to use, because I understand the database will treat
everything I send as ASCII bytes no matter what encoding the clients
say it is".

> From how I understand you, if I wanted a postgres server machine 
> supporting databases with different charsets, I'm advised to initialise 
> one cluster per locale.

If you want to control the *storage* charset, yes. If you just want
clients to think it's a LATIN9 DB, doing a:

ALTER DATABASE foo SET client_encoding=latin9;

> If specifying a characterset different from the default locale for a 
> database is not a bad idea, why does the default install forbid me to do 
> exactly this?

It is a bad idea, because most normal the C library can only handle one
encoding at a time. Locale C is a backdoor because it has system
independant semantics and does not require libc. It's also not what
people usually want, and so not recommended.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.

Attachment: signature.asc
Description: Digital signature

Reply via email to