> Where can I read about basic tech details of Unicode / Charset > Conversion / ... > > I't like to find answers to the following (for database created using > UNICODE) > > 1. Where exactly are conversions between national charsets done
No "national charset" is in PostgreSQL. I assume you want to know where frontend/backend encoding conversion happens. They are handled by pg_server_to_client(does conversion BE to FE) and pg_client_to_server(FE to BE). These functions are called by the communication sub system(backend/libpq) and COPY. In summary, in most cases the encoding conversion is done before the parser and after the executor produces the final result. > 2. What is converyted (whole SQL statements or just data) Whole statement. > 3. What format is used for processing in memory (UCS-2, UCS-4, UTF-8, > UTF-16, UTF-32, ...) "format"? I assume you are talking about the encoding. It is exactly same as the database encoding. For UNICODE database, we use UTF-8. Not UCS-2 nor UCS-4. > 4. What format is used when saving to disk (UCS-*, UTF-*, SCSU, ...) ? Ditto. > 5. Are LIKE/SIMILAR aware of locale stuff ? I don't know about SIMILAR, but I believe LIKE is not locale aware and is correct from the standard's point of view... -- Tatsuo Ishii ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])