Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Craig Ringer
On 07/19/2012 04:58 PM, Alban Hertroys wrote: On 19 July 2012 10:40, Alexander Law wrote: Ok, maybe the time of real universal encoding has not yet come. Then we maybe just should add a new parameter "log_encoding" (UTF-8 by default) to postgresql.conf. And to use this encoding consistently wit

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Craig Ringer
On 07/19/2012 03:24 PM, Tatsuo Ishii wrote: BTW, I'm not stick with mule-internal encoding. What we need here is a "super" encoding which could include any existing encodings without information loss. For this purpose, I think we can even invent a new encoding(maybe something like very first prpo

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alban Hertroys
On 19 July 2012 13:50, Alexander Law wrote: >> I like Craig's idea of adding the client encoding to the log lines. A >> possible problem with that (I'm not an encoding expert) is that a log >> line like that will contain data about the database server meta-data >> (log time, client encoding, etc)

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alexander Law
I like Craig's idea of adding the client encoding to the log lines. A possible problem with that (I'm not an encoding expert) is that a log line like that will contain data about the database server meta-data (log time, client encoding, etc) in the database default encoding and database data (the

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alexander Law
Sorry, it was inaccurate phrase. I mean "if the conversion to this encoding is not avaliable". For example, when we have database in EUC_JP and log_encoding set to Latin1. I think that we can even fall back to UTF-8 as we can convert all encodings to it (with some exceptions that you noticed).

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alban Hertroys
Yikes, messed up my grammar a bit I see! On 19 July 2012 10:58, Alban Hertroys wrote: > I like Craig's idea of adding the client encoding to the log lines. A > possible problem with that (I'm not an encoding expert) is that a log > line like that will contain data about the database server meta-

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alban Hertroys
On 19 July 2012 10:40, Alexander Law wrote: >>> Ok, maybe the time of real universal encoding has not yet come. Then >>> we maybe just should add a new parameter "log_encoding" (UTF-8 by >>> default) to postgresql.conf. And to use this encoding consistently >>> within logging_collector. >>> If thi

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Tatsuo Ishii
> Sorry, it was inaccurate phrase. I mean "if the conversion to this > encoding is not avaliable". For example, when we have database in > EUC_JP and log_encoding set to Latin1. I think that we can even fall > back to UTF-8 as we can convert all encodings to it (with some > exceptions that you noti

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alexander Law
Ok, maybe the time of real universal encoding has not yet come. Then we maybe just should add a new parameter "log_encoding" (UTF-8 by default) to postgresql.conf. And to use this encoding consistently within logging_collector. If this encoding is not available then fall back to 7-bit ASCII. What

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Tatsuo Ishii
>> You can google by "encoding "EUC_JP" has no equivalent in "UTF8"" or >> some such to find such an example. In this case PostgreSQL just throw >> an error. For frontend/backend encoding conversion this is fine. But >> what should we do for logs? Apparently we cannot throw an error here. >> >> "Un

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alexander Law
And regarding mule internal encoding - reading about Mule http://www.emacswiki.org/emacs/UnicodeEncoding I found: /In future (probably Emacs 22), Mule will use an internal encoding which is a UTF-8 encoding of a superset of Unicode. / So I still see UTF-8 as a common denominator for all the enco

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Alexander Law
The initial issue was that log file contains messages in different encodings. So transcoding is performed already, but it's not This is not true. Transcoding happens only when PostgreSQL is built with --enable-nls option (default is no nls). I'll restate the initial issue as I see it. I have Win

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Tatsuo Ishii
> Hello, >> >> Implementing any of these isn't trivial - especially making sure >> messages emitted to stderr from things like segfaults and dynamic >> linker messages are always correct. Ensuring that the logging >> collector knows when setlocale() has been called to change the >> encoding and tra

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-19 Thread Tatsuo Ishii
>> I am thinking about variant of C. >> >> Problem with C is, converting from other encoding to UTF-8 is not >> cheap because it requires huge conversion tables. This may be a >> serious problem with busy server. Also it is possible some information >> is lossed while in this conversion. This is be

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-18 Thread Alexander Law
Hello, C. We have one logfile with UTF-8. Pros: Log messages of all our clients can fit in it. We can use any generic editor/viewer to open it. Nothing changes for Linux (and other OSes with UTF-8 encoding). Cons: All the strings written to log file should go through some conversation function.

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-18 Thread Tatsuo Ishii
> Tatsuo Ishii writes: >> My idea is using mule-internal encoding for the log file instead of >> UTF-8. There are several advantages: > >> 1) Converion to mule-internal encoding is cheap because no conversion >>table is required. Also no information loss happens in this >>conversion. > >

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-18 Thread Tom Lane
Tatsuo Ishii writes: > My idea is using mule-internal encoding for the log file instead of > UTF-8. There are several advantages: > 1) Converion to mule-internal encoding is cheap because no conversion >table is required. Also no information loss happens in this >conversion. > 2) Mule-in

Re: [GENERAL] [BUGS] main log encoding problem

2012-07-18 Thread Tatsuo Ishii
> C. We have one logfile with UTF-8. > Pros: Log messages of all our clients can fit in it. We can use any > generic editor/viewer to open it. > Nothing changes for Linux (and other OSes with UTF-8 encoding). > Cons: All the strings written to log file should go through some > conversation function