[HACKERS] Unicode restriction

2004-08-03 Thread Oliver Elphick
In src/backend/utils/mb/wchar.c there is a check to exclude Unicode characters above 0x1. I can't see anything to explain this restriction, except possibly this in the release notes for 7.2: Reject invalid multibyte character sequences (Tatsuo) It does not explain why part of the

Re: [HACKERS] Unicode restriction

2004-08-03 Thread Tatsuo Ishii
In src/backend/utils/mb/wchar.c there is a check to exclude Unicode characters above 0x1. I can't see anything to explain this restriction, except possibly this in the release notes for 7.2: Reject invalid multibyte character sequences (Tatsuo) It does not explain why part of

Re: [HACKERS] Unicode restriction

2004-08-03 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: Before 7.4, to be handled by regex routines, UTF-8 are converted to ISO 10646. There was a limitaion in regex routines in that they cannot handle multibyte characters 2bytes. In another word only 16bit UCS-2 are supported. That's why ISO 10646 0x1