Re: [BUGS] BUG #3766: tsearch2 index creation error

2007-11-24 Thread Thomas H.

Tom Lane wrote:

Operating system:   Windows 2003



CREATE INDEX posts_fts_idx ON forum.posts USING gin(to_tsvector('english',
p_msg_clean));
ERROR:  translation from wchar_t to server encoding failed: No error


Hmm.  That error message is close to some code that is specific to the
Windows-and-UTF8 case, which might explain why I don't see it.

Can any Windows hackers check into whether the WIN32 coding in
wchar2char() and char2wchar() in ts_locale.c is sane?


has anyone had the chance to look into that problem? i'd be more than 
willing to help testing an updated build if needed.


thanks,
thomas


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [BUGS] BUG #3766: tsearch2 index creation error

2007-11-24 Thread Tom Lane
Thomas H. [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Can any Windows hackers check into whether the WIN32 coding in
 wchar2char() and char2wchar() in ts_locale.c is sane?

 has anyone had the chance to look into that problem? i'd be more than 
 willing to help testing an updated build if needed.

After re-reading Microsoft's man pages I think I see the problem ---
attached patch is applied.

regards, tom lane

Index: src/backend/tsearch/ts_locale.c
===
RCS file: /cvsroot/pgsql/src/backend/tsearch/ts_locale.c,v
retrieving revision 1.4
diff -c -r1.4 ts_locale.c
*** src/backend/tsearch/ts_locale.c 15 Nov 2007 21:14:38 -  1.4
--- src/backend/tsearch/ts_locale.c 24 Nov 2007 21:14:49 -
***
*** 23,29 
   * wchar2char --- convert wide characters to multibyte format
   *
   * This has the same API as the standard wcstombs() function; in particular,
!  * tolen is the maximum number of bytes to store at *to, and *from should be
   * zero-terminated.  The output will be zero-terminated iff there is room.
   */
  size_t
--- 23,29 
   * wchar2char --- convert wide characters to multibyte format
   *
   * This has the same API as the standard wcstombs() function; in particular,
!  * tolen is the maximum number of bytes to store at *to, and *from must be
   * zero-terminated.  The output will be zero-terminated iff there is room.
   */
  size_t
***
*** 73,93 
{
int r;
  
!   r = MultiByteToWideChar(CP_UTF8, 0, from, fromlen, to, tolen);
! 
!   if (r = 0)
{
!   pg_verifymbstr(from, fromlen, false);
!   ereport(ERROR,
!   
(errcode(ERRCODE_CHARACTER_NOT_IN_REPERTOIRE),
!errmsg(invalid multibyte character 
for locale),
!errhint(The server's LC_CTYPE locale 
is probably incompatible with the database encoding.)));
}
  
!   Assert(r = tolen);
  
!   /* Microsoft counts the zero terminator in the result */
!   return r - 1;
}
  #endif   /* WIN32 */
  
--- 73,100 
{
int r;
  
!   /* stupid Microsloth API does not work for zero-length input */
!   if (fromlen == 0)
!   r = 0;
!   else
{
!   r = MultiByteToWideChar(CP_UTF8, 0, from, fromlen, to, 
tolen - 1);
! 
!   if (r = 0)
!   {
!   /* see notes in oracle_compat.c about error 
reporting */
!   pg_verifymbstr(from, fromlen, false);
!   ereport(ERROR,
!   
(errcode(ERRCODE_CHARACTER_NOT_IN_REPERTOIRE),
!errmsg(invalid multibyte 
character for locale),
!errhint(The server's LC_CTYPE 
locale is probably incompatible with the database encoding.)));
!   }
}
  
!   Assert(r  tolen);
!   to[r] = 0;
  
!   return r;
}
  #endif   /* WIN32 */
  
---(end of broadcast)---
TIP 6: explain analyze is your friend