Re: [Firebird-net-provider] Short char field with longer name

Helen Borrie Tue, 05 Jun 2007 17:12:33 -0700

At 06:12 AM 6/06/2007, you wrote:
>YES! Changing the column's type to VARCHAR(1) works.
>
>Now can someone tell me why?


You said your data was UTF8 (4-byte characters) and your connection 
charset is NONE (which means US ASCII, one-byte characters).  ASCII 
"D" is stored as "44", while UTF8 "D" is stored as "44 00 00 
00".  Varchar(1) is an absurdity, but the engine dutifully strips the 
right-padding from varchar byte streams and the client gets the 
single byte it is expecting.  This hack is only going to work for 
characters that share the same code for the leftmost byte, i.e. 
ASCII/ANSI codes <= hex 80, and only if the search string is a single 
character. Beyond that, all bets are off.

>Doesn't my experience still indicate that there
>is some sort of bug with columns of type CHAR(1)?

In your current setup, yes.  The hack only works if both the client 
charset and the database charset have encodings for all equivalent 
characters _and_ both remain meaningful if reduced to the leftmost 
byte.  (The same applies to the introducer syntax, see below.)

With client charset NONE, you are getting transliteration when 
*storing* strings (since the database engine takes care of it) but it 
can't happen for reads, since the client only knows about single-byte 
encodings and prepares its buffers accordingly.

You'll need to set the correct client charset to ensure that the 
client side of the interface knows the correct attributes for the 
data it is preparing for.  Also follow up on Carlos' advice to 
revisit your Windows language environment and do whatever is needed 
to ensure that the strings being processed by the application are 
valid for UTF8.

Side note:
When you have a literal search argument (as opposed to a 
parameterised one) you can use the introducer syntax to coerce 
literal input strings to a different character set than the 
connection charset, e.g.

SELECT Description from TableName
where Type = _UTF8 'D'

While this syntax can have its uses in databases that store data in a 
variety of character sets, it's no kind of universal formula for 
hacking around your mismatched client/server encodings.  It's not 
possible to coerce output so, as you have already discovered, your 
client prepares a single-byte buffer for your unsearched SELECT 
statement because it is expecting ASCII-encoded characters.

>By the way, I renamed the column back to TYPE and it doesn't seem to cause
>any problems.

This was a red herring.  From Fb 2.0 on, TYPE is no longer a reserved 
word.  See p. 45 (Acrobat p. 55) of the release notes...

cheers,
Helen


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Firebird-net-provider mailing list
Firebird-net-provider@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/firebird-net-provider

Re: [Firebird-net-provider] Short char field with longer name

Reply via email to