Hi Hans,

Thanks very much, I’ve created JIRA 
https://issues.apache.org/jira/browse/TRAFODION-2669 , I think it will be 
better if we change it a little bit, including the issue I mentioned in another 
email. It says source type is 21 bytes and target type is 24 bytes, but in 
reality the value can not be inserted.

Best regards,
Yuan
Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn>
Cellphone: (+86) 13671935540

From: Hans Zeller [mailto:hans.zel...@esgyn.com]
Sent: Wednesday, June 28, 2017 11:31 AM
To: user@trafodion.incubator.apache.org
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

Yuan, you made a very good point, the error message is misleading. It shows the 
target data type as CHAR(24 BYTES) when in reality it is CHAR(6). I think this 
could be made better (probably not perfect) by a change in method 
ExRaiseDetailSqlError() in file ExpError.cpp.

Do you want to file a JIRA for this? I can also do that if you like. You can 
assign the JIRA to me.

It might also be good to change the default terminal charset of sqlci to UTF-8. 
What do people think, should we make that incompatible change? It may be some 
work, not sure it's worth it.

Thanks,

Hans

From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn]
Sent: Tuesday, June 27, 2017 8:03 PM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

That is good to hear that, thanks, Hans~

Best regards,
Yuan
Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn>
Cellphone: (+86) 13671935540

From: Hans Zeller [mailto:hans.zel...@esgyn.com]
Sent: Wednesday, June 28, 2017 2:38 AM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

Hi,

The problem is how sqlci interprets its input data. In your case, sqlci 
interprets the data as ISO8859-1, that makes the input 3*3 bytes or 9 ISO 
8859-1 characters. It then translates those 9 (junk) characters into UTF-8, 
requiring 18 bytes, and those don't fit into the target table. Once you tell 
sqlci to interpret its input as UTF-8, it works. Also, you need the terminal 
that you are using to use UTF-8 (in my case that's a Linux xterm which is set 
to UTF-8):


Apache Trafodion Conversational Interface 2.2.0
Copyright (c) 2015-2017 Apache Software Foundation
>>create table test_utf8(a varchar(6) character set utf8);

--- SQL operation complete.
>>insert into test_utf8 values('我我我');

*** ERROR[8402] A string overflow occurred during the evaluation of a character 
expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,18 BYTES,UTF8) 
Source Value:我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8).

--- 0 row(s) inserted.
>>set terminal_charset utf8;
>>insert into test_utf8 values('我我我');

--- 1 row(s) inserted.


Note the "set terminal_charset utf8" statement in sqlci that made it work.

When I ran trafci from a Linux prompt it worked - it seems trafci on Linux 
assumes its input is UTF-8.

My recommendation: a) use a terminal (e.g. xterm) that is set to UTF-8, b) if 
using sqlci, set the terminal charset of sqlci to utf8, c) type UTF-8 
characters into the terminal.

Hans


From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn]
Sent: Tuesday, June 27, 2017 12:52 AM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

Hmm.. Especially the error message caused misunderstanding,

21 bytes can not be inserted into 24 bytes?

*** ERROR[8402] A string overflow occurred during the evaluation of a character 
expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,21 BYTES,UTF8) 
Source Value:我我我我我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8). 
[2017-06-27 15:44:45]


Best regards,
Yuan
Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn>
Cellphone: (+86) 13671935540

From: Xu, Kai-Hua (Kevin) [mailto:kaihua...@esgyn.cn]
Sent: Tuesday, June 27, 2017 3:49 PM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

I got the same error on sqlci. I would suggest you use trafci. Ture, it an 
issue.
[cid:image001.jpg@01D2F003.010C09E0]

Best Regards,
Kevin Xu

From: Xu, Kai-Hua (Kevin)
Sent: Tuesday, June 27, 2017 3:41 PM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: RE: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

Hi Yuan,

Curious, it works for me.
[cid:image002.jpg@01D2F003.010C09E0]

Best Regards,
Kevin Xu

From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn]
Sent: Tuesday, June 27, 2017 2:54 PM
To: 
user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org>
Subject: How many Chinese characters can be stored in varchar(n chars) 
character set utf8?

Hi Trafodioneers,

I created one table as below,
>>create table test_utf8(a varchar(6) character set utf8);

--- SQL operation complete.
>>showddl test_utf8;

CREATE TABLE TRAFODION.SEABASE.TEST_UTF8
  (
    A                                VARCHAR(6 CHARS) CHARACTER SET UTF8
      COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
  )
ATTRIBUTES ALIGNED FORMAT
;

Then I tried to insert Chinese characters into the table,
>>insert into test_utf8 values('我我我');

*** ERROR[8402] A string overflow occurred during the evaluation of a character 
expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,18 BYTES,UTF8) 
Source Value:我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8).
>>insert into test_utf8 values('我我');

--- 1 row(s) inserted.

It seems that I can only insert 2 Chinese characters into the column, which is 
not as expected as I thought.

So I‘d want to ask how should I define the data type if I want to store N 
Chinese characters?

Another thing is that, from the error message, why the Source Type of ‘我我我’ is 
18 bytes, utf8 while the target type is 24 bytes, utf8?


Best regards,
Yuan
Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn>
Cellphone: (+86) 13671935540

Reply via email to