Hi Hans, Thanks very much, I’ve created JIRA https://issues.apache.org/jira/browse/TRAFODION-2669 , I think it will be better if we change it a little bit, including the issue I mentioned in another email. It says source type is 21 bytes and target type is 24 bytes, but in reality the value can not be inserted.
Best regards, Yuan Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn> Cellphone: (+86) 13671935540 From: Hans Zeller [mailto:hans.zel...@esgyn.com] Sent: Wednesday, June 28, 2017 11:31 AM To: user@trafodion.incubator.apache.org Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? Yuan, you made a very good point, the error message is misleading. It shows the target data type as CHAR(24 BYTES) when in reality it is CHAR(6). I think this could be made better (probably not perfect) by a change in method ExRaiseDetailSqlError() in file ExpError.cpp. Do you want to file a JIRA for this? I can also do that if you like. You can assign the JIRA to me. It might also be good to change the default terminal charset of sqlci to UTF-8. What do people think, should we make that incompatible change? It may be some work, not sure it's worth it. Thanks, Hans From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn] Sent: Tuesday, June 27, 2017 8:03 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? That is good to hear that, thanks, Hans~ Best regards, Yuan Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn> Cellphone: (+86) 13671935540 From: Hans Zeller [mailto:hans.zel...@esgyn.com] Sent: Wednesday, June 28, 2017 2:38 AM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? Hi, The problem is how sqlci interprets its input data. In your case, sqlci interprets the data as ISO8859-1, that makes the input 3*3 bytes or 9 ISO 8859-1 characters. It then translates those 9 (junk) characters into UTF-8, requiring 18 bytes, and those don't fit into the target table. Once you tell sqlci to interpret its input as UTF-8, it works. Also, you need the terminal that you are using to use UTF-8 (in my case that's a Linux xterm which is set to UTF-8): Apache Trafodion Conversational Interface 2.2.0 Copyright (c) 2015-2017 Apache Software Foundation >>create table test_utf8(a varchar(6) character set utf8); --- SQL operation complete. >>insert into test_utf8 values('我我我'); *** ERROR[8402] A string overflow occurred during the evaluation of a character expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,18 BYTES,UTF8) Source Value:我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8). --- 0 row(s) inserted. >>set terminal_charset utf8; >>insert into test_utf8 values('我我我'); --- 1 row(s) inserted. Note the "set terminal_charset utf8" statement in sqlci that made it work. When I ran trafci from a Linux prompt it worked - it seems trafci on Linux assumes its input is UTF-8. My recommendation: a) use a terminal (e.g. xterm) that is set to UTF-8, b) if using sqlci, set the terminal charset of sqlci to utf8, c) type UTF-8 characters into the terminal. Hans From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn] Sent: Tuesday, June 27, 2017 12:52 AM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? Hmm.. Especially the error message caused misunderstanding, 21 bytes can not be inserted into 24 bytes? *** ERROR[8402] A string overflow occurred during the evaluation of a character expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,21 BYTES,UTF8) Source Value:我我我我我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8). [2017-06-27 15:44:45] Best regards, Yuan Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn> Cellphone: (+86) 13671935540 From: Xu, Kai-Hua (Kevin) [mailto:kaihua...@esgyn.cn] Sent: Tuesday, June 27, 2017 3:49 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? I got the same error on sqlci. I would suggest you use trafci. Ture, it an issue. [cid:image001.jpg@01D2F003.010C09E0] Best Regards, Kevin Xu From: Xu, Kai-Hua (Kevin) Sent: Tuesday, June 27, 2017 3:41 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: How many Chinese characters can be stored in varchar(n chars) character set utf8? Hi Yuan, Curious, it works for me. [cid:image002.jpg@01D2F003.010C09E0] Best Regards, Kevin Xu From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn] Sent: Tuesday, June 27, 2017 2:54 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: How many Chinese characters can be stored in varchar(n chars) character set utf8? Hi Trafodioneers, I created one table as below, >>create table test_utf8(a varchar(6) character set utf8); --- SQL operation complete. >>showddl test_utf8; CREATE TABLE TRAFODION.SEABASE.TEST_UTF8 ( A VARCHAR(6 CHARS) CHARACTER SET UTF8 COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED ) ATTRIBUTES ALIGNED FORMAT ; Then I tried to insert Chinese characters into the table, >>insert into test_utf8 values('我我我'); *** ERROR[8402] A string overflow occurred during the evaluation of a character expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,18 BYTES,UTF8) Source Value:我我我 to Target Type:VARCHAR(REC_BYTE_V_ASCII,24 BYTES,UTF8). >>insert into test_utf8 values('我我'); --- 1 row(s) inserted. It seems that I can only insert 2 Chinese characters into the column, which is not as expected as I thought. So I‘d want to ask how should I define the data type if I want to store N Chinese characters? Another thing is that, from the error message, why the Source Type of ‘我我我’ is 18 bytes, utf8 while the target type is 24 bytes, utf8? Best regards, Yuan Email: yuan....@esgyn.cn<mailto:yuan....@esgyn.cn> Cellphone: (+86) 13671935540