[ https://issues.apache.org/jira/browse/TRAFODION-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893263#comment-15893263 ]
ASF GitHub Bot commented on TRAFODION-2477: ------------------------------------------- Github user asfgit closed the pull request at: https://github.com/apache/incubator-trafodion/pull/986 > Invalid characters in UCS2 to UTF8 translation are not handled correctly > ------------------------------------------------------------------------ > > Key: TRAFODION-2477 > URL: https://issues.apache.org/jira/browse/TRAFODION-2477 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-cmp > Affects Versions: 2.0-incubating > Reporter: Hans Zeller > Assignee: Hans Zeller > Fix For: 2.2-incubating > > > When translating from UCS-2 to UTF-8, using CAST or TRANSLATE(... > UCS2TOUTF8), all valid characters will map easily to a UTF-8 character. > However, if we encounter invalid code points or invalid UTF-16 surrogate > pairs, those could raise errors. Right now we just suppress those errors. > Instead we should either translate them to the Unicode "replacement > character" U+FFFD or we should raise an error. Ideally, we should have a CQD > that decides which of these two actions to take. > Test case: > create table tbaducs2(a char(10) character set ucs2); > -- DC00 is a low-order UTF-16 surrogate, on its own this is invalid > insert into tbaducs2 values(_ucs2 X'DC000041'); > select translate(a using ucs2toutf8) from tbaducs2; > -- this returns an empty string - no error, no replacement character -- This message was sent by Atlassian JIRA (v6.3.15#6346)