On 14-08-2020 13:09, Dimitry Sibiryakov wrote:
   Hello All.

  What SQL standard is telling about charset introducers (such as "_utf8 'abc'")?

   From my knowledge it can be interpreted in two ways:

1) Following byte sequence has given charset.
2) Following character sequence must be used as a string in given charset.

  First option resuts in heavy limitation for query text transliteration and may end up in single usable case when the literal is a binary string of hexadecimal form.   Second option makes the introducer a shortcut for CAST('abc' as CHAR CHARACTER SET utf8) but allow the query text to be freely transliterated between application and engine.

  I have this question because Unicode version of ODBC interface needs query text in fixed UTF-16 and it result in a transliteration problem.

The SQL standard is not very clear about it to be honest, but how I read it, the introducer is NOT intended as a form of cast (at least, not with the 'normal' string literal:

Specifically it says:
"""
16) Case:
a) If a <character set specification> is not specified in a <character string literal>, then the set of characters contained in the <character string literal> shall be wholly contained in the character set of the <SQLclient module definition> that contains the <character string literal>. b) Otherwise, there shall be no <separator> between the <introducer> and the <character set specification>, and the set of characters contained in the <character string literal> shall be wholly contained in the character set specified by the <character set specification>.

[..]

18) The character set of a <character string literal> is
Case:
a) If the <character string literal> specifies a <character set specification>, then the character set specified by that <character set specification>. b) Otherwise, the character set of the SQL-client module that contains the <character string literal>.
"""

As I read, the current behaviour of Firebird is correct, it is just damn awkward to achieve.

For the behaviour you want, Firebird would need to implement Unicode literals, because in unicode literals, the introducer serves as a form of cast.

Mark
--
Mark Rotteveel


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to