It is easier to understand using the BINARY_STRING and STRING_BINARY
functions that Aditya so kindly added.  In general, CONVERT_TO and
CONVERT_FROM are converting to binary and from binary.  The encoding
defines the translation.

SELECT CONVERT_FROM(BINARY_STRING('\x00\x00\x00\xC8'), 'INT_BE') as cnvrt
from (VALUES (1));
+--------+
| cnvrt  |
+--------+
| 200    |
+--------+

You can read this expression as
1. Start with string literal \x00\x00\x00\xC8
2. Decode that string literal into a VARBINARY with those four octets.
3. Decode that VARBINARY understanding the data is encoded using big endian
four byte integer encoding.
4. Return that value.

So 000000C8 = 200 if we're dealing with a big endian integer.

Other examples:
SELECT CONVERT_FROM(BINARY_STRING('\x17\x04\x00\x00'), 'INT') as cnvrt from
(VALUES (1));
+--------+
| cnvrt  |
+--------+
| 1047   |
+--------+

SELECT
  STRING_BINARY(CONVERT_TO(1, 'INT')) as i,
  STRING_BINARY(CONVERT_TO(1, 'INT_BE')) as i_be,
  STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l,
  STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l_be,
  STRING_BINARY(CONVERT_TO(1, 'INT_HADOOPV')) as l_be
from (VALUES (1));
-------------------+-------------------+-----------------------------------+-----------------------------------+--------+
|         i         |       i_be        |                 l
|               l_be                | l_be0  |
+-------------------+-------------------+-----------------------------------+-----------------------------------+--------+
| \x01\x00\x00\x00  | \x00\x00\x00\x01  | \x01\x00\x00\x00\x00\x00\x00\x00
 | \x01\x00\x00\x00\x00\x00\x00\x00  | \x01   |
+-------------------+-------------------+-----------------------------------+-----------------------------------+--------+

SELECT
  STRING_BINARY(CONVERT_TO('hello', 'UTF8')) u8,
  STRING_BINARY(CONVERT_TO('hello', 'UTF16')) u16
from (VALUES (1));
+--------+------------------------------------+
|   u8   |                u16                 |
+--------+------------------------------------+
| hello  | \xFE\xFF\x00h\x00e\x00l\x00l\x00o  |
+--------+------------------------------------+

You can see a bunch of examples in the tests here [1].

[1]
https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestConvertFunctions.java

Now specifically to your three examples:

> SELECT CONVERT_TO('[ [1, 2], [3, 4], [5]]','UTF-8') AS MYCOL1 FROM
sys.version;

File a bug.  This should work.

> select cast(x as BINARY(10)) foo from (values 1000)tbl(x);

I'm pretty sure that SQL doesn't allow a cast from integer to varbinary,
thus a correct failure message.

>select convert_to(x, 'INT') from (values 1000) tbl(x);

The problem here is that Drill treats all number literals as BIGINT.
You're trying to convert to a four byte encoding.  This doesn't work since
your input type requires eight bytes of precision.  You either need to cast
to lower precision or change your encoding to BIGINT or some other large
precision encoding such as BIGINT_HADOOPV or BIGINT_BE.

Reply via email to