Thank you Jacques The INT_BE made the trick - now I'm getting status 200 instead of the negative number. The problem is that I haven't seen any mention of this type anywhere in the documentation - maybe the corresponding section of the conversions should be expanded, because it refers only to standard types that mention only INT, LONG, etc,, without endian-specific variants
Another point is ease to use - if you need to make relatively complex query against HBase, your code is cluttered with all these CONVERT_FROMs - maybe there could be a way to "pre-register" HBase schema with data types, etc., and after that - use provided information. Otherwise, it's completely unusable, especially if the project targets analysts, not developers - they all will be confused by conversions, big endian types, etc. On Thu, Jul 23, 2015 at 1:51 AM, Jacques Nadeau <jacq...@dremio.com> wrote: > It is easier to understand using the BINARY_STRING and STRING_BINARY > functions that Aditya so kindly added. In general, CONVERT_TO and > CONVERT_FROM are converting to binary and from binary. The encoding > defines the translation. > > SELECT CONVERT_FROM(BINARY_STRING('\x00\x00\x00\xC8'), 'INT_BE') as cnvrt > from (VALUES (1)); > +--------+ > | cnvrt | > +--------+ > | 200 | > +--------+ > > You can read this expression as > 1. Start with string literal \x00\x00\x00\xC8 > 2. Decode that string literal into a VARBINARY with those four octets. > 3. Decode that VARBINARY understanding the data is encoded using big endian > four byte integer encoding. > 4. Return that value. > > So 000000C8 = 200 if we're dealing with a big endian integer. > > Other examples: > SELECT CONVERT_FROM(BINARY_STRING('\x17\x04\x00\x00'), 'INT') as cnvrt from > (VALUES (1)); > +--------+ > | cnvrt | > +--------+ > | 1047 | > +--------+ > > SELECT > STRING_BINARY(CONVERT_TO(1, 'INT')) as i, > STRING_BINARY(CONVERT_TO(1, 'INT_BE')) as i_be, > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l, > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l_be, > STRING_BINARY(CONVERT_TO(1, 'INT_HADOOPV')) as l_be > from (VALUES (1)); > > -------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > | i | i_be | l > | l_be | l_be0 | > > +-------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > | \x01\x00\x00\x00 | \x00\x00\x00\x01 | \x01\x00\x00\x00\x00\x00\x00\x00 > | \x01\x00\x00\x00\x00\x00\x00\x00 | \x01 | > > +-------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > > SELECT > STRING_BINARY(CONVERT_TO('hello', 'UTF8')) u8, > STRING_BINARY(CONVERT_TO('hello', 'UTF16')) u16 > from (VALUES (1)); > +--------+------------------------------------+ > | u8 | u16 | > +--------+------------------------------------+ > | hello | \xFE\xFF\x00h\x00e\x00l\x00l\x00o | > +--------+------------------------------------+ > > You can see a bunch of examples in the tests here [1]. > > [1] > > https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestConvertFunctions.java > > Now specifically to your three examples: > > > SELECT CONVERT_TO('[ [1, 2], [3, 4], [5]]','UTF-8') AS MYCOL1 FROM > sys.version; > > File a bug. This should work. > > > select cast(x as BINARY(10)) foo from (values 1000)tbl(x); > > I'm pretty sure that SQL doesn't allow a cast from integer to varbinary, > thus a correct failure message. > > >select convert_to(x, 'INT') from (values 1000) tbl(x); > > The problem here is that Drill treats all number literals as BIGINT. > You're trying to convert to a four byte encoding. This doesn't work since > your input type requires eight bytes of precision. You either need to cast > to lower precision or change your encoding to BIGINT or some other large > precision encoding such as BIGINT_HADOOPV or BIGINT_BE. > -- With best wishes, Alex Ott http://alexott.net/ Twitter: alexott_en (English), alexott (Russian) Skype: alex.ott