If the column is delimited by some character, you can use the SPLIT()
function to separate the value into an array of values.  You can then use
the FLATTEN() function to separate the array of values into individual
records.

E.g., if your column has the value "a:b", where your delimiter is ":", you
would run the following query:

0: jdbc:drill:zk=local> select flatten(split(columns[0],':')) from
`/tmp/foo.csv`;
+---------+
| EXPR$0  |
+---------+
| a       |
| b       |
+---------+
2 rows selected (0.319 seconds)

Is that what you had in mind?

-- Zelaine

On Tue, Aug 30, 2016 at 7:17 AM, Robin Moffatt <
robin.moff...@rittmanmead.com> wrote:

> Hi,
>
> I'm trying to read a sequence file, in which the key is null and the value
> holds multiple columns [1], delimited by \x01. In Hive I simply define it
> as :
>
> CREATE EXTERNAL TABLE foo (col1 string, col2 string, col3 timestamp)
> ROW FORMAT DELIMITED
> STORED as sequencefile
> LOCATION '/user/oracle/foo/bar';
>
> In Drill I've got as far as
>
> SELECT CONVERT_FROM(binary_value, 'UTF8') from  `/user/oracle/foo/bar`
>
> which yields the data but as a single column. I can cast it to individual
> columns but this is no use if the field positions change
>
> SELECT substr(CONVERT_FROM(binary_value, 'UTF8'),5,1) as
> col0,substr(CONVERT_FROM(binary_value, 'UTF8'),7,13) as
> col1,substr(CONVERT_FROM(binary_value, 'UTF8'),20,20) as col3 from
>  `/user/oracle/seq/pdb.soe.logon` limit 5;
> +-------+----------------+-----------------------+
> | col0  |      col1      |         col3          |
> +-------+----------------+-----------------------+
> | I     | PDB.SOE.LOGON  | 2016-07-29 13:36:40  |
>
>
> Is there a way to treat a column as delimited and burst it out into
> multiple columns? Presumably I could somehow dump the string contents to
> CSV and then re-read it - but I'm interested here in using Drill the query
> existing data; wrangling it to suit Drill isn't really what I'm looking for
> (and maybe Drill just isn't the right tool here?).
>
>
> thanks,
>
> Robin.
>
> [1]
> https://docs.oracle.com/goldengate/bd1221/gg-bd/GADBD/
> GUID-85A82B2E-CD51-463A-8674-3D686C3C0EC0.htm#GADBD-GUID-
> 4CAFC347-0F7D-49AB-B293-EFBCE95B66D6
>

Reply via email to