Re: Accessing json fields within CSV file

2017-06-08 Thread Andries Engelbrecht
You can use convert_from and JSON data type.

0: jdbc:drill:> select t.col1, t.col2, t.conv.key1 as key1, t.conv.key2 as 
key2, t.col4 from
. . . . . . . > (select columns[0] as col1 , columns[1] as col2 , 
convert_from(columns[2], 'JSON') as conv  , columns[3] as col4 from 
`/flat/psv-json/json.tbl`) t;
+---+---+-+-+---+
| col1  | col2  |  key1   |  key2   | col4  |
+---+---+-+-+---+
| 1 | xyz   | value1  | value2  | abc   |




If you want to use functions like flatten you will need to make sure the JSON 
in represented in an array.
i.e. [{"key":1, "value": 1},{"key":2, "value":2}]

0: jdbc:drill:> select t.col1, t.col2, t.conv.key as key, t.conv.`value` as 
`value`, t.col4 from
. . . . . . . > (select columns[0] as col1, columns[1]as col2, 
flatten((convert_from(columns[2],'JSON'))) as conv,  columns[3] as col4 from 
`/flat/psv-json/json.tbl`) t;
+---+---+--++---+
| col1  | col2  | key  | value  | col4  |
+---+---+--++---+
| 1 | xyz   | 1| 1  | abc   |
| 1 | xyz   | 2| 2  | abc   |
+---+---+--++---+



--Andries




On 6/8/17, 2:22 AM, "ankit jain"  wrote:

Hi,
I have a few psv file with a few of the columns being a json key value map.
Example:

> 1|xyz|{"key1":"value1", "key2":"value2"}|abc|


I am converting these files to parquet format but want to convert the json
key and values to different columns. How is that possible?

end product being:
id name key1 key2 description
1 xyz value1 value2 abc

Right now am doing something like this but the json column wont explode:

CREATE TABLE dfs.data.`/logs/logsp/`  AS SELECT
> CAST(columns[0] AS INT)  `id`,
> columns[1] AS `name`,
> columns[2] AS `json_column`,
> columns[3] AS `description`,
> from dfs.data.`logs/events.tbl`;


And this is what I get

id name json_column description
1 xyz {"key1":"value1", "key2":"value2"} abc

Thanks in advance,
Ankit Jain




Accessing json fields within CSV file

2017-06-08 Thread ankit jain
Hi,
I have a few psv file with a few of the columns being a json key value map.
Example:

> 1|xyz|{"key1":"value1", "key2":"value2"}|abc|


I am converting these files to parquet format but want to convert the json
key and values to different columns. How is that possible?

end product being:
id name key1 key2 description
1 xyz value1 value2 abc

Right now am doing something like this but the json column wont explode:

CREATE TABLE dfs.data.`/logs/logsp/`  AS SELECT
> CAST(columns[0] AS INT)  `id`,
> columns[1] AS `name`,
> columns[2] AS `json_column`,
> columns[3] AS `description`,
> from dfs.data.`logs/events.tbl`;


And this is what I get

id name json_column description
1 xyz {"key1":"value1", "key2":"value2"} abc

Thanks in advance,
Ankit Jain