Hey all,  to facilitate loading of some data from JSON to Parquet, I am
using the the load into "day" based directories...

parqtable
|
|_______2015-11-01
|
|_______2015-11-02
|
|_______2015-11-03
|
|_______2015-11-04

That way I can do select * from `parqtable` where dir0 = '2015-11-01' and
other cool tricks. It also helps my data loading.

I am using the exact same query to load each day.

CREATE TABLE `parqtable/2015-11-01' as
(select field1, field2, field3, field4 from jsontable where dir0 =
'2015-11-01')

CREATE TABLE `parqtable/2015-11-02' as
(select field1, field2, field3, field4 from jsontable where dir0 =
'2015-11-02')

CREATE TABLE `parqtable/2015-11-03' as
(select field1, field2, field3, field4 from jsontable where dir0 =
'2015-11-03')

Etc

This seams to work well except for one thing:

If I want to see the count per directory, this (what I thought was obvious)
query:

select dir0, count(*) from `parqtable` group by dir0

fails with

Error: UNSUPPORTED_OPERATION_ ERROR: Hash aggregate does not support schema
changes

Fragment: 2:8


I am not sure why this would be the case, the data is loaded by the same
query, I would assume the schema is the same....

Thoughts on how to troubleshoot?

Thanks!

John

Reply via email to