Have you tried to create a view on top of parqtable, and then use the view?
I did a quick experiment that may help you. Created a dir structure as follows (using csv files - what I had on hand) orders |____month1 |____month2 . . |____month9 create or replace view dfs.views.ordersdir as select dir0, * from dfs.data.`/orders`; 0: jdbc:drill:> select * from dfs.views.ordersdir limit 2; +---------+-------------------------------------------------------------------+---------+ | dir0 | columns | dir00 | +---------+-------------------------------------------------------------------+---------+ | month1 | ["8000","January","2014-01-02 23:06:07","15349","ri","203","26"] | month1 | | month1 | ["8001","January","2014-01-31 03:40:21","16033","oh","113","23"] | month1 | +---------+-------------------------------------------------------------------+————+ (interesting to note dir00 in the output) But the group by on the dir works as it is a column defined in the view 0: jdbc:drill:> select dir0, count(*) from dfs.views.ordersdir group by dir0; +---------+---------+ | dir0 | EXPR$1 | +---------+---------+ | month1 | 9000 | | month6 | 16000 | | month9 | 15131 | | month2 | 12000 | | month3 | 16000 | | month7 | 15573 | | month5 | 16000 | | month4 | 12000 | | month8 | 15390 | +---------+————+ See if that works in your case. —Andries > On Nov 5, 2015, at 6:48 AM, John Omernik <[email protected]> wrote: > > Hey all, to facilitate loading of some data from JSON to Parquet, I am > using the the load into "day" based directories... > > parqtable > | > |_______2015-11-01 > | > |_______2015-11-02 > | > |_______2015-11-03 > | > |_______2015-11-04 > > That way I can do select * from `parqtable` where dir0 = '2015-11-01' and > other cool tricks. It also helps my data loading. > > I am using the exact same query to load each day. > > CREATE TABLE `parqtable/2015-11-01' as > (select field1, field2, field3, field4 from jsontable where dir0 = > '2015-11-01') > > CREATE TABLE `parqtable/2015-11-02' as > (select field1, field2, field3, field4 from jsontable where dir0 = > '2015-11-02') > > CREATE TABLE `parqtable/2015-11-03' as > (select field1, field2, field3, field4 from jsontable where dir0 = > '2015-11-03') > > Etc > > This seams to work well except for one thing: > > If I want to see the count per directory, this (what I thought was obvious) > query: > > select dir0, count(*) from `parqtable` group by dir0 > > fails with > > Error: UNSUPPORTED_OPERATION_ ERROR: Hash aggregate does not support schema > changes > > Fragment: 2:8 > > > I am not sure why this would be the case, the data is loaded by the same > query, I would assume the schema is the same.... > > Thoughts on how to troubleshoot? > > Thanks! > > John
