Hello Qiang, DRILL-4686 is reported to track this problem.
Thanks, Khurram On Wed, May 18, 2016 at 3:16 PM, qiang li <tiredqi...@gmail.com> wrote: > Ok, Thanks very much. > > 2016-05-18 17:44 GMT+08:00 Khurram Faraaz <kfar...@maprtech.com>: > >> Hello Qiang, >> >> Someone from our Drill team (in San Jose) will get back to you soon. I >> work from the India lab and I am in a different time zone as compared to >> San Jose office, some one from MapR San Jose will get back to you as soon >> as possible. >> >> Thanks, >> Khurram >> >> On Wed, May 18, 2016 at 3:09 PM, qiang li <tiredqi...@gmail.com> wrote: >> >>> Hi Khurram, Thanks very much to reproduce it, so what's the conclusion? >>> >>> Any idea how to sovle it? >>> >>> 2016-05-18 17:02 GMT+08:00 Khurram Faraaz <kfar...@maprtech.com>: >>> >>>> So I tried to create the table using HBase API (with no data inserted >>>> into table) and I got the query plan for drill 1.7.0 >>>> Drill 1.7.0-SNAPSHOT commit ID : 09b26277 >>>> >>>> 0: jdbc:drill:schema=dfs.tmp> describe browser_action2; >>>> +--------------+------------+--------------+ >>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | >>>> +--------------+------------+--------------+ >>>> | row_key | ANY | NO | >>>> | v | MAP | NO | >>>> +--------------+------------+--------------+ >>>> 2 rows selected (1.665 seconds) >>>> >>>> Table creation Java program >>>> >>>> {noformat} >>>> public class PutIntDataToHBase { >>>> public static void main(String args[]) throws IOException { >>>> Configuration conf = HBaseConfiguration.create(); >>>> conf.set("hbase.zookeeper.property.clientPort","5181"); >>>> HBaseAdmin admin = new HBaseAdmin(conf); >>>> if (admin.tableExists("browser_action2")) { >>>> admin.disableTable("browser_action2"); >>>> admin.deleteTable("browser_action2"); >>>> } >>>> >>>> byte[][] SPLIT_KEYS = >>>> {{'0'},{'1'},{'2'},{'3'},{'4'},{'5'},{'6'},{'7'},{'8'},{'9'}}; >>>> HTableDescriptor tableDesc = new >>>> HTableDescriptor(TableName.valueOf("browser_action2")); >>>> >>>> tableDesc.addFamily(new HColumnDescriptor("v")); >>>> admin.createTable(tableDesc,SPLIT_KEYS); >>>> >>>> } >>>> } >>>> {noformat} >>>> >>>> Query plan for the query that was reported as returning wrong results. >>>> >>>> {noformat} >>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select >>>> CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p from >>>> hbase.browser_action2 a where a.row_key > '0' group by a.`v`.`e0`; >>>> +------+------+ >>>> | text | json | >>>> +------+------+ >>>> | 00-00 Screen >>>> 00-01 Project(k=[$0], p=[$1]) >>>> 00-02 UnionExchange >>>> 01-01 Project(k=[CONVERT_FROMUTF8($0)], p=[$1]) >>>> 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >>>> 01-03 Project($f0=[$0], p=[$1]) >>>> 01-04 HashToRandomExchange(dist0=[[$0]]) >>>> 02-01 UnorderedMuxExchange >>>> 03-01 Project($f0=[$0], p=[$1], >>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >>>> 03-02 HashAgg(group=[{0}], p=[COUNT($0)]) >>>> 03-03 Project($f0=[ITEM($1, 'e0')]) >>>> 03-04 Scan(groupscan=[HBaseGroupScan >>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=0\x00, >>>> stopRow=, filter=null], columns=[`*`]]]) >>>> {noformat} >>>> >>>> and the query plan for the other problem query mentioned in the first >>>> email. >>>> >>>> {noformat} >>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select >>>> CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8') as k, >>>> . . . . . . . . . . . . . . > count(a.row_key) p from >>>> hbase.browser_action2 a group by >>>> . . . . . . . . . . . . . . > BYTE_SUBSTR(a.row_key, 1 , 9); >>>> +------+------+ >>>> | text | json | >>>> +------+------+ >>>> | 00-00 Screen >>>> 00-01 Project(k=[$0], p=[$1]) >>>> 00-02 UnionExchange >>>> 01-01 Project(k=[CONVERT_FROMUTF8($0)], p=[$1]) >>>> 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >>>> 01-03 Project($f0=[$0], p=[$1]) >>>> 01-04 HashToRandomExchange(dist0=[[$0]]) >>>> 02-01 UnorderedMuxExchange >>>> 03-01 Project($f0=[$0], p=[$1], >>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >>>> 03-02 HashAgg(group=[{0}], p=[COUNT($1)]) >>>> 03-03 Project($f0=[BYTE_SUBSTR($0, 1, 9)], >>>> row_key=[$0]) >>>> 03-04 Scan(groupscan=[HBaseGroupScan >>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=null, >>>> stopRow=null, filter=null], columns=[`*`]]]) >>>> {noformat} >>>> >>>> Thanks, >>>> Khurram >>>> >>>> On Wed, May 18, 2016 at 7:01 AM, qiang li <tiredqi...@gmail.com> wrote: >>>> >>>>> Yes. >>>>> I use hbase API to create it. >>>>> >>>>> The main code is: >>>>> >>>>> byte[][] SPLIT_KEYS = { {'0'}, {'1'}, {'2'}, {'3'}, {'4'}, {'5'}, {'6'}, >>>>> {'7'},{'8'}, {'9'} }; >>>>> TableName tableName = TableName.valueOf("browser_action2"); >>>>> >>>>> HTableDescriptor tableDesc = new HTableDescriptor(tableName); >>>>> HColumnDescriptor columnDesc = new HColumnDescriptor("v"); >>>>> tableDesc.addFamily(columnDesc); >>>>> >>>>> columnDesc.setCompressionType(Compression.Algorithm.SNAPPY); >>>>> columnDesc.setDataBlockEncoding(DataBlockEncoding.DIFF); >>>>> >>>>> admin.createTable(tableDesc, SPLIT_KEYS); >>>>> >>>>> >>>>> >>>>> >>>>> 2016-05-18 1:48 GMT+08:00 Zelaine Fong <zf...@maprtech.com>: >>>>> >>>>>> Can you provide the CREATE TABLE statement you used to reproduce this >>>>>> problem so we can try to reproduce it on our end. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> -- Zelaine >>>>>> >>>>>> On Tue, May 17, 2016 at 4:50 AM, qiang li <tiredqi...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> > Hi , >>>>>> > >>>>>> > I recently meet a issue that can not query the correct data from >>>>>> hbase with >>>>>> > sql by drill, can anybody help me. >>>>>> > >>>>>> > I test with the drill 1.6. >>>>>> > My hbase scheme: >>>>>> > rowkey: salt+day+event+uid + ts , eg: 120160411visituidts >>>>>> > cf : v >>>>>> > qualifier: v, e0, e1 >>>>>> > >>>>>> > The wrong result only happened when I use group by clause. >>>>>> > >>>>>> > This sql will not return correct result: >>>>>> > select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p >>>>>> from >>>>>> > hbase.browser_action2 a where a.row_key > '0' group by a.`v`.`e0`; >>>>>> > Part of explain of this sql is: >>>>>> > >>>>>> > 0: jdbc:drill:zk=rfdc5> explain plan for select >>>>>> CONVERT_FROM(a.`v`.`e0`, >>>>>> > 'UTF8') as k, count(a.`v`.`e0`) p from hbase.browser_action2 a where >>>>>> > a.row_key > '0' group by a.`v`.`e0`; >>>>>> > +------+------+ >>>>>> > | text | json | >>>>>> > +------+------+ >>>>>> > | 00-00 Screen >>>>>> > 00-01 Project(k=[$0], p=[$1]) >>>>>> > 00-02 UnionExchange >>>>>> > 01-01 Project(k=[CONVERT_FROMUTF8($0)], p=[$1]) >>>>>> > 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >>>>>> > 01-03 Project($f0=[$0], p=[$1]) >>>>>> > 01-04 HashToRandomExchange(dist0=[[$0]]) >>>>>> > 02-01 UnorderedMuxExchange >>>>>> > 03-01 Project($f0=[$0], p=[$1], >>>>>> > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >>>>>> > 03-02 HashAgg(group=[{0}], p=[COUNT($0)]) >>>>>> > 03-03 Project($f0=[ITEM($1, 'e0')]) >>>>>> > 03-04 Scan(groupscan=[HBaseGroupScan >>>>>> > [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, >>>>>> startRow=0\x00, >>>>>> > stopRow=, filter=null], columns=[`*`]]]) >>>>>> > >>>>>> > The data return very quickly , the result of this sql is : >>>>>> > +------+--------+ >>>>>> > | k | p | >>>>>> > +------+--------+ >>>>>> > | pay | 12180 | >>>>>> > +------+-------- >>>>>> > >>>>>> > But I have millons of data in the table. >>>>>> > >>>>>> > I tried to change the physical plan. if I change the json explain >>>>>> > *"columns" >>>>>> > : [ "`*`" ]* to *"columns" : [ "`v`.`e0`" ] *, it will return the >>>>>> correct >>>>>> > result. >>>>>> > >>>>>> > It seems the physical plan is not correct. >>>>>> > I also try to debug the sql parser to find out the reason, but its >>>>>> too >>>>>> > complicate. Can anyone help me. >>>>>> > >>>>>> > Also this sql have the same issue. >>>>>> > select CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8') as k, >>>>>> > count(a.row_key) p from hbase.browser_action2 a group by >>>>>> > BYTE_SUBSTR(a.row_key, 1 , 9); >>>>>> > I change the json explain *"columns" : [ "`*`" ]* to *"columns" : [ >>>>>> > "`row_key`" ] *, it will return the correct result. >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> >