Hello Qiang,

DRILL-4686 is reported to track this problem.

Thanks,
Khurram

On Wed, May 18, 2016 at 3:16 PM, qiang li <tiredqi...@gmail.com> wrote:

> Ok, Thanks very much.
>
> 2016-05-18 17:44 GMT+08:00 Khurram Faraaz <kfar...@maprtech.com>:
>
>> Hello Qiang,
>>
>> Someone from our Drill team (in San Jose) will get back to you soon. I
>> work from the India lab and I am in a different time zone as compared to
>> San Jose office, some one from MapR San Jose will get back to you as soon
>> as possible.
>>
>> Thanks,
>> Khurram
>>
>> On Wed, May 18, 2016 at 3:09 PM, qiang li <tiredqi...@gmail.com> wrote:
>>
>>> Hi Khurram, Thanks very much to reproduce it, so what's the conclusion?
>>>
>>> Any idea how to sovle it?
>>>
>>> 2016-05-18 17:02 GMT+08:00 Khurram Faraaz <kfar...@maprtech.com>:
>>>
>>>> So I tried to create the table using HBase API (with no data inserted
>>>> into table) and I got the query plan for drill 1.7.0
>>>> Drill 1.7.0-SNAPSHOT  commit ID :  09b26277
>>>>
>>>> 0: jdbc:drill:schema=dfs.tmp> describe browser_action2;
>>>> +--------------+------------+--------------+
>>>> | COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
>>>> +--------------+------------+--------------+
>>>> | row_key      | ANY        | NO           |
>>>> | v            | MAP        | NO           |
>>>> +--------------+------------+--------------+
>>>> 2 rows selected (1.665 seconds)
>>>>
>>>> Table creation Java program
>>>>
>>>> {noformat}
>>>> public class PutIntDataToHBase {
>>>>     public static void main(String args[]) throws IOException {
>>>>         Configuration conf = HBaseConfiguration.create();
>>>>         conf.set("hbase.zookeeper.property.clientPort","5181");
>>>>         HBaseAdmin admin = new HBaseAdmin(conf);
>>>>         if (admin.tableExists("browser_action2")) {
>>>>             admin.disableTable("browser_action2");
>>>>             admin.deleteTable("browser_action2");
>>>>         }
>>>>
>>>>         byte[][] SPLIT_KEYS =
>>>> {{'0'},{'1'},{'2'},{'3'},{'4'},{'5'},{'6'},{'7'},{'8'},{'9'}};
>>>>         HTableDescriptor tableDesc = new
>>>>             HTableDescriptor(TableName.valueOf("browser_action2"));
>>>>
>>>>         tableDesc.addFamily(new HColumnDescriptor("v"));
>>>>         admin.createTable(tableDesc,SPLIT_KEYS);
>>>>
>>>>     }
>>>> }
>>>> {noformat}
>>>>
>>>> Query plan for the query that was reported as returning wrong results.
>>>>
>>>> {noformat}
>>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select
>>>> CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p from
>>>> hbase.browser_action2 a where a.row_key > '0'  group by a.`v`.`e0`;
>>>> +------+------+
>>>> | text | json |
>>>> +------+------+
>>>> | 00-00    Screen
>>>> 00-01      Project(k=[$0], p=[$1])
>>>> 00-02        UnionExchange
>>>> 01-01          Project(k=[CONVERT_FROMUTF8($0)], p=[$1])
>>>> 01-02            HashAgg(group=[{0}], p=[$SUM0($1)])
>>>> 01-03              Project($f0=[$0], p=[$1])
>>>> 01-04                HashToRandomExchange(dist0=[[$0]])
>>>> 02-01                  UnorderedMuxExchange
>>>> 03-01                    Project($f0=[$0], p=[$1],
>>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)])
>>>> 03-02                      HashAgg(group=[{0}], p=[COUNT($0)])
>>>> 03-03                        Project($f0=[ITEM($1, 'e0')])
>>>> 03-04                          Scan(groupscan=[HBaseGroupScan
>>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=0\x00,
>>>> stopRow=, filter=null], columns=[`*`]]])
>>>> {noformat}
>>>>
>>>> and the query plan for the other problem query mentioned in the first
>>>> email.
>>>>
>>>> {noformat}
>>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select
>>>> CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8')  as k,
>>>> . . . . . . . . . . . . . . > count(a.row_key) p from
>>>> hbase.browser_action2 a group by
>>>> . . . . . . . . . . . . . . > BYTE_SUBSTR(a.row_key, 1 , 9);
>>>> +------+------+
>>>> | text | json |
>>>> +------+------+
>>>> | 00-00    Screen
>>>> 00-01      Project(k=[$0], p=[$1])
>>>> 00-02        UnionExchange
>>>> 01-01          Project(k=[CONVERT_FROMUTF8($0)], p=[$1])
>>>> 01-02            HashAgg(group=[{0}], p=[$SUM0($1)])
>>>> 01-03              Project($f0=[$0], p=[$1])
>>>> 01-04                HashToRandomExchange(dist0=[[$0]])
>>>> 02-01                  UnorderedMuxExchange
>>>> 03-01                    Project($f0=[$0], p=[$1],
>>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)])
>>>> 03-02                      HashAgg(group=[{0}], p=[COUNT($1)])
>>>> 03-03                        Project($f0=[BYTE_SUBSTR($0, 1, 9)],
>>>> row_key=[$0])
>>>> 03-04                          Scan(groupscan=[HBaseGroupScan
>>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=null,
>>>> stopRow=null, filter=null], columns=[`*`]]])
>>>> {noformat}
>>>>
>>>> Thanks,
>>>> Khurram
>>>>
>>>> On Wed, May 18, 2016 at 7:01 AM, qiang li <tiredqi...@gmail.com> wrote:
>>>>
>>>>> Yes.
>>>>> I use hbase API to create it.
>>>>>
>>>>> The main code is:
>>>>>
>>>>> byte[][] SPLIT_KEYS = { {'0'}, {'1'}, {'2'}, {'3'}, {'4'}, {'5'}, {'6'}, 
>>>>> {'7'},{'8'}, {'9'} };
>>>>> TableName tableName = TableName.valueOf("browser_action2");
>>>>>
>>>>> HTableDescriptor tableDesc = new HTableDescriptor(tableName);
>>>>> HColumnDescriptor columnDesc = new HColumnDescriptor("v");
>>>>> tableDesc.addFamily(columnDesc);
>>>>>
>>>>> columnDesc.setCompressionType(Compression.Algorithm.SNAPPY);
>>>>> columnDesc.setDataBlockEncoding(DataBlockEncoding.DIFF);
>>>>>
>>>>> admin.createTable(tableDesc, SPLIT_KEYS);
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2016-05-18 1:48 GMT+08:00 Zelaine Fong <zf...@maprtech.com>:
>>>>>
>>>>>> Can you provide the CREATE TABLE statement you used to reproduce this
>>>>>> problem so we can try to reproduce it on our end.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -- Zelaine
>>>>>>
>>>>>> On Tue, May 17, 2016 at 4:50 AM, qiang li <tiredqi...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > Hi ,
>>>>>> >
>>>>>> > I recently meet a issue that can not query the correct data from
>>>>>> hbase with
>>>>>> > sql by drill, can anybody help me.
>>>>>> >
>>>>>> > I test with the drill 1.6.
>>>>>> > My hbase scheme:
>>>>>> > rowkey: salt+day+event+uid + ts , eg: 120160411visituidts
>>>>>> > cf : v
>>>>>> > qualifier: v, e0, e1
>>>>>> >
>>>>>> > The wrong result only happened when I use group by clause.
>>>>>> >
>>>>>> > This sql will not return correct result:
>>>>>> > select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p
>>>>>> from
>>>>>> > hbase.browser_action2 a where a.row_key > '0'  group by a.`v`.`e0`;
>>>>>> > Part of explain of this sql is:
>>>>>> >
>>>>>> > 0: jdbc:drill:zk=rfdc5> explain plan for select
>>>>>> CONVERT_FROM(a.`v`.`e0`,
>>>>>> > 'UTF8') as k, count(a.`v`.`e0`) p from hbase.browser_action2 a where
>>>>>> > a.row_key > '0'  group by a.`v`.`e0`;
>>>>>> > +------+------+
>>>>>> > | text | json |
>>>>>> > +------+------+
>>>>>> > | 00-00    Screen
>>>>>> > 00-01      Project(k=[$0], p=[$1])
>>>>>> > 00-02        UnionExchange
>>>>>> > 01-01          Project(k=[CONVERT_FROMUTF8($0)], p=[$1])
>>>>>> > 01-02            HashAgg(group=[{0}], p=[$SUM0($1)])
>>>>>> > 01-03              Project($f0=[$0], p=[$1])
>>>>>> > 01-04                HashToRandomExchange(dist0=[[$0]])
>>>>>> > 02-01                  UnorderedMuxExchange
>>>>>> > 03-01                    Project($f0=[$0], p=[$1],
>>>>>> > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)])
>>>>>> > 03-02                      HashAgg(group=[{0}], p=[COUNT($0)])
>>>>>> > 03-03                        Project($f0=[ITEM($1, 'e0')])
>>>>>> > 03-04                          Scan(groupscan=[HBaseGroupScan
>>>>>> > [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2,
>>>>>> startRow=0\x00,
>>>>>> > stopRow=, filter=null], columns=[`*`]]])
>>>>>> >
>>>>>> > The data return very quickly , the result of this sql is :
>>>>>> > +------+--------+
>>>>>> > |  k   |   p    |
>>>>>> > +------+--------+
>>>>>> > | pay  | 12180  |
>>>>>> > +------+--------
>>>>>> >
>>>>>> > But I have millons of data in the table.
>>>>>> >
>>>>>> > I tried to change the physical plan.  if I change the json explain
>>>>>> > *"columns"
>>>>>> > : [ "`*`" ]*  to *"columns" : [ "`v`.`e0`" ] *, it will return the
>>>>>> correct
>>>>>> > result.
>>>>>> >
>>>>>> > It seems the physical plan is not correct.
>>>>>> > I also try to debug the sql parser to find out the reason, but its
>>>>>> too
>>>>>> > complicate. Can anyone help me.
>>>>>> >
>>>>>> > Also this sql have the same issue.
>>>>>> > select CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8')  as k,
>>>>>> > count(a.row_key) p from hbase.browser_action2 a group by
>>>>>> > BYTE_SUBSTR(a.row_key, 1 , 9);
>>>>>> > I change the json explain *"columns" : [ "`*`" ]*  to *"columns" : [
>>>>>> > "`row_key`" ] *, it will return the correct result.
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to