I upgrade to latest cdh version which is HBase 1.2.0-cdh5.7.0 and test it again, The result are correct now. Thanks for the help.
Even though, I think the query plan still can be optimized. Here is what I think can improve: a. specify the columns when need b. remove the rowFilter when scan startRow and stopRow can meet it. for example : select row_key from hbase.browser_action a where a.row_key >'0' and a.row_key < '1' , I think sql like this will query faster without rowfilter. 2016-05-24 9:29 GMT+08:00 qiang li <tiredqi...@gmail.com>: > Yes, Its seems like the same issue. I will upgrade it and test again. But > do you think we can update the physical plan too. If we only want to query > one qualifier, then the columns of the plan should only contains the > qualifier instead of "*". Maybe this plan will be query fast. Am I right? > > 2016-05-23 23:38 GMT+08:00 Krystal Nguyen <kngu...@maprtech.com>: > >> Hi Qiang, >> >> Looks like you might be encountering this issue: >> https://issues.apache.org/jira/browse/DRILL-4271 >> >> Thanks >> >> On Sun, May 22, 2016 at 8:38 PM, qiang li <tiredqi...@gmail.com> wrote: >> >> > I test it step by step again. And finally I find out that the issue >> > happened only if the qualifier number is more than 3. >> > >> > It's werid, but this is the result I test. >> > >> > I tested about 10 thousands row of data. The length of the event is >> 6,the >> > code I used to test is like below: >> > >> > String[] earr = action.getEvent().geteArr(); >> > for(int i=0;i<6;i++){ >> > put.addColumn(family, Bytes.toBytes("e"+i), >> Bytes.toBytes(earr[i])); >> > } >> > >> > Then I test step by step like below : >> > >> > put.addColumn(family, Bytes.toBytes("e0"), Bytes.toBytes("e0")); >> > put.addColumn(family, Bytes.toBytes("e1"), Bytes.toBytes("e1")); >> > put.addColumn(family, Bytes.toBytes("e2"), Bytes.toBytes("e2")); >> > put.addColumn(family, Bytes.toBytes("e3"), Bytes.toBytes("e3")); >> > >> > 0: jdbc:drill:zk=rfdc5> select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, >> > count(a.`v`.`e0`) p from hbase.browser_action a where a.row_key > '0' >> > group by a.`v`.`e0`; >> > +-----+-------+ >> > | k | p | >> > +-----+-------+ >> > | e0 | 3856 | >> > +-----+-------+ >> > >> > 0: jdbc:drill:zk=rfdc5> select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, >> > count(a.`v`.`e0`) p from hbase.browser_action a group by a.`v`.`e0`; >> > +-----+-------+ >> > | k | p | >> > +-----+-------+ >> > | e0 | 9094 | >> > +-----+-------+ >> > >> > >> > if I put the qualifier "e3" into the table(that is more than 3 >> qulifier), >> > then the issue reproduced. >> > >> > >> > 2016-05-23 9:23 GMT+08:00 qiang li <tiredqi...@gmail.com>: >> > >> > > Sorry late. >> > > >> > > Yes, Hadoop 2.6.0-cdh5.4.5 and HBase 1.0.0-cdh5.4.5. >> > > >> > > 2016-05-20 23:06 GMT+08:00 Krystal Nguyen <kngu...@maprtech.com>: >> > > >> > >> Qiang, Can you please let us know the hbase version and hadoop >> > >> distribution >> > >> version that you are using. >> > >> >> > >> On Fri, May 20, 2016 at 8:03 AM, Krystal Nguyen < >> kngu...@maprtech.com> >> > >> wrote: >> > >> >> > >> > Can you please let us know the hbase version and hadoop >> distribution >> > >> > version that you are using. >> > >> > >> > >> > >> > >> > On Fri, May 20, 2016 at 1:35 AM, qiang li <tiredqi...@gmail.com> >> > wrote: >> > >> > >> > >> >> Khurram , I send the mail again, the last mail forget to cc to >> > >> >> user@drill.apache.org >> > >> >> >> > >> >> The main process is the same, but my rowkey is more complicate, >> > >> >> Here is the detail I tested. >> > >> >> rowkey is like this : [salt 1byte string] + [day 8byte string] + >> > >> [event] + >> > >> >> [uid long] + [ts long] >> > >> >> also I have other qualifiers, only qualifier v:v is integer, the >> > others >> > >> >> are >> > >> >> string. >> > >> >> >> > >> >> example: >> > >> >> hbase(main):004:0> scan 'browser_action2', { LIMIT => 1} >> > >> >> ROW >> > >> COLUMN+CELL >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e0, >> > >> >> timestamp=1461839343076, value=pay >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e1, >> > >> >> timestamp=1461839343076, value=bijia >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e2, >> > >> >> timestamp=1461839343076, value=browser >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e3, >> > >> >> timestamp=1461839343076, value=* >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e4, >> > >> >> timestamp=1461839343076, value=* >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:e5, >> > >> >> timestamp=1461839343076, value=* >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:h, >> > >> >> timestamp=1459771200000, value=20 >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:m, >> > >> >> timestamp=1459771200000, value=0 >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 020160404pay.bijia.browser\x00\x00qK>I\xD0w\x00\x00\x01S\xE >> > >> column=v:v, >> > >> >> timestamp=1459771200000, value=\x00\x00\x00\x17 >> > >> >> >> > >> >> >> > >> >> 1$\xD2\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 1 row(s) in 0.0410 seconds >> > >> >> >> > >> >> >> > >> >> Here is the example I how the issue look like: >> > >> >> >> > >> >> hbase(main):69904:0> scan 'browser_action2', {COLUMNS => ['v:e0'], >> > >> >> STARTROW=> '0'} >> > >> >> ........ >> > >> >> 920160410visit.bijia.test\xFFr\xC0o\x0B\x14\x0A\x16\x00\x00\ >> > >> column=v:e0, >> > >> >> timestamp=1463723029448, value=visit >> > >> >> >> > >> >> >> > >> >> x01T\x00\x0A\xFA\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 920160410visit.bijia.test\xFF\x96-\xE4\x0B\x9D\xAB]\x00\x00\ >> > >> column=v:e0, >> > >> >> timestamp=1463723029217, value=visit >> > >> >> >> > >> >> >> > >> >> x01T\x00\x0A\xFA\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 920160410visit.bijia.test\xFF\xE3\x80\xFAac\xA6\xCF\x00\x00\ >> > >> column=v:e0, >> > >> >> timestamp=1463723029295, value=visit >> > >> >> >> > >> >> >> > >> >> x01T\x00\x0A\xFA\x00 >> > >> >> >> > >> >> >> > >> >> >> > >> >> 9994 row(s) in 123.8650 seconds >> > >> >> >> > >> >> the drill result: >> > >> >> 0: jdbc:drill:zk=rfdc5> select CONVERT_FROM(a.`v`.`e0`, 'UTF8') >> as k, >> > >> >> count(a.`v`.`e0`) p from hbase.browser_action2 a where a.row_key > >> > '0' >> > >> >> group by a.`v`.`e0`; >> > >> >> +--------+-------+ >> > >> >> | k | p | >> > >> >> +--------+-------+ >> > >> >> | visit | 1216 | >> > >> >> +--------+-------+ >> > >> >> >> > >> >> >> > >> >> I find out that if the row size larger than 10000 will have the >> > issue. >> > >> >> The >> > >> >> result is right if less that 1000 rows. But not always that way. >> > >> >> What I can make sure is if I updated the columns in the physical >> plan >> > >> and >> > >> >> query by web UI , the result will be correct. >> > >> >> >> > >> >> >> > >> >> Thanks >> > >> >> >> > >> >> 2016-05-20 13:58 GMT+08:00 Khurram Faraaz <kfar...@maprtech.com>: >> > >> >> >> > >> >> > Qiang, can you please take a look at DRILL-4686 and confirm if >> the >> > >> data >> > >> >> > set used in my repro is the same as the one you have used. If >> the >> > >> data >> > >> >> set >> > >> >> > is different please let us know the type of data that you have >> used >> > >> in >> > >> >> your >> > >> >> > table. >> > >> >> > >> > >> >> > Aman - I will try to repro the problem on Drill 1.6.0 and share >> > >> results. >> > >> >> > >> > >> >> > Thanks, >> > >> >> > Khurram >> > >> >> > >> > >> >> > On Thu, May 19, 2016 at 11:23 PM, Aman Sinha < >> amansi...@apache.org >> > > >> > >> >> wrote: >> > >> >> > >> > >> >> >> Khurram, DRILL-4686 seems like a different issue...it is >> > reporting >> > >> an >> > >> >> >> error whereas the original problem from qiang was an incorrect >> > >> result. >> > >> >> >> Can >> > >> >> >> you use the same version (1.6) that he was using. Also, is the >> > data >> > >> >> set >> > >> >> >> similar ? If you are unable to repro the exact same issue, >> > perhaps >> > >> >> qiang >> > >> >> >> should file a JIRA with a smaller repro if possible. >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> On Thu, May 19, 2016 at 8:35 AM, Khurram Faraaz < >> > >> kfar...@maprtech.com> >> > >> >> >> wrote: >> > >> >> >> >> > >> >> >> > Hello Qiang, >> > >> >> >> > >> > >> >> >> > DRILL-4686 is reported to track this problem. >> > >> >> >> > >> > >> >> >> > Thanks, >> > >> >> >> > Khurram >> > >> >> >> > >> > >> >> >> > On Wed, May 18, 2016 at 3:16 PM, qiang li < >> tiredqi...@gmail.com >> > > >> > >> >> wrote: >> > >> >> >> > >> > >> >> >> >> Ok, Thanks very much. >> > >> >> >> >> >> > >> >> >> >> 2016-05-18 17:44 GMT+08:00 Khurram Faraaz < >> > kfar...@maprtech.com >> > >> >: >> > >> >> >> >> >> > >> >> >> >>> Hello Qiang, >> > >> >> >> >>> >> > >> >> >> >>> Someone from our Drill team (in San Jose) will get back to >> you >> > >> >> soon. I >> > >> >> >> >>> work from the India lab and I am in a different time zone >> as >> > >> >> compared >> > >> >> >> to >> > >> >> >> >>> San Jose office, some one from MapR San Jose will get back >> to >> > >> you >> > >> >> as >> > >> >> >> soon >> > >> >> >> >>> as possible. >> > >> >> >> >>> >> > >> >> >> >>> Thanks, >> > >> >> >> >>> Khurram >> > >> >> >> >>> >> > >> >> >> >>> On Wed, May 18, 2016 at 3:09 PM, qiang li < >> > tiredqi...@gmail.com >> > >> > >> > >> >> >> wrote: >> > >> >> >> >>> >> > >> >> >> >>>> Hi Khurram, Thanks very much to reproduce it, so what's >> the >> > >> >> >> >>>> conclusion? >> > >> >> >> >>>> >> > >> >> >> >>>> Any idea how to sovle it? >> > >> >> >> >>>> >> > >> >> >> >>>> 2016-05-18 17:02 GMT+08:00 Khurram Faraaz < >> > >> kfar...@maprtech.com>: >> > >> >> >> >>>> >> > >> >> >> >>>>> So I tried to create the table using HBase API (with no >> data >> > >> >> >> inserted >> > >> >> >> >>>>> into table) and I got the query plan for drill 1.7.0 >> > >> >> >> >>>>> Drill 1.7.0-SNAPSHOT commit ID : 09b26277 >> > >> >> >> >>>>> >> > >> >> >> >>>>> 0: jdbc:drill:schema=dfs.tmp> describe browser_action2; >> > >> >> >> >>>>> +--------------+------------+--------------+ >> > >> >> >> >>>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | >> > >> >> >> >>>>> +--------------+------------+--------------+ >> > >> >> >> >>>>> | row_key | ANY | NO | >> > >> >> >> >>>>> | v | MAP | NO | >> > >> >> >> >>>>> +--------------+------------+--------------+ >> > >> >> >> >>>>> 2 rows selected (1.665 seconds) >> > >> >> >> >>>>> >> > >> >> >> >>>>> Table creation Java program >> > >> >> >> >>>>> >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> public class PutIntDataToHBase { >> > >> >> >> >>>>> public static void main(String args[]) throws >> > IOException >> > >> { >> > >> >> >> >>>>> Configuration conf = HBaseConfiguration.create(); >> > >> >> >> >>>>> >> > >> conf.set("hbase.zookeeper.property.clientPort","5181"); >> > >> >> >> >>>>> HBaseAdmin admin = new HBaseAdmin(conf); >> > >> >> >> >>>>> if (admin.tableExists("browser_action2")) { >> > >> >> >> >>>>> admin.disableTable("browser_action2"); >> > >> >> >> >>>>> admin.deleteTable("browser_action2"); >> > >> >> >> >>>>> } >> > >> >> >> >>>>> >> > >> >> >> >>>>> byte[][] SPLIT_KEYS = >> > >> >> >> >>>>> >> > {{'0'},{'1'},{'2'},{'3'},{'4'},{'5'},{'6'},{'7'},{'8'},{'9'}}; >> > >> >> >> >>>>> HTableDescriptor tableDesc = new >> > >> >> >> >>>>> >> > >> >> HTableDescriptor(TableName.valueOf("browser_action2")); >> > >> >> >> >>>>> >> > >> >> >> >>>>> tableDesc.addFamily(new HColumnDescriptor("v")); >> > >> >> >> >>>>> admin.createTable(tableDesc,SPLIT_KEYS); >> > >> >> >> >>>>> >> > >> >> >> >>>>> } >> > >> >> >> >>>>> } >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> >> > >> >> >> >>>>> Query plan for the query that was reported as returning >> > wrong >> > >> >> >> results. >> > >> >> >> >>>>> >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select >> > >> >> >> >>>>> CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) >> p >> > >> from >> > >> >> >> >>>>> hbase.browser_action2 a where a.row_key > '0' group by >> > >> >> a.`v`.`e0`; >> > >> >> >> >>>>> +------+------+ >> > >> >> >> >>>>> | text | json | >> > >> >> >> >>>>> +------+------+ >> > >> >> >> >>>>> | 00-00 Screen >> > >> >> >> >>>>> 00-01 Project(k=[$0], p=[$1]) >> > >> >> >> >>>>> 00-02 UnionExchange >> > >> >> >> >>>>> 01-01 Project(k=[CONVERT_FROMUTF8($0)], p=[$1]) >> > >> >> >> >>>>> 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >> > >> >> >> >>>>> 01-03 Project($f0=[$0], p=[$1]) >> > >> >> >> >>>>> 01-04 HashToRandomExchange(dist0=[[$0]]) >> > >> >> >> >>>>> 02-01 UnorderedMuxExchange >> > >> >> >> >>>>> 03-01 Project($f0=[$0], p=[$1], >> > >> >> >> >>>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >> > >> >> >> >>>>> 03-02 HashAgg(group=[{0}], >> > p=[COUNT($0)]) >> > >> >> >> >>>>> 03-03 Project($f0=[ITEM($1, >> 'e0')]) >> > >> >> >> >>>>> 03-04 >> > Scan(groupscan=[HBaseGroupScan >> > >> >> >> >>>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, >> > >> >> >> startRow=0\x00, >> > >> >> >> >>>>> stopRow=, filter=null], columns=[`*`]]]) >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> >> > >> >> >> >>>>> and the query plan for the other problem query mentioned >> in >> > >> the >> > >> >> >> first >> > >> >> >> >>>>> email. >> > >> >> >> >>>>> >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> 0: jdbc:drill:schema=dfs.tmp> explain plan for select >> > >> >> >> >>>>> CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8') as >> k, >> > >> >> >> >>>>> . . . . . . . . . . . . . . > count(a.row_key) p from >> > >> >> >> >>>>> hbase.browser_action2 a group by >> > >> >> >> >>>>> . . . . . . . . . . . . . . > BYTE_SUBSTR(a.row_key, 1 , >> 9); >> > >> >> >> >>>>> +------+------+ >> > >> >> >> >>>>> | text | json | >> > >> >> >> >>>>> +------+------+ >> > >> >> >> >>>>> | 00-00 Screen >> > >> >> >> >>>>> 00-01 Project(k=[$0], p=[$1]) >> > >> >> >> >>>>> 00-02 UnionExchange >> > >> >> >> >>>>> 01-01 Project(k=[CONVERT_FROMUTF8($0)], p=[$1]) >> > >> >> >> >>>>> 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >> > >> >> >> >>>>> 01-03 Project($f0=[$0], p=[$1]) >> > >> >> >> >>>>> 01-04 HashToRandomExchange(dist0=[[$0]]) >> > >> >> >> >>>>> 02-01 UnorderedMuxExchange >> > >> >> >> >>>>> 03-01 Project($f0=[$0], p=[$1], >> > >> >> >> >>>>> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >> > >> >> >> >>>>> 03-02 HashAgg(group=[{0}], >> > p=[COUNT($1)]) >> > >> >> >> >>>>> 03-03 >> Project($f0=[BYTE_SUBSTR($0, 1, >> > >> 9)], >> > >> >> >> >>>>> row_key=[$0]) >> > >> >> >> >>>>> 03-04 >> > Scan(groupscan=[HBaseGroupScan >> > >> >> >> >>>>> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, >> > >> >> >> startRow=null, >> > >> >> >> >>>>> stopRow=null, filter=null], columns=[`*`]]]) >> > >> >> >> >>>>> {noformat} >> > >> >> >> >>>>> >> > >> >> >> >>>>> Thanks, >> > >> >> >> >>>>> Khurram >> > >> >> >> >>>>> >> > >> >> >> >>>>> On Wed, May 18, 2016 at 7:01 AM, qiang li < >> > >> tiredqi...@gmail.com> >> > >> >> >> >>>>> wrote: >> > >> >> >> >>>>> >> > >> >> >> >>>>>> Yes. >> > >> >> >> >>>>>> I use hbase API to create it. >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> The main code is: >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> byte[][] SPLIT_KEYS = { {'0'}, {'1'}, {'2'}, {'3'}, >> {'4'}, >> > >> >> {'5'}, >> > >> >> >> {'6'}, {'7'},{'8'}, {'9'} }; >> > >> >> >> >>>>>> TableName tableName = >> TableName.valueOf("browser_action2"); >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> HTableDescriptor tableDesc = new >> > HTableDescriptor(tableName); >> > >> >> >> >>>>>> HColumnDescriptor columnDesc = new >> HColumnDescriptor("v"); >> > >> >> >> >>>>>> tableDesc.addFamily(columnDesc); >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> >> > columnDesc.setCompressionType(Compression.Algorithm.SNAPPY); >> > >> >> >> >>>>>> columnDesc.setDataBlockEncoding(DataBlockEncoding.DIFF); >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> admin.createTable(tableDesc, SPLIT_KEYS); >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> 2016-05-18 1:48 GMT+08:00 Zelaine Fong < >> zf...@maprtech.com >> > >: >> > >> >> >> >>>>>> >> > >> >> >> >>>>>>> Can you provide the CREATE TABLE statement you used to >> > >> >> reproduce >> > >> >> >> this >> > >> >> >> >>>>>>> problem so we can try to reproduce it on our end. >> > >> >> >> >>>>>>> >> > >> >> >> >>>>>>> Thanks. >> > >> >> >> >>>>>>> >> > >> >> >> >>>>>>> -- Zelaine >> > >> >> >> >>>>>>> >> > >> >> >> >>>>>>> On Tue, May 17, 2016 at 4:50 AM, qiang li < >> > >> >> tiredqi...@gmail.com> >> > >> >> >> >>>>>>> wrote: >> > >> >> >> >>>>>>> >> > >> >> >> >>>>>>> > Hi , >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > I recently meet a issue that can not query the >> correct >> > >> data >> > >> >> from >> > >> >> >> >>>>>>> hbase with >> > >> >> >> >>>>>>> > sql by drill, can anybody help me. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > I test with the drill 1.6. >> > >> >> >> >>>>>>> > My hbase scheme: >> > >> >> >> >>>>>>> > rowkey: salt+day+event+uid + ts , eg: >> > 120160411visituidts >> > >> >> >> >>>>>>> > cf : v >> > >> >> >> >>>>>>> > qualifier: v, e0, e1 >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > The wrong result only happened when I use group by >> > clause. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > This sql will not return correct result: >> > >> >> >> >>>>>>> > select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, >> > >> >> count(a.`v`.`e0`) >> > >> >> >> p >> > >> >> >> >>>>>>> from >> > >> >> >> >>>>>>> > hbase.browser_action2 a where a.row_key > '0' group >> by >> > >> >> >> a.`v`.`e0`; >> > >> >> >> >>>>>>> > Part of explain of this sql is: >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > 0: jdbc:drill:zk=rfdc5> explain plan for select >> > >> >> >> >>>>>>> CONVERT_FROM(a.`v`.`e0`, >> > >> >> >> >>>>>>> > 'UTF8') as k, count(a.`v`.`e0`) p from >> > >> hbase.browser_action2 >> > >> >> a >> > >> >> >> >>>>>>> where >> > >> >> >> >>>>>>> > a.row_key > '0' group by a.`v`.`e0`; >> > >> >> >> >>>>>>> > +------+------+ >> > >> >> >> >>>>>>> > | text | json | >> > >> >> >> >>>>>>> > +------+------+ >> > >> >> >> >>>>>>> > | 00-00 Screen >> > >> >> >> >>>>>>> > 00-01 Project(k=[$0], p=[$1]) >> > >> >> >> >>>>>>> > 00-02 UnionExchange >> > >> >> >> >>>>>>> > 01-01 Project(k=[CONVERT_FROMUTF8($0)], >> p=[$1]) >> > >> >> >> >>>>>>> > 01-02 HashAgg(group=[{0}], p=[$SUM0($1)]) >> > >> >> >> >>>>>>> > 01-03 Project($f0=[$0], p=[$1]) >> > >> >> >> >>>>>>> > 01-04 >> HashToRandomExchange(dist0=[[$0]]) >> > >> >> >> >>>>>>> > 02-01 UnorderedMuxExchange >> > >> >> >> >>>>>>> > 03-01 Project($f0=[$0], p=[$1], >> > >> >> >> >>>>>>> > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) >> > >> >> >> >>>>>>> > 03-02 HashAgg(group=[{0}], >> > >> >> p=[COUNT($0)]) >> > >> >> >> >>>>>>> > 03-03 Project($f0=[ITEM($1, >> > 'e0')]) >> > >> >> >> >>>>>>> > 03-04 >> > >> Scan(groupscan=[HBaseGroupScan >> > >> >> >> >>>>>>> > [HBaseScanSpec=HBaseScanSpec >> [tableName=browser_action2, >> > >> >> >> >>>>>>> startRow=0\x00, >> > >> >> >> >>>>>>> > stopRow=, filter=null], columns=[`*`]]]) >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > The data return very quickly , the result of this sql >> > is : >> > >> >> >> >>>>>>> > +------+--------+ >> > >> >> >> >>>>>>> > | k | p | >> > >> >> >> >>>>>>> > +------+--------+ >> > >> >> >> >>>>>>> > | pay | 12180 | >> > >> >> >> >>>>>>> > +------+-------- >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > But I have millons of data in the table. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > I tried to change the physical plan. if I change the >> > json >> > >> >> >> explain >> > >> >> >> >>>>>>> > *"columns" >> > >> >> >> >>>>>>> > : [ "`*`" ]* to *"columns" : [ "`v`.`e0`" ] *, it >> will >> > >> >> return >> > >> >> >> the >> > >> >> >> >>>>>>> correct >> > >> >> >> >>>>>>> > result. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > It seems the physical plan is not correct. >> > >> >> >> >>>>>>> > I also try to debug the sql parser to find out the >> > reason, >> > >> >> but >> > >> >> >> its >> > >> >> >> >>>>>>> too >> > >> >> >> >>>>>>> > complicate. Can anyone help me. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> > Also this sql have the same issue. >> > >> >> >> >>>>>>> > select CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), >> > 'UTF8') >> > >> >> as >> > >> >> >> k, >> > >> >> >> >>>>>>> > count(a.row_key) p from hbase.browser_action2 a >> group by >> > >> >> >> >>>>>>> > BYTE_SUBSTR(a.row_key, 1 , 9); >> > >> >> >> >>>>>>> > I change the json explain *"columns" : [ "`*`" ]* to >> > >> >> >> *"columns" : >> > >> >> >> >>>>>>> [ >> > >> >> >> >>>>>>> > "`row_key`" ] *, it will return the correct result. >> > >> >> >> >>>>>>> > >> > >> >> >> >>>>>>> >> > >> >> >> >>>>>> >> > >> >> >> >>>>>> >> > >> >> >> >>>>> >> > >> >> >> >>>> >> > >> >> >> >>> >> > >> >> >> >> >> > >> >> >> > >> > >> >> >> >> > >> >> > >> > >> >> > >> > >> >> >> > >> > >> > >> > >> > >> >> > > >> > > >> > >> > >