Re: drill hbase logical plan with group by

2016-05-12 Thread Jinfeng Ni
Can you post the complete plan for the wrong result case?

>From the part for SCAN, it's not clear that it might produce wrong
result. The columns=[*] is not performance optimal, but it does not
mean the plan would produce wrong result.

In particular, you may check the physical plan in JSON string,
especially for the part of "Project" right after SCAN, to see if it
would output the right result for expression a.v.e0.


{
"pop" : "project",
"@id" : 196611,
"exprs" : [ {
  "ref" : "`$f0`",
  "expr" : "`v`.`e0`"
} ],
...
}

To help reproduce, you may consider share a simplified dataset,
expected result vs wrong result for your query.


On Thu, May 12, 2016 at 2:27 AM, qiang li  wrote:
> Dear,
>
> Please help me, recently I met a issue when I test drill with hbase.
>
> I tested two sqls, one return the wrong result and the other return correct
> result.
>
> the wrong sql:
> select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p
> from hbase.browser_action2 a
> where a.row_key >'0' group by a.`v`.`e0`;
>
> 03-04  Scan(groupscan=[HBaseGroupScan
> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=0\x00,
> stopRow=, filter=null], *columns=[`*`]*]])
>
> the correct sql :
> select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p
> from hbase.browser_action2 a  group by a.`v`.`e0` can return the right
> result.
>
> 03-04  Scan(groupscan=[HBaseGroupScan
> [HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=null,
> stopRow=null, filter=null], *columns=[`v`.`e0`]*]])
>
> As you can see, the difference of the plan is the columns, the plan is not
> what I want when I have where clause. Can anyone help me as I do not know
> how to submit issue for drill at issues.apache.org ?


drill hbase logical plan with group by

2016-05-12 Thread qiang li
Dear,

Please help me, recently I met a issue when I test drill with hbase.

I tested two sqls, one return the wrong result and the other return correct
result.

the wrong sql:
select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p
from hbase.browser_action2 a
where a.row_key >'0' group by a.`v`.`e0`;

03-04  Scan(groupscan=[HBaseGroupScan
[HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=0\x00,
stopRow=, filter=null], *columns=[`*`]*]])

the correct sql :
select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p
from hbase.browser_action2 a  group by a.`v`.`e0` can return the right
result.

03-04  Scan(groupscan=[HBaseGroupScan
[HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=null,
stopRow=null, filter=null], *columns=[`v`.`e0`]*]])

As you can see, the difference of the plan is the columns, the plan is not
what I want when I have where clause. Can anyone help me as I do not know
how to submit issue for drill at issues.apache.org ?