[jira] [Comment Edited] (PHOENIX-4283) Group By statement truncating BIGINTs

Ethan Wang (JIRA) Fri, 13 Oct 2017 18:28:44 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204396#comment-16204396
 ]


Ethan Wang edited comment on PHOENIX-4283 at 10/14/17 1:27 AM:
---------------------------------------------------------------

So in the GroupbyCompiler, when nested groupby is evaluated, in this case it 
will try to coarse all the leading ProjectedColumnExpressions (except the last 
one) inside the expressions to the required types. e.g., for (A,C) it will be 
(A); for (A,B,C,D,E) it will be (A,B,C,D) force to convert.

When doing so, in IndexUtil.getIndexColumnDataType(), BigInt (PLong) is defined 
to be coarsable with PDecimal. (in PLong.isComparableTo() ). As a result, all 
the leading BIGINT ProjectedColumnExpressions is now casted to decimal. The 
Decimal will be converted back to the appropriate type (Integer or Long). 

If I understand right the reason behind this is that it needs something to give 
coprocessor sort by in the case groupbys is not along the PK axis. However, 
when a region have no such groupbys, it need null to hold on. And decimal is 
the "appropriate" type that we can construct a such a null. Is this correct?  



was (Author: aertoria):
So in the GroupbyCompiler, when nested groupby is evaluated, it will try to 
coarse all the leading   ProjectedColumnExpressions (except the last one) 
inside the expressions to the required types. e.g., for (A,C) it will be (A); 
for (A,B,C,D,E) it will be (A,B,C,D) force to convert.

When doing so, in IndexUtil.getIndexColumnDataType(), BigInt (PLong) is defined 
to be coarsable with PDecimal. (in PLong.isComparableTo() ). As a result, all 
the leading BIGINT ProjectedColumnExpressions is now casted to decimal. The 
Decimal will be converted back to the appropriate type (Integer or Long). 

If I understand right the reason behind this is that it needs something to give 
coprocessor sort by in the case groupbys is not along the PK axis. However, 
when a region have no such groupbys, it need null to hold on. And decimal is 
the "appropriate" type that we can construct a such a null. Is this correct?  


> Group By statement truncating BIGINTs
> -------------------------------------
>
>                 Key: PHOENIX-4283
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4283
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Steven Sadowski
>            Assignee: Ethan Wang
>             Fix For: 4.12.1
>
>
> *Versions:*
> Phoenix 4.11.0
> HBase: 1.3.1
> (Amazon EMR: 5.8.0)
> *Steps to reproduce:*
> 1. From the `sqlline-thin.py` client setup the following table:
> {code:sql}
> CREATE TABLE test_table (
>     a BIGINT NOT NULL, 
>     c BIGINT NOT NULL
>     CONSTRAINT PK PRIMARY KEY (a, c)
> );
> UPSERT INTO test_table(a,c) VALUES(4444444444444444444, 5555555555555555555);
> SELECT a FROM (SELECT a, c FROM test_table GROUP BY a, c) GROUP BY a, c;
> {code}
> *Expected Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444444444  |
> +----------------------+
> {code}
> *Actual Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444000000  |
> +----------------------+
> {code}
> *Comments:*
> Having the two Group By statements together seems to truncate the last 6 or 
> so digits of the final result. Removing the outer (or either) group by will 
> produce the correct result.
> Please fix the Group by statement to not truncate the outer result's value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (PHOENIX-4283) Group By statement truncating BIGINTs

Reply via email to