[jira] [Comment Edited] (PHOENIX-4283) Group By statement truncating BIGINTs

Ethan Wang (JIRA) Sun, 15 Oct 2017 12:47:21 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205268#comment-16205268
 ]


Ethan Wang edited comment on PHOENIX-4283 at 10/15/17 7:46 PM:
---------------------------------------------------------------

Yes, when cut off happens, the "actualType" is Decimal, "this" is BigInt.  The 
"actualType" is pass from RowKeyColumnExpression.fromType, which is Decimal in 
the nested groupby case. So, as comparison, in RowKeyColumnExpression,

 logic  :         type.coerceBytes(ptr, fromType);
*nested*:      BIGINT.coerceBytes(ptr, DECIMAL);
*normal*:     BIGINT.coerceBytes(ptr, BIGINT);

I think a issue may be that, in PLong, coerceBytes() is overriding *regardless* 
the "actualType", before passing into super.coerceBytes(). Therefore, the cut 
off get executed always.

{code}  @Override
    public void coerceBytes(ImmutableBytesWritable ptr, Object object, 
PDataType actualType,
            Integer maxLength, Integer scale, SortOrder actualModifier, Integer 
desiredMaxLength, Integer desiredScale,
            SortOrder expectedModifier) {
        // Decrease size of TIMESTAMP to size of LONG and continue coerce
        if (ptr.getLength() > getByteSize()) {
            ptr.set(ptr.get(), ptr.getOffset(), getByteSize());
        }
        super.coerceBytes(ptr, object, actualType, maxLength, scale, 
actualModifier, desiredMaxLength,
                desiredScale, expectedModifier);
    }
{code}  


was (Author: aertoria):
Yes, when cut off happens, the "actualType" is Decimal, "this" is BigInt.  The 
"actualType" is pass from RowKeyColumnExpression.fromType, which is Decimal in 
nested groupby. So, as comparison, in RowKeyColumnExpression,

 logic  :         type.coerceBytes(ptr, fromType);
*nested*:      BIGINT.coerceBytes(ptr, DECIMAL);
*normal*:     BIGINT.coerceBytes(ptr, BIGINT);

I think the issue may be that, in PLong, coerceBytes() is override *regardless* 
the "actualType", before passing into super.coerceBytes(). Therefore, the cut 
off get executed always.

{code}  @Override
    public void coerceBytes(ImmutableBytesWritable ptr, Object object, 
PDataType actualType,
            Integer maxLength, Integer scale, SortOrder actualModifier, Integer 
desiredMaxLength, Integer desiredScale,
            SortOrder expectedModifier) {
        // Decrease size of TIMESTAMP to size of LONG and continue coerce
        if (ptr.getLength() > getByteSize()) {
            ptr.set(ptr.get(), ptr.getOffset(), getByteSize());
        }
        super.coerceBytes(ptr, object, actualType, maxLength, scale, 
actualModifier, desiredMaxLength,
                desiredScale, expectedModifier);
    }
{code}  

> Group By statement truncating BIGINTs
> -------------------------------------
>
>                 Key: PHOENIX-4283
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4283
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Steven Sadowski
>            Assignee: Ethan Wang
>             Fix For: 4.12.1
>
>
> *Versions:*
> Phoenix 4.11.0
> HBase: 1.3.1
> (Amazon EMR: 5.8.0)
> *Steps to reproduce:*
> 1. From the `sqlline-thin.py` client setup the following table:
> {code:sql}
> CREATE TABLE test_table (
>     a BIGINT NOT NULL, 
>     c BIGINT NOT NULL
>     CONSTRAINT PK PRIMARY KEY (a, c)
> );
> UPSERT INTO test_table(a,c) VALUES(4444444444444444444, 5555555555555555555);
> SELECT a FROM (SELECT a, c FROM test_table GROUP BY a, c) GROUP BY a, c;
> {code}
> *Expected Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444444444  |
> +----------------------+
> {code}
> *Actual Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444000000  |
> +----------------------+
> {code}
> *Comments:*
> Having the two Group By statements together seems to truncate the last 6 or 
> so digits of the final result. Removing the outer (or either) group by will 
> produce the correct result.
> Please fix the Group by statement to not truncate the outer result's value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (PHOENIX-4283) Group By statement truncating BIGINTs

Reply via email to