[jira] [Comment Edited] (PHOENIX-4139) select distinct with identical aggregations return weird values

Csaba Skrabak (JIRA) Wed, 08 Nov 2017 02:06:42 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243646#comment-16243646
 ]


Csaba Skrabak edited comment on PHOENIX-4139 at 11/8/17 10:05 AM:
------------------------------------------------------------------

In ExpressionCompiler.wrapGroupByExpression(Expression) method, there is an 
indexOf call:
            int index = groupBy.getExpressions().indexOf(expression);

If there are two equal expressions in the groupBy, they should have different 
index in their accessors but both get the return from an indexOf (which by 
design gives the first found element just a few lines below,)

                RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
                expression = new RowKeyColumnExpression(expression, accessor, 
groupBy.getKeyExpressions().get(index).getDataType());

This makes me think that the GroupBy fields should have more powerful data 
structures than Lists to store keyExpressions and expressions. But now I'm not 
sure what the whole GroupBy class is really used for. I don't want to tinker 
with it and maybe break the design until I understand.

So the list is what we're doing a GROUP BY over I think but its elements are 
wrong.


was (Author: cskrabak):
In ExpressionCompiler.wrapGroupByExpression(Expression) method, there is an 
indexOf call:
            int index = groupBy.getExpressions().indexOf(expression);

If there are two equal expressions in the groupBy, they should have different 
index in their accessors but both get the return from an indexOf (which by 
design gives the first found element just a few lines below:)

                RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
                expression = new RowKeyColumnExpression(expression, accessor, 
groupBy.getKeyExpressions().get(index).getDataType());

This makes me think that the GroupBy fields should have more powerful data 
structures than Lists to store keyExpressions and expressions. But now I'm not 
sure what the whole GroupBy class is really used for. I don't want to tinker 
with it and maybe break the design until I understand.

So the list is what we're doing a GROUP BY over I think but its elements are 
wrong.

> select distinct with identical aggregations return weird values 
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-4139
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4139
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>         Environment: minicluster
>            Reporter: Csaba Skrabak
>            Assignee: Csaba Skrabak
>            Priority: Minor
>             Fix For: 4.14.0
>
>         Attachments: PHOENIX-4139.patch
>
>
> From sme-hbase hipchat room:
> Pulkit Bhardwaj·10:31
> i'm seeing a weird issue with phoenix, appreciate some thoughts
> Created a simple table in phoenix
> {noformat}
> 0: jdbc:phoenix:> create table test_select(nam VARCHAR(20), address 
> VARCHAR(20), id BIGINT
> . . . . . . . . > constraint my_pk primary key (id));
> 0: jdbc:phoenix:> upsert into test_select (nam, address,id) 
> values('pulkit','badaun',1);
> 0: jdbc:phoenix:> select * from test_select;
> +---------+----------+-----+
> |   NAM   | ADDRESS  | ID  |
> +---------+----------+-----+
> | pulkit  | badaun   | 1   |
> +---------+----------+-----+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", nam from 
> test_select;
> +--------------+---------+
> | test_column  |   NAM   |
> +--------------+---------+
> | harshit      | pulkit  |
> +--------------+---------+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), 
> trim(nam) from test_select;
> +--------------+----------------+----------------+
> | test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
> +--------------+----------------+----------------+
> | harshit      | pulkitpulkit  | pulkitpulkit  |
> +--------------+----------------+----------------+
> {noformat}
> When I apply a trim on the nam column and use it multiple times, the output 
> has the cell data duplicated!
> {noformat}
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), 
> trim(nam), trim(nam) from test_select;
> +--------------+-----------------------+-----------------------+-----------------------+
> | test_column  |       TRIM(NAM)       |       TRIM(NAM)       |       
> TRIM(NAM)       |
> +--------------+-----------------------+-----------------------+-----------------------+
> | harshit      | pulkitpulkitpulkit  | pulkitpulkitpulkit  | 
> pulkitpulkitpulkit  |
> +--------------+-----------------------+-----------------------+-----------------------+
> {noformat}
> Wondering if someone has seen this before??
> One thing to note is, if I remove the —— distinct 'harshit' as "test_column" 
> ——  The issue is not seen
> {noformat}
> 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
> +------------+------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+------------+
> | pulkit     | pulkit     | pulkit     |
> +------------+------------+------------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (PHOENIX-4139) select distinct with identical aggregations return weird values

Reply via email to