[ https://issues.apache.org/jira/browse/PHOENIX-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212846#comment-16212846 ]
Csaba Skrabak commented on PHOENIX-4139: ---------------------------------------- The NUL-separated value is generated in GroupedAggregateRegionObserver.scanUnordered but it is intentional. It's called from hbase scan for each row. RowKeyColumnExpression.evaluate is called by ExpressionProjector.getValue on each Tuple generated by scanUnordered. In the error case this evaluate returns (sets ptr to) a string containing the NUL. Its accessor.getOffset and accessor.getLength does not return correct value. > select distinct with identical aggregations return weird values > ---------------------------------------------------------------- > > Key: PHOENIX-4139 > URL: https://issues.apache.org/jira/browse/PHOENIX-4139 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.12.0 > Environment: minicluster > Reporter: Csaba Skrabak > Assignee: Csaba Skrabak > Priority: Minor > Fix For: 4.13.0 > > Attachments: PHOENIX-4139.patch > > > From sme-hbase hipchat room: > Pulkit Bhardwaj·10:31 > i'm seeing a weird issue with phoenix, appreciate some thoughts > Created a simple table in phoenix > {noformat} > 0: jdbc:phoenix:> create table test_select(nam VARCHAR(20), address > VARCHAR(20), id BIGINT > . . . . . . . . > constraint my_pk primary key (id)); > 0: jdbc:phoenix:> upsert into test_select (nam, address,id) > values('pulkit','badaun',1); > 0: jdbc:phoenix:> select * from test_select; > +---------+----------+-----+ > | NAM | ADDRESS | ID | > +---------+----------+-----+ > | pulkit | badaun | 1 | > +---------+----------+-----+ > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", nam from > test_select; > +--------------+---------+ > | test_column | NAM | > +--------------+---------+ > | harshit | pulkit | > +--------------+---------+ > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), > trim(nam) from test_select; > +--------------+----------------+----------------+ > | test_column | TRIM(NAM) | TRIM(NAM) | > +--------------+----------------+----------------+ > | harshit | pulkitpulkit | pulkitpulkit | > +--------------+----------------+----------------+ > {noformat} > When I apply a trim on the nam column and use it multiple times, the output > has the cell data duplicated! > {noformat} > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), > trim(nam), trim(nam) from test_select; > +--------------+-----------------------+-----------------------+-----------------------+ > | test_column | TRIM(NAM) | TRIM(NAM) | > TRIM(NAM) | > +--------------+-----------------------+-----------------------+-----------------------+ > | harshit | pulkitpulkitpulkit | pulkitpulkitpulkit | > pulkitpulkitpulkit | > +--------------+-----------------------+-----------------------+-----------------------+ > {noformat} > Wondering if someone has seen this before?? > One thing to note is, if I remove the —— distinct 'harshit' as "test_column" > —— The issue is not seen > {noformat} > 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select; > +------------+------------+------------+ > | TRIM(NAM) | TRIM(NAM) | TRIM(NAM) | > +------------+------------+------------+ > | pulkit | pulkit | pulkit | > +------------+------------+------------+ > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)