[ https://issues.apache.org/jira/browse/PHOENIX-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214243#comment-16214243 ]
Dumindu Buddhika commented on PHOENIX-4139: ------------------------------------------- [~jamestaylor] The logic for setting hasSeperator is below {code:java} this.hasSeparator = !isFixedLength && (datum != data.get(data.size()-1)); {code} isFixedLength being false here, now in this scenario we have the same column repeated (TRIM(NAM) ), I think datum has the reference for the same PDarum object for the repeated columns, because of that datum != data.get(data.size()-1) becomes false. That's why hasSeparator is not set. We may need to have different PDatum objects here (But I do not know the performance impact of that) or we need to change this logic. > select distinct with identical aggregations return weird values > ---------------------------------------------------------------- > > Key: PHOENIX-4139 > URL: https://issues.apache.org/jira/browse/PHOENIX-4139 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.12.0 > Environment: minicluster > Reporter: Csaba Skrabak > Assignee: Csaba Skrabak > Priority: Minor > Fix For: 4.13.0 > > Attachments: PHOENIX-4139.patch > > > From sme-hbase hipchat room: > Pulkit Bhardwaj·10:31 > i'm seeing a weird issue with phoenix, appreciate some thoughts > Created a simple table in phoenix > {noformat} > 0: jdbc:phoenix:> create table test_select(nam VARCHAR(20), address > VARCHAR(20), id BIGINT > . . . . . . . . > constraint my_pk primary key (id)); > 0: jdbc:phoenix:> upsert into test_select (nam, address,id) > values('pulkit','badaun',1); > 0: jdbc:phoenix:> select * from test_select; > +---------+----------+-----+ > | NAM | ADDRESS | ID | > +---------+----------+-----+ > | pulkit | badaun | 1 | > +---------+----------+-----+ > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", nam from > test_select; > +--------------+---------+ > | test_column | NAM | > +--------------+---------+ > | harshit | pulkit | > +--------------+---------+ > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), > trim(nam) from test_select; > +--------------+----------------+----------------+ > | test_column | TRIM(NAM) | TRIM(NAM) | > +--------------+----------------+----------------+ > | harshit | pulkitpulkit | pulkitpulkit | > +--------------+----------------+----------------+ > {noformat} > When I apply a trim on the nam column and use it multiple times, the output > has the cell data duplicated! > {noformat} > 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), > trim(nam), trim(nam) from test_select; > +--------------+-----------------------+-----------------------+-----------------------+ > | test_column | TRIM(NAM) | TRIM(NAM) | > TRIM(NAM) | > +--------------+-----------------------+-----------------------+-----------------------+ > | harshit | pulkitpulkitpulkit | pulkitpulkitpulkit | > pulkitpulkitpulkit | > +--------------+-----------------------+-----------------------+-----------------------+ > {noformat} > Wondering if someone has seen this before?? > One thing to note is, if I remove the —— distinct 'harshit' as "test_column" > —— The issue is not seen > {noformat} > 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select; > +------------+------------+------------+ > | TRIM(NAM) | TRIM(NAM) | TRIM(NAM) | > +------------+------------+------------+ > | pulkit | pulkit | pulkit | > +------------+------------+------------+ > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)