[ https://issues.apache.org/jira/browse/PHOENIX-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15787222#comment-15787222 ]
chenglei edited comment on PHOENIX-3453 at 12/30/16 9:09 AM: ------------------------------------------------------------- I wrote following test case to make this problem can be reproduced under 4.9.0, simplifying the original test case by removing the index table and change the type from CHAR(15) to Integer, which is more easier to debug: {code:borderStyle=solid} CREATE TABLE GROUPBY3453_INT ( ENTITY_ID INTEGER NOT NULL, CONTAINER_ID INTEGER NOT NULL, SCORE INTEGER NOT NULL, CONSTRAINT TEST_PK PRIMARY KEY (ENTITY_ID DESC,CONTAINER_ID DESC,SCORE DESC) ) UPSERT INTO GROUPBY3453_INT VALUES (1,1,1) select DISTINCT entity_id, score from ( select entity_id, score from GROUPBY3453_INT limit 1) {code} the expecting result is : {code:borderStyle=solid} 1 1 {code} but the actual result is: {code:borderStyle=solid} -104 1 {code} This problem can only be reproduced when the SQL has a SubQuery. When I debuged into the source code,I found the cause of the problem is the distinct(or group by) statement in the outer query.By the following code in GroupByCompiler.GroupBy.compile method, the "entity" column in GroupBy's expressions is ProjectedColumnExpression,but in line 245, the "entity" column in GroupBy's keyExpressions is replaced by CoerceExpression,which would convert the the "entity" column from PInteger to PDecimal: {code:borderStyle=solid} 232 for (int i = expressions.size()-2; i >= 0; i--) { 233 Expression expression = expressions.get(i); 234 PDataType keyType = getGroupByDataType(expression); 235 if (keyType == expression.getDataType()) { 236 continue; 237 } 238 // Copy expressions only when keyExpressions will be different than expressions 239 if (keyExpressions == expressions) { 240 keyExpressions = new ArrayList<Expression>(expressions); 241 } 242 // Wrap expression in an expression that coerces the expression to the required type.. 243 // This is done so that we have a way of expressing null as an empty key when more 244 // than one fixed and nullable types are used in a group by clause 245 keyExpressions.set(i, CoerceExpression.create(expression, keyType)); 246 } {code} was (Author: comnetwork): I wrote following test case to make this problem can be reproduced under 4.9.0, simplifying the original test case by removing the index table and change the type from CHAR(15) to Integer, which is more easier to debug: {code:borderStyle=solid} CREATE TABLE GROUPBY3453_INT ( ENTITY_ID INTEGER NOT NULL, CONTAINER_ID INTEGER NOT NULL, SCORE INTEGER NOT NULL, CONSTRAINT TEST_PK PRIMARY KEY (ENTITY_ID DESC,CONTAINER_ID DESC,SCORE DESC) ) UPSERT INTO GROUPBY3453_INT VALUES (1,1,1) select DISTINCT entity_id, score from ( select entity_id, score from GROUPBY3453_INT limit 1) {code} the expecting result is : {code:borderStyle=solid} 1 1 {code} but the actual result is: {code:borderStyle=solid} -104 1 {code} This problem can only be reproduced when the SQL has a SubQuery. > Secondary index and query using distinct: Outer query results in ERROR 201 > (22000): Illegal data. CHAR types may only contain single byte characters > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-3453 > URL: https://issues.apache.org/jira/browse/PHOENIX-3453 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.8.0 > Reporter: Joel Palmert > Assignee: chenglei > > Steps to repro: > CREATE TABLE IF NOT EXISTS TEST.TEST ( > ENTITY_ID CHAR(15) NOT NULL, > SCORE DOUBLE, > CONSTRAINT TEST_PK PRIMARY KEY ( > ENTITY_ID > ) > ) VERSIONS=1, MULTI_TENANT=FALSE, REPLICATION_SCOPE=1, TTL=31536000; > CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (SCORE DESC, ENTITY_ID > DESC); > UPSERT INTO test.test VALUES ('entity1',1.1); > SELECT DISTINCT entity_id, score > FROM( > SELECT entity_id, score > FROM test.test > LIMIT 25 > ); > Output (in SQuirreL) > ��������������� 1.1 > If you run it in SQuirreL it results in the entity_id column getting the > above error value. Notice that if you remove the secondary index or DISTINCT > you get the correct result. > I've also run the query through the Phoenix java api. Then I get the > following exception: > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. CHAR types > may only contain single byte characters (????????????) > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:454) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) > at > org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:291) > at org.apache.phoenix.schema.types.PChar.toObject(PChar.java:121) > at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:997) > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:608) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:621) -- This message was sent by Atlassian JIRA (v6.3.4#6332)