[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15617077#comment-15617077 ] Hudson commented on PHOENIX-3396: - FAILURE: Integrated in Jenkins build Phoenix-master #1460 (See [https://builds.apache.org/job/Phoenix-master/1460/]) PHOENIX-3396 Valid Multi-byte strings whose total byte size is greater (jamestaylor: rev a097cffb1afeb94ae2ae101d655bfb362ee69845) * (edit) phoenix-core/src/test/java/org/apache/phoenix/schema/MutationTest.java > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando >Assignee: James Taylor > Attachments: PHOENIX-3396.patch, PHOENIX-3396_v2.patch > > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616129#comment-15616129 ] Hudson commented on PHOENIX-3396: - FAILURE: Integrated in Jenkins build Phoenix-master #1458 (See [https://builds.apache.org/job/Phoenix-master/1458/]) PHOENIX-3396 Valid Multi-byte strings whose total byte size is greater (jamestaylor: rev 13ae3f38c11b62f52cacff9ee029955f2bef3b8b) * (edit) phoenix-core/src/test/java/org/apache/phoenix/schema/MutationTest.java > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando >Assignee: James Taylor > Attachments: PHOENIX-3396.patch, PHOENIX-3396_v2.patch > > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615043#comment-15615043 ] Hudson commented on PHOENIX-3396: - FAILURE: Integrated in Jenkins build Phoenix-master #1456 (See [https://builds.apache.org/job/Phoenix-master/1456/]) PHOENIX-3396 Valid Multi-byte strings whose total byte size is greater (jamestaylor: rev a54a06cf566363054778dc60431553c6384ef34d) * (edit) phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/expression/function/ArrayModifierFunction.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/UpsertSelectIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/util/SchemaUtil.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PArrayDataType.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PVarchar.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixIndexBuilder.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/parse/ColumnDef.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDecimal.java * (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/ArithmeticQueryIT.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/exception/SQLExceptionInfo.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/schema/MutationTest.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PVarbinary.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PBinaryBase.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/PTableImpl.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/expression/function/ArrayConcatFunction.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PChar.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PBinary.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/types/PDataType.java * (edit) phoenix-core/src/main/java/org/apache/phoenix/exception/DataExceedsCapacityException.java > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando >Assignee: James Taylor > Attachments: PHOENIX-3396.patch, PHOENIX-3396_v2.patch > > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614380#comment-15614380 ] Samarth Jain commented on PHOENIX-3396: --- +1, patch LGTM. Just one minor nit to fix on commit. In ArithmeticQueryIT, add a fail() here: {code} +try { stmt.execute(); conn.commit(); +} catch (SQLException e) { + assertEquals(SQLExceptionCode.DATA_EXCEEDS_MAX_CAPACITY.getErrorCode(), e.getErrorCode()); +} {code} > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > Attachments: PHOENIX-3396.patch, PHOENIX-3396_v2.patch > > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602892#comment-15602892 ] James Taylor commented on PHOENIX-3396: --- We have a utility, StringUtil.calculateUTF8Length((byte[] bytes, int offset, int length, SortOrder sortOrder) that could be used in PTableImpl.newKey() to calculate the length based on the byte[] value (to avoid creating a String). > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602840#comment-15602840 ] James Taylor commented on PHOENIX-3396: --- No good reason, [~jfernando_sfdc]. I agree, it should consistently interpret the length as characters instead of bytes. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602805#comment-15602805 ] Jan Fernando commented on PHOENIX-3396: --- [~giacomotaylor] Was there a reason this works differently with non-PK and PK fields. I only see this issue if a fields is part of the PK. I think the behavior should be consistent in all cases. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596812#comment-15596812 ] James Taylor commented on PHOENIX-3396: --- Phoenix interpretes the max as bytes instead of characters. That was done on purpose, but using characters instead would make more sense. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)