[jira] [Comment Edited] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602805#comment-15602805 ] Jan Fernando edited comment on PHOENIX-3396 at 10/24/16 6:25 PM: - [~giacomotaylor] Was there a reason this works differently with non-PK and PK fields? I only see this issue if a field is part of the PK. I think the behavior should be consistent in all cases. was (Author: jfernando_sfdc): [~giacomotaylor] Was there a reason this works differently with non-PK and PK fields. I only see this issue if a fields is part of the PK. I think the behavior should be consistent in all cases. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-3396) Valid Multi-byte strings whose total byte size is greater than the max char limit cannot be inserted into VARCHAR fields in the PK
[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596812#comment-15596812 ] James Taylor edited comment on PHOENIX-3396 at 10/22/16 1:41 AM: - Phoenix interpretes the max as bytes instead of characters. That was done on purpose, but using characters instead would make more sense. As a workaround, just increase the max length by a factor of 3 - it won't have any impact on performance. To do that, you can follow the directions outlined here: http://search-hadoop.com/m/9UY0h21LtMS1KMm942&subj=Re+Can+I+change+a+String+column+s+size+and+preserve+the+data+ was (Author: jamestaylor): Phoenix interpretes the max as bytes instead of characters. That was done on purpose, but using characters instead would make more sense. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > --- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug >Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)