[ https://issues.apache.org/jira/browse/PHOENIX-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596812#comment-15596812 ]
James Taylor edited comment on PHOENIX-3396 at 10/22/16 1:41 AM: ----------------------------------------------------------------- Phoenix interpretes the max as bytes instead of characters. That was done on purpose, but using characters instead would make more sense. As a workaround, just increase the max length by a factor of 3 - it won't have any impact on performance. To do that, you can follow the directions outlined here: http://search-hadoop.com/m/9UY0h21LtMS1KMm942&subj=Re+Can+I+change+a+String+column+s+size+and+preserve+the+data+ was (Author: jamestaylor): Phoenix interpretes the max as bytes instead of characters. That was done on purpose, but using characters instead would make more sense. > Valid Multi-byte strings whose total byte size is greater than the max char > limit cannot be inserted into VARCHAR fields in the PK > ----------------------------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-3396 > URL: https://issues.apache.org/jira/browse/PHOENIX-3396 > Project: Phoenix > Issue Type: Bug > Reporter: Jan Fernando > > We allow users to insert multi-byte characters into VARCHAR columns that are > part of a table or view's PK. We noticed that Strings that had a valid number > of characters (i.e. were less than than max char length) were causing upserts > to fail with with the following exception: > Caused by: java.sql.SQLException: ERROR 206 (22003): The data exceeds the max > capacity for the data type. MYTABLE may not exceed 100 bytes > ('緓嗝加슪䐤㵞靹疸芬꽣汚佃䘯茵䖻埾巆蕤ⱅ澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉ķ窯尬룗㚈Ꝝ퍛爃됰灁ᄠࢥ') > There appears to be an issue in PTableImpl.newKey() where we check the > maxLength in chars against the byte length in this check: > maxLength != null && !type.isArrayType() && byteValue.length > maxLength > To reproduce you can run the following: > CREATE TABLE TEXT_FIELD_VALIDATION_PK (TEXT VARCHAR(20), TEXT1 VARCHAR(20) > CONSTRAINT PK PRIMARY KEY (TEXT)); > UPSERT INTO TEXT_FIELD_VALIDATION_PK VALUES ('澴粖蟤य褻酃岤豦팑薰鄩脼ժ끦碉碉碉碉碉碉', 'test'); > The string we insert into the column TEXT is 20 chars, but greater than 20 > bytes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)