[ https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jacky Li resolved CARBONDATA-3527. ---------------------------------- Resolution: Fixed > Throw 'String length cannot exceed 32000 characters' exception when load data > with 'GLOBAL_SORT' from csv which include big complex type data > --------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CARBONDATA-3527 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3527 > Project: CarbonData > Issue Type: Bug > Components: spark-integration > Affects Versions: 1.6.0 > Reporter: Zhichao Zhang > Assignee: Zhichao Zhang > Priority: Major > Fix For: 1.6.1 > > Time Spent: 3h > Remaining Estimate: 0h > > *Problem:* > When complex type data is used more than 32000 characters to indicate in csv > file, and load data with 'GLOBAL_SORT' from these csv files, it will throw > 'String length cannot exceed 32000 characters' exception. > *Cause:* > Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly > store data in StringArrayRow, the type of all data are string, when call > 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the > length of all data and throw 'String length cannot exceed 32000 characters' > exception even if it's complex type data which store as more than 32000 > characters in csv files. > *Solution:* > In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), > if the data type of field is complex type, don't check the length. -- This message was sent by Atlassian Jira (v8.3.4#803005)