[ 
https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-3527.
----------------------------------
    Resolution: Fixed

> Throw 'String length cannot exceed 32000 characters' exception when load data 
> with 'GLOBAL_SORT' from csv which include big complex type data
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-3527
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3527
>             Project: CarbonData
>          Issue Type: Bug
>          Components: spark-integration
>    Affects Versions: 1.6.0
>            Reporter: Zhichao  Zhang
>            Assignee: Zhichao  Zhang
>            Priority: Major
>             Fix For: 1.6.1
>
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> *Problem:*
> When complex type data is used more than 32000 characters to indicate in csv 
> file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 
> 'String length cannot exceed 32000 characters' exception.
> *Cause:*
> Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly 
> store data in StringArrayRow, the type of all data are string, when call 
> 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the 
> length of all data and throw 'String length cannot exceed 32000 characters' 
> exception even if it's complex type data which store as more than 32000 
> characters in csv files.
> *Solution:*
> In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), 
> if the data type of field is complex type, don't check the length.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to