Deng Changchun created SPARK-14524:
--------------------------------------

             Summary: In SparkSQL, it can't be select column of String type 
because of UTF8String when setting more than 32G for executors.
                 Key: SPARK-14524
                 URL: https://issues.apache.org/jira/browse/SPARK-14524
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.5.2
         Environment: Centos
            Reporter: Deng Changchun
            Priority: Critical


(related Issue:https://github.com/apache/spark/pull/8210/files)

When we set 32G(or more) for executor, select the column of String type, it 
shows the Wrong result, such as:
'abcde'               (less than 8 chars)   => ''   (it will show nothing)
'abcdefghijklmn' (more than 8 chars) =>'ijklmn' ( it will cut the the front of 
8 chars)

However, when we set 31G( or less)  for executor, all is good.

We also have debugged this problem, we found that SparkSQL uses UTF8String 
internally, it depends on some properties of locally JVM Memmory allocation ( 
see class 'org.apache.spark.unsafe.Platform'). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to