Deng Changchun created SPARK-14524: -------------------------------------- Summary: In SparkSQL, it can't be select column of String type because of UTF8String when setting more than 32G for executors. Key: SPARK-14524 URL: https://issues.apache.org/jira/browse/SPARK-14524 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.2 Environment: Centos Reporter: Deng Changchun Priority: Critical
(related Issue:https://github.com/apache/spark/pull/8210/files) When we set 32G(or more) for executor, select the column of String type, it shows the Wrong result, such as: 'abcde' (less than 8 chars) => '' (it will show nothing) 'abcdefghijklmn' (more than 8 chars) =>'ijklmn' ( it will cut the the front of 8 chars) However, when we set 31G( or less) for executor, all is good. We also have debugged this problem, we found that SparkSQL uses UTF8String internally, it depends on some properties of locally JVM Memmory allocation ( see class 'org.apache.spark.unsafe.Platform'). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org