[ https://issues.apache.org/jira/browse/SPARK-12624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai resolved SPARK-12624. ------------------------------ Resolution: Fixed Fix Version/s: 1.6.1 2.0.0 Issue resolved by pull request 10886 [https://github.com/apache/spark/pull/10886] > When schema is specified, we should give better error message if actual row > length doesn't match > ------------------------------------------------------------------------------------------------ > > Key: SPARK-12624 > URL: https://issues.apache.org/jira/browse/SPARK-12624 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Reporter: Reynold Xin > Priority: Blocker > Fix For: 2.0.0, 1.6.1 > > > The following code snippet reproduces this issue: > {code} > from pyspark.sql.types import StructType, StructField, IntegerType, StringType > from pyspark.sql.types import Row > schema = StructType([StructField("a", IntegerType()), StructField("b", > StringType())]) > rdd = sc.parallelize(range(10)).map(lambda x: Row(a=x)) > df = sqlContext.createDataFrame(rdd, schema) > df.show() > {code} > An unintuitive {{ArrayIndexOutOfBoundsException}} exception is thrown in this > case: > {code} > ... > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.apache.spark.sql.catalyst.expressions.GenericInternalRow.genericGet(rows.scala:227) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.isNullAt(rows.scala:36) > ... > {code} > We should give a better error message here. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org