scala.MatchError on SparkSQL when creating ArrayType of StructType

Hao Ren Fri, 05 Dec 2014 02:39:07 -0800

Hi, 

I am using SparkSQL on 1.1.0 branch.


The following code leads to a scala.MatchError 
at
org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:247) 

val scm = StructType(inputRDD.schema.fields.init :+ 
      StructField("list", 
        ArrayType( 
          StructType( 
            Seq(StructField("date", StringType, nullable = false), 
              StructField("nbPurchase", IntegerType, nullable = false)))), 
        nullable = false)) 

// purchaseRDD is RDD[sql.ROW] whose schema is corresponding to scm. It is
transformed from inputRDD
val schemaRDD = hiveContext.applySchema(purchaseRDD, scm) 
schemaRDD.registerTempTable("t_purchase") 

Here's the stackTrace: 
scala.MatchError: ArrayType(StructType(List(StructField(date,StringType,
true ), StructField(n_reachat,IntegerType, true ))),true) (of class
org.apache.spark.sql.catalyst.types.ArrayType) 
        at
org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:247) 
        at
org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247) 
        at
org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263) 
        at
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:84) 
        at
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:66)
 
        at
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:50)
 
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) 
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) 
        at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:149)
 
        at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$1.apply(InsertIntoHiveTable.scala:158)
 
        at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$1.apply(InsertIntoHiveTable.scala:158)
 
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) 
        at org.apache.spark.scheduler.Task.run(Task.scala:54) 
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) 
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
        at java.lang.Thread.run(Thread.java:744) 

The strange thing is that nullable of date and nbPurchase field are set to
true while it were false in the code. If I set both to true, it works. But,
in fact, they should not be nullable. 

Here's what I find at Cast.scala:247 on 1.1.0 branch 

  private[this] lazy val cast: Any => Any = dataType match { 
    case StringType => castToString 
    case BinaryType => castToBinary 
    case DecimalType => castToDecimal 
    case TimestampType => castToTimestamp 
    case BooleanType => castToBoolean 
    case ByteType => castToByte 
    case ShortType => castToShort 
    case IntegerType => castToInt 
    case FloatType => castToFloat 
    case LongType => castToLong 
    case DoubleType => castToDouble 
  } 

Any idea? Thank you. 

Hao



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/scala-MatchError-on-SparkSQL-when-creating-ArrayType-of-StructType-tp20459.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

scala.MatchError on SparkSQL when creating ArrayType of StructType

Reply via email to