HI, I am new to spark,when try to write some simple tests in spark shell, I met following problem.
I create a very small text file,name it as 5.txt 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 and experiment in spark shell: scala> val d5 = sc.textFile("5.txt").cache() d5: org.apache.spark.rdd.RDD[String] = MappedRDD[91] at textFile at <console>:12 scala> d5.keyBy(_.split(" ")(0)).reduceByKey((v1,v2) => (v1.split(" ")(1).toInt + v2.split(" ")(1).toInt).toString).first then error occurs: 14/04/18 00:20:11 ERROR Executor: Exception in task ID 36 java.lang.ArrayIndexOutOfBoundsException: 1 at $line60.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:15) at $line60.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:15) at org.apache.spark.util.collection.ExternalAppendOnlyMap$$anonfun$2.apply(ExternalAppendOnlyMap.scala:120) when I delete 1 line in the file, and make it 2 lines,the result is correct, I don't understand what's the problem, please help me,thanks.