Is it possible that value.get("(area_code")) or value.get("time_zone")) returned null ?
On Thu, Feb 5, 2015 at 10:58 AM, oxpeople <vincent.y....@bankofamerica.com> wrote: > I modified the code Base on CassandraCQLTest. to get the area code count > base on time zone. I got error on create new map Rdd. Any helping is > appreciated. Thanks. > > ... val arecodeRdd = sc.newAPIHadoopRDD(job.getConfiguration(), > classOf[CqlPagingInputFormat], > classOf[java.util.Map[String,ByteBuffer]], > classOf[java.util.Map[String,ByteBuffer]]) > > println("Count: " + arecodeRdd.count) //got right count > // arecodeRdd.saveAsTextFile("/tmp/arecodeRddrdd.txt"); > val areaCodeSelectedRDD = arecodeRdd.map { > case (key, value) => { > * (ByteBufferUtil.string(value.get("(area_code")), > ByteBufferUtil.string(value.get("time_zone"))) * //failed > } > } > println("areaCodeRDD: " + areaCodeSelectedRDD.count) > > ... > > Here is the stack trace: > 15/02/05 13:38:15 ERROR executor.Executor: Exception in task 109.0 in stage > 1.0 (TID 366) > java.lang.NullPointerException > at > org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167) > at > org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124) > at > > org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68) > at > > org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910) > at > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) > at > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 15/02/05 13:38:15 INFO scheduler.TaskSetManager: Starting task 110.0 in > stage 1.0 (TID 367, localhost, ANY, 1334 bytes) > 15/02/05 13:38:15 INFO executor.Executor: Running task 110.0 in stage 1.0 > (TID 367) > 15/02/05 13:38:15 INFO rdd.NewHadoopRDD: Input split: > ColumnFamilySplit((-8484684946848467066, '-8334833978340269788] > @[127.0.0.1]) > 15/02/05 13:38:15 WARN scheduler.TaskSetManager: Lost task 109.0 in stage > 1.0 (TID 366, localhost): java.lang.NullPointerException > at > org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167) > at > org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124) > at > > org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68) > at > > org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910) > at > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) > at > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > > > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/get-null-potiner-exception-newAPIHadoopRDD-map-tp21520.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >