I modified the code Base on CassandraCQLTest. to get the area code count
base on time zone. I got error on create new map Rdd. Any helping is
appreciated. Thanks.

...   val arecodeRdd = sc.newAPIHadoopRDD(job.getConfiguration(),
      classOf[CqlPagingInputFormat],
      classOf[java.util.Map[String,ByteBuffer]],
      classOf[java.util.Map[String,ByteBuffer]])

    println("Count: " + arecodeRdd.count) //got right count
  //  arecodeRdd.saveAsTextFile("/tmp/arecodeRddrdd.txt");
    val areaCodeSelectedRDD = arecodeRdd.map {
      case (key, value) => {
       * (ByteBufferUtil.string(value.get("(area_code")),
ByteBufferUtil.string(value.get("time_zone"))) * //failed
      }
    }
  println("areaCodeRDD: " + areaCodeSelectedRDD.count)

...

Here is the stack trace:
15/02/05 13:38:15 ERROR executor.Executor: Exception in task 109.0 in stage
1.0 (TID 366)
java.lang.NullPointerException
        at
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
        at
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
        at
org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68)
        at
org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
15/02/05 13:38:15 INFO scheduler.TaskSetManager: Starting task 110.0 in
stage 1.0 (TID 367, localhost, ANY, 1334 bytes)
15/02/05 13:38:15 INFO executor.Executor: Running task 110.0 in stage 1.0
(TID 367)
15/02/05 13:38:15 INFO rdd.NewHadoopRDD: Input split:
ColumnFamilySplit((-8484684946848467066, '-8334833978340269788]
@[127.0.0.1])
15/02/05 13:38:15 WARN scheduler.TaskSetManager: Lost task 109.0 in stage
1.0 (TID 366, localhost): java.lang.NullPointerException
        at
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
        at
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
        at
org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68)
        at
org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)



    





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/get-null-potiner-exception-newAPIHadoopRDD-map-tp21520.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to