Re: get null potiner exception newAPIHadoopRDD.map()

Ted Yu Thu, 05 Feb 2015 11:29:50 -0800

Is it possible that value.get("(area_code")) or value.get("time_zone"))
returned null ?


On Thu, Feb 5, 2015 at 10:58 AM, oxpeople <vincent.y....@bankofamerica.com>
wrote:

>  I modified the code Base on CassandraCQLTest. to get the area code count
> base on time zone. I got error on create new map Rdd. Any helping is
> appreciated. Thanks.
>
> ...   val arecodeRdd = sc.newAPIHadoopRDD(job.getConfiguration(),
>       classOf[CqlPagingInputFormat],
>       classOf[java.util.Map[String,ByteBuffer]],
>       classOf[java.util.Map[String,ByteBuffer]])
>
>     println("Count: " + arecodeRdd.count) //got right count
>   //  arecodeRdd.saveAsTextFile("/tmp/arecodeRddrdd.txt");
>     val areaCodeSelectedRDD = arecodeRdd.map {
>       case (key, value) => {
>        * (ByteBufferUtil.string(value.get("(area_code")),
> ByteBufferUtil.string(value.get("time_zone"))) * //failed
>       }
>     }
>   println("areaCodeRDD: " + areaCodeSelectedRDD.count)
>
> ...
>
> Here is the stack trace:
> 15/02/05 13:38:15 ERROR executor.Executor: Exception in task 109.0 in stage
> 1.0 (TID 366)
> java.lang.NullPointerException
>         at
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
>         at
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
>         at
>
> org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68)
>         at
>
> org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66)
>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>         at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311)
>         at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
>         at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
>         at
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
>         at
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
>         at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 15/02/05 13:38:15 INFO scheduler.TaskSetManager: Starting task 110.0 in
> stage 1.0 (TID 367, localhost, ANY, 1334 bytes)
> 15/02/05 13:38:15 INFO executor.Executor: Running task 110.0 in stage 1.0
> (TID 367)
> 15/02/05 13:38:15 INFO rdd.NewHadoopRDD: Input split:
> ColumnFamilySplit((-8484684946848467066, '-8334833978340269788]
> @[127.0.0.1])
> 15/02/05 13:38:15 WARN scheduler.TaskSetManager: Lost task 109.0 in stage
> 1.0 (TID 366, localhost): java.lang.NullPointerException
>         at
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
>         at
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
>         at
>
> org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:68)
>         at
>
> org.apache.spark.examples.CassandraAreaCodeLocation$$anonfun$1.apply(CassandraAreaCodeLocation.scala:66)
>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>         at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1311)
>         at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
>         at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:910)
>         at
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
>         at
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
>         at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/get-null-potiner-exception-newAPIHadoopRDD-map-tp21520.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: get null potiner exception newAPIHadoopRDD.map()

Reply via email to