Hi,

I don't think there's a NPE issue when using DStream/count() even there is no 
data feed into Spark Streaming. I tested using Kafka in my local settings, both 
are OK with and without data consumed.

Actually you can see the details in ReceiverInputDStream, even there is no data 
in this batch duration, it will generate an empty BlockRDD, so map() and 
transformation() in count() operator will never meet NPE. I think the problem 
may lies on your customized InputDStream, you should make sure to generate an 
empty RDD even when there is no data feed in.

Thanks
Jerry

-----Original Message-----
From: anoldbrain [mailto:anoldbr...@gmail.com] 
Sent: Wednesday, August 20, 2014 4:13 PM
To: u...@spark.incubator.apache.org
Subject: Re: NullPointerException from '.count.foreachRDD'

Looking at the source codes of DStream.scala


>   /**
>    * Return a new DStream in which each RDD has a single element 
> generated by counting each RDD
>    * of this DStream.
>    */
>   def count(): DStream[Long] = {
>     this.map(_ => (null, 1L))
>         .transform(_.union(context.sparkContext.makeRDD(Seq((null, 
> 0L)),
> 1)))
>         .reduceByKey(_ + _)
>         .map(_._2)
>   }

transform is the line throwing the NullPointerException. Can anyone give some 
hints as what would cause "_" to be null (it is indeed null)? This only happens 
when there is no data to process.

When there's data, no NullPointerException is thrown, and all the 
processing/counting proceeds correctly.

I am using my custom InputDStream. Could it be that this is the source of the 
problem?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-from-count-foreachRDD-tp2066p12461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to