Re: met a problem while running a streaming example program

Patrick Wendell Tue, 29 Oct 2013 09:53:00 -0700

If you just add the "extends Serializable" changes from here it should work.


On Tue, Oct 29, 2013 at 9:36 AM, Patrick Wendell <pwend...@gmail.com> wrote:
> This was fixed on 0.8 branch and master:
> https://github.com/apache/incubator-spark/pull/63/files
>
> - Patrick
>
> On Tue, Oct 29, 2013 at 9:17 AM, Thunder Stumpges
> <thunder.stump...@gmail.com> wrote:
>> I vaguely remember running into this same error. It says there
>> "java.io.NotSerializableException:
>> org.apache.spark.streaming.examples.clickstream.PageView"... can you
>> check the PageView class in the examples and make sure it has the
>> @serializable directive? I seem to remember having to add it.
>>
>> good luck,
>> Thunder
>>
>>
>> On Tue, Oct 29, 2013 at 6:54 AM, dachuan <hdc1...@gmail.com> wrote:
>>> Hi,
>>>
>>> I have tried the clickstream example, it runs into an exception, anybody met
>>> this before?
>>>
>>> Since the program mentioned "local[2]", so I run it in my local machine.
>>>
>>> thanks in advance,
>>> dachuan.
>>>
>>> Log Snippet 1:
>>>
>>> 13/10/29 08:50:25 INFO scheduler.DAGScheduler: Submitting 46 missing tasks
>>> from Stage 12 (MapPartitionsRDD[63] at combineByKey at
>>> ShuffledDStream.scala:41)
>>> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Size of task 75 is 4230
>>> bytes
>>> 13/10/29 08:50:25 INFO local.LocalScheduler: Running 75
>>> 13/10/29 08:50:25 INFO spark.CacheManager: Cache key is rdd_9_0
>>> 13/10/29 08:50:25 INFO spark.CacheManager: Computing partition
>>> org.apache.spark.rdd.BlockRDDPartition@0
>>> 13/10/29 08:50:25 WARN storage.BlockManager: Putting block rdd_9_0 failed
>>> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Loss was due to
>>> java.io.NotSerializableException
>>> java.io.NotSerializableException:
>>> org.apache.spark.streaming.examples.clickstream.PageView
>>>
>>> Log Snippet 2:
>>> org.apache.spark.SparkException: Job failed: Task 12.0:0 failed more than 4
>>> times; aborting job java.io.NotSerializableException:
>>> org.apache.spark.streaming.examples.clickstream.PageView
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
>>>         at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
>>>         at
>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)
>>>
>>> Two commands that run this app:
>>> ./run-example
>>> org.apache.spark.streaming.exampl.clickstream.PageViewGenerator 44444 10
>>> ./run-example org.apache.spark.streaming.examples.clickstream.PageViewStream
>>> errorRatePerZipCode localhost 44444
>>>

Re: met a problem while running a streaming example program

Reply via email to