Your streaming job may have been seemingly running ok, but the DStream
checkpointing must have been failing in the background. It would have been
visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
that checkpointing failures dont get hidden in the background.

The fact that the serialization stack is not being shown correctly, is a
known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
next couple of days. That should help you to narrow down the culprit
preventing serialization.

On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Can you show us your function(s) ?
>
> Thanks
>
> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <chen.song...@gmail.com>
> wrote:
>
>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
>> 1.4, I started seeing error as below. It appears that it fails in validate
>> method in StreamingContext. Is there anything changed on 1.4.0 w.r.t
>> DStream checkpointint?
>>
>> Detailed error from driver:
>>
>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>> exception: *java.io.NotSerializableException: DStream checkpointing has
>> been enabled but the DStreams with their functions are not serializable*
>> Serialization stack:
>>
>> java.io.NotSerializableException: DStream checkpointing has been enabled
>> but the DStreams with their functions are not serializable
>> Serialization stack:
>>
>> at
>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>> at
>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>> at
>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>
>> --
>> Chen Song
>>
>>
>

Reply via email to