Aha, you’re right, I did a wrong comparison, the reason might be only for checkpointing :).
Thanks Jerry From: Tobias Pfeiffer [mailto:[email protected]] Sent: Wednesday, January 28, 2015 10:39 AM To: Shao, Saisai Cc: user Subject: Re: Why must the dstream.foreachRDD(...) parameter be serializable? Hi, thanks for the answers! On Wed, Jan 28, 2015 at 11:31 AM, Shao, Saisai <[email protected]<mailto:[email protected]>> wrote: Also this `foreachFunc` is more like an action function of RDD, thinking of rdd.foreach(func), in which `func` need to be serializable. So maybe I think your way of use it is not a normal way :). Yeah I totally understand why func in rdd.foreach(func) must be serializable (because it's sent to the executors), but I didn't get why a function that's not shipped around must be serializable, too. The explanations made sense, though :-) Thanks Tobias
