Aha, you’re right, I did a wrong comparison, the reason might be only for checkpointing :).
Thanks Jerry From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Wednesday, January 28, 2015 10:39 AM To: Shao, Saisai Cc: user Subject: Re: Why must the dstream.foreachRDD(...) parameter be serializable? Hi, thanks for the answers! On Wed, Jan 28, 2015 at 11:31 AM, Shao, Saisai <saisai.s...@intel.com<mailto:saisai.s...@intel.com>> wrote: Also this `foreachFunc` is more like an action function of RDD, thinking of rdd.foreach(func), in which `func` need to be serializable. So maybe I think your way of use it is not a normal way :). Yeah I totally understand why func in rdd.foreach(func) must be serializable (because it's sent to the executors), but I didn't get why a function that's not shipped around must be serializable, too. The explanations made sense, though :-) Thanks Tobias