Aha, you’re right, I did a wrong comparison, the reason might be only for 
checkpointing  :).

Thanks
Jerry

From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Wednesday, January 28, 2015 10:39 AM
To: Shao, Saisai
Cc: user
Subject: Re: Why must the dstream.foreachRDD(...) parameter be serializable?

Hi,

thanks for the answers!

On Wed, Jan 28, 2015 at 11:31 AM, Shao, Saisai 
<saisai.s...@intel.com<mailto:saisai.s...@intel.com>> wrote:
Also this `foreachFunc` is more like an action function of RDD, thinking of 
rdd.foreach(func), in which `func` need to be serializable. So maybe I think 
your way of use it is not a normal way :).

Yeah I totally understand why func in rdd.foreach(func) must be serializable 
(because it's sent to the executors), but I didn't get why a function that's 
not shipped around must be serializable, too.

The explanations made sense, though :-)

Thanks
Tobias


Reply via email to