Can't you send a special event through spark streaming once the list is updated? So you have your normal events and a special reload event Le 17 janv. 2015 15:06, "Ji ZHANG" <zhangj...@gmail.com> a écrit :
> Hi, > > I want to join a DStream with some other dataset, e.g. join a click > stream with a spam ip list. I can think of two possible solutions, one > is use broadcast variable, and the other is use transform operation as > is described in the manual. > > But the problem is the spam ip list will be updated outside of the > spark streaming program, so how can it be noticed to reload the list? > > For broadcast variables, they are immutable. > > For transform operation, is it costly to reload the RDD on every > batch? If it is, and I use RDD.persist(), does it mean I need to > launch a thread to regularly unpersist it so that it can get the > updates? > > Any ideas will be appreciated. Thanks. > > -- > Jerry > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >