I'm updating the Broadcast between batches, but I've ended up doing it in a listener, thanks!
On Wed, Feb 8, 2017 at 12:31 AM Tathagata Das <tathagata.das1...@gmail.com> wrote: > broadcasts are not saved in checkpoints. so you have to save it externally > yourself, and recover it before restarting the stream from checkpoints. > > On Tue, Feb 7, 2017 at 3:55 PM, Amit Sela <amitsel...@gmail.com> wrote: > > I know this approach, only thing is, it relies on the transformation being > an RDD transfomration as well and so could be applied via foreachRDD and > using the rdd context to avoid a stale context after recovery/resume. > My question is how to void stale context in a DStream-only transformation > such as updateStateByKey / mapWithState ? > > On Tue, Feb 7, 2017 at 9:19 PM Shixiong(Ryan) Zhu <shixi...@databricks.com> > wrote: > > It's documented here: > http://spark.apache.org/docs/latest/streaming-programming-guide.html#accumulators-broadcast-variables-and-checkpoints > > On Tue, Feb 7, 2017 at 8:12 AM, Amit Sela <amitsel...@gmail.com> wrote: > > Hi all, > > I was wondering if anyone ever used a broadcast variable within > an updateStateByKey op. ? Using it is straight-forward but I was wondering > how it'll work after resuming from checkpoint (using the rdd.context() > trick is not possible here) ? > > Thanks, > Amit > > > >