Hi,

Is there anyway in spark streaming to keep data across multiple
micro-batches? Like in a HashMap or something?
Can anyone make suggestions on how to keep data across iterations where
each iteration is an RDD being processed in JavaDStream?

This is especially the case when I am trying to update a model or compare
two sets of RDD's, or keep a global history of certain events etc which
will impact operations in future iterations?
I would like to keep some accumulated history to make calculations.. not
the entire dataset, but persist certain events which can be used in future
JavaDStream RDDs?

Thanks
Nipun
  • [no subject] Nipun Arora
    • Re: Silvio Fiorito

Reply via email to