Hi there,

I've built a Spark Streaming app that accepts certain events from Kafka, and
I want to keep some state between the events. So I've successfully used
mapWithState for that. The problem is, that I want the state for keys to be
updated on every batchInterval, because "lack" of events is also significant
to the use case. This doesn't seem possible with mapWithState, unless I'm
missing something.

Previously I looked at updateStateByKey, which says:
> In every batch, Spark will apply the state update function for all
> existing keys, regardless of whether they have new data in a batch or not.

That is what I want, however, I've seen several tutorials/blog posts where
the advise was not to use updateStateByKey anymore, and use mapWithState
instead.

So my questions:

- Can mapWithState state function be called every batchInterval, even when
no events exist for that interval?
- If not, is it okay to use updateStateByKey instead? Or will it be
deprecated in the near future?
- If mapWithState doesn't support my need, is there another way to
accomplish the goal of updating state every batchInterval, that still uses
mapWithState, together with some other mechanism?

Thanks in advance!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Can-mapWithState-state-func-be-called-every-batchInterval-tp27877.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to