Hi David, Your information is very helpful! Thank you!
BroadcastStream can definitely do the job, but I think it makes the architecture kind of complicated, so it will be my last resort . I wonder if it’s possible to implement a clearAll() method for keyed states which clears user states for all namespaces, and does it violate the principle of keyed states? Thanks again! Best, Paul Lam > 在 2018年9月14日,16:00,David Anderson <da...@data-artisans.com> 写道: > > Paul, > > Theoretically, processing-time timers will get the job done, but yes, you'd > need a timer per key -- and folks who've tried this with millions of keys, > all firing at the same time, have reported that this behaves badly. For some > use cases it's workable to spread out the timers over an interval, like an > hour or two, to avoid this timer firing storm, but that doesn't sound like it > would work well for you. > > You might instead try using broadcast state to deal with this. You would > establish a broadcast stream connected to your keyed stream that acts as a > control stream for the keyed state. Then in the processBroadcastElement > method of a KeyedBroadcastProcessFunction you would use applyToKeyedState to > iterate over all the keyed state and clear everything. Unfortunately it's not > possible to use timers on broadcast state, so you'll have to find some other > way to trigger the event on the broadcast stream -- maybe a custom source > that uses a ProcessingTimeCallback to create events on the broadcast stream. > > David > > On Fri, Sep 14, 2018 at 7:18 AM Paul Lam <paullin3...@gmail.com > <mailto:paullin3...@gmail.com>> wrote: > > > > Hi vino, > > > > Thanks for the advice, but I think state TTL does not completely fit in my > > case. > > > > AFAIK, State TTL is per entry level and uses an inactive time threshold to > > expire entries, but I need a TTL for the whole MapState, which does not > > depend on when the entries are created or updated. Suppose I’m calculating > > stats of daily active users and use a userId field as key, I want the state > > totally truncated at the very beginning of each day. > > > > Thanks a lot! > > > > Best, > > Paul Lam > > > > > > 在 2018年9月14日,10:39,vino yang <yanghua1...@gmail.com > > <mailto:yanghua1...@gmail.com>> 写道: > > > > Hi Paul, > > > > Maybe you can try to understand the State TTL?[1] > > > > Thanks, vino. > > > > [1]: > > https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/state.html#state-time-to-live-ttl > > > > <https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/state.html#state-time-to-live-ttl> > > > > Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com>> > > 于2018年9月12日周三 下午6:06写道: > >> > >> Hi, > >> > >> I’m using MapState to deduplicate some ids and the MapState needs to be > >> truncated periodically. I tried to use ProcessingTimeCallback to call > >> state.clear(), but in this way I can only clear the state for one key, and > >> actually I need a key group level cleanup. So I’m wondering is there any > >> best practice for my case? Thanks a lot! > >> > >> Best, > >> Paul Lam > > > > > > > -- > David Anderson | Training Coordinator | data Artisans > -- > Join Flink Forward - The Apache Flink Conference > Stream Processing | Event Driven | Real Time