Hi Jingsong, you're right, it is indeed somewhat tricky to find a good data structure for out-of-core timers. That's why we have them in memory in Flink for now and that's also why I'm afraid I don't have any good advice for you right now. We're aware of the problem in Flink but we're not yet working on a concrete solution.
Cheers, Aljoscha On Tue, 24 Jan 2017 at 21:42 Dan Halperin <[email protected]> wrote: > Hi Jingsong, > > Sorry for the delayed response; this email ended up being misclassified by > my mail server and I missed it. Maybe Kenn or Aljoscha has suggestions on > how runners can best implement timers? > > Dan > > On Thu, Jan 19, 2017 at 9:55 PM, lzljs3620320 <[email protected]> > wrote: > > > Hi there, > > I'm working on the beam integration for an internal system at Alibaba. > Now > > most of the runners put timers in memory, such as Flink, Apex, etc. (I > do not know > > the implementation of Google Dataflow).But in our scene, unbounded data > > has a large number of keys,which will lead to OOM(timers in memory). So > > we want to store timers in state(RocksDb in disk).The problem is how to > > extract fired event time timers when refresh the input > > watermark. Do we have to scan all keys and timers(Now timer is composed > of > > Key, id, namespace, timestamp, domain)?Is there a better > > implement? I'm wondering if you could give me some advice on how to > implement > > timers in state efficiently. Thank you! > > Best,Jingsong Lee >
