Hey Klaus, I don't think anyone has directly tried to address the issue by creating a different state store. The way we handle it right now is by having your task implement WindowableTask. You get a window() callback in your task, which you can configure to run every N milliseconds (task.window.ms). In this window() method, you can do a range() or all() call for old data, and delete() it from the store.
As for running a local Mongo DB, this would work, but we tend to prefer embeddable databases that can be shipped as part of the Samza job (in the .tgz file as a .jar, usually). The reason for this preference is that, in a multi-tenant environment (i.e. jobs running on YARN), it's often not feasible to run one off software for a job. One job might want Mongo DB, another Redis, another Memcache, etc. What you end up with on your YARN cluster is a union of all those things running on every node. This makes life rough, operationally. One thing to look into might be RocksDB. I did a bit of googling, and I see a couple of mentions of TTL support in it (https://github.com/facebook/rocksdb/tree/master/utilities/ttl and https://github.com/facebook/rocksdb/wiki/Time-to-Live), but I haven't gone any further than that. There are probably also other embeddable TTL DBs that you could find, as well. Cheers, Chris On 1/9/14 8:47 AM, "Klaus Schaefers" <[email protected]> wrote: >Hi, > >I was digging a little into Samza and saw that the state storage is based >in LevelDB. This very nice because it is really fast but in my use cases I >would need some kind of time-to-live variabale attached to a key. Has >anybody already tried to address this issue by including a different state >storage like a local Mongo db or so? > > >Cheers, > >Klaus > > > >-- > >-- > >Klaus Schaefers >Senior Optimization Manager > >Ligatus GmbH >Hohenstaufenring 30-32 >D-50674 Köln > >Tel.: +49 (0) 221 / 56939 -784 >Fax: +49 (0) 221 / 56 939 - 599 >E-Mail: [email protected] >Web: www.ligatus.de > >HRB Köln 56003 >Geschäftsführung: >Dipl.-Kaufmann Lars Hasselbach, Dipl.-Kaufmann Klaus Ludemann, >Dipl.-Wirtschaftsingenieur Arne Wolter
