Hey Klaus,

I don't think anyone has directly tried to address the issue by creating a
different state store. The way we handle it right now is by having your
task implement WindowableTask. You get a window() callback in your task,
which you can configure to run every N milliseconds (task.window.ms). In
this window() method, you can do a range() or all() call for old data, and
delete() it from the store.

As for running a local Mongo DB, this would work, but we tend to prefer
embeddable databases that can be shipped as part of the Samza job (in the
.tgz file as a .jar, usually). The reason for this preference is that, in
a multi-tenant environment (i.e. jobs running on YARN), it's often not
feasible to run one off software for a job. One job might want Mongo DB,
another Redis, another Memcache, etc. What you end up with on your YARN
cluster is a union of all those things running on every node. This makes
life rough, operationally.

One thing to look into might be RocksDB. I did a bit of googling, and I
see a couple of mentions of TTL support in it
(https://github.com/facebook/rocksdb/tree/master/utilities/ttl and
https://github.com/facebook/rocksdb/wiki/Time-to-Live), but I haven't gone
any further than that. There are probably also other embeddable TTL DBs
that you could find, as well.

Cheers,
Chris

On 1/9/14 8:47 AM, "Klaus Schaefers" <[email protected]> wrote:

>Hi,
>
>I was digging a little into Samza and saw that the state storage is based
>in LevelDB. This very nice because it is really fast but in my use cases I
>would need some kind of time-to-live variabale attached to a key. Has
>anybody already tried to address this issue by including a different state
>storage like a local Mongo db or so?
>
>
>Cheers,
>
>Klaus
>
>
>
>-- 
>
>-- 
>
>Klaus Schaefers
>Senior Optimization Manager
>
>Ligatus GmbH
>Hohenstaufenring 30-32
>D-50674 Köln
>
>Tel.:  +49 (0) 221 / 56939 -784
>Fax:  +49 (0) 221 / 56 939 - 599
>E-Mail: [email protected]
>Web: www.ligatus.de
>
>HRB Köln 56003
>Geschäftsführung:
>Dipl.-Kaufmann Lars Hasselbach, Dipl.-Kaufmann Klaus Ludemann,
>Dipl.-Wirtschaftsingenieur Arne Wolter

Reply via email to