Hi List,

this is my first post here...

I'm a Riak (and Erlang) newbie. I'm currently trying to design an architecture that can handle the following needs:

- 95% of the work is to log data from sensors: once a data reading is sent to a server, it is written into the database, it is never updated afterwards. For instance if I read a sensor located in a particular place at a particular time, the sensor's value is to be stored with a timestamp and this value will never change: a new reading from the same sensor will have a different timestamp (and also a different key).

- I need to keep the details of each data reading for each sensor for a few months (sometimes some process performs statistic analysis on many different data reading for a single sensor or a group of sensor).

- there are many sensors, sending many readings each day, the system must be able to sustain millions of put() per day. But after a reading is considered obsolete, there is no use to keep it forever in the database.

I need to solve the problem of data obsolescence hence I'm considering using a ring for each period of time (for instance a new ring at the beginning of each month, or even a new ring per week), while keeping a set of 'old' rings available only to read data from them. The application would handle the problem of accessing different rings if it is queried to process data than is spanned in multiple rings (I think I have a correct key layout/organization).

My goal is to be able to completely delete a ring when the data in the ring is not needed anymore, instead of performing zillions of delete() queries in a single ring that would hold all the readings. So at the end of each period of time I make a new ring (it becomes the 'current' ring) and I delete an old one. I don't have a need to fine grain deletion, I don't need per-sensor data deletion, but I need to get rid of all recording older than X months otherwise the database will become impossible to manage/backup at some point.

I would like also to avoid to delete() old readings when new readings come in, since I have also have a problem of peak hour to manage.

Multiple rings would also ease backups since any ring which is not the 'current' ring could be saved somewhere with usual copy tools since no update would occur in an 'old' ring. And most of the data in the 'current' ring is still kept in sensors, so if I lose the current ring I can rebuild it by having sensors to resend data. (last solution is case of a catastrophic event).

It seems that LevelDB keeps the data of different buckets in the same set of files (another solution would have been to keep a single ring but to obsolete buckets instead).

Maybe this proposal is already an error by itself (comments welcome!) It triggers the following questions (from what I've understood from Erlang and Riak at this point):

- When multiple rings are needed, it seems that each ring needs a new Riak instance on each node. But all examples I found are about running an Erlang VM per Riak instance, instead of having more than one Riak instance in a single VM. I'd like to know if there is a solution to run many Riak instance in a single VM since Erlang seems to be designed to sustain large quantities of processes in a single VM instead of having many VM on a single host (for better resource management/monitoring). I'd like to know what is really possible with Riak regarding multiple instances ?

- For the same reason, I'd like to have the server that process the traffic coming from the sensors in the same VM, to have just one VM per physical machine. Is there some doc about bundling an Erlang user made app with riak in the same .beam (a 'release'?)?

- Or if a user can tell if having many different Erlang VM running Riak on a single node works efficiently?

If this approach is correct then my main problem is to handle different rings/riak instance/VM in the most efficient way and I've solved one difficult problem regarding the life cycle of the database.

Thanks for any hint/comment!

  Bernard

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to