Hi All.

I am currently looking at using Riak as a data store for time series data. Currently we get about 1.5T of data in JSON format that I intend to persist in Riak. I am having some difficulty figuring out how to model it such that I can fulfill the use cases I have been handed.

The data is provided in several types of log formats with some common fields:

- timestamp
- geo
- s/w build #
- location #

- .... whole bunch of other key value pairs.

For the most part I will need to provide aggregated views based on geo. There are some views based on s/w build # and location #. The aggregation will be on an hourly basis.

The model that I came up with:

<log-format-type>[<hour>][<timestamp>-<msg-id>]: <json-body>

with indices on geo, s/w build # and location #.

I /think/ this will satisfy most of what I want to do, but I was wondering if someone else has had to solve this sort of a problem and what their solution was?

I would also be interested in hearing about alternate structures or bad assumptions I am making here.

Thanks.
AM

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to