> > From what I've seen from your estimation, the data amount you're > going to store is huge. Not only that but also the bandwidth required > is quite a lot. (Assuming you have a 200MBit connection and you send > data over UDP (128 bytes in total = headers + payload), after a simple > calculation it results that you'll only be able to handle 16384 > sensors. Thus maybe you should reduce the readings.) >
I need to give the other stakeholders an idea of the strategy and the costs involved, hence the effort to make things predictable, even under higher loads. I will make the assumption that the connection will be handled if the whole equation makes sense. > I wouldn't store the "data files" inside the embedded DB, but the > actual raw readings. While this would be tempting how would you see something like that? A huge db by source? As I said previously, in many cases this would involve a DB with around 1 billion records. In fact many such DBs used all at the same time and I can constrain the queries so as to never have to consider the full amount of data. It seems to me that I'd be giving up a good chance for optimization (as premature and evil be it) by storing all data points. And if you're suggesting one DB per batch, the batches would be relatively reduced in size, wouldn't that create its own set of problems for the DB library (opening many files, closing them, etc.) when I can reduce the final range query to a sequential traversal of at most 3 x MaxN records? Regards, Paul
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
