Ryan, Exactly, eventually, we will be storing data continuously on N beds in the ICU. So, if it's waveform data, it's probably going to be 125 Hz which is about 3.9 billion points per bed, times N beds. I've been trying to find out what sort of search terms to use to dive deeper and "compound keys" with respect to NoSQL solutions.
You mention tall tables - this sounds consistent with what Erik and Andrey have said. Given that, just to clarify my understanding, I'm probably looking at a single table with only one column (the value, which Andrey names as "series"???) and billiions of rows, right? That said, the decision to break up the values into multiple column families is just a function of performance and how I want the data physically stored. Are there any other major points to consider for determining what column families to have? (I made this conclusion from your hbase-nosql presentation on slideshare.) Thanks all! --Andrew On Apr 24, 2010, at 12:59 PM, Ryan Rawson wrote: > For example if you are storing timeseries data for a monitoring > system, you might want to store it by row, since the number of points > for a single system might be arbitrarily large (think: 2 years+ of > data). In this case if the expected data set size per row is larger > than what a single machine could conceivably store, Cassandra would > not work for you in this case (since each row must be stored on a > single (er N) node(s)).