Hello all, I am currently in the process of researching and learning about HBase (along with other column stores) as a potential solution for storing large amounts of physiologic data. In both Cassandra and HBase, it seems that column families need to be created administratively; however, Cassandra would require an additional server restart.
That said, the patient physiology that I'm looking to store is basically time-series data. We are pulling in A/D counts (or as converted to their physical units) for various physiologic parameters for patients that are in the intensive-care environment. So, my first inclination was to model it as follows: One single column family for "physiology" Each row key is of the form "PatientName-PhysiologicParameter" and each column name is the timestamp of the reading. So, say patient Bob is in the ICU and his arterial blood pressure, heart rate, and intracranial pressure are currently being monitored. This would result in the row keys: Bob-ABP Bob-HR Bob-ICP The column names would be, "2010-04-23 16:43:44" and so on... Is this a reasonable way of accomplishing this? The bulk of the queries would be something like: Give me all blood pressures for Bob between two dates Give me all blood pressures, and intracranial pressures for Bob from <date> until present In other words, the queries will be very patient-centric, or patient-physiologic parameter-centric. Thanks, Andrew