Modeling column families

Andrew Nguyen Fri, 23 Apr 2010 16:46:57 -0700

Hello all,

I am currently in the process of researching and learning about HBase (along 
with other column stores) as a potential solution for storing large amounts of 
physiologic data. In both Cassandra and HBase, it seems that column families 
need to be created administratively; however, Cassandra would require an 
additional server restart.


That said, the patient physiology that I'm looking to store is basically 
time-series data.  We are pulling in A/D counts (or as converted to their 
physical units) for various physiologic parameters for patients that are in the 
intensive-care environment.  So, my first inclination was to model it as 
follows:

One single column family for "physiology"

Each row key is of the form "PatientName-PhysiologicParameter" and each column 
name is the timestamp of the reading.

So, say patient Bob is in the ICU and his arterial blood pressure, heart rate, 
and intracranial pressure are currently being monitored.  This would result in 
the row keys:

Bob-ABP
Bob-HR
Bob-ICP

The column names would be, "2010-04-23 16:43:44" and so on...

Is this a reasonable way of accomplishing this?  The bulk of the queries would 
be something like:

Give me all blood pressures for Bob between two dates
Give me all blood pressures, and intracranial pressures for Bob from <date> 
until present

In other words, the queries will be very patient-centric, or 
patient-physiologic parameter-centric.

Thanks,
Andrew

Modeling column families

Reply via email to