Versions in HBase are timestamps by default. If you intend to continue using the timestamps, what will happen when someone writes value_1 and value_2 at the exact same time? Regards,
Dhaval ----- Original Message ----- From: Sagar Naik <sn...@splunk.com> To: "user@hbase.apache.org" <user@hbase.apache.org> Cc: Sent: Friday, 24 January 2014 12:27 PM Subject: HBase Design : Column name v/s Version Hi, I have a choice to maintain to data either in column values or as versioned data. This data is not a versioned copy per se. The access pattern on this get all the data every time So the schema choices are : Schema 1: 1. column_name/qualifier => data_1. column_value => value_1 1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a 1.b. column_name/qualifier => data_3. column_value => value_3 To get all the values for "data", I will have to use ColumnPrefixFilter with prefix set "data" Schema 2: 2. column_name/qualifier => data. version=> 1, column_value => value_1 2.a. column_name/qualifier => data. version=> 2, column_value => value_2,value_2.a 2.b. column_name/qualifier => data. version=> 3, column_value => value_3 To get all the values for "data" , I will do a simple get operation to get all the versions. Number of versions can go from: 10 to 100K Get operation perf should beat the Filter perf. Comparing 100K values will be costly as the # versions increase. I would like to know if there are drawbacks in going the version route. -Sagar