Thank you all. Facts learned:
- Having 130 column families is too much. Don't do that. - While scanning, an entire row will be read for filtering, unless HBASE-5416 technique is applied which makes only relevant column family is loaded. (But it seems that still one can't load just a column needed while scanning) - Big row size is maybe not good. Currently it seems appropriate to follow the one-column solution that Alok Singh suggested, in part since currently there is no reasonable grouping of the fields. Here is my current thinking: - One column family, one column. Field name will be included in rowkey. - Eliminate filtering altogether (in most case) by properly ordering rowkey components. - If a filtering is absolutely needed, add a 'dummy' column family and apply HBASE-5416 technique to minimize disk read, since the field value can be large(~5MB). (This dummy column thing may not be right, I'm not sure, since I have not read the filtering section of the book I'm reading yet) Hope that I am not missing or misunderstanding something... (I'm a total newbie. I've started to read a HBase book since last week...)