bq. add a 'dummy' column family and apply HBASE-5416 technique Adding dummy column family is not the way to utilize essential column family support - what would this dummy column family hold ?
bq. since I have not read the filtering section of the book I'm reading yet Once you finish reading, you can look at the unit test (TestJoinedScanners) from HBASE-5416. You would understand this feature better. Cheers On Tue, Aug 5, 2014 at 9:21 PM, innowireless TaeYun Kim < [email protected]> wrote: > Thank you all. > > Facts learned: > > - Having 130 column families is too much. Don't do that. > - While scanning, an entire row will be read for filtering, unless > HBASE-5416 technique is applied which makes only relevant column family is > loaded. (But it seems that still one can't load just a column needed while > scanning) > - Big row size is maybe not good. > > Currently it seems appropriate to follow the one-column solution that Alok > Singh suggested, in part since currently there is no reasonable grouping of > the fields. > > Here is my current thinking: > > - One column family, one column. Field name will be included in rowkey. > - Eliminate filtering altogether (in most case) by properly ordering > rowkey components. > - If a filtering is absolutely needed, add a 'dummy' column family and > apply HBASE-5416 technique to minimize disk read, since the field value can > be large(~5MB). (This dummy column thing may not be right, I'm not sure, > since I have not read the filtering section of the book I'm reading yet) > > Hope that I am not missing or misunderstanding something... > (I'm a total newbie. I've started to read a HBase book since last week...) > > > > > >
