I think having unique row-prefix for every table is a standard way of storing multiple virtual tables inside one BigTable's table You get data locality per every virtual table and in this case you can easily specify start and stop rows for a Scan.
Assigning separate CF to a virtual table is a bad idea because you will get data from different virtual tables mixed as since CF comes after row-key in default BigTable (HBase) comparison routine. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: [email protected] ________________________________________ From: Pan, Thomas [[email protected]] Sent: Wednesday, February 15, 2012 1:57 PM To: [email protected] Subject: Scan performance on a big table as combination of multiple logic tables Since Hbase is tailored to handle one table very well, we are thinking to put multiple tables into one big table but on different column family sets. Our use case is full table scan against single column value filters. As records from different "logical tables" are at different column families, could we speed up the scan performance by simply checking the column family referenced by these single column value filters first before really going through all the underlying K-V pairs? It would be great if the Hbase code is already coded that way. $0.02, Thomas Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or [email protected] and delete or destroy any copy of this message and its attachments.
