Full table scan cost after deleting Millions of Records from HBase Table

houman Tue, 09 Feb 2016 16:59:12 -0800

Hi

I'm thinking of creating a table that will have millions of rows; and each
day, I would insert and delete millions of rows to/from it.


Two questions:
1. I'm guessing HBase won't have any problems with this approach, but just
wanted to check that in terms of region-splits or compaction I won't run
into issues.  Can you think of any problems?
2. Let's say there are 6 million records in the table, then do a full
table-scan querying a column-family that has a single family the value in
the cell is either 1 or 0.  Let's say it takes N seconds.  Now I bulk delete
5 million records (but do not run  compaction) and run the same query again,
would I get a much faster response or will HBase need to perform the same
amount of i/o (as if there are still 6 million records there).  Once
compaction is done, then the query would run faster...

Also most queries on the table would scan the entire table.



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Full-table-scan-cost-after-deleting-Millions-of-Records-from-HBase-Table-tp4077676.html
Sent from the HBase User mailing list archive at Nabble.com.

Full table scan cost after deleting Millions of Records from HBase Table

Reply via email to