Hi I'm thinking of creating a table that will have millions of rows; and each day, I would insert and delete millions of rows to/from it.
Two questions: 1. I'm guessing HBase won't have any problems with this approach, but just wanted to check that in terms of region-splits or compaction I won't run into issues. Can you think of any problems? 2. Let's say there are 6 million records in the table, then do a full table-scan querying a column-family that has a single family the value in the cell is either 1 or 0. Let's say it takes N seconds. Now I bulk delete 5 million records (but do not run compaction) and run the same query again, would I get a much faster response or will HBase need to perform the same amount of i/o (as if there are still 6 million records there). Once compaction is done, then the query would run faster... Also most queries on the table would scan the entire table. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Full-table-scan-cost-after-deleting-Millions-of-Records-from-HBase-Table-tp4077676.html Sent from the HBase User mailing list archive at Nabble.com.