Currently some system cleaning tasks do read all the rows, and then perform
some operations on that. It impacts other users who are being served at the
same time. System cleaning tasks are of lower priority, and I can delay the
requests, but I am just wondering if there is anyway I can hook into
This doesn't answer your question per se, but this is how we dealt with
load on HBase at Lithium. We power klout.com with HBase. On a nightly
basis, we load user profile data and Klout scores for approx. 600 million
users into HBase. We also do maintenance on HBase such as major compactions
on a