Hi, What's the best way to automate major compactions without enabling it during off peak period?
What I was testing is simple script which runs on every node in cluster, checks if there is major compaction already running on that node, if not picks one region for compaction and run compaction on that one region. It's running for some time and it helped us get our data to much better shape, but now I'm not quite sure how to choose anymore which region to compact. So far I was reading for that node rs-status#regionStoreStats and first choosing the one with biggest amount of storefiles, and then those with biggest storefile sizes. Is there maybe something more intelligent I could/should do? Thanks a lot!