Hi All, I have timeseries data that has most of the the regions completely inactive. With my current set of resources and estimates, I would end up with close to 15TB of data per RegionServer and with a region size of about 15G, this would mean 1000 regions per region server. On whole I expect close to 150TB of data which would lead to close to 10,000 total regions and was thinking of handling it all with around 10-15 nodes.
This is a write intensive process adn read QPS will be fairly low. Even at write time I expect only 1-3 regions per region server to be actively written to. I wanted to know more about the memory overhead associated with completely inactive regions. Can someone pls help me out with the details of what are typical minimum memory usage overheads (on memstore, blockcahe, indexes and bloomfilters) for such inactive (cold) regions? If the overhead is nill or minuscule then, I should be able to comfortably run these regiosservers with ~10GB RAM. Any other gotchas I need to be careful about here? --cheers, gaurav