There are 10 region servers & I can schedule compaction during weekend when the write load negligable.
After reading the documentation, its not clear how many HFiles are created once bulk-load finishes - is it one HFile per reducer? My question is, is it recommended to run major compaction after bulk-load if the # of regions on each region server are not too high? On Thursday, July 30, 2015, Ted Yu <[email protected]> wrote: > How many region servers do you have in the cluster ? > > Would there be concurrent write load on the cluster if you choose to run > major > compaction ? I ask this because the concurrent write would be slowed down > by the major compaction and compacting 10 TB of data would take some time. > > Cheers > > On Wed, Jul 29, 2015 at 4:23 PM, Krishna <[email protected] > <javascript:;>> wrote: > > > Hi, > > > > I am planning to bulk-load about 10 TB of data to a table pre-split with > > 30 regions with max region file size configured to 10 GB. > > > > Is it recommended that I run a major compaction when bulk-loading > > finishes? How > > many HFiles does the reducer create? > > >
