There are 10 region servers & I can schedule compaction during weekend when
the write load negligable.

After reading the documentation, its not clear how many HFiles are created
once bulk-load finishes - is it one HFile per reducer? My question is, is
it recommended to run major compaction after bulk-load if the # of regions
on each region server are not too high?


On Thursday, July 30, 2015, Ted Yu <[email protected]> wrote:

> How many region servers do you have in the cluster ?
>
> Would there be concurrent write load on the cluster if you choose to run
> major
> compaction ? I ask this because the concurrent write would be slowed down
> by the major compaction and compacting 10 TB of data would take some time.
>
> Cheers
>
> On Wed, Jul 29, 2015 at 4:23 PM, Krishna <[email protected]
> <javascript:;>> wrote:
>
> > Hi,
> >
> > I am planning to bulk-load about 10 TB of data to a table pre-split with
> > 30 regions with max region file size configured to 10 GB.
> >
> > Is it recommended that I run a major compaction when bulk-loading
> > finishes? How
> > many HFiles does the reducer create?
> >
>

Reply via email to