The assumption is that one of those three copies of the HDFS block comprising your HFiles are stored on the local datanode.

That is what the major compaction process guarantee.

On 5/26/17 9:59 AM, Rajeshkumar J wrote:
I have seen the code in that while creating input split they are also
sending region info with that splits. Is there any reason for that as all
the hfiles are not going to be in that server

On Fri, May 26, 2017 at 7:06 PM, Ted Yu <yuzhih...@gmail.com> wrote:

Consider running major compaction which restores data locality.

Thanks

On May 26, 2017, at 6:08 AM, Rajeshkumar J <rajeshkumarit8...@gmail.com>
wrote:

Thanks Ted. If data blocks of the hfile may not be on the same node as
the
region server then how data locality is achieved when mapreduce is run
over
hbase tables



On Fri, May 26, 2017 at 6:15 PM, Ted Yu <yuzhih...@gmail.com> wrote:

The hfiles of a region are stored on hdfs. By default, hdfs has
replication
factor of 3.
If you're not using read replica feature, any single region is served by
one region server (however the data blocks of the hfile may not be on
the
same node as the region server).

Cheers

On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
rajeshkumarit8...@gmail.com
wrote:

Hi,

   we have region max file size as 10 GB. Whether the hfiles of a region
exists in same region server or will it be distributed?

Thanks



Reply via email to