HBase MapReduce rack local MAP tasks

2016-02-24 Thread onealbao
Sorry to bother. For my MapReduce framework, I split a scan into several sub-scan of one region, however, I still got many rack-only MAP tasks rather than data-local MAP. Is this the problem of mapred.max.split.size setting because of hdfs defaul 64 mb block size? Or I need to use some other taskSc

When split a region, how to get row keys efficiently instead of using midkey

2016-01-30 Thread onealbao
Hi, In default region split policy, it first finds largest stores, then finds largest store files, and finally get split point (midkey) of the largest store file. Is there anyway to efficiently get all row-keys of a store files? I tried to use ResultScanner with setting start/end row key, but I fo