[ https://issues.apache.org/jira/browse/HBASE-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775680#comment-13775680 ]
Nick Dimiduk commented on HBASE-9639: ------------------------------------- Yes, I think the comparator is in fact correct. The problem appears to be from SecureBulkLoadClient#bulkLoadHFiles sending bulk load requests to all regions in the table rather than the single region it intended. This is because is passes in EMPTY_BYTE_ARRAY for both start and end rows, which is also the invocation structure necessary for calling the whole table. > SecureBulkLoad dispatches file load requests to all Regions > ----------------------------------------------------------- > > Key: HBASE-9639 > URL: https://issues.apache.org/jira/browse/HBASE-9639 > Project: HBase > Issue Type: Bug > Components: Client, Coprocessors > Affects Versions: 0.95.2 > Environment: Hadoop2, Kerberos > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Fix For: 0.98.0, 0.96.1 > > Attachments: HBASE-9639.00.patch > > > When running a bulk load on a secure environment and loading data into the > first region of a table, the request to load the HFile set is dispatched to > all Regions for the table. This is reproduced consistently by running > IntegrationTestBulkLoad on a secure cluster. The load fails with an exception > that looks like: > {noformat} > 2013-08-30 07:37:22,993 INFO [main] mapreduce.LoadIncrementalHFiles: Split > occured while grouping HFiles, retry attempt 1 with 3 files remaining to > group or split > 2013-08-30 07:37:22,999 ERROR [main] mapreduce.LoadIncrementalHFiles: > IOException during splitting > java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File > does not exist: > /user/hbase/test-data/c45ddfe9-ee30-4d32-8042-928db12b1cee/IntegrationTestBulkLoad-0/L/bf41ea13997b4e228d05e67ba7b1b686 > at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) > at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:51) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1489) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1438) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1418) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1392) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:438) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:269) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59566) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:403) > at > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:284) > at > org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.runLinkedListMRJob(IntegrationTestBulkLoad.java:200) > at > org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.testBulkLoad(IntegrationTestBulkLoad.java:133) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira