Using the client api, we can scan the snapshot using a single scanner. We currently create an instance of TableSnapshotScanner providing it with a unique dir location per scanner. We are currently on cdh 5.4.5 and using the hbase 1.0.0-cdh5.4.5 api. In order to get desired throughput, we tried to increase the number of parallel scanners but raising the no. of scanner instances throws the following exception, the occurrence increases as the no. of scanners goes up. Can the client api support multiple scanners for a single snapshot?
java.io.IOException: java.util.concurrent.ExecutionException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): failed to create file /hbase/archive/data/default/<tableName>/cb794cfb7948ba8b1f4e73b690dfbfe5/L/.links-a04e1a5b2141445eb1b9e2429f1eced2/cb794cfb7948ba8b1f4e73b690dfbfe5.<tableName> for DFSClient_NONMAPREDUCE_672650916_1 for client 10.10.2.90 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3077) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2783) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:162) at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:561) at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:237) at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:159) at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:812) at org.apache.hadoop.hbase.client.TableSnapshotScanner.init(TableSnapshotScanner.java:156) at org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:124) at org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:101) at net.connexity.aro.data.AudienceScanner.join(AudienceScanner.scala:68) at net.connexity.aro.actor.ScanActor.joinAudience(ScanActor.scala:190) at net.connexity.aro.actor.ScanActor$$anonfun$receive$1.applyOrElse(ScanActor.scala:90) at akka.actor.Actor$class.aroundReceive(Actor.scala:467) at net.connexity.aro.actor.ScanActor.aroundReceive(ScanActor.scala:36) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Parallel-scanning-of-snapshots-using-hbase-client-api-tp4077014.html Sent from the HBase Developer mailing list archive at Nabble.com.