TableSnapshotScanner itself does not support more than one scanner. Are you creating more than 1 TableSnapshotScanner in your parallel scan?
Everytime a snapshot scanner is initiated, it will try to "restore" the snapshot to a temporary location out of the regular root directory in hdfs. You can try to give different restore directories to each TableSnapshotScanner. Enis On Wed, Jan 6, 2016 at 5:51 PM, dbhogle <dbho...@connexity.com> wrote: > Using the client api, we can scan the snapshot using a single scanner. We > currently create an instance of TableSnapshotScanner providing it with a > unique dir location per scanner. We are currently on cdh 5.4.5 and using > the > hbase 1.0.0-cdh5.4.5 api. In order to get desired throughput, we tried to > increase the number of parallel scanners but raising the no. of scanner > instances throws the following exception, the occurrence increases as the > no. of scanners goes up. > Can the client api support multiple scanners for a single snapshot? > > java.io.IOException: java.util.concurrent.ExecutionException: > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file > > /hbase/archive/data/default/<tableName>/cb794cfb7948ba8b1f4e73b690dfbfe5/L/.links-a04e1a5b2141445eb1b9e2429f1eced2/cb794cfb7948ba8b1f4e73b690dfbfe5.<tableName> > for DFSClient_NONMAPREDUCE_672650916_1 for client 10.10.2.90 because > current > leaseholder is trying to recreate file. > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3077) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2783) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560) > at > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585) > at > > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395) > at > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at > > org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:162) > at > > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:561) > at > > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:237) > at > > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:159) > at > > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:812) > at > > org.apache.hadoop.hbase.client.TableSnapshotScanner.init(TableSnapshotScanner.java:156) > at > > org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:124) > at > > org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:101) > at > net.connexity.aro.data.AudienceScanner.join(AudienceScanner.scala:68) > at > net.connexity.aro.actor.ScanActor.joinAudience(ScanActor.scala:190) > at > > net.connexity.aro.actor.ScanActor$$anonfun$receive$1.applyOrElse(ScanActor.scala:90) > at akka.actor.Actor$class.aroundReceive(Actor.scala:467) > at > net.connexity.aro.actor.ScanActor.aroundReceive(ScanActor.scala:36) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) > at akka.dispatch.Mailbox.run(Mailbox.scala:220) > at > > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Parallel-scanning-of-snapshots-using-hbase-client-api-tp4077014.html > Sent from the HBase Developer mailing list archive at Nabble.com. >