Using the client api, we can scan the snapshot using a single scanner. We
currently create an instance of TableSnapshotScanner providing it with a
unique dir location per scanner. We are currently on cdh 5.4.5 and using the
hbase 1.0.0-cdh5.4.5 api. In order to get desired throughput, we tried to
increase the number of parallel scanners but raising the no. of scanner
instances throws the following exception, the occurrence increases as the
no. of scanners goes up. 
Can the client api support multiple scanners for a single snapshot?

java.io.IOException: java.util.concurrent.ExecutionException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
failed to create file
/hbase/archive/data/default/<tableName>/cb794cfb7948ba8b1f4e73b690dfbfe5/L/.links-a04e1a5b2141445eb1b9e2429f1eced2/cb794cfb7948ba8b1f4e73b690dfbfe5.<tableName>
for DFSClient_NONMAPREDUCE_672650916_1 for client 10.10.2.90 because current
leaseholder is trying to recreate file.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3077)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2783)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585)
        at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
        at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)

        at
org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:162)
 
        at
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:561)
 
        at
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:237)
 
        at
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:159)
 
        at
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:812)
 
        at
org.apache.hadoop.hbase.client.TableSnapshotScanner.init(TableSnapshotScanner.java:156)
 
        at
org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:124)
 
        at
org.apache.hadoop.hbase.client.TableSnapshotScanner.<init>(TableSnapshotScanner.java:101)
 
        at
net.connexity.aro.data.AudienceScanner.join(AudienceScanner.scala:68) 
        at
net.connexity.aro.actor.ScanActor.joinAudience(ScanActor.scala:190)
        at
net.connexity.aro.actor.ScanActor$$anonfun$receive$1.applyOrElse(ScanActor.scala:90)
 
        at akka.actor.Actor$class.aroundReceive(Actor.scala:467) 
        at
net.connexity.aro.actor.ScanActor.aroundReceive(ScanActor.scala:36) 
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) 
        at akka.actor.ActorCell.invoke(ActorCell.scala:487) 
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) 
        at akka.dispatch.Mailbox.run(Mailbox.scala:220) 
        at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
 
        at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
        at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 
        at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
        at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Parallel-scanning-of-snapshots-using-hbase-client-api-tp4077014.html
Sent from the HBase Developer mailing list archive at Nabble.com.

Reply via email to