Hi Matteo, I've posted the snapshot information here:
http://pastebin.com/ZgDfH2pT and the stack trace here: http://pastebin.com/GBQT3zdd Thanks, Sean On Friday, 26 April, 2013 at 2:16 PM, Matteo Bertozzi wrote: > Hey Sean, > > could you provide us the full stack trace of the FileNotFoundException > Unable to open link > and also the output of: hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo > -files -stats -snapshot SNAPSHOT_NAME > to give us a better idea of what is the state of the snapshot > > Thanks! > > > On Fri, Apr 26, 2013 at 9:51 PM, Sean MacDonald <s...@opendns.com > (mailto:s...@opendns.com)> wrote: > > > Hi Jon, > > > > I've actually discovered another issue with snapshot export. If you have a > > region that has recently split and you take a snapshot of that table and > > try to export it while the children still have references to the files in > > the split parent, the files will not be transferred and will be counted in > > the missing total. You end with error messages like: > > > > java.io.FileNotFoundException: Unable to open link: > > org.apache.hadoop.hbase.io.HLogLink > > > > Please let me know if you would like any additional information. > > > > Thanks and have a great day, > > > > Sean > > > > > > On Wednesday, 24 April, 2013 at 9:19 AM, Sean MacDonald wrote: > > > > > Hi Jon, > > > > > > No problem. We do have snapshots enabled on the target cluster, and we > > are using the default hfile archiver settings on both clusters. > > > > > > Thanks, > > > > > > Sean > > > > > > > > > On Tuesday, 23 April, 2013 at 1:54 PM, Jonathan Hsieh wrote: > > > > > > > Sean, > > > > > > > > Thanks for finding this problem. Can you provide some more information > > so > > > > that we can try to duplicate and fix this problem? > > > > > > > > Are snapshots on on the target cluster? > > > > What are the hfile archiver settings in your hbase-site.xml on both > > > > clusters? > > > > > > > > Thanks, > > > > Jon. > > > > > > > > > > > > On Mon, Apr 22, 2013 at 4:47 PM, Sean MacDonald <s...@opendns.com > > > > (mailto:s...@opendns.com)(mailto: > > s...@opendns.com (mailto:s...@opendns.com))> wrote: > > > > > > > > > It looks like you can't export a snapshot to a running cluster or it > > will > > > > > start cleaning up files from the archive after a period of time. I > > > > > > > > > > > > > have > > > > > turned off HBase on the destination cluster and the export is > > > > > > > > > > > > > working as > > > > > expected now. > > > > > > > > > > Sean > > > > > > > > > > > > > > > On Monday, 22 April, 2013 at 9:22 AM, Sean MacDonald wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > I am using HBase 0.94.6 on CDH 4.2 and trying to export a snapshot > > to > > > > > another cluster (also CDH 4.2), but this is failing repeatedly. The > > > > > > > > > > > > > table I > > > > > am trying to export is approximately 4TB in size and has 10GB > > > > > > > > > > > > > regions. Each > > > > > of the map jobs runs for about 6 minutes and appears to be running > > > > > properly, but then fails with a message like the following: > > > > > > > > > > > > 2013-04-22 16:12:50,699 WARN org.apache.hadoop.hdfs.DFSClient: > > > > > DataStreamer Exception > > > > > > > > > > > > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > > > > > No lease on > > > > > > > > > > > > > /hbase/.archive/queries/533fcbb7858ef34b103a4f8804fa8719/d/651e974dafb64eefb9c49032aec4a35b > > > > > File does not exist. Holder DFSClient_NONMAPREDUCE_-192704511_1 does > > > > > > > > > > > > > not > > > > > have any open files. at > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > > > > > at > > > > > > > > > > > > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtoc > > > > > ol > > > > > > $2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at > > > > > > > > > > > > > > > > > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > > > > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at > > > > > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at > > > > > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at > > > > > java.security.AccessController.doPrivileged(Native Method) at > > > > > javax.security.auth.Subject.doAs(Subject.java:396) at > > > > > > > > > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > > > > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) > > > > > > > > > > > > I was able to see the file that the LeaseExpiredException mentions > > on > > > > > the destination cluster before the exception happened (it is gone > > > > > afterwards). > > > > > > > > > > > > Any help that could be provided in resolving this would be greatly > > > > > appreciated. > > > > > > > > > > > > Thanks and have a great day, > > > > > > > > > > > > Sean > > > > > > > > > > > > -- > > > > // Jonathan Hsieh (shay) > > > > // Software Engineer, Cloudera > > > > // j...@cloudera.com (mailto:j...@cloudera.com) > > > > > >