Also, here are the files in S3: $ hadoop fs -ls s3n://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export /hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 Found 1 items -rwxrwxrwx 1 741047906 2013-10-08 13:45 s3n://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export /hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1
$ hadoop fs -ls s3://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export /hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 Found 1 items -rwxrwxrwx 1 741047906 1970-01-01 00:00 s3://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export /hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 Thank you, Adrian On Tue, Oct 8, 2013 at 6:44 PM, Adrian Sandulescu < [email protected]> wrote: > Yes, I was just digging. > > From a successful s3n:// import > > 2013-10-08 14:57:04,816 INFO org.apache.hadoop.hbase.snapshot.ExportSnapshot: > copy file > input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1 > > output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 > 2013-10-08 14:57:04,965 INFO > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening > 's3n://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1' > for reading > 2013-10-08 14:57:05,039 INFO > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > 'hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1' > for reading at position '0' > 2013-10-08 14:57:05,299 INFO org.apache.hadoop.hbase.snapshot.ExportSnapshot: > Skip copy > v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1 > to > hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1, > same file. > 2013-10-08 14:57:05,300 INFO org.apache.hadoop.hbase.snapshot.ExportSnapshot: > copy completed for > input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1 > > output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 > > From a failed s3:// import > > 2013-10-08 15:27:21,810 INFO org.apache.hadoop.hbase.snapshot.ExportSnapshot: > copy file > input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1 > > output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1 > 2013-10-08 15:27:21,834 ERROR > org.apache.hadoop.hbase.snapshot.ExportSnapshot: Unable to open source > file=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1 > java.io.IOException: No such file. > at org.apache.hadoop.fs.s3.S3FileSystem.checkFile(S3FileSystem.java:181) > at org.apache.hadoop.fs.s3.S3FileSystem.open(S3FileSystem.java:246) > at > org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.tryOpen(FileLink.java:289) > at > org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.<init>(FileLink.java:120) > at > org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.<init>(FileLink.java:111) > at org.apache.hadoop.hbase.io.FileLink.open(FileLink.java:390) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.openSourceFile(ExportSnapshot.java:302) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:175) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:146) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:95) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > Thank you, > Adrian > > > On Tue, Oct 8, 2013 at 5:56 PM, Ted Yu <[email protected]> wrote: > >> bq. 13/10/08 13:32:01 INFO mapred.JobClient: MISSING_FILES=1 >> >> Are you able to provide more context from job output ? >> >> Thanks >> >> >> On Tue, Oct 8, 2013 at 6:35 AM, Adrian Sandulescu < >> [email protected]> wrote: >> >> > Hello everyone, >> > >> > I'm using this tool to export and "import" snapshots from S3: >> > >> > >> https://github.com/lospro7/snapshot-s3-util/blob/master/src/main/java/com/imgur/backup/SnapshotS3Util.java >> > >> > I'm using this tool because it seems like a better option than >> ExportTable, >> > considering there isn't another HDFS cluster on hand. >> > >> > It uses the following trick to make exportSnapshot "import" from S3 to >> the >> > local HDFS. >> > >> > // Override dfs configuration to point to S3 >> > config.set("fs.default.name", s3protocol + accessKey + ":" >> > + accessSecret + "@" + bucketName); >> > config.set("fs.defaultFS", s3protocol + accessKey + ":" + >> > accessSecret + "@" + bucketName); >> > config.set("fs.s3.awsAccessKeyId", accessKey); >> > config.set("fs.s3.awsSecretAccessKey", accessSecret); >> > config.set("hbase.tmp.dir", "/tmp/hbase-${user.name}"); >> > config.set("hbase.rootdir", s3Url); >> > >> > >> > Imports work great, but only when using the s3n:// protocol (which means >> > and HFile limit of 5GB). >> > When using the s3:// protocol, I get the following: >> > 13/10/08 13:32:01 INFO mapred.JobClient: MISSING_FILES=1 >> > >> > The author said he wasn't able to debug it and just uses s3n:// until it >> > becomes a problem. >> > >> > Has anyone encountered this when using exportSnapshot? >> > Can you please point me in the right direction? >> > >> > Adrian >> > >> > >
