Hello everyone,
I'm using this tool to export and "import" snapshots from S3:
https://github.com/lospro7/snapshot-s3-util/blob/master/src/main/java/com/imgur/backup/SnapshotS3Util.java
I'm using this tool because it seems like a better option than ExportTable,
considering there isn't another HDFS cluster on hand.
It uses the following trick to make exportSnapshot "import" from S3 to the
local HDFS.
// Override dfs configuration to point to S3
config.set("fs.default.name", s3protocol + accessKey + ":"
+ accessSecret + "@" + bucketName);
config.set("fs.defaultFS", s3protocol + accessKey + ":" +
accessSecret + "@" + bucketName);
config.set("fs.s3.awsAccessKeyId", accessKey);
config.set("fs.s3.awsSecretAccessKey", accessSecret);
config.set("hbase.tmp.dir", "/tmp/hbase-${user.name}");
config.set("hbase.rootdir", s3Url);
Imports work great, but only when using the s3n:// protocol (which means
and HFile limit of 5GB).
When using the s3:// protocol, I get the following:
13/10/08 13:32:01 INFO mapred.JobClient: MISSING_FILES=1
The author said he wasn't able to debug it and just uses s3n:// until it
becomes a problem.
Has anyone encountered this when using exportSnapshot?
Can you please point me in the right direction?
Adrian