On Fri, Oct 11, 2013 at 5:27 AM, Adrian Sandulescu < [email protected]> wrote:
> It's Apache Hadoop, but I see multi-part upload is in the works for this as > well. > https://issues.apache.org/jira/browse/HADOOP-9454 > I didn't know about this ticket. That's a very good thing to have in Apache. In the mean time, does EMR run a hadoop version close to yours? If they do, and you're feeling brave enough, you can try a franken-build. Grab the appropriate hadoop jar from an EMR deployment and deploy it on your cluster. I dunno what else this might pull in though. Another solution would be to limit the size of the HFiles to 5GB, but I > don't know yet what effect this would have on cluster performance. > If that's you're only option, that's you're only option. Once the snapshot is hydrated from S3, you can always compact your table to clean things up. Good luck! -n
