Re: Accumulo with Native S3 Support

Jeremy Kepner Tue, 27 Jul 2021 10:31:11 -0700

If this works it will be great.
Might also be interest in creating a Lustre plugin.
Regards.  -Jeremy


On Tue, Jul 27, 2021 at 01:15:40PM -0400, Jeff Kubina wrote:
> All,
> 
> Some of AWS's back end services use a version of Accumulo modified to use
> Amazon's S3 as its storage system. Amazon engineers forked Accumulo 2.0 and
> merged that S3 support into it <https://github.com/cmilbert/accumulo/>.
> Chris Milbert is the lead Amazon engineer who did the integration. Chris
> and I would like to jump start the conversation about how best to initiate
> the pull request for these changes into Accumulo 2.1.
> 
> Mike Wall suggested using this as an opportunity to abstract out the
> storage system of Accumulo and make it pluggable. He suggested the
> following broad steps:
> 
>    1. Identify all the things HDFS provides such as read, write,
>    replication and failover.
>    2. Abstract out a file system interface with hooks for all those things
>    (and does not require loading hadoop jars).
>    3. Plugin HDFS as the default implementation of that interface, hiding
>    all hadoop jars there.
>    4. Make another implementation that plugins in S3 and make it optionally
>    configured.
>    5. Run tests to make sure we didn't break things with HDFS.
>    6. Run tests to see if S3 meets all the requirements.
> 
> Ed Coleman also suggested first forking Accumulo 2.1 and merging the S3
> changes into it.
> 
> Chris and I look forward to the discussion on how best to add S3 support to
> Accumulo.
> 
> Thanks,
> Jeff
> -- 
> Jeff Kubina

Re: Accumulo with Native S3 Support

Reply via email to