If this works it will be great. Might also be interest in creating a Lustre plugin. Regards. -Jeremy
On Tue, Jul 27, 2021 at 01:15:40PM -0400, Jeff Kubina wrote: > All, > > Some of AWS's back end services use a version of Accumulo modified to use > Amazon's S3 as its storage system. Amazon engineers forked Accumulo 2.0 and > merged that S3 support into it <https://github.com/cmilbert/accumulo/>. > Chris Milbert is the lead Amazon engineer who did the integration. Chris > and I would like to jump start the conversation about how best to initiate > the pull request for these changes into Accumulo 2.1. > > Mike Wall suggested using this as an opportunity to abstract out the > storage system of Accumulo and make it pluggable. He suggested the > following broad steps: > > 1. Identify all the things HDFS provides such as read, write, > replication and failover. > 2. Abstract out a file system interface with hooks for all those things > (and does not require loading hadoop jars). > 3. Plugin HDFS as the default implementation of that interface, hiding > all hadoop jars there. > 4. Make another implementation that plugins in S3 and make it optionally > configured. > 5. Run tests to make sure we didn't break things with HDFS. > 6. Run tests to see if S3 meets all the requirements. > > Ed Coleman also suggested first forking Accumulo 2.1 and merging the S3 > changes into it. > > Chris and I look forward to the discussion on how best to add S3 support to > Accumulo. > > Thanks, > Jeff > -- > Jeff Kubina
