[ 
https://issues.apache.org/jira/browse/HADOOP-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859063#comment-16859063
 ] 

Andrew Purtell commented on HADOOP-16355:
-----------------------------------------

Let's make sure just this part can be backported to branch-2.9 (unless use of 
s3guard is strongly contraindicated on that branch)

> ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store
> --------------------------------------------------------------
>
>                 Key: HADOOP-16355
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16355
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs
>            Reporter: Mingliang Liu
>            Priority: Major
>
> When S3Guard was proposed, there are a couple of valid reasons to choose 
> DynamoDB as its default backend store: 0) seamless integration as part of AWS 
> ecosystem e.g. client library 1) it's a managed web service which is zero 
> operational cost, highly available and infinitely scalable 2) it's performant 
> with single digit latency 3) it's proven by Netflix's S3mper (not actively 
> maintained) and EMRFS (closed source and usage). As it's pluggable, it's 
> possible to implement {{MetadataStore}} with other backend store without 
> changing semantics, besides null and in-memory local ones.
> Here we propose {{ZookeeperMetadataStore}} which uses Zookeeper as S3Guard 
> backend store. Its main motivation is to provide a new MetadataStore option 
> which:
>  # can be easily integrated as Zookeeper is heavily used in Hadoop community
>  # affordable performance as both client and Zookeeper ensemble are usually 
> "local" in a Hadoop cluster (ZK/HBase/Hive etc)
>  # removes DynamoDB dependency
> Obviously all use cases will not prefer this to default DynamoDB store. For 
> e.g. ZK might not scale well if there are dozens of S3 buckets and each has 
> millions of objects. Our use case is targeting HBase to store HFiles on S3 
> instead of HDFS. A total solution for HBase on S3 must be HBOSS (see 
> HBASE-22149) for recovering atomicity of metadata operations like rename, and 
> S3Guard for consistent enumeration and access to object store bucket 
> metadata. We would like to use Zookeeper as backend store for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to