[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316202#comment-15316202 ]
Lei (Eddy) Xu commented on HDFS-7240: ------------------------------------- Hi, [~anu] Thanks a lot for organize the meeting. I also have a few questions that are hopefully be answered in the meeting * Since Ozone is decided to use range partition, how would key / data distribution achieve balancing from initial state? For example, a user Foo runs Hive and creates 10GB of data, these data are distributed to up to 6 (containers) DNs? * Would you explain what is the benefit of recovering failure pipeline by using a parallel writes to all 3 containers? It is not very clear in the design. * It seems to me that in the new pipeline in Ozone, there is no multiple intermediate states for each chunk? bq. due to immutability of chunks w rite chunk is an idempotent operation How does ozone differentiate a recover write from a malicious (or buggy) re-write? * You mentioned that KMS/SCM separation is for future scalability. Do KMS / SCM maintains 1:1, 1:n or n:m relationship? Though it is not in this phase. I'd like to know whether it is considered. Btw, they are also Raft replicated? * The raft ring / leader is per-container? * For pipeline, say if we have a pipeline A->B->C, if the data writes successfully on A->B, and the metadata Raft writes are succeed on B,C, IIUC, that is a What would be the result for a read request sent to A or C ? * How to handle split (merge, migrate) container during writes? * Since container size is determined by the space usage instead of # of keys, would that result large performance variants on listing operation, because {{# of DN reached for a list operation = total # of keys / (# of keys per container)). And # of keys per container is determined by average object size in the container. Thanks. > Object store in HDFS > -------------------- > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Jitendra Nath Pandey > Assignee: Jitendra Nath Pandey > Attachments: Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, > ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org