[ https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314609#comment-15314609 ]
Zhe Zhang commented on HDFS-9806: --------------------------------- Thanks for posting the design and PoC [~chris.douglas]! It's really exciting to see this work moving forward. A few questions / comments about the current design doc: # Having a {{PROVIDED}} storage type is an interesting idea. There are a few tricky issues though. How should we update the over-replication logic to work with caching? If replication factor is 1, and a {{PROVIDED}} block is cached by DN, NN will try to remove the excess replica right? If we specify a replication factor > 1, NN will always try to create DN-local replicas, which is probably not what we want as the opportunistic caching logic. How should we specify the preference of caching on SSD vs. HDD? How about {{Mover}} and {{Balancer}}? # bq. blocks in the PROVIDED storage type are not included by any Datanode as part of its block report. So does a DN still reports connectivity to the {{PROVIDED}} store to NN at each BR? I guess an alternative is for NN itself to periodically check the connectivity? # Per section 3.4, I think the NN also needs to have a "PROVIDED store client" anyway, right? bq. Data and metadata in the external store can change out-of-band (e.g., daily log data uploaded). This would be a tricky case to handle. How are directories persisted in the external store? Consider the below case: #* An empty HDFS cluster is built on WASB (only {{/}}) #* {{mkdir /data}} through HDFS. The metadata should be persisted in WASB in some form right? #* {{/data/log1.txt}} is uploaded by some other WASB client (not the HDFS on top of it) #* {{ls /data}} is done through HDFS. I guess HDFS NN can check the WASB data structure for {{/data}} and get the update #* How about when another directory {{/jobs}} is created through other WASB client? Are we assuming HDFS has created data structure in WASB to track root dir {{/}}? # I think more details can be added to Section 2 for clarification. In particular, per the above comment, is this work mainly intended for "using a big external store to back a single smaller HDFS"? Or the above "out-of-band update" use case is also important? Is it better to have a phase 1 for single-HDFS use case (no other updates to external store)? > Allow HDFS block replicas to be provided by an external storage system > ---------------------------------------------------------------------- > > Key: HDFS-9806 > URL: https://issues.apache.org/jira/browse/HDFS-9806 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Chris Douglas > Attachments: HDFS-9806-design.001.pdf > > > In addition to heterogeneous media, many applications work with heterogeneous > storage systems. The guarantees and semantics provided by these systems are > often similar, but not identical to those of > [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html]. > Any client accessing multiple storage systems is responsible for reasoning > about each system independently, and must propagate/and renew credentials for > each store. > Remote stores could be mounted under HDFS. Block locations could be mapped to > immutable file regions, opaque IDs, or other tokens that represent a > consistent view of the data. While correctness for arbitrary operations > requires careful coordination between stores, in practice we can provide > workable semantics with weaker guarantees. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org