[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078421#comment-16078421 ]
Rakesh R commented on HDFS-12090: --------------------------------- Thanks for posting the design doc. It looks really nice! Some comments/questions: # {quote} The backup starts only when the StoragePolicy of the specified hdfs directory (or any subtree within) is set to include the PROVIDED StorageType. {quote} So, does this mean there would be order in these ops. What if the dir has PROVIDED StoragePolicy set, then issues cmd to create the mount point. I hope the movement will be triggered once user invokes HDFS-10285 {{dfs#satisfyStoragePolicy}} api. In that case, before calling the satisfy api user has to set storage policy and mount, which can be in any order, right? # {quote} Set the StoragePolicy of hdfs://data/2016/jan/ to fDISK:2, PROVIDED:1. This starts backing up data in hdfs://data/2016/jan/ {quote} How do you handle newly writing files. While writing data to a file, it will respect the storage policy and will do a pipeline writes to provided store. I got confused the way -backup differs with -ephemeral writes. # {quote} For FileSystems that share the semantics of directories, permissions etc. with HDFS, backing up metadata involves traversing the subtree under hdfs://srcPath (recursively) and mkdir all the directories on the PROVIDED store, with the necessary permissions. {quote} Will this {{backing up metadata}} be the responsibility of system admin before triggering the backing up call? # {quote} We note that irrespective of where in the write process the C-DN fails (i.e., which step in Figure 1 or 2 it fails), re-executing the backup operation (while potentially wasteful) will overwrite the earlier data (in the PROVIDED store or AliasMap), and will eventually result in a successful backup. {quote} The idea looks good. {{BackupManager}} coordinates backing up of individual files, which involves sequence of steps, step-1: create file metadata. step-2: write individual blocks (sequentially or parallely based on concat_temp_files/append/multipart etc) on the PROVIDED store. step-3: perform concat/finalize_multipart etc. Here, it has to perform these operations in an atomic way and a failure in between shouldn't leave to a non-recoverable state. Overwriting is simple approach, but I'm afraid whether this works with all external provided stores. For example, if some external stores won't allow overwirte then our retry logic has to first delete it and then write back. In that case, we may need to expose interfaces to plugin vendor specific logic. # {quote} the blob can be named using the absolute path name of the file – for example, a blob for file foo.txt under /users/user1/home/ directory can be named /users/user1/home/foo.txt (This is the convention used in various blob stores today to represent namespaces). {quote} I'm just adding a very corner case, but it may happen. Say, if the same provided store is used by two HDFS cluster and has same path exists. So, admin should be careful while configuring same provided store to different clusters. # {quote} Updates to data from HDFS can happen either (a) synchronously (write-through) or (b) lazily (write-back). Writing data synchronously to the PROVIDED store can be done as an extension to the existing write pipeline: one of the 3 datanodes in the pipeline writes to the PROVIDED store as part of the pipeline {quote} I'd prefer lazily write-back. We([~rakeshr]/[~umamaheswararao]) have tested with {{s3-fuse}} approach and observed high latency during write ops, which results in many client socket time out exceptions due to the slowness of external cold store. # {quote} To unmount a PROVIDED store, the administrator can issue the following command: {quote} Many sub directories would have set provided storage policy and which results in failures during write/append ops. Maybe, we could also think about traversing and resetting all the dir's storage policy to default or provide fallback storages for replication. Also, the same problem would occur if some body sets PROVIDED storage policy without a mount point and then perform pipeline writes. # {quote}Changes to SPS.{quote} Presently, SPS doesn't do a recursive dir scanning and satisfy the sub-trees. We thought of making the block movement simple. Anyway user can iteratively scan subtree and invoke {{dfs#satisfyStoragePolicy}} API, if needed. I hope this proposal is expecting recursive traversal of subtree, right? If yes, we can capture this task also and would discuss ways to extend SPS without much overhead. # {{Adding a point about EC file}} - Presently, the supported storage policies for EC files are All_SSD, Hot and Cold. Since it has data + parity blocks, we may need to consider EC as a special case and introduce a new policy, then backup only the EC {{data}} blocks to the {{PROVIDED}} storage type and skip the {{parity}} blocks. > Handling writes from HDFS to Provided storages > ---------------------------------------------- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Virajith Jalaparti > Attachments: HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org