[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

Rakesh R (JIRA) Fri, 07 Jul 2017 10:41:21 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078421#comment-16078421
 ]


Rakesh R commented on HDFS-12090:
---------------------------------

Thanks for posting the design doc. It looks really nice! Some 
comments/questions:

# {quote}
The backup starts only when the StoragePolicy of the specified hdfs directory 
(or any subtree within)
is set to include the PROVIDED StorageType.
{quote}
So, does this mean there would be order in these ops. What if the dir has 
PROVIDED StoragePolicy set, then issues cmd to create the mount point. I hope 
the movement will be triggered once user invokes HDFS-10285 
{{dfs#satisfyStoragePolicy}} api. In that case, before calling the satisfy api 
user has to set storage policy and mount, which can be in any order, right?
# {quote}
Set the StoragePolicy of hdfs://data/2016/jan/ to fDISK:2, PROVIDED:1. This 
starts backing
up data in hdfs://data/2016/jan/
{quote}
How do you handle newly writing files. While writing data to a file, it will 
respect the storage policy and will do a pipeline writes to provided store. I 
got confused the way -backup differs with -ephemeral writes.
# {quote}
For FileSystems that share the semantics of directories, permissions etc. with 
HDFS, backing up metadata
involves traversing the subtree under hdfs://srcPath (recursively) and mkdir 
all the directories on the
PROVIDED store, with the necessary permissions.
{quote}
Will this {{backing up metadata}} be the responsibility of system admin before 
triggering the backing up call?
# {quote}
We note that irrespective
of where in the write process the C-DN fails (i.e., which step in Figure 1 or 2 
it fails), re-executing the
backup operation (while potentially wasteful) will overwrite the earlier data 
(in the PROVIDED store or AliasMap),
and will eventually result in a successful backup.
{quote}
The idea looks good. {{BackupManager}} coordinates backing up of individual 
files, which involves sequence of steps, step-1: create file metadata. step-2: 
write individual blocks (sequentially or parallely based on 
concat_temp_files/append/multipart etc) on the PROVIDED store. step-3: perform 
concat/finalize_multipart etc. Here, it has to perform these operations in an 
atomic way and a failure in between shouldn't leave to a non-recoverable state. 
Overwriting is simple approach, but I'm afraid whether this works with all 
external provided stores. For example, if some external stores won't allow 
overwirte then our retry logic has to first delete it and then write back. In 
that case, we may need to expose interfaces to plugin vendor specific logic.
# {quote}
the blob can be named using the absolute path name of the file – for example, a
blob for file foo.txt under /users/user1/home/ directory can be named 
/users/user1/home/foo.txt
(This is the convention used in various blob stores today to represent 
namespaces).
{quote}
I'm just adding a very corner case, but it may happen. Say, if the same 
provided store is used by two HDFS cluster and has same path exists. So, admin 
should be careful while configuring same provided store to different clusters. 
# {quote}
Updates to data from HDFS can happen either (a) synchronously (write-through) 
or (b) lazily (write-back). Writing
data synchronously to the PROVIDED store can be done as an extension to the 
existing write pipeline: one of the
3 datanodes in the pipeline writes to the PROVIDED store as part of the pipeline
{quote}
I'd prefer lazily write-back. We([~rakeshr]/[~umamaheswararao]) have tested 
with {{s3-fuse}} approach and observed high latency during write ops, which 
results in many client socket time out exceptions due to the slowness of 
external cold store.
# {quote}
To unmount a PROVIDED store, the administrator can issue the following command:
{quote}
Many sub directories would have set provided storage policy and which results 
in failures during write/append ops. Maybe, we could also think about 
traversing and resetting all the dir's storage policy to default or provide 
fallback storages for replication. Also, the same problem would occur if some 
body sets PROVIDED storage policy without a mount point and then perform 
pipeline writes.
# {quote}Changes to SPS.{quote}
Presently, SPS doesn't do a recursive dir scanning and satisfy the sub-trees. 
We thought of making the block movement simple. Anyway user can iteratively 
scan subtree and invoke {{dfs#satisfyStoragePolicy}} API, if needed. I hope 
this proposal is expecting recursive traversal of subtree, right? If yes, we 
can capture this task also and would discuss ways to extend SPS without much 
overhead.
# {{Adding a point about EC file}} - Presently, the supported storage policies 
for EC files are All_SSD, Hot and Cold. Since it has data + parity blocks, we 
may need to consider EC as a special case and introduce a new policy, then 
backup only the EC {{data}} blocks to the {{PROVIDED}} storage type and skip 
the {{parity}} blocks.


> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>         Attachments: HDFS-12090-design.001.pdf
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

Reply via email to