[jira] [Commented] (HDFS-14978) In-place Erasure Coding Conversion

Wei-Chiu Chuang (Jira) Tue, 12 Nov 2019 05:15:08 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972384#comment-16972384
 ]


Wei-Chiu Chuang commented on HDFS-14978:
----------------------------------------

bq. What is the client behavior during the CAS operation OP_SWAP_BLOCK_LIST
This operation is atomic. Semantically, it is similar to truncating the file to 
zero length, and then append the file with erasure coded blocks. 
Assuming both files are not open. A getBlockLocations() call for the $src prior 
to swapBlockList() gets the replicated block list. Once a client has the 
located blocks list, it has the block tokens too and it should be able to read 
without problems, even though the namespace has changed.

> In-place Erasure Coding Conversion
> ----------------------------------
>
>                 Key: HDFS-14978
>                 URL: https://issues.apache.org/jira/browse/HDFS-14978
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: erasure-coding
>    Affects Versions: 3.0.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: In-place Erasure Coding Conversion.pdf
>
>
> HDFS Erasure Coding is a new feature added in Apache Hadoop 3.0. It uses 
> encoding algorithms to reduce disk space usage while retaining redundancy 
> necessary for data recovery. It was a huge amount of work but it is just 
> getting adopted after almost 2 years.
> One usability problem that’s blocking users from adopting HDFS Erasure Coding 
> is that existing replicated files have to be copied to an EC-enabled 
> directory explicitly. Renaming a file/directory to an EC-enabled directory 
> does not automatically convert the blocks. Therefore users typically perform 
> the following steps to erasure-code existing files:
> {noformat}
> Create $tmp directory, set EC policy at it
> Distcp $src to $tmp
> Delete $src (rm -rf $src)
> mv $tmp $src
> {noformat}
> There are several reasons why this is not popular:
> * Complex. The process involves several steps: distcp data to a temporary 
> destination; delete source file; move destination to the source path.
> * Availability: there is a short period where nothing exists at the source 
> path, and jobs may fail unexpectedly.
> * Overhead. During the copy phase, there is a point in time where all of 
> source and destination files exist at the same time, exhausting disk space.
> * Not snapshot-friendly. If a snapshot is taken prior to performing the 
> conversion, the source (replicated) files will be preserved in the cluster 
> too. Therefore, the conversion actually increase storage space usage.
> * Not management-friendly. This approach changes file inode number, 
> modification time and access time. Erasure coded files are supposed to store 
> cold data, but this conversion makes data “hot” again.
> * Bulky. It’s either all or nothing. The directory may be partially erasure 
> coded, but this approach simply erasure code everything again.
> To ease data management, we should offer a utility tool to convert replicated 
> files to erasure coded files in-place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14978) In-place Erasure Coding Conversion

Reply via email to