[ https://issues.apache.org/jira/browse/HDFS-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202419#comment-15202419 ]
Zhe Zhang commented on HDFS-8030: --------------------------------- Thanks [~umamaheswararao] for the helpful feedback! bq. As this design tries to convert files into EC mode from normal file layout, Blockgroups needs to be created later when converting. But block groups generally we allocate continuous blockids, but here how do we make that continuous blockids when converting? bq. Does this create overheads on memory as we need to track blockGroups separately and if the blockids are not continuous as discussed in #1 In the current design we are not assuming continuous block IDs in the same block group. And therefore we are incurring additional memory overhead to store the mapping between a block group to its blocks. But this overhead is partially offset by the reduction of replicas. Generating parity data in streaming fashion sounds a good idea. I think contiguous EC will generate new {{ErasureCodingPolicy}}'s. Then it will be handled by the current {{ErasureCodingPolicy}} design: {code} Each individual directory can be configured with an EC policy with command `hdfs erasurecode -setPolicy`. When a file is created, it will inherit the EC policy from its nearest ancestor directory to determine how its blocks are stored. {code} > HDFS Erasure Coding Phase II -- EC with contiguous layout > --------------------------------------------------------- > > Key: HDFS-8030 > URL: https://issues.apache.org/jira/browse/HDFS-8030 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFSErasureCodingPhaseII-20151204.pdf > > > Data redundancy form -- replication or erasure coding, should be orthogonal > to block layout -- contiguous or striped. This JIRA explores the combination > of {{Erasure Coding}} + {{Contiguous}} block layout. > As will be detailed in the design document, key benefits include preserving > block locality, and easy conversion between hot and cold modes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)