[ 
https://issues.apache.org/jira/browse/HDFS-11382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886759#comment-15886759
 ] 

Andrew Wang commented on HDFS-11382:
------------------------------------

Thanks for the rev Manoj, LGTM overall, +1 pending these little nits:

* FSDirWriteFileOp: typo "replictaionFactor" in addFileForEditLog
* Unused import for Preconditions in INodeFileAttributes

[~ehiggs], my concern is that encoding whether a file is erasure coded in both 
the EC policy and the BlockTypeProto fields opens us up to possible incongruity 
between the two fields. Since I'm not proposing we do away with BlockType 
entirely, I double checked the Precondition checks we have in this patch, and 
it looks okay.

Also as an FYI, HDFS-8030 wants to implement "contiguous EC," so we need a JIRA 
to rename CONTIGUOUS to REPLICATED. I filed HDFS-11465 for this if you want to 
pick it up, should be pretty easy to do this refactoring with IDE assistance.


> Persist Erasure Coding Policy ID in a new optional field in INodeFile in 
> FSImage
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11382
>                 URL: https://issues.apache.org/jira/browse/HDFS-11382
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11382.01.patch, HDFS-11382.02.patch, 
> HDFS-11382.03.patch, HDFS-11382.04.patch
>
>
> For Erasure Coded files, replication field in INodeFile message is re-used  
> to store the EC Policy ID. 
> *FSDirWriteFileOp#addFile*
> {noformat}
>   private static INodesInPath addFile(
>       FSDirectory fsd, INodesInPath existing, byte[] localName,
>       PermissionStatus permissions, short replication, long 
> preferredBlockSize,
>       String clientName, String clientMachine)
>       throws IOException {
>     .. .. ..
>     try {
>       ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
>           getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
>       if (ecPolicy != null) {
>         replication = ecPolicy.getId();   <===
>       }
>       final BlockType blockType = ecPolicy != null?
>           BlockType.STRIPED : BlockType.CONTIGUOUS;
>       INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
>           modTime, modTime, replication, preferredBlockSize, blockType);
>       newNode.setLocalName(localName);
>       newNode.toUnderConstruction(clientName, clientMachine);
>       newiip = fsd.addINode(existing, newNode, permissions.getPermission());
> {noformat}
> With HDFS-11268 fix, {{FSImageFormatPBINode#Loader#loadInodeFile}} is rightly 
> getting the EC ID from the replication field and then uses the right Policy 
> to construct the blocks.
> *FSImageFormatPBINode#Loader#loadInodeFile*
> {noformat}
>       ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
>           ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
>           null;
> {noformat}
> The original intention was to re-use the replication field so the in-memory 
> representation would be compact. But, this isn't necessary for the on-disk 
> representation. replication is an optional field, and if we add another 
> optional field for the EC policy, it won't be any extra space.
> Also, we need to make sure to have the appropriate asserts in place to make 
> sure both fields aren't set for the same INodeField.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to