[ 
https://issues.apache.org/jira/browse/HDFS-11382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852374#comment-15852374
 ] 

Andrew Wang commented on HDFS-11382:
------------------------------------

Thanks for working on this Manoj! Overall looks good, some review comments:

* Noticed we now have a BlockTypeProto as well as erasureCodingPolicyID now in 
the INodeFile PB. The history here is that we used to have an {{isStriped}} 
boolean, which turned into the BlockTypeProto enum in HDFS-10759. Now that we 
have this optional field, it seems like BlockTypeProto is unnecessary. 
[~ehiggs], any thoughts on this?
* Related, I'm wondering if we should make {{isStriped}} just be {{return 
getBlockType() == STRIPED}}, looks equivalent to me.
* FSImageFormatPBINode, it'd be better to check if the policy is present with 
{{f.hasErasureCodingPolicyID}} rather than getting based on if the block type 
is striped. This way we don't accidentally get the default value of 0.
* INodeFile, I think having the combined {{replicationOrEcPolicyId}} parameter 
is confusing from the caller's perspective. It'd be better to have separate 
parameters that pass null or -1 if not set, and combine as appropriate in 
INodeFile.HeaderFormat#toLong (with requisite Precondition checks). This 
relates to the comment in toLong about this being hacky, I dug up this comment 
from Zhe during the initial review:

{quote}
The biggest concern is INodeFile constructor – related to that, the toLong 
method. Currently when isStriped, we just interpret replication as the EC 
policy ID. This looks pretty hacky. But it looks pretty tricky to fix. How 
about adding a TODO here and file a follow-on? I think what we need to do is to 
add some util methods to convert both 0 + replicationFactor and 1 + ecPolicy 
into a short typed variable, like blockLayoutRedudancy, and pass it to the 
constructor.
{quote}

Seems like this JIRA is a good time to fix the hack :)

* FSImageFormatPBInode#buildINodeFile, it looks like we unconditionally set the 
EC policy ID. Shouldn't we only set when there is a policy set, similar to the 
AclFeature and XAttrFeature? A unit test for this would be good.
* INodeFile#getFileReplication, IIRC S3 returns a replication factor of 1. 
Could you check this and use the same setting for our striped files? IMO 1 is 
the most reasonable value here, since each data block has a repl factor of one. 
The javadoc also needs to be updated. We also should check any for callers that 
assume this is the EC policy ID, if you have't yet.
* Is the PBXMLWriter change intended for this JIRA?

> Persist Erasure Coding Policy ID in a new optional field in INodeFile in 
> FSImage
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11382
>                 URL: https://issues.apache.org/jira/browse/HDFS-11382
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11382.01.patch
>
>
> For Erasure Coded files, replication field in INodeFile message is re-used  
> to store the EC Policy ID. 
> *FSDirWriteFileOp#addFile*
> {noformat}
>   private static INodesInPath addFile(
>       FSDirectory fsd, INodesInPath existing, byte[] localName,
>       PermissionStatus permissions, short replication, long 
> preferredBlockSize,
>       String clientName, String clientMachine)
>       throws IOException {
>     .. .. ..
>     try {
>       ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
>           getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
>       if (ecPolicy != null) {
>         replication = ecPolicy.getId();   <===
>       }
>       final BlockType blockType = ecPolicy != null?
>           BlockType.STRIPED : BlockType.CONTIGUOUS;
>       INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
>           modTime, modTime, replication, preferredBlockSize, blockType);
>       newNode.setLocalName(localName);
>       newNode.toUnderConstruction(clientName, clientMachine);
>       newiip = fsd.addINode(existing, newNode, permissions.getPermission());
> {noformat}
> With HDFS-11268 fix, {{FSImageFormatPBINode#Loader#loadInodeFile}} is rightly 
> getting the EC ID from the replication field and then uses the right Policy 
> to construct the blocks.
> *FSImageFormatPBINode#Loader#loadInodeFile*
> {noformat}
>       ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
>           ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
>           null;
> {noformat}
> The original intention was to re-use the replication field so the in-memory 
> representation would be compact. But, this isn't necessary for the on-disk 
> representation. replication is an optional field, and if we add another 
> optional field for the EC policy, it won't be any extra space.
> Also, we need to make sure to have the appropriate asserts in place to make 
> sure both fields aren't set for the same INodeField.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to