[jira] [Commented] (HDFS-11382) Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage

Andrew Wang (JIRA) Thu, 23 Feb 2017 17:03:06 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-11382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881673#comment-15881673
 ]


Andrew Wang commented on HDFS-11382:
------------------------------------

Hi Manoj, thanks for the rev! Looks like my last wave of comments were mostly 
addressed, a few follow-ups:

* FSDirWriteOp#addFileForEditLog and #addFile, would prefer that we set up the 
parameters for newINodeFile outside of the call, like how we set up 
{{blockType}}. This is a little cleaner.
* [~ehiggs] hasn't responded, but I'd like to remove {{blockType}} from the 
INodeFile PB definition since it looks extraneous to me, and is extra overhead. 
BlockType can remain as an in-memory abstraction.
* When reading and writing the fsimage, shouldn't we set replication XOR ec 
policy? This is in FSImageFormatPBINode. I'd also prefer we setup the 
parameters outside of the call, similar to above review comments
* FSImageFormatPBSnapshot, if we use the boxed Short and Byte, we can avoid the 
ternary checks right? Paranoia precondition checks are never a bad idea if you 
want to add some.
* INodeFile#DEFAULT_REPL_FOR_STRIPED_BLOCKS could use a javadoc comment, we can 
also explain why we chose "1" for the value
* It looks like we use {{getSysPoliciesMinID}} and {{MaxID}} to validate that 
the policy ID is valid; could we just check that {{getPolicyByPolicyID}} isn't 
null instead? I get that this is more efficient, but it's cleaner to use the 
helper function instead, and we can optimize the helper to use a hashtable 
later if this becomes a bottleneck.
* The new INodeFile header helpers look good, except it seems like we have some 
boiler plate now that callers need to do around blockType. I think what Zhe 
wanted was for callers to use a function like your {{getBlockLayoutRedundancy}} 
directly, which would in turn call into the more specific {{getContiguous}} and 
{{getStriped}} that would be the two halves of the {{blockType == STRIPED}} if 
statement currently in {{getBlockLayoutRedundancy}}. Hopefully that makes 
sense, I can provide a code snippet if it's unclear.

For the findbugs issue, we should add this to the hdfs findbugsExcludeFile.xml, 
and possibly a code comment near the signed OR for the future.

> Persist Erasure Coding Policy ID in a new optional field in INodeFile in 
> FSImage
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11382
>                 URL: https://issues.apache.org/jira/browse/HDFS-11382
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11382.01.patch, HDFS-11382.02.patch
>
>
> For Erasure Coded files, replication field in INodeFile message is re-used  
> to store the EC Policy ID. 
> *FSDirWriteFileOp#addFile*
> {noformat}
>   private static INodesInPath addFile(
>       FSDirectory fsd, INodesInPath existing, byte[] localName,
>       PermissionStatus permissions, short replication, long 
> preferredBlockSize,
>       String clientName, String clientMachine)
>       throws IOException {
>     .. .. ..
>     try {
>       ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
>           getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
>       if (ecPolicy != null) {
>         replication = ecPolicy.getId();   <===
>       }
>       final BlockType blockType = ecPolicy != null?
>           BlockType.STRIPED : BlockType.CONTIGUOUS;
>       INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
>           modTime, modTime, replication, preferredBlockSize, blockType);
>       newNode.setLocalName(localName);
>       newNode.toUnderConstruction(clientName, clientMachine);
>       newiip = fsd.addINode(existing, newNode, permissions.getPermission());
> {noformat}
> With HDFS-11268 fix, {{FSImageFormatPBINode#Loader#loadInodeFile}} is rightly 
> getting the EC ID from the replication field and then uses the right Policy 
> to construct the blocks.
> *FSImageFormatPBINode#Loader#loadInodeFile*
> {noformat}
>       ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
>           ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
>           null;
> {noformat}
> The original intention was to re-use the replication field so the in-memory 
> representation would be compact. But, this isn't necessary for the on-disk 
> representation. replication is an optional field, and if we add another 
> optional field for the EC policy, it won't be any extra space.
> Also, we need to make sure to have the appropriate asserts in place to make 
> sure both fields aren't set for the same INodeField.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11382) Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage

Reply via email to