[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120237#comment-15120237
 ] 

Jing Zhao commented on HDFS-9260:
---------------------------------

Thanks for the great work, [~sfriberg]. The patch looks good overall. Some 
comments and questions:
# For {{DatanodeProtocol#blockReport}}, can we move "boolean sorted" inside of 
BlockReportContext and make its default value false? In this way we can keep 
the backward compatibility between DN and NN (though allow old DN sending 
reports to NN can lead to bad performance).
# Need to update the javadoc of ReplicaMap#map.
# {{DatanodeStorageInfo#addBlockInitial(BlockInfo)}} has not been called.
# In {{BlockManager#removeBlocksAssociatedTo}} and {{removeZombieReplicas}}, 
looks like we're removing the block from the DatanodeStorageInfo twice? Can we 
modify {{removeStoredBlock}} to avoid the second remove op?
{code}
    for (DatanodeStorageInfo storage : node.getStorageInfos()) {
      final Iterator<BlockInfo> it = storage.getBlockIterator();
      while (it.hasNext()) {
        BlockInfo block = it.next();
        // DatanodeStorageInfo must be removed using the iterator to avoid
        // ConcurrentModificationException in the underlying storage
        it.remove();
        removeStoredBlock(block, node);
      }
    }
{code}
# In {{reportDiffSorted}}, since we already convert the replica id in the 
beginning,  when we call {{getStoredBlock}} we can pass in the replicaID. Or we 
can directly call {{getStoredBlock}} in the beginning.
{code}
      if (BlockIdManager.isStripedBlockID(replicaID)
          && (!hasNonEcBlockUsingStripedID ||
              !blocksMap.containsBlock(replica))) {
        replicaID = BlockIdManager.convertToStripedID(replicaID);
      }
{code}
# The following TODO has been deleted from the current trunk.
{code}
    if (invalidateBlocks.contains(dn, replica)) {
      /*
       * TODO: following assertion is incorrect, see HDFS-2668 assert
       * storedBlock.findDatanode(dn) < 0 : "Block " + block +
       * " in recentInvalidatesSet should not appear in DN " + dn;
       */
      return;
    }
{code}
# It's better to avoid duplicated code between 
{{processAndHandleReportedBlock}} and {{reportDiffSortedInner}}. The logic for 
processing reported block is  complicated. Having copies in two different 
places causes extra maintenance burden.
# In TreeSet#removeElementAt, currently the compaction is only done to merge 
node into next/prev. Do you also want to check if we can merge next/prev into 
the current node if next/perv's size is 1? (Looks like Apache Harmony's 
implementation has this extra check)
# Maybe we should have an upper bound for the total number of storage 
compaction done in each iteration? And the next iteration can continue from the 
stopping point of the previous iteration. Considering calcualting the fill 
ratio also needs to go through the whole tree, each compaction iteration will 
go through at least (total_number_blocks * replication_factor / 64) tree nodes. 
This may be a big workload and in the best scenario (i.e., no compaction is 
necessary) the read lock will still be held for a while. 
# It will also be helpful if you can post performance numbers about the 
compaction.

> Improve performance and GC friendliness of startup and FBRs
> -----------------------------------------------------------
>
>                 Key: HDFS-9260
>                 URL: https://issues.apache.org/jira/browse/HDFS-9260
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode, performance
>    Affects Versions: 2.7.1
>            Reporter: Staffan Friberg
>            Assignee: Staffan Friberg
>         Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to