[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

Colin Patrick McCabe (JIRA) Tue, 26 Jan 2016 15:31:38 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118269#comment-15118269
 ]


Colin Patrick McCabe commented on HDFS-9260:
--------------------------------------------

bq. Should I convert storages field to private? (The triplets field was 
protected)

That sounds like a good idea.  We have functions to manipulate these fields, so 
the subclasses don't need to directly poke at the fields.

If that involves code changes to the subclasses, though, let's just do it in a 
follow-on JIRA.  This is one case where checkstyle is not that useful, since 
it's warning about a problem that already exists before this patch (and isn't 
even really a "problem," just an infelicity).

{code}
-  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY 
= "dfs.namenode.replication.max-streams-hard-limit";
+  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY =
+      "dfs.namenode.replication.max-streams-hard-limit";
{code}
Can we skip this change?  It's not really part of this work and it makes the 
diff bigger.

{code}
+  public static final String DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_KEY
+      = "dfs.namenode.storageinfo.efficiency.interval";
+  public static final int DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_DEFAULT
+      = 600;
{code}
Should be something like {{dfs.namenode.storageinfo.defragmenter.interval.ms}} 
to indicate that it's a scan interval, and that it's in milliseconds.

{code}
-      String poolId, StorageBlockReport[] reports, BlockReportContext context)
+      String poolId, StorageBlockReport[] reports, boolean sorted,
+      BlockReportContext context)
{code}
Another unnecessary diff

{code}
+    Object[] old = storages;
+    storages = new DatanodeStorageInfo[(last+num)];
+    System.arraycopy(old, 0, storages, 0, last);
{code}
Now "old" can have type {{DatanodeStorageInfo[]}} rather than {{Object[]}}, 
right?

{code}
+      storageInfoMonitorThread.interrupt();
{code}
Maybe this should be something like {{storageInfoDefragmenterThread}}? 
"monitor" suggests something like the block scanner, not defragmentation (at 
least in my mind?)

{code}
import org.apache.hadoop.hdfs.util.TreeSet;
{code}
I think this might be less confusing if you called it {{ChunkedTreeSet}}.  If I 
were just a new developer looking at the code (or even an experienced 
developer), I wouldn't really expect us to be using something called 
{{TreeSet}} which was actually completely different than {{java.util.TreeSet}}.

{code}
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
@@ -246,6 +246,7 @@ message BlockReportRequestProto {
   required string blockPoolId = 2;
   repeated StorageBlockReportProto reports = 3;
   optional BlockReportContextProto context = 4;
+  optional bool sorted = 5 [default = false];
 }
{code}
The other fields in {{BlockReportRequestProto}} have comments explaining what 
they are.  Let's add one for "sorted"

> Improve performance and GC friendliness of startup and FBRs
> -----------------------------------------------------------
>
>                 Key: HDFS-9260
>                 URL: https://issues.apache.org/jira/browse/HDFS-9260
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode, performance
>    Affects Versions: 2.7.1
>            Reporter: Staffan Friberg
>            Assignee: Staffan Friberg
>         Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

Reply via email to