[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.018.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, > HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.017.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFS-9260.016.patch, HDFS-9260.017.patch, HDFSBenchmarks.zip, > HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.016.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFS-9260.016.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.015.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.014.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, > HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.012.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.013.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.011.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFSBenchmarks.zip, > HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFSBenchmarks2.zip > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.010.patch Avoid LinkedList allocations > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFSBenchmarks.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFSBenchmarks.zip Microbenchmarks > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFSBenchmarks.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Description: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change. was: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-9260: -- Attachment: FBR processing.png I've read the doc now. Sorry I commented before doing so. The results are interesting until the the final details about a 4x reduction in block updates. Here are some basic specs to consider: * 10-80k adds/min * job submissions increasing replication factor to 10 * at least 1 node/day decommissioning or going dead with 100k-400k blocks * every few weeks entire racks (40 nodes) are decommissioned for refresh or reallocation * balancer is constantly churning to populate recommissioned dead nodes That's a lot of IBRs which is why a 4x degradation is quite concerning. The block report processing times seem a bit high in the tests. :) I'll attach an image of the BR processing times for some of our busiest clusters. They span the gamut from 100M-300M blocks with roughly the same number of files. We got a huge improvement from my BR encoding change + per-storage reports. BTW, I had/have a working patch that replaced the triplets with sparse yet densely packed 2-dimensional primitive arrays. Everything is linked via indices to a greatly reduce the dirty cards to scan. Need to dig up the jira when my head is above water. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.009.patch Fix for timed out test org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks Need to remove from iterator and not from tree during iteration to avoid concurrent modification exception > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-9260.008.patch Using the right name of the bug on the patch... Fixed white spaces and findbugs The remaining should hopefully be OK. ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java:60:35: Variable 'storages' must be private and have accessor methods. Same as triplets was before ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:204:16: Variable 'storageInfoMonitorThread' must be private and have accessor methods. Same as replicationMonitor ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:1: File length is 4,427 lines (max allowed is 2,000). ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:2487: Comment matches to-do format 'TODO:'. ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:2501: Comment matches to-do format 'TODO:'. File was long before already, and the TODOs are kept from the earlier version of diffReport ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:19: Inner assignments should be avoided. ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:28: Inner assignments should be avoided. ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:35: Inner assignments should be avoided. ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:43: Inner assignments should be avoided. Can change to separate lines writing null, but the current version is more compact in the clear method setting them all to null > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.007.patch Fixed comments, white spaces and most of the 80 width warnings > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.005.patch Fix the last todos Handles New NN and Old DN (unsorted entries), it is ineffiecient since the NN needs to sort entries. However it should only be a problem during the upgrade cycle, and avoidable if DNs are updated first. StorageInfoMonitor thread that can compact the TreeSet if the fill ratio gets too low. Added test to check that unsorted entries are handled correctly. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.006.patch Merged and diff:ed again due to conflict > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Status: Patch Available (was: Open) > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.004.patch All HDFS tests should now pass. Fixes include Correctly handle EC blocks which are negative, and need to be masked Initial report might sometime report into a storage containing new blocks reported by incremental block report, addStoredLast will fallback on regular add if not sorted remove debugging output remove unused GSet import invalidate list must be a Block and not a Replica Left todo Handle old nodes which don't send data sorted Add a 'sorted' field in the report PB Figure out how how to be able to handle reports when cluster contains negative entries that are not EC blocks > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks are not on any storage, so no > replication can occur causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Description: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. was: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. There seems to be some timing issues I hit when testing the patch, not sure if it is a bug in the patch or something else (most likely the earlier)... Tests that fail for me: The issues seems to be that the blocks are not on any storage, so no replication can occur causing the tests to fail in different ways. TestDecomission.testDecommision If I add a little sleep after the cleanup/delete things seem to work TestDFSStripedOutputStreamWithFailure A couple of tests fails in this class. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.002.patch Merged with latest head > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS Block and Replica Management 20151013.pdf > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9260: Assignee: Staffan Friberg > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS Block and Replica Management 20151013.pdf > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: (was: HDFS Block and Replica Management 20151013.pdf) > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Description: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. There seems to be some timing issues I hit when testing the patch, not sure if it is a bug in the patch or something else (most likely the earlier)... Tests that fail for me: The issues seems to be that the blocks are not on any storage, so no replication can occur causing the tests to fail in different ways. TestDecomission.testDecommision If I add a little sleep after the cleanup/delete things seem to work TestDFSStripedOutputStreamWithFailure A couple of tests fails in this class. was: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. There seems to be some timing issues I hit when testing the patch, not sure if it is a bug in the patch or something else (most likely the earlier)... Tests that fail for me: The issues seems to be that the blocks is not on any storage, so no replication can occurs causing the tests to fail in different ways. TestDecomission.testDecommision If I add a little sleep after the cleanup/delete things seem to work TestDFSStripedOutputStreamWithFailure A couple of tests fails in this class. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks are not on any storage, so no > replication can occur causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.003.patch Add null check when creating iterator of storages > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: HDFS Block and Replica Management 20151013.pdf, > HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks are not on any storage, so no > replication can occur causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFS-7435.001.patch > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg > Attachments: HDFS-7435.001.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. > There seems to be some timing issues I hit when testing the patch, not sure > if it is a bug in the patch or something else (most likely the earlier)... > Tests that fail for me: >The issues seems to be that the blocks is not on any storage, so no > replication can occurs causing the tests to fail in different ways. >TestDecomission.testDecommision >If I add a little sleep after the cleanup/delete things seem to work >TestDFSStripedOutputStreamWithFailure >A couple of tests fails in this class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)