[ https://issues.apache.org/jira/browse/HDFS-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289278#comment-14289278 ]
Hudson commented on HDFS-7575: ------------------------------ FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #79 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/79/]) HDFS-7575. Upgrade should generate a unique storage ID for each volume. (Contributed by Arpit Agarwal) (arp: rev d34074e237ee10b83aeb02294f595714d43e39e4) * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom22FixesStorageIDs.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom26PreservesStorageIDs.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeLayoutUpgrade.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom22via26FixesStorageIDs.tgz * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom22via26FixesStorageIDs.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/UpgradeUtilities.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom22FixesStorageIDs.tgz * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testUpgradeFrom26PreservesStorageIDs.tgz * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeStorage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeStartupFixesLegacyStorageIDs.java HDFS-7575. Fix CHANGES.txt (arp: rev 792b7d337a0edbc3da74099709c197e85ce38097) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Upgrade should generate a unique storage ID for each volume > ----------------------------------------------------------- > > Key: HDFS-7575 > URL: https://issues.apache.org/jira/browse/HDFS-7575 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.4.0, 2.5.0, 2.6.0 > Reporter: Lars Francke > Assignee: Arpit Agarwal > Priority: Critical > Fix For: 2.7.0 > > Attachments: HDFS-7575.01.patch, HDFS-7575.02.patch, > HDFS-7575.03.binary.patch, HDFS-7575.03.patch, HDFS-7575.04.binary.patch, > HDFS-7575.04.patch, HDFS-7575.05.binary.patch, HDFS-7575.05.patch, > testUpgrade22via24GeneratesStorageIDs.tgz, > testUpgradeFrom22GeneratesStorageIDs.tgz, > testUpgradeFrom24PreservesStorageId.tgz > > > Before HDFS-2832 each DataNode would have a unique storageId which included > its IP address. Since HDFS-2832 the DataNodes have a unique storageId per > storage directory which is just a random UUID. > They send reports per storage directory in their heartbeats. This heartbeat > is processed on the NameNode in the > {{DatanodeDescriptor#updateHeartbeatState}} method. Pre HDFS-2832 this would > just store the information per Datanode. After the patch though each DataNode > can have multiple different storages so it's stored in a map keyed by the > storage Id. > This works fine for all clusters that have been installed post HDFS-2832 as > they get a UUID for their storage Id. So a DN with 8 drives has a map with 8 > different keys. On each Heartbeat the Map is searched and updated > ({{DatanodeStorageInfo storage = storageMap.get(s.getStorageID());}}): > {code:title=DatanodeStorageInfo} > void updateState(StorageReport r) { > capacity = r.getCapacity(); > dfsUsed = r.getDfsUsed(); > remaining = r.getRemaining(); > blockPoolUsed = r.getBlockPoolUsed(); > } > {code} > On clusters that were upgraded from a pre HDFS-2832 version though the > storage Id has not been rewritten (at least not on the four clusters I > checked) so each directory will have the exact same storageId. That means > there'll be only a single entry in the {{storageMap}} and it'll be > overwritten by a random {{StorageReport}} from the DataNode. This can be seen > in the {{updateState}} method above. This just assigns the capacity from the > received report, instead it should probably sum it up per received heartbeat. > The Balancer seems to be one of the only things that actually uses this > information so it now considers the utilization of a random drive per > DataNode for balancing purposes. > Things get even worse when a drive has been added or replaced as this will > now get a new storage Id so there'll be two entries in the storageMap. As new > drives are usually empty it skewes the balancers decision in a way that this > node will never be considered over-utilized. > Another problem is that old StorageReports are never removed from the > storageMap. So if I replace a drive and it gets a new storage Id the old one > will still be in place and used for all calculations by the Balancer until a > restart of the NameNode. > I can try providing a patch that does the following: > * Instead of using a Map I could just store the array we receive or instead > of storing an array sum up the values for reports with the same Id > * On each heartbeat clear the map (so we know we have up to date information) > Does that sound sensible? -- This message was sent by Atlassian JIRA (v6.3.4#6332)