[ https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124731#comment-14124731 ]
James Thomas commented on HDFS-6981: ------------------------------------ {code} + if (!storagesWithRollingUpgradeMarker.contains(bpRoot.toString()) && + !markerFile.exists()) { + LOG.info("Created " + markerFile); + markerFile.createNewFile(); + storagesWithRollingUpgradeMarker.add(bpRoot.toString()); + storagesWithoutRollingUpgradeMarker.remove(bpRoot.toString()); + } {code} could be {code} + if (!storagesWithRollingUpgradeMarker.contains(bpRoot.toString())) { + if (!markerFile.exists()) { + LOG.info("Created " + markerFile); + markerFile.createNewFile(); + storagesWithRollingUpgradeMarker.add(bpRoot.toString()); + storagesWithoutRollingUpgradeMarker.remove(bpRoot.toString()); + } else { + storagesWithRollingUpgardeMarker.add(bpRoot.toString()); + } + } {code} and similarly for {{clearRollingUpgradeMarkers}}. These changes ensure that the cache is in sync with the filesystem state and reduce the number of filesystem operations. It also seems to me like the in-memory cache could be just be two volatile booleans (e.g. {{storagesHaveRollingUpgradeMarker}} and {{storagesDoNotHaveRollingUpgradeMarker}}) rather than two sets. Could the set of storages possibly change during the rolling upgrade? Otherwise things look good. Tests are solid. > DN upgrade with layout version change should not use trash > ---------------------------------------------------------- > > Key: HDFS-6981 > URL: https://issues.apache.org/jira/browse/HDFS-6981 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 3.0.0 > Reporter: James Thomas > Assignee: Arpit Agarwal > Attachments: HDFS-6981.01.patch, HDFS-6981.02.patch, > HDFS-6981.03.patch, HDFS-6981.04.patch, HDFS-6981.05.patch, HDFS-6981.06.patch > > > Post HDFS-6800, we can encounter the following scenario: > # We start with DN software version -55 and initiate a rolling upgrade to > version -56 > # We delete some blocks, and they are moved to trash > # We roll back to DN software version -55 using the -rollback flag – since we > are running the old code (prior to this patch), we will restore the previous > directory but will not delete the trash > # We append to some of the blocks that were deleted in step 2 > # We then restart a DN that contains blocks that were appended to – since the > trash still exists, it will be restored at this point, the appended-to blocks > will be overwritten, and we will lose the appended data > So I think we need to avoid writing anything to the trash directory if we > have a previous directory. > Thanks to [~james.thomas] for reporting this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)