[ https://issues.apache.org/jira/browse/HDFS-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116257#comment-14116257 ]
James Thomas commented on HDFS-6800: ------------------------------------ I tested that rolling upgrade to and rolling upgrade rollback from DN layout version -56 (HDFS-6482) works without data loss. I also validated the first, third, and fourth bullets above (the fourth by doing a rolling upgrade to the original software version). These tests were run on a single node with a single NN (no HA) and a single DN, and one or two files were added to the DFS in each case. The second bullet unfortunately raises an issue with this patch -- we can encounter the following scenario: (1) We start with DN software version -55 and initiate a rolling upgrade to version -56 (2) We delete some blocks, and they are moved to trash (3) We roll back to DN software version -55 using the {{-rollback}} flag -- since we are running the old code (prior to this patch), we will restore the {{previous}} directory but will not delete the trash (4) We append to some of the blocks that were deleted in step 2 (5) We then restart a DN that contains blocks that were appended to -- since the trash still exists, it will be restored at this point, the appended-to blocks will be overwritten, and we will lose the appended data So I think we need to avoid writing anything to the trash directory if we have a {{previous}} directory. I have not thought through this completely. [~arpitagarwal], any thoughts? > Support Datanode layout changes with rolling upgrade > ---------------------------------------------------- > > Key: HDFS-6800 > URL: https://issues.apache.org/jira/browse/HDFS-6800 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 2.6.0 > Reporter: Colin Patrick McCabe > Assignee: James Thomas > Fix For: 3.0.0, 2.6.0 > > Attachments: HDFS-6800.2.patch, HDFS-6800.3.patch, HDFS-6800.4.patch, > HDFS-6800.5.patch, HDFS-6800.6.patch, HDFS-6800.7.patch, HDFS-6800.8.patch, > HDFS-6800.patch > > > We need to handle attempts to rolling-upgrade the DataNode to a new storage > directory layout. > One approach is to disallow such upgrades. If we choose this approach, we > should make sure that the system administrator gets a helpful error message > and a clean failure when trying to use rolling upgrade to a version that > doesn't support it. Based on the compatibility guarantees described in > HDFS-5535, this would mean that *any* future DataNode layout changes would > require a major version upgrade. > Another approach would be to support rolling upgrade from an old DN storage > layout to a new layout. This approach requires us to change our > documentation to explain to users that they should supply the {{\-rollback}} > command on the command-line when re-starting the DataNodes during rolling > rollback. Currently the documentation just says to restart the DataNode > normally. > Another issue here is that the DataNode's usage message describes rollback > options that no longer exist. The help text says that the DN supports > {{\-rollingupgrade rollback}}, but this option was removed by HDFS-6005. -- This message was sent by Atlassian JIRA (v6.2#6252)