[ 
https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122262#comment-14122262
 ] 

James Thomas commented on HDFS-6981:
------------------------------------

[~arpitagarwal], thanks for the patch. A couple high-level comments:
* The {{Preconditions.checkState}} calls can fail. Consider the case where a 
block {{B1}} is created after the rolling upgrade is started and is deleted 
before the upgrade is finalized. Since your code avoids adding a block to the 
trash only if the block exists in the {{previous}} directory, {{B1}} will be 
added to the trash. So both the trash and the {{previous}} directory will exist 
concurrently. This behavior is not incorrect (the same block does not exist in 
both the trash and {{previous}}), but it will cause the 
{{Preconditions.checkState}} calls to fail. It seems like you can either remove 
these calls or avoid adding a block to the trash if any {{previous}} directory 
exists at all, rather than checking whether the block exists in {{previous}}.
* Suppose that we finalize a rolling upgrade with layout version change at time 
{{T1}}, and then start another rolling upgrade with layout version change soon 
afterwards at time {{T2}}. Suppose also that there is no block report between 
{{T1}} and {{T2}}, so the old {{previous}} directory is not deleted. Now a 
block {{B2}} existing in the old {{previous}} is deleted, and it is not added 
to the trash by the rules in this patch. Next, this DN is restarted, the old 
{{previous}} directory is deleted, and a new {{previous}} directory without 
{{B2}} is created, and we have lost {{B2}}. This is admittedly an extreme 
corner case, but I think that the existence of {{previous}} for some time after 
finalization is somewhat worrisome. It's probably a good idea to rename 
{{previous}} to something else (e.g. {{finalized.tmp}}) when 
{{signalRollingUpgrade}} is called with null, and then actually do the deletion 
asynchronously in the background.

{code}
+      // Delete the second file. Ensure that its block file is in previous.
+      File blockFile1 = getBlockForFile(paths[1], true);
+      fs.delete(paths[1], false);
+      ensureBlockFileInPrevious(blockFile1);
{code}
Might want to assert that this file doesn't exist in trash.

Another issue with this set of changes is that if we call roll back to an old 
DN software version with {{-rollback}} and the upgrade did not involve a layout 
version change, we'll actually fail to restore the trash and there could be 
missing blocks as a result. I'm not sure what the best way to handle this is 
... we might need to instruct administrators to use {{-rollback}} only when 
rolling back after an upgrade involving a DN layout version change.

> DN upgrade with layout version change should not use trash
> ----------------------------------------------------------
>
>                 Key: HDFS-6981
>                 URL: https://issues.apache.org/jira/browse/HDFS-6981
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0
>            Reporter: James Thomas
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-6981.01.patch, HDFS-6981.02.patch, 
> HDFS-6981.03.patch
>
>
> Post HDFS-6800, we can encounter the following scenario:
> # We start with DN software version -55 and initiate a rolling upgrade to 
> version -56
> # We delete some blocks, and they are moved to trash
> # We roll back to DN software version -55 using the -rollback flag – since we 
> are running the old code (prior to this patch), we will restore the previous 
> directory but will not delete the trash
> # We append to some of the blocks that were deleted in step 2
> # We then restart a DN that contains blocks that were appended to – since the 
> trash still exists, it will be restored at this point, the appended-to blocks 
> will be overwritten, and we will lose the appended data
> So I think we need to avoid writing anything to the trash directory if we 
> have a previous directory.
> Thanks to [~james.thomas] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to