[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.

Tsz-wo Sze (Jira) Sun, 14 Nov 2021 19:39:07 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443527#comment-17443527
 ]


Tsz-wo Sze commented on HDFS-16322:
-----------------------------------

> ... However, under concurrency, after the first execution of truncate(...), 
> concurrent requests from other clients may append new data and change the 
> file length. When truncate(...) is retried after that, it will find the file 
> has not been truncated with the same length and truncate it again, which 
> causes data loss.

This is not a data loss case.  For concurrent truncate and append, NN can 
execute them in any order.  This case just becomes first append and then 
truncate.

> The NameNode implementation of ClientProtocol.truncate(...) can cause data 
> loss.
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-16322
>                 URL: https://issues.apache.org/jira/browse/HDFS-16322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>         Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 
> and Apache Maven 3.6.0. 
> The bug can be reproduced by the the testMultipleTruncate() in the 
> attachment. First, replace the file TestFileTruncate.java under the directory 
> "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/"
>  with the attachment. Then run "mvn test 
> -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate"
>  to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 
> line of TestFileTruncate.java will abort. Because the retry of truncate() 
> changes the file size and cause data loss.
>            Reporter: nhaorand
>            Priority: Major
>         Attachments: TestFileTruncate.java
>
>
> The NameNode implementation of ClientProtocol.truncate(...) can cause data 
> loss. If dfsclient drops the first response of a truncate RPC call, the retry 
> by retry cache will truncate the file again and cause data loss.
> HDFS-7926 avoids repeated execution of truncate(...) by checking if the file 
> is already being truncated with the same length. However, under concurrency, 
> after the first execution of truncate(...), concurrent requests from other 
> clients may append new data and change the file length. When truncate(...) is 
> retried after that, it will find the file has not been truncated with the 
> same length and truncate it again, which causes data loss.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.

Reply via email to