[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.
[ https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443570#comment-17443570 ] Tsz-wo Sze commented on HDFS-16322: --- > ... (NOTE: t0 t0. It is correct if c1 succeeds but c0 gets FileAlreadyExistsException. > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. > > > Key: HDFS-16322 > URL: https://issues.apache.org/jira/browse/HDFS-16322 > Project: Hadoop HDFS > Issue Type: Bug > Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 > and Apache Maven 3.6.0. > The bug can be reproduced by the the testMultipleTruncate() in the > attachment. First, replace the file TestFileTruncate.java under the directory > "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/" > with the attachment. Then run "mvn test > -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate" > to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 > line of TestFileTruncate.java will abort. Because the retry of truncate() > changes the file size and cause data loss. >Reporter: nhaorand >Priority: Major > Attachments: TestFileTruncate.java > > > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. If dfsclient drops the first response of a truncate RPC call, the retry > by retry cache will truncate the file again and cause data loss. > HDFS-7926 avoids repeated execution of truncate(...) by checking if the file > is already being truncated with the same length. However, under concurrency, > after the first execution of truncate(...), concurrent requests from other > clients may append new data and change the file length. When truncate(...) is > retried after that, it will find the file has not been truncated with the > same length and truncate it again, which causes data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.
[ https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443566#comment-17443566 ] Tsz-wo Sze commented on HDFS-16322: --- [~hexiaoqiao], The given case is not data loss. As long as NN has not responded to Client A, NN may insert any operations before truncate. This is the correct concurrent behavior. > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. > > > Key: HDFS-16322 > URL: https://issues.apache.org/jira/browse/HDFS-16322 > Project: Hadoop HDFS > Issue Type: Bug > Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 > and Apache Maven 3.6.0. > The bug can be reproduced by the the testMultipleTruncate() in the > attachment. First, replace the file TestFileTruncate.java under the directory > "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/" > with the attachment. Then run "mvn test > -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate" > to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 > line of TestFileTruncate.java will abort. Because the retry of truncate() > changes the file size and cause data loss. >Reporter: nhaorand >Priority: Major > Attachments: TestFileTruncate.java > > > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. If dfsclient drops the first response of a truncate RPC call, the retry > by retry cache will truncate the file again and cause data loss. > HDFS-7926 avoids repeated execution of truncate(...) by checking if the file > is already being truncated with the same length. However, under concurrency, > after the first execution of truncate(...), concurrent requests from other > clients may append new data and change the file length. When truncate(...) is > retried after that, it will find the file has not been truncated with the > same length and truncate it again, which causes data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.
[ https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443550#comment-17443550 ] Xiaoqiao He commented on HDFS-16322: Thanks [~szetszwo] for your response. Considering the following case: A. Client A request truncate for file foo at time t0. B. NameNode executed truncate request and response to Client A at time t1. C. Client B request truncate and append for file foo and file length not changes but content changes actually at time t2. D. Client A not receive response and retry it (end user can not dominate this action), NameNode will re-execute it because no RetryCache for truncate request then response at t3. After t3, the file content will not be expected. (NOTE: t0 The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. > > > Key: HDFS-16322 > URL: https://issues.apache.org/jira/browse/HDFS-16322 > Project: Hadoop HDFS > Issue Type: Bug > Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 > and Apache Maven 3.6.0. > The bug can be reproduced by the the testMultipleTruncate() in the > attachment. First, replace the file TestFileTruncate.java under the directory > "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/" > with the attachment. Then run "mvn test > -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate" > to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 > line of TestFileTruncate.java will abort. Because the retry of truncate() > changes the file size and cause data loss. >Reporter: nhaorand >Priority: Major > Attachments: TestFileTruncate.java > > > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. If dfsclient drops the first response of a truncate RPC call, the retry > by retry cache will truncate the file again and cause data loss. > HDFS-7926 avoids repeated execution of truncate(...) by checking if the file > is already being truncated with the same length. However, under concurrency, > after the first execution of truncate(...), concurrent requests from other > clients may append new data and change the file length. When truncate(...) is > retried after that, it will find the file has not been truncated with the > same length and truncate it again, which causes data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681327 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 15/Nov/21 04:59 Start Date: 15/Nov/21 04:59 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#issuecomment-968539533 > Changes LGTM. Thank @ayushtkn for your review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681327) Time Spent: 2h 40m (was: 2.5h) > Add metrics related to Transfer and NativeCopy for DataNode > --- > > Key: HDFS-16315 > URL: https://issues.apache.org/jira/browse/HDFS-16315 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-11-08-26-33-074.png > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Datanodes already have Read, Write, Sync and Flush metrics. We should add > NativeCopy and Transfer as well. > Here is a partial look after the change: > !image-2021-11-11-08-26-33-074.png|width=205,height=235! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.
[ https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443527#comment-17443527 ] Tsz-wo Sze commented on HDFS-16322: --- > ... However, under concurrency, after the first execution of truncate(...), > concurrent requests from other clients may append new data and change the > file length. When truncate(...) is retried after that, it will find the file > has not been truncated with the same length and truncate it again, which > causes data loss. This is not a data loss case. For concurrent truncate and append, NN can execute them in any order. This case just becomes first append and then truncate. > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. > > > Key: HDFS-16322 > URL: https://issues.apache.org/jira/browse/HDFS-16322 > Project: Hadoop HDFS > Issue Type: Bug > Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 > and Apache Maven 3.6.0. > The bug can be reproduced by the the testMultipleTruncate() in the > attachment. First, replace the file TestFileTruncate.java under the directory > "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/" > with the attachment. Then run "mvn test > -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate" > to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 > line of TestFileTruncate.java will abort. Because the retry of truncate() > changes the file size and cause data loss. >Reporter: nhaorand >Priority: Major > Attachments: TestFileTruncate.java > > > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. If dfsclient drops the first response of a truncate RPC call, the retry > by retry cache will truncate the file again and cause data loss. > HDFS-7926 avoids repeated execution of truncate(...) by checking if the file > is already being truncated with the same length. However, under concurrency, > after the first execution of truncate(...), concurrent requests from other > clients may append new data and change the file length. When truncate(...) is > retried after that, it will find the file has not been truncated with the > same length and truncate it again, which causes data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16323) DatanodeHttpServer doesn't require handler state map while retrieving filter handlers
[ https://issues.apache.org/jira/browse/HDFS-16323?focusedWorklogId=681306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681306 ] ASF GitHub Bot logged work on HDFS-16323: - Author: ASF GitHub Bot Created on: 15/Nov/21 02:39 Start Date: 15/Nov/21 02:39 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3659: URL: https://github.com/apache/hadoop/pull/3659#issuecomment-968464431 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 12s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 14s | | trunk passed | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 58s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 22s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 18s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 17s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | compile | 1m 7s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 52s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3659/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | mvnsite | 1m 15s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 21s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 25s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 363m 29s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3659/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 470m 44s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3659/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3659 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux e8a6e1732a52 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 4ce8882038f2e4a06b17ccc8ae65eac0dd277bb9 | | Default Java | Private
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681296 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 15/Nov/21 01:40 Start Date: 15/Nov/21 01:40 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#issuecomment-968433719 The failed unit test is unrelated to the change. And work fine locally. @tasanuma @ayushtkn Please take a look. Thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681296) Time Spent: 2.5h (was: 2h 20m) > Add metrics related to Transfer and NativeCopy for DataNode > --- > > Key: HDFS-16315 > URL: https://issues.apache.org/jira/browse/HDFS-16315 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-11-08-26-33-074.png > > Time Spent: 2.5h > Remaining Estimate: 0h > > Datanodes already have Read, Write, Sync and Flush metrics. We should add > NativeCopy and Transfer as well. > Here is a partial look after the change: > !image-2021-11-11-08-26-33-074.png|width=205,height=235! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681294 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 15/Nov/21 01:26 Start Date: 15/Nov/21 01:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#issuecomment-968428495 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 16m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 12m 53s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 23s | | trunk passed | | +1 :green_heart: | compile | 21m 43s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 18m 58s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 3m 42s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 10s | | trunk passed | | +1 :green_heart: | javadoc | 2m 17s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 3m 19s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 5m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 7s | | the patch passed | | +1 :green_heart: | compile | 21m 0s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 21m 0s | | the patch passed | | +1 :green_heart: | compile | 18m 56s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 18m 56s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 3m 33s | | the patch passed | | +1 :green_heart: | mvnsite | 3m 12s | | the patch passed | | +1 :green_heart: | javadoc | 2m 16s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 3m 20s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 6m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 47s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 17m 21s | | hadoop-common in the patch passed. | | -1 :x: | unit | 376m 9s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3643/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 7s | | The patch does not generate ASF License warnings. | | | | 610m 38s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3643/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3643 | | Optional Tests | dupname asflicense mvnsite codespell markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle | | uname | Linux af9395139069 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / cdad4d553862c1b0e31a3a470c313c11f44df1d5 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions |
[jira] [Work logged] (HDFS-16323) DatanodeHttpServer doesn't require handler state map while retrieving filter handlers
[ https://issues.apache.org/jira/browse/HDFS-16323?focusedWorklogId=681255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681255 ] ASF GitHub Bot logged work on HDFS-16323: - Author: ASF GitHub Bot Created on: 14/Nov/21 18:46 Start Date: 14/Nov/21 18:46 Worklog Time Spent: 10m Work Description: virajjasani opened a new pull request #3659: URL: https://github.com/apache/hadoop/pull/3659 ### Description of PR DatanodeHttpServer#getFilterHandlers use handler state map just to query if the given datanode httpserver filter handler class exists in the map and if not, initialize the Channel handler by invoking specific parameterized constructor of the class. However, this handler state map is never used to upsert any data. ### How was this patch tested? Local testing with mini cluster. ### For code changes: - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681255) Remaining Estimate: 0h Time Spent: 10m > DatanodeHttpServer doesn't require handler state map while retrieving filter > handlers > - > > Key: HDFS-16323 > URL: https://issues.apache.org/jira/browse/HDFS-16323 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > DatanodeHttpServer#getFilterHandlers use handler state map just to query if > the given datanode httpserver filter handler class exists in the map and if > not, initialize the Channel handler by invoking specific parameterized > constructor of the class. However, this handler state map is never used to > upsert any data. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16323) DatanodeHttpServer doesn't require handler state map while retrieving filter handlers
[ https://issues.apache.org/jira/browse/HDFS-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16323: -- Labels: pull-request-available (was: ) > DatanodeHttpServer doesn't require handler state map while retrieving filter > handlers > - > > Key: HDFS-16323 > URL: https://issues.apache.org/jira/browse/HDFS-16323 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > DatanodeHttpServer#getFilterHandlers use handler state map just to query if > the given datanode httpserver filter handler class exists in the map and if > not, initialize the Channel handler by invoking specific parameterized > constructor of the class. However, this handler state map is never used to > upsert any data. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16323) DatanodeHttpServer doesn't require handler state map while retrieving filter handlers
Viraj Jasani created HDFS-16323: --- Summary: DatanodeHttpServer doesn't require handler state map while retrieving filter handlers Key: HDFS-16323 URL: https://issues.apache.org/jira/browse/HDFS-16323 Project: Hadoop HDFS Issue Type: Task Reporter: Viraj Jasani Assignee: Viraj Jasani DatanodeHttpServer#getFilterHandlers use handler state map just to query if the given datanode httpserver filter handler class exists in the map and if not, initialize the Channel handler by invoking specific parameterized constructor of the class. However, this handler state map is never used to upsert any data. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?focusedWorklogId=681238=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681238 ] ASF GitHub Bot logged work on HDFS-16318: - Author: ASF GitHub Bot Created on: 14/Nov/21 15:15 Start Date: 14/Nov/21 15:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3649: URL: https://github.com/apache/hadoop/pull/3649#issuecomment-968309570 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 8s | | trunk passed | | +1 :green_heart: | compile | 1m 0s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 53s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 26s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 56s | | trunk passed | | +1 :green_heart: | javadoc | 0m 42s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 2m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 16s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 49s | | the patch passed | | +1 :green_heart: | compile | 0m 52s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 52s | | the patch passed | | +1 :green_heart: | compile | 0m 43s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 43s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 18s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3649/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt) | hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 8 new + 31 unchanged - 0 fixed = 39 total (was 31) | | +1 :green_heart: | mvnsite | 0m 48s | | the patch passed | | +1 :green_heart: | javadoc | 0m 32s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 29s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | -1 :x: | spotbugs | 2m 40s | [/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3649/2/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.html) | hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 24m 32s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 15s | | hadoop-hdfs-client in the patch passed. | | -1 :x: | asflicense | 0m 30s | [/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3649/2/artifact/out/results-asflicense.txt) | The patch generated 1 ASF License warnings. | | | | 101m 7s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSInputStream.currentNode; locked 94% of time Unsynchronized access at DFSInputStream.java:94% of time Unsynchronized access at DFSInputStream.java:[line 260] | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3649/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3649 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | |
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681237 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 14/Nov/21 15:10 Start Date: 14/Nov/21 15:10 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#discussion_r748867878 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java ## @@ -1830,4 +1830,58 @@ public void testReleaseVolumeRefIfExceptionThrown() throws IOException { cluster.shutdown(); } } + + @Test(timeout = 3) + public void testTransferAndNativeCopyMetrics() { +Configuration config = new HdfsConfiguration(); +config.setInt( +DFSConfigKeys.DFS_DATANODE_FILEIO_PROFILING_SAMPLING_PERCENTAGE_KEY, +100); +config.set(DFSConfigKeys.DFS_METRICS_PERCENTILES_INTERVALS_KEY, +"60,300,1500"); +MiniDFSCluster cluster = null; +try { + cluster = new MiniDFSCluster.Builder(config) + .numDataNodes(1) + .storageTypes(new StorageType[]{StorageType.DISK, StorageType.DISK}) + .storagesPerDatanode(2) + .build(); + FileSystem fs = cluster.getFileSystem(); + DataNode dataNode = cluster.getDataNodes().get(0); + + // Create file that has one block with one replica. + Path filePath = new Path(name.getMethodName()); + DFSTestUtil.createFile(fs, filePath, 100, (short) 1, 0); + ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, filePath); + + // Copy a new replica to other volume. + FsDatasetImpl fsDataSetImpl = (FsDatasetImpl) dataNode.getFSDataset(); + ReplicaInfo newReplicaInfo = createNewReplicaObj(block, fsDataSetImpl); + fsDataSetImpl.finalizeNewReplica(newReplicaInfo, block); + + // Get the volume where the original replica resides. + FsVolumeSpi volume = null; + for (FsVolumeSpi fsVolumeReference : + fsDataSetImpl.getFsVolumeReferences()) { +if (!fsVolumeReference.getStorageID() +.equals(newReplicaInfo.getStorageUuid())) { + volume = fsVolumeReference; +} + } + + // Assert metrics. + DataNodeVolumeMetrics metrics = volume.getMetrics(); + assertEquals(2, metrics.getTransferIoSampleCount()); + assertEquals(3, metrics.getTransferIoQuantiles().length); + assertEquals(2, metrics.getNativeCopyIoSampleCount()); + assertEquals(3, metrics.getNativeCopyIoQuantiles().length); +} catch (Exception ex) { + LOG.info("Exception in testTransferAndNativeCopyMetrics ", ex); + fail("MoveBlock operation should succeed"); Review comment: Thanks @ayushtkn for your advice, I will fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681237) Time Spent: 2h 10m (was: 2h) > Add metrics related to Transfer and NativeCopy for DataNode > --- > > Key: HDFS-16315 > URL: https://issues.apache.org/jira/browse/HDFS-16315 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-11-08-26-33-074.png > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Datanodes already have Read, Write, Sync and Flush metrics. We should add > NativeCopy and Transfer as well. > Here is a partial look after the change: > !image-2021-11-11-08-26-33-074.png|width=205,height=235! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681236 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 14/Nov/21 15:09 Start Date: 14/Nov/21 15:09 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#discussion_r748867659 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java ## @@ -1237,7 +1238,6 @@ public void testMoveBlockSuccess() { FsDatasetImpl fsDataSetImpl = (FsDatasetImpl) dataNode.getFSDataset(); ReplicaInfo newReplicaInfo = createNewReplicaObj(block, fsDataSetImpl); fsDataSetImpl.finalizeNewReplica(newReplicaInfo, block); - Review comment: I'm sorry. It's my bad. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681236) Time Spent: 2h (was: 1h 50m) > Add metrics related to Transfer and NativeCopy for DataNode > --- > > Key: HDFS-16315 > URL: https://issues.apache.org/jira/browse/HDFS-16315 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-11-08-26-33-074.png > > Time Spent: 2h > Remaining Estimate: 0h > > Datanodes already have Read, Write, Sync and Flush metrics. We should add > NativeCopy and Transfer as well. > Here is a partial look after the change: > !image-2021-11-11-08-26-33-074.png|width=205,height=235! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
[ https://issues.apache.org/jira/browse/HDFS-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443357#comment-17443357 ] Ayush Saxena commented on HDFS-16319: - Committed to trunk and branch-3.3/ Thanx [~tomscut] for the contribution!!! > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount > > > Key: HDFS-16319 > URL: https://issues.apache.org/jira/browse/HDFS-16319 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See > [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16315) Add metrics related to Transfer and NativeCopy for DataNode
[ https://issues.apache.org/jira/browse/HDFS-16315?focusedWorklogId=681226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681226 ] ASF GitHub Bot logged work on HDFS-16315: - Author: ASF GitHub Bot Created on: 14/Nov/21 14:51 Start Date: 14/Nov/21 14:51 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #3643: URL: https://github.com/apache/hadoop/pull/3643#discussion_r748865034 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java ## @@ -1237,7 +1238,6 @@ public void testMoveBlockSuccess() { FsDatasetImpl fsDataSetImpl = (FsDatasetImpl) dataNode.getFSDataset(); ReplicaInfo newReplicaInfo = createNewReplicaObj(block, fsDataSetImpl); fsDataSetImpl.finalizeNewReplica(newReplicaInfo, block); - Review comment: nit: avoid this ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java ## @@ -1830,4 +1830,58 @@ public void testReleaseVolumeRefIfExceptionThrown() throws IOException { cluster.shutdown(); } } + + @Test(timeout = 3) + public void testTransferAndNativeCopyMetrics() { +Configuration config = new HdfsConfiguration(); +config.setInt( +DFSConfigKeys.DFS_DATANODE_FILEIO_PROFILING_SAMPLING_PERCENTAGE_KEY, +100); +config.set(DFSConfigKeys.DFS_METRICS_PERCENTILES_INTERVALS_KEY, +"60,300,1500"); +MiniDFSCluster cluster = null; +try { + cluster = new MiniDFSCluster.Builder(config) + .numDataNodes(1) + .storageTypes(new StorageType[]{StorageType.DISK, StorageType.DISK}) + .storagesPerDatanode(2) + .build(); + FileSystem fs = cluster.getFileSystem(); + DataNode dataNode = cluster.getDataNodes().get(0); + + // Create file that has one block with one replica. + Path filePath = new Path(name.getMethodName()); + DFSTestUtil.createFile(fs, filePath, 100, (short) 1, 0); + ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, filePath); + + // Copy a new replica to other volume. + FsDatasetImpl fsDataSetImpl = (FsDatasetImpl) dataNode.getFSDataset(); + ReplicaInfo newReplicaInfo = createNewReplicaObj(block, fsDataSetImpl); + fsDataSetImpl.finalizeNewReplica(newReplicaInfo, block); + + // Get the volume where the original replica resides. + FsVolumeSpi volume = null; + for (FsVolumeSpi fsVolumeReference : + fsDataSetImpl.getFsVolumeReferences()) { +if (!fsVolumeReference.getStorageID() +.equals(newReplicaInfo.getStorageUuid())) { + volume = fsVolumeReference; +} + } + + // Assert metrics. + DataNodeVolumeMetrics metrics = volume.getMetrics(); + assertEquals(2, metrics.getTransferIoSampleCount()); + assertEquals(3, metrics.getTransferIoQuantiles().length); + assertEquals(2, metrics.getNativeCopyIoSampleCount()); + assertEquals(3, metrics.getNativeCopyIoQuantiles().length); +} catch (Exception ex) { + LOG.info("Exception in testTransferAndNativeCopyMetrics ", ex); + fail("MoveBlock operation should succeed"); Review comment: No need to have a catch block, let the exception raised be propagated. You can even consider using try with resources for ``cluster = new MiniDFSCluster.Builder(config)`` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681226) Time Spent: 1h 50m (was: 1h 40m) > Add metrics related to Transfer and NativeCopy for DataNode > --- > > Key: HDFS-16315 > URL: https://issues.apache.org/jira/browse/HDFS-16315 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-11-08-26-33-074.png > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Datanodes already have Read, Write, Sync and Flush metrics. We should add > NativeCopy and Transfer as well. > Here is a partial look after the change: > !image-2021-11-11-08-26-33-074.png|width=205,height=235! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail:
[jira] [Work logged] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
[ https://issues.apache.org/jira/browse/HDFS-16319?focusedWorklogId=681225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681225 ] ASF GitHub Bot logged work on HDFS-16319: - Author: ASF GitHub Bot Created on: 14/Nov/21 14:48 Start Date: 14/Nov/21 14:48 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3653: URL: https://github.com/apache/hadoop/pull/3653#issuecomment-968304600 > Thanx @tomscut for the contribution! Thanks @ayushtkn for the merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681225) Time Spent: 1h 10m (was: 1h) > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount > > > Key: HDFS-16319 > URL: https://issues.apache.org/jira/browse/HDFS-16319 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See > [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
[ https://issues.apache.org/jira/browse/HDFS-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16319. - Fix Version/s: 3.4.0 3.3.2 Hadoop Flags: Reviewed Resolution: Fixed > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount > > > Key: HDFS-16319 > URL: https://issues.apache.org/jira/browse/HDFS-16319 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Time Spent: 1h > Remaining Estimate: 0h > > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See > [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
[ https://issues.apache.org/jira/browse/HDFS-16319?focusedWorklogId=681223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681223 ] ASF GitHub Bot logged work on HDFS-16319: - Author: ASF GitHub Bot Created on: 14/Nov/21 14:40 Start Date: 14/Nov/21 14:40 Worklog Time Spent: 10m Work Description: ayushtkn merged pull request #3653: URL: https://github.com/apache/hadoop/pull/3653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681223) Time Spent: 50m (was: 40m) > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount > > > Key: HDFS-16319 > URL: https://issues.apache.org/jira/browse/HDFS-16319 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See > [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
[ https://issues.apache.org/jira/browse/HDFS-16319?focusedWorklogId=681224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681224 ] ASF GitHub Bot logged work on HDFS-16319: - Author: ASF GitHub Bot Created on: 14/Nov/21 14:40 Start Date: 14/Nov/21 14:40 Worklog Time Spent: 10m Work Description: ayushtkn commented on pull request #3653: URL: https://github.com/apache/hadoop/pull/3653#issuecomment-968303352 Thanx @tomscut for the contribution!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681224) Time Spent: 1h (was: 50m) > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount > > > Key: HDFS-16319 > URL: https://issues.apache.org/jira/browse/HDFS-16319 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See > [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16321. - Fix Version/s: 3.4.0 3.3.2 Hadoop Flags: Reviewed Resolution: Fixed > Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP > - > > Key: HDFS-16321 > URL: https://issues.apache.org/jira/browse/HDFS-16321 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.3.1 >Reporter: guo >Assignee: guo >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > `TestAvailableSpaceRackFaultTolerantBPP` seems setting invalid param(valid in > `TestAvailableSpaceBlockPlacementPolicy`), we can fix it to avoid further > trouble. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443355#comment-17443355 ] Ayush Saxena commented on HDFS-16321: - Committed to trunk and branch-3.3. Thanx [~philipse] for the contribution!!! > Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP > - > > Key: HDFS-16321 > URL: https://issues.apache.org/jira/browse/HDFS-16321 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.3.1 >Reporter: guo >Assignee: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > `TestAvailableSpaceRackFaultTolerantBPP` seems setting invalid param(valid in > `TestAvailableSpaceBlockPlacementPolicy`), we can fix it to avoid further > trouble. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?focusedWorklogId=681222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681222 ] ASF GitHub Bot logged work on HDFS-16321: - Author: ASF GitHub Bot Created on: 14/Nov/21 14:29 Start Date: 14/Nov/21 14:29 Worklog Time Spent: 10m Work Description: ayushtkn merged pull request #3655: URL: https://github.com/apache/hadoop/pull/3655 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681222) Time Spent: 1h 20m (was: 1h 10m) > Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP > - > > Key: HDFS-16321 > URL: https://issues.apache.org/jira/browse/HDFS-16321 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.3.1 >Reporter: guo >Assignee: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > `TestAvailableSpaceRackFaultTolerantBPP` seems setting invalid param(valid in > `TestAvailableSpaceBlockPlacementPolicy`), we can fix it to avoid further > trouble. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned HDFS-16321: --- Assignee: guo > Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP > - > > Key: HDFS-16321 > URL: https://issues.apache.org/jira/browse/HDFS-16321 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.3.1 >Reporter: guo >Assignee: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > `TestAvailableSpaceRackFaultTolerantBPP` seems setting invalid param(valid in > `TestAvailableSpaceBlockPlacementPolicy`), we can fix it to avoid further > trouble. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443351#comment-17443351 ] Janus Chow edited comment on HDFS-16320 at 11/14/21, 2:13 PM: -- [~hexiaoqiao] {quote}I mean that DataNode has the total information to decide if he is SLOW based on response time or throughput rather than based on command from NameNode. Furthermore there is possible to false positive at NameNode side. {quote} In my opinion, the slownode information of NameNode is kind of union choices from DataNodes. All the slownodes are reported by other DataNodes(calculated by statistics), and NameNode does the summary and chooses the top reported DataNodes. Till this part, the data of "slownode" should be much confident. {quote}I am not against the idea but we should have more proper way to solve this problem. {quote} I tried to find other ways to find out the slownode, especially from DataNode themselves. But after I checked the implementation of "OutlierDetector.java" and "DataNodePeerMetrics.java", I think the current calculation is very good to spot the slownode. {quote}client and DataNode/Pipeline communication could estimate if there are slow nodes and which one is slow {quote} Since in a pipeline, the client only talks to the first DataNode, it could be difficult to track the slowness for the slowness between the three DataNodes. I think that's why the slownode is only calculated on the penultimate node and the last node. Another thing is in this ticket, it's kind of a slowness statement. Until now, the DataNode only shows the state of slowness tagged by each NameNode in the metrics. It's a kind of real-time status updated by heartbeat. was (Author: symious): {quote}I mean that DataNode has the total information to decide if he is SLOW based on response time or throughput rather than based on command from NameNode. Furthermore there is possible to false positive at NameNode side. {quote} In my opinion, the slownode information of NameNode is kind of union choices from DataNodes. All the slownodes are reported by other DataNodes(calculated by statistics), and NameNode does the summary and chooses the top reported DataNodes. Till this part, the data of "slownode" should be much confident. {quote}I am not against the idea but we should have more proper way to solve this problem. {quote} I tried to find other ways to find out the slownode, especially from DataNode themselves. But after I checked the implementation of "OutlierDetector.java" and "DataNodePeerMetrics.java", I think the current calculation is very good to spot the slownode. {quote}client and DataNode/Pipeline communication could estimate if there are slow nodes and which one is slow {quote} Since in a pipeline, the client only talks to the first DataNode, it could be difficult to track the slowness for the slowness between the three DataNodes. I think that's why the slownode is only calculated on the penultimate node and the last node. Another thing is in this ticket, it's kind of a slowness statement. Until now, the DataNode only shows the state of slowness tagged by each NameNode in the metrics. It's a kind of real-time status updated by heartbeat. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443351#comment-17443351 ] Janus Chow commented on HDFS-16320: --- {quote}I mean that DataNode has the total information to decide if he is SLOW based on response time or throughput rather than based on command from NameNode. Furthermore there is possible to false positive at NameNode side. {quote} In my opinion, the slownode information of NameNode is kind of union choices from DataNodes. All the slownodes are reported by other DataNodes(calculated by statistics), and NameNode does the summary and chooses the top reported DataNodes. Till this part, the data of "slownode" should be much confident. {quote}I am not against the idea but we should have more proper way to solve this problem. {quote} I tried to find other ways to find out the slownode, especially from DataNode themselves. But after I checked the implementation of "OutlierDetector.java" and "DataNodePeerMetrics.java", I think the current calculation is very good to spot the slownode. {quote}client and DataNode/Pipeline communication could estimate if there are slow nodes and which one is slow {quote} Since in a pipeline, the client only talks to the first DataNode, it could be difficult to track the slowness for the slowness between the three DataNodes. I think that's why the slownode is only calculated on the penultimate node and the last node. Another thing is in this ticket, it's kind of a slowness statement. Until now, the DataNode only shows the state of slowness tagged by each NameNode in the metrics. It's a kind of real-time status updated by heartbeat. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443341#comment-17443341 ] guo commented on HDFS-16318: Thanks [~hexiaoqiao] for your note, have just updated > Add exception blockinfo > --- > > Key: HDFS-16318 > URL: https://issues.apache.org/jira/browse/HDFS-16318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > we may suffer `Could not obtain the last block location` exception, but we > may reading more than one file, the following exception cannnot guide us to > find the problem block or dn info. we can add more info in the log to help > us . > `2021-11-12 14:01:59,633 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 3 times` > `2021-11-12 14:02:03,724 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 2 times` > `2021-11-12 14:02:07,726 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 1 times` > `Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:251) > ... 11 more` > `Caused by: java.io.IOException: Could not obtain the last block locations. > at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:291) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:264) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1535) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312) > at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:162) > at > org.apache.hadoop.fs.viewfs.ChRootedFileSystem.open(ChRootedFileSystem.java:261) > at > org.apache.hadoop.fs.viewfs.ViewFileSystem.open(ViewFileSystem.java:463) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768) > at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:109) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:66) > ... 15 more` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guo updated HDFS-16318: --- Description: we may suffer `Could not obtain the last block location` exception, but we may reading more than one file, the following exception cannnot guide us to find the problem block or dn info. we can add more info in the log to help us . `2021-11-12 14:01:59,633 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 3 times` `2021-11-12 14:02:03,724 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 2 times` `2021-11-12 14:02:07,726 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 1 times` `Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:251) ... 11 more` `Caused by: java.io.IOException: Could not obtain the last block locations. at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:291) at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:264) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1535) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312) at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:162) at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.open(ChRootedFileSystem.java:261) at org.apache.hadoop.fs.viewfs.ViewFileSystem.open(ViewFileSystem.java:463) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768) at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:109) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:66) ... 15 more` > Add exception blockinfo > --- > > Key: HDFS-16318 > URL: https://issues.apache.org/jira/browse/HDFS-16318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > we may suffer `Could not obtain the last block location` exception, but we > may reading more than one file, the following exception cannnot guide us to > find the problem block or dn info. we can add more info in the log to help > us . > `2021-11-12 14:01:59,633 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 3 times` > `2021-11-12 14:02:03,724 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 2 times` > `2021-11-12 14:02:07,726 WARN [main] org.apache.hadoop.hdfs.DFSClient: Last > block locations not available. Datanodes might not have reported blocks > completely. Will retry for 1 times` > `Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:251) > ... 11 more` > `Caused by: java.io.IOException: Could not obtain the last block locations. > at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:291) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:264) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1535) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) > at >
[jira] [Work logged] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?focusedWorklogId=681218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681218 ] ASF GitHub Bot logged work on HDFS-16318: - Author: ASF GitHub Bot Created on: 14/Nov/21 13:36 Start Date: 14/Nov/21 13:36 Worklog Time Spent: 10m Work Description: GuoPhilipse commented on a change in pull request #3649: URL: https://github.com/apache/hadoop/pull/3649#discussion_r748855728 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java ## @@ -257,7 +257,7 @@ void openInfo(boolean refreshLocatedBlocks) throws IOException { // locations will not be available with NN for getting the length. Lets // retry for 3 times to get the length. if (lastBlockBeingWrittenLength == -1) { - DFSClient.LOG.warn("Last block locations not available. " + DFSClient.LOG.warn("Last block locations " + getCurrentBlock() + " not available. " Review comment: Thanks @Hexiaoqiao for your review , just have updated `src` info in the log, but i have not idea on the log test cases ,Could you give me some advice ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681218) Time Spent: 40m (was: 0.5h) > Add exception blockinfo > --- > > Key: HDFS-16318 > URL: https://issues.apache.org/jira/browse/HDFS-16318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?focusedWorklogId=681216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681216 ] ASF GitHub Bot logged work on HDFS-16321: - Author: ASF GitHub Bot Created on: 14/Nov/21 13:10 Start Date: 14/Nov/21 13:10 Worklog Time Spent: 10m Work Description: GuoPhilipse commented on pull request #3655: URL: https://github.com/apache/hadoop/pull/3655#issuecomment-968288277 @ayushtkn Could you kindly help check the timeout cases, I have tried twice ,the errors seem not related with this patch, and they occurs in different methods, `[ERROR] testResponseCode(org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract) Time elapsed: 30.014 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 3 milliseconds at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:426) at java.util.concurrent.FutureTask.get(FutureTask.java:204) at org.junit.internal.runners.statements.FailOnTimeout.getResult(FailOnTimeout.java:167) at org.junit.internal.runners.statements.FailOnTimeout.evaluate(FailOnTimeout.java:128) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681216) Time Spent: 1h 10m (was: 1h) > Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP > - > > Key: HDFS-16321 > URL: https://issues.apache.org/jira/browse/HDFS-16321 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > `TestAvailableSpaceRackFaultTolerantBPP` seems setting invalid param(valid in > `TestAvailableSpaceBlockPlacementPolicy`), we can fix it to avoid further > trouble. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443283#comment-17443283 ] Xiaoqiao He commented on HDFS-16320: Thanks [~Symious] for your quick response. It is reasonable case for me. I mean that DataNode has the total information to decide if he is SLOW based on response time or throughput rather than based on command from NameNode. Furthermore there is possible to false positive at NameNode side. I am not against the idea but we should have more proper way to solve this problem. IMO, client and DataNode/Pipeline communication could estimate if there are slow nodes and which one is slow. FYI. Thanks. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16321) Fix invalid config in TestAvailableSpaceRackFaultTolerantBPP
[ https://issues.apache.org/jira/browse/HDFS-16321?focusedWorklogId=681207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681207 ] ASF GitHub Bot logged work on HDFS-16321: - Author: ASF GitHub Bot Created on: 14/Nov/21 09:42 Start Date: 14/Nov/21 09:42 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3655: URL: https://github.com/apache/hadoop/pull/3655#issuecomment-968257217 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 36s | | trunk passed | | +1 :green_heart: | compile | 1m 22s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 13s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 28s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 18s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 6s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 13s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | compile | 1m 7s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 52s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3655/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | javadoc | 0m 47s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 18s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 21s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 6s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 354m 40s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3655/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 462m 56s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3655/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3655 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux b1ad24666443 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / dcc21b479d735ebf5f8c4a07b4e6f6156d026f9f | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions |
[jira] [Commented] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443276#comment-17443276 ] Janus Chow commented on HDFS-16320: --- [~hexiaoqiao] Thank you for the review. The issue we met is we have clients writing to the slownode and it took a very long time to finish writing for a normal file. After we checked the metrics, we found we can avoid the pipeline creating on the slownodes with "dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled" set to true. It will work fine for new clients, but for clients already using the slownode as pipeline, they have to suffer the slownode. (Maybe the slownode is reported by this pipeline.) Since when clients are writing data, it will only be clients and datanodes communicating, so even NameNode has the information that the datanode in the pipeline is slow, clients can do too much to avoid it. Our proposal would be, to let Datanodes get the information from heartbeats reports, then during the writing, datanodes can report it to clients, then clients can choose to rebuild the pipeline to improve the writing performance. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443276#comment-17443276 ] Janus Chow edited comment on HDFS-16320 at 11/14/21, 9:41 AM: -- [~hexiaoqiao] Thank you for the review. The issue we met is we have clients writing to the slownode and it took a very long time to finish writing for a normal file. After we checked the metrics, we found we can avoid the pipeline creating on the slownodes with "dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled" set to true. It will work fine for new clients, but for clients already using the slownode as pipeline, they have to suffer the slownode. (Maybe the slownode is reported by this pipeline.) Since when clients are writing data, it will only be clients and datanodes communicating, so even NameNode has the information that the datanode in the pipeline is slow, clients can not do too much to avoid it. Our proposal would be, to let Datanodes get the information from heartbeats reports, then during the writing, datanodes can report it to clients, then clients can choose to rebuild the pipeline to improve the writing performance. was (Author: symious): [~hexiaoqiao] Thank you for the review. The issue we met is we have clients writing to the slownode and it took a very long time to finish writing for a normal file. After we checked the metrics, we found we can avoid the pipeline creating on the slownodes with "dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled" set to true. It will work fine for new clients, but for clients already using the slownode as pipeline, they have to suffer the slownode. (Maybe the slownode is reported by this pipeline.) Since when clients are writing data, it will only be clients and datanodes communicating, so even NameNode has the information that the datanode in the pipeline is slow, clients can do too much to avoid it. Our proposal would be, to let Datanodes get the information from heartbeats reports, then during the writing, datanodes can report it to clients, then clients can choose to rebuild the pipeline to improve the writing performance. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?focusedWorklogId=681204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-681204 ] ASF GitHub Bot logged work on HDFS-16318: - Author: ASF GitHub Bot Created on: 14/Nov/21 09:18 Start Date: 14/Nov/21 09:18 Worklog Time Spent: 10m Work Description: Hexiaoqiao commented on a change in pull request #3649: URL: https://github.com/apache/hadoop/pull/3649#discussion_r748825616 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java ## @@ -257,7 +257,7 @@ void openInfo(boolean refreshLocatedBlocks) throws IOException { // locations will not be available with NN for getting the length. Lets // retry for 3 times to get the length. if (lastBlockBeingWrittenLength == -1) { - DFSClient.LOG.warn("Last block locations not available. " + DFSClient.LOG.warn("Last block locations " + getCurrentBlock() + " not available. " Review comment: Also add `src` will be more helpful IMO. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 681204) Time Spent: 0.5h (was: 20m) > Add exception blockinfo > --- > > Key: HDFS-16318 > URL: https://issues.apache.org/jira/browse/HDFS-16318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16318) Add exception blockinfo
[ https://issues.apache.org/jira/browse/HDFS-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443273#comment-17443273 ] Xiaoqiao He commented on HDFS-16318: [~philipse] would you mind to add some information about `Description` for what issue and how to fix. Thanks. > Add exception blockinfo > --- > > Key: HDFS-16318 > URL: https://issues.apache.org/jira/browse/HDFS-16318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.1 >Reporter: guo >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16320) Datanode retrieve slownode information from NameNode
[ https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443268#comment-17443268 ] Xiaoqiao He commented on HDFS-16320: Thanks [~Symious] for your report and patch. IMO it is a little tricky for DataNode to get the slownode status from NameNode. In theory, DataNode has the total information to decide if it is slow by itself rather than following NameNode command. Would you mind to offer some more information about your plan to using this status? Thanks. > Datanode retrieve slownode information from NameNode > > > Key: HDFS-16320 > URL: https://issues.apache.org/jira/browse/HDFS-16320 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The current information of slownode is reported by reportingNode, and stored > in NameNode. > This ticket is to let the slownode retrieve the information from NameNode, so > that it can do other performance improvement actions based on this > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16322) The NameNode implementation of ClientProtocol.truncate(...) can cause data loss.
[ https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443263#comment-17443263 ] Xiaoqiao He commented on HDFS-16322: Thanks [~Nsupyq] for your report. It is interesting case. I am not sure if it is reasonable to disable `idempotent` at HDFS-7926, but it could cause data loss when client retry request for some network or other issues. IMO, it could be fixed when enable `idempotent` for truncate. {quote}"idempotent" means applying the same operations multiple times will get the same result. If there is an append in the middle, the retry could get different results. E.g. getPermission is idempotent. However, if there is a setPermission (or delete, rename, etc.) in the middle, the retry of getPermission could get a different result.{quote} Just notice that [~szetszwo] leave this comment at HDFS-7926, Not sure if this explain is proper now. Such as Client A request `create` with overwrite operation and execute successful at NameNode side but not response to Client A, then it will retry. Before the retry request to NameNode, another Client B delete this file. Then retry request has invoked and return the last result because retry cache. It is the same case as `truncate`. cc [~shv], [~szetszwo] would you mind to give some suggestions? Thanks. > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. > > > Key: HDFS-16322 > URL: https://issues.apache.org/jira/browse/HDFS-16322 > Project: Hadoop HDFS > Issue Type: Bug > Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222 > and Apache Maven 3.6.0. > The bug can be reproduced by the the testMultipleTruncate() in the > attachment. First, replace the file TestFileTruncate.java under the directory > "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/" > with the attachment. Then run "mvn test > -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate" > to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199 > line of TestFileTruncate.java will abort. Because the retry of truncate() > changes the file size and cause data loss. >Reporter: nhaorand >Priority: Major > Attachments: TestFileTruncate.java > > > The NameNode implementation of ClientProtocol.truncate(...) can cause data > loss. If dfsclient drops the first response of a truncate RPC call, the retry > by retry cache will truncate the file again and cause data loss. > HDFS-7926 avoids repeated execution of truncate(...) by checking if the file > is already being truncated with the same length. However, under concurrency, > after the first execution of truncate(...), concurrent requests from other > clients may append new data and change the file length. When truncate(...) is > retried after that, it will find the file has not been truncated with the > same length and truncate it again, which causes data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org