[jira] [Commented] (HDFS-14757) TestBalancerRPCDelay.testBalancerRPCDelay failed

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023435#comment-17023435
 ] 

Ayush Saxena commented on HDFS-14757:
-

[~ahussein] may be you can a raise another Jira for this and put the stack you 
got there, and then we can resolve that.

Since this Jira Wei-Chiu raised for some different error.

> TestBalancerRPCDelay.testBalancerRPCDelay failed
> 
>
> Key: HDFS-14757
> URL: https://issues.apache.org/jira/browse/HDFS-14757
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Wei-Chiu Chuang
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-14757.001.patch
>
>
> {noformat}
> Error Message
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.spyFSNamesystem(TestBalancer.java:1948)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
> Stacktrace
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.spyFSNamesystem(TestBalancer.java:1948)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.spyFSNamesystem(TestBalancer.java:1957)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:811)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerRPCDelay(TestBalancer.java:1976)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay.testBalancerRPCDelay(TestBalancerRPCDelay.java:30)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run due to failure in TestDataNodeVolumeFailureToleration

2020-01-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023434#comment-17023434
 ] 

Hudson commented on HDFS-15128:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17899 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17899/])
HDFS-15128. Unit test failing to clean testing data and crashed future 
(ayushsaxena: rev 6d008c0d39185f18dbec4676f4d0e7ef77104eb7)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java


> Unit test failing to clean testing data and crashed future Maven test run due 
> to failure in TestDataNodeVolumeFailureToleration
> ---
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Fix For: 3.3.0
>
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion 

[jira] [Updated] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run due to failure in TestDataNodeVolumeFailureToleration

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15128:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Unit test failing to clean testing data and crashed future Maven test run due 
> to failure in TestDataNodeVolumeFailureToleration
> ---
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Fix For: 3.3.0
>
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> 

[jira] [Commented] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run due to failure in TestDataNodeVolumeFailureToleration

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023428#comment-17023428
 ] 

Ayush Saxena commented on HDFS-15128:
-

Committed to trunk.

Thanx [~ctest.team]  for the contribution.

> Unit test failing to clean testing data and crashed future Maven test run due 
> to failure in TestDataNodeVolumeFailureToleration
> ---
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion 

[jira] [Assigned] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run due to failure in TestDataNodeVolumeFailureToleration

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HDFS-15128:
---

Assignee: Ctest

> Unit test failing to clean testing data and crashed future Maven test run due 
> to failure in TestDataNodeVolumeFailureToleration
> ---
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion `assertEquals(expectedBPServiceState, bpServiceState)`  
> to the last line of this function. This fix will 

[jira] [Updated] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run due to failure in TestDataNodeVolumeFailureToleration

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15128:

Summary: Unit test failing to clean testing data and crashed future Maven 
test run due to failure in TestDataNodeVolumeFailureToleration  (was: Unit test 
failing to clean testing data and crashed future Maven test run)

> Unit test failing to clean testing data and crashed future Maven test run due 
> to failure in TestDataNodeVolumeFailureToleration
> ---
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> 

[jira] [Commented] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023420#comment-17023420
 ] 

Ayush Saxena commented on HDFS-15128:
-

Thanx [~ctest.team] for the update. The failed test doesn't seems related. 
v001 LGTM +1

> Unit test failing to clean testing data and crashed future Maven test run
> -
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion `assertEquals(expectedBPServiceState, bpServiceState)`  
> to the last line of this function. This fix will fix the bug and will not 
> change the 

[jira] [Commented] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023419#comment-17023419
 ] 

Ayush Saxena commented on HDFS-15144:
-

Can you help me understand. How reversing the order is fixing this issue?

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt, 
> HDFS-15144.001.patch
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023398#comment-17023398
 ] 

Hadoop QA commented on HDFS-15128:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}197m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15128 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991802/HDFS-15128-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8259f6eca033 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 839e607 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28709/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28709/testReport/ |
| Max. process+thread count | 2730 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28709/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This 

[jira] [Commented] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023391#comment-17023391
 ] 

Hadoop QA commented on HDFS-15144:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15144 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991805/HDFS-15144.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 636c9d93df2b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 839e607 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28710/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28710/testReport/ |
| Max. process+thread count | 4355 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28710/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Updated] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15144:
-
Attachment: HDFS-15144.001.patch

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt, 
> HDFS-15144.001.patch
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023341#comment-17023341
 ] 

Ahmed Hussein commented on HDFS-15144:
--

There was a problem when restarting the datanodes. One of the DN won't receive 
the restart and it will stay as inactive following the injected disk failure.
I reversed the order by which Datanodes are restarted and this seems to fix the 
issue.

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15144:
-
Status: Patch Available  (was: Open)

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run

2020-01-24 Thread Ctest (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023330#comment-17023330
 ] 

Ctest commented on HDFS-15128:
--

[~ayushtkn], thank you for the comment, I have uploaded a new patch 
*HDFS-15128-001.patch* to reflect the change as suggested.

> Unit test failing to clean testing data and crashed future Maven test run
> -
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion `assertEquals(expectedBPServiceState, bpServiceState)`  
> to the last line of this function. This fix will fix the bug 

[jira] [Updated] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run

2020-01-24 Thread Ctest (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ctest updated HDFS-15128:
-
Attachment: (was: HDFS-15128-000.patch)

> Unit test failing to clean testing data and crashed future Maven test run
> -
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion `assertEquals(expectedBPServiceState, bpServiceState)`  
> to the last line of this function. This fix will fix the bug and will not 
> change the test outcome. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HDFS-15128) Unit test failing to clean testing data and crashed future Maven test run

2020-01-24 Thread Ctest (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ctest updated HDFS-15128:
-
Attachment: HDFS-15128-001.patch

> Unit test failing to clean testing data and crashed future Maven test run
> -
>
> Key: HDFS-15128
> URL: https://issues.apache.org/jira/browse/HDFS-15128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, test
>Affects Versions: 3.2.1
>Reporter: Ctest
>Priority: Critical
>  Labels: easyfix, patch, test
> Attachments: HDFS-15128-000.patch, HDFS-15128-001.patch
>
>
> Actively-used test helper function `testVolumeConfig` in 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration` 
> chmod a directory with invalid perm 000 for testing purposes but later failed 
> to chmod back this directory with a valid perm if the assertion inside this 
> function failed. Any subsequent `mvn test` command would fail to run if this 
> test had failed before. It is because Maven failed to build itself as it did 
> not have permission to clean the temporarily-generated directory that has 
> perm 000. See below for the code snippet that is buggy.
> {code:java}
> try {
>   for (int i = 0; i < volumesFailed; i++) {
> prepareDirToFail(dirs[i]); // this will chmod dirs[i] to perm 000
>   }
>   restartDatanodes(volumesTolerated, manageDfsDirs);
> } catch (DiskErrorException e) {
>  ...
> } finally {
> ...
> }
>  
>   assertEquals(expectedBPServiceState, bpServiceState);
>  
>   for (File dir : dirs) {
> FileUtil.chmod(dir.toString(), "755");
>   }
> }
> {code}
> The failure of the statement `assertEquals(expectedBPServiceState, 
> bpServiceState)` caused function to terminate without executing 
> `FileUtil.chmod(dir.toString(), "755")` for each temporary directory with 
> invalid perm 000 the test has created. 
>  
> *Consequence*
> Any subsequent `mvn test` command would fail to run if this test had failed 
> before. It is because Maven failed to build itself since it does not have 
> permission to clean this temporarily-generated directory. For details of the 
> failure, see below:
> {noformat}
> [INFO] --- maven-antrun-plugin:1.7:run (create-log-dir) @ hadoop-hdfs ---
> [INFO] Executing tasks
>  
> main:
> [delete] Deleting directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  8.349 s
> [INFO] Finished at: 2019-12-27T03:53:04-06:00
> [INFO] 
> 
> [ERROR] Failed to execute 
> goalorg.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-log-dir) on 
> project hadoop-hdfs: An Ant BuildException has occured: Unable to delete 
> directory 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current
> [ERROR] around Ant part ... dir="/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data"/>...
>  @ 4:105 in 
> /home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException{noformat}
>  
> *Root Cause*
> The test helper function 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`
>  purposely set the directory 
> `/home/ctest/app/Ctest-Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current`
>  to have perm 000. And at the end of this function, it changed the perm of 
> this directory to 755. However, there is an assertion in this function before 
> the perm was able to changed to 755. Once this assertion fails, the function 
> terminates before the directory’s perm can be changed to 755. Hence, this 
> directory was later unable to be removed by Maven for when executing `mvn 
> test`. 
>  
> *Fix*
> In 
> `org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration#testVolumeConfig`,
>  move the assertion `assertEquals(expectedBPServiceState, bpServiceState)`  
> to the last line of this function. This fix will fix the bug and will not 
> change the test outcome. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023275#comment-17023275
 ] 

Stephen O'Donnell edited comment on HDFS-7175 at 1/24/20 9:28 PM:
--

Yea, I ran it without the -showprogress switch, which gave this truncated 
output:

{code}
hdfs fsck /
2020-01-24 11:52:24,105 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via http://localhost:9870/fsck?ugi=sodonnell=%2F
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:52:24 GMT 2020

.
..
..
..
..

 Missing block groups:  0
 Corrupt block groups:  0
 Missing internal blocks:   0
 Blocks queued for replication: 0
FSCK ended at Fri Jan 24 11:52:26 GMT 2020 in 1196 milliseconds
{code}

Note there are not 10 dots per line, while previously there should have been 
100 per line.

I also ran with -showprogress to ensure that still works, and it logs the 
expected warning:

{code}
hdfs fsck / -showprogress
2020-01-24 11:55:08,414 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via 
http://localhost:9870/fsck?ugi=sodonnell=1=%2F
The fsck switch -showprogress is deprecated and no longer has any effect. 
Progress is now shown by default.
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:55:09 GMT 2020

.

{code}

Note that I was not able to test a long running fsck, but the dots being 
printed are what is required to keep the connection open.


was (Author: sodonnell):
Yea, I ran it without the -showprogress switch, which gave this truncated 
output:

{code}
hdfs fsck /
2020-01-24 11:52:24,105 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via http://localhost:9870/fsck?ugi=sodonnell=%2F
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:52:24 GMT 2020

.
..
..
..
..

 Missing block groups:  0
 Corrupt block groups:  0
 Missing internal blocks:   0
 Blocks queued for replication: 0
FSCK ended at Fri Jan 24 11:52:26 GMT 2020 in 1196 milliseconds
{code}

Note there are not 10 dots per line, while previously there should have been 
100 per line.

I also ran with -showprogress to ensure that still works, and it logs the 
expected warning:

{code}
hdfs fsck / -showprogress
2020-01-24 11:55:08,414 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via 
http://localhost:9870/fsck?ugi=sodonnell=1=%2F
The fsck switch -showprogress is deprecated and no longer has any effect. 
Progress is now shown by default.
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:55:09 GMT 2020

.

{code}

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.30
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, 
> HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> 

[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023275#comment-17023275
 ] 

Stephen O'Donnell commented on HDFS-7175:
-

Yea, I ran it without the -showprogress switch, which gave this truncated 
output:

{code}
hdfs fsck /
2020-01-24 11:52:24,105 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via http://localhost:9870/fsck?ugi=sodonnell=%2F
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:52:24 GMT 2020

.
..
..
..
..

 Missing block groups:  0
 Corrupt block groups:  0
 Missing internal blocks:   0
 Blocks queued for replication: 0
FSCK ended at Fri Jan 24 11:52:26 GMT 2020 in 1196 milliseconds
{code}

Note there are not 10 dots per line, while previously there should have been 
100 per line.

I also ran with -showprogress to ensure that still works, and it logs the 
expected warning:

{code}
hdfs fsck / -showprogress
2020-01-24 11:55:08,414 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Connecting to namenode via 
http://localhost:9870/fsck?ugi=sodonnell=1=%2F
The fsck switch -showprogress is deprecated and no longer has any effect. 
Progress is now shown by default.
FSCK started by sodonnell (auth:SIMPLE) from /127.0.0.1 for path / at Fri Jan 
24 11:55:09 GMT 2020

.

{code}

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.30
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, 
> HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To 

[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023273#comment-17023273
 ] 

Wei-Chiu Chuang commented on HDFS-7175:
---

Makes sense to me [~sodonnell]. Have you verified the fsck prints dots as 
expected to keep the connection open?

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.30
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, 
> HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023207#comment-17023207
 ] 

Ayush Saxena commented on HDFS-15143:
-

Test failures not related

> LocatedStripedBlock returns wrong block type
> 
>
> Key: HDFS-15143
> URL: https://issues.apache.org/jira/browse/HDFS-15143
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15143-01.patch, HDFS-15143-02.patch
>
>
> LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually 
> should be {{STRIPED}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023122#comment-17023122
 ] 

Hadoop QA commented on HDFS-15143:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
1s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
51s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 58s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}184m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15143 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991745/HDFS-15143-02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 85fa09d78d62 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8390547 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| 

[jira] [Commented] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023079#comment-17023079
 ] 

Hadoop QA commented on HDFS-15143:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15143 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991743/HDFS-15143-01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aede7cb7af85 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8390547 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 

[jira] [Commented] (HDFS-15140) Replace FoldedTreeSet in Datanode with SortedSet or TreeMap

2020-01-24 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023077#comment-17023077
 ] 

Stephen O'Donnell commented on HDFS-15140:
--

One problem I have with this change, is that I have no idea why the 
FoldedTreeSet structure degrades so badly after some period of time. It would 
be great to figure out what that is and fix it.

The datanode originally used a ResizableGSet for the block map, which is 
unsorted. However, along with the introduction of FoldedTreeSet, we introduced 
sorted block reports and the namenode now relies upon them being sorted.

Therefore, there are (at least) two choices here:

1. Replace FoldedTreeSet with another sorted structure. Eg TreeMap would work. 

2. Switch back to the GSet in the datanode, and then sort the blocks each time 
we create a block report. 

In benchmarks I ran, TreeMap seems to perform marginally better than 
FoldedTreeMap on adds and deletes. For gets, for sets under 8M entries, 
FoldedTreeSet is marginally faster and then for larger sets TreeMap is faster. 
The performance gap seems to widen as the sets get bigger in TreeMap's favour.

However, a major plus point to FoldedTreeSet, is its reduce memory overhead. 
Loading each structure with 1M entries and capturing a heap dump, we can see 
TreeMap has an overhead of about 40 bytes per entry, as each object is stored 
into a TreeMap$Entry:

{code}
 num #instances #bytes  class name
--
   1:   100   4000  java.util.TreeMap$Entry
   2:   1000129   24003096  java.lang.Long
   3:   100   2400  com.sodonnell.BlockMock
{code}

FoldedTreeSet however, has an overhead of about 5 bytes per entry due to 
storing up to 64 objects in each tree node:

{code}
num #instances #bytes  class name
--
   1:   100   2400  com.sodonnell.BlockMock
   2: 159414263344  [Ljava.lang.Object;
   3: 15625 875000  com.sodonnell.FoldedTreeSet$Node
{code}

Therefore switching to TreeMap would cost about (40 - 5) * 1M blocks = 33.37MB 
of additional heap per 1M blocks and give comparable get, delete and add 
performance.

For option (2), we could capture the list of blocks for a block report, then 
drop the DN lock and sort them. Block Reports are sent per volume, so sorting 
volume by volume would be doable. 

Benchmarking a sort on an array of objects, where the compare is performed 
against a long gives:

{code}
Time taken to sort 50 elements: 338
Time taken to sort 100 elements: 337
Time taken to sort 200 elements: 632
Time taken to sort 400 elements: 1438
Time taken to sort 800 elements: 3244
Time taken to sort 1600 elements: 7260
{code}

The extra sort would require materializing another list of references to the 
objects, but even for 20M blocks (a lot for a DN) at 4 bytes per reference this 
is only about 76MB which would be quickly GC'ed afterwards.

> Replace FoldedTreeSet in Datanode with SortedSet or TreeMap
> ---
>
> Key: HDFS-15140
> URL: https://issues.apache.org/jira/browse/HDFS-15140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>
> Based on the problems discussed in HDFS-15131, I would like to explore 
> replacing the FoldedTreeSet structure in the datanode with a builtin Java 
> equivalent - either SortedSet or TreeMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15144:
-
 Attachment: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt
Component/s: datanode

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated HDFS-15144:
-
Description: 
{{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
times to restart Datanodes. However, the code restart the DN-0 three times.
As a result, the JUnit does not really execute the scenario it was supposed to.



{code:java}
DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
for (int i = 0; i < 3; i++) {
  cluster.restartDataNode(0, true);
}
// wait for heartbeat
Thread.sleep(6000);
storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
.getStorageTypeStats();
storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
assertEquals(6, storageTypeStats.getNodesInService());
{code}

When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
the test did not pass with the stack trace below. I suspect that one of the 
datanodes  does not start properly after calling restart.


{code:bash}
[INFO] Running 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.805 
s <<< FAILURE! - in 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
[ERROR] 
testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
  Time elapsed: 17.682 s  <<< FAILURE!
java.lang.AssertionError: expected:<6> but was:<5>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}


  was:
{{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
times to restart Datanodes. However, the code restart the DN-0 three times.
As a result, the JUnit does not really execute the scenario it was supposed to.
When I changed the loop, the test did not pass with the following stack trace.


> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> ---
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
>   cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. 

[jira] [Commented] (HDFS-15119) Allow expiration of cached locations in DFSInputStream

2020-01-24 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023045#comment-17023045
 ] 

Kihwal Lee commented on HDFS-15119:
---

I've committed to trunk, branch-3.2 and branch-3.1.  For branch-2.10, the patch 
needs a bit of change.  Please post a patch for branch-2.10.

> Allow expiration of cached locations in DFSInputStream
> --
>
> Key: HDFS-15119
> URL: https://issues.apache.org/jira/browse/HDFS-15119
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15119.001.patch, HDFS-15119.002.patch, 
> HDFS-15119.003.patch
>
>
> Staleness and other transient conditions can affect reads for a long time 
> since the block locations may not be re-fetched. It makes sense to make 
> cached locations to expire.
> For example, we may not take advantage of local-reads since the nodes are 
> blacklisted and have not been updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15144) TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect

2020-01-24 Thread Ahmed Hussein (Jira)
Ahmed Hussein created HDFS-15144:


 Summary: 
TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
 Key: HDFS-15144
 URL: https://issues.apache.org/jira/browse/HDFS-15144
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


{{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
times to restart Datanodes. However, the code restart the DN-0 three times.
As a result, the JUnit does not really execute the scenario it was supposed to.
When I changed the loop, the test did not pass with the following stack trace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15119) Allow expiration of cached locations in DFSInputStream

2020-01-24 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-15119:
--
Fix Version/s: 3.2.2
   3.1.4
   3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Allow expiration of cached locations in DFSInputStream
> --
>
> Key: HDFS-15119
> URL: https://issues.apache.org/jira/browse/HDFS-15119
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15119.001.patch, HDFS-15119.002.patch, 
> HDFS-15119.003.patch
>
>
> Staleness and other transient conditions can affect reads for a long time 
> since the block locations may not be re-fetched. It makes sense to make 
> cached locations to expire.
> For example, we may not take advantage of local-reads since the nodes are 
> blacklisted and have not been updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15119) Allow expiration of cached locations in DFSInputStream

2020-01-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023023#comment-17023023
 ] 

Hudson commented on HDFS-15119:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17897 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17897/])
HDFS-15119. Allow expiration of cached locations in DFSInputStream. (kihwal: 
rev d10f77e3c91225f86ed9c0f0e6a9adf2e1434674)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStreamBlockLocations.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java


> Allow expiration of cached locations in DFSInputStream
> --
>
> Key: HDFS-15119
> URL: https://issues.apache.org/jira/browse/HDFS-15119
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: HDFS-15119.001.patch, HDFS-15119.002.patch, 
> HDFS-15119.003.patch
>
>
> Staleness and other transient conditions can affect reads for a long time 
> since the block locations may not be re-fetched. It makes sense to make 
> cached locations to expire.
> For example, we may not take advantage of local-reads since the nodes are 
> blacklisted and have not been updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022997#comment-17022997
 ] 

Hadoop QA commented on HDFS-7175:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 23m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
|   | hadoop.hdfs.TestDeadNodeDetection |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-7175 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991733/HDFS-7157.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 82f9632c86e1 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 978c487 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28706/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28706/testReport/ |
| Max. process+thread count | 3595 (vs. ulimit of 

[jira] [Commented] (HDFS-15119) Allow expiration of cached locations in DFSInputStream

2020-01-24 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022988#comment-17022988
 ] 

Kihwal Lee commented on HDFS-15119:
---

Here is my +1.

> Allow expiration of cached locations in DFSInputStream
> --
>
> Key: HDFS-15119
> URL: https://issues.apache.org/jira/browse/HDFS-15119
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: HDFS-15119.001.patch, HDFS-15119.002.patch, 
> HDFS-15119.003.patch
>
>
> Staleness and other transient conditions can affect reads for a long time 
> since the block locations may not be re-fetched. It makes sense to make 
> cached locations to expire.
> For example, we may not take advantage of local-reads since the nodes are 
> blacklisted and have not been updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15143:

Attachment: HDFS-15143-02.patch

> LocatedStripedBlock returns wrong block type
> 
>
> Key: HDFS-15143
> URL: https://issues.apache.org/jira/browse/HDFS-15143
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15143-01.patch, HDFS-15143-02.patch
>
>
> LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually 
> should be {{STRIPED}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022956#comment-17022956
 ] 

Ayush Saxena commented on HDFS-15143:
-

{{BlockPlacementPolicy}} has {{getPolicy}} method which returns based on the 
block type.
Observed this while trying to retrieve BPP corresponding to a block type. Seems 
this is used in {{FSCK}} too. So, it would be creating some problem there too.

> LocatedStripedBlock returns wrong block type
> 
>
> Key: HDFS-15143
> URL: https://issues.apache.org/jira/browse/HDFS-15143
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15143-01.patch
>
>
> LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually 
> should be {{STRIPED}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15143:

Status: Patch Available  (was: Open)

> LocatedStripedBlock returns wrong block type
> 
>
> Key: HDFS-15143
> URL: https://issues.apache.org/jira/browse/HDFS-15143
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15143-01.patch
>
>
> LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually 
> should be {{STRIPED}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15143:

Attachment: HDFS-15143-01.patch

> LocatedStripedBlock returns wrong block type
> 
>
> Key: HDFS-15143
> URL: https://issues.apache.org/jira/browse/HDFS-15143
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15143-01.patch
>
>
> LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually 
> should be {{STRIPED}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-7175:

 Target Version/s: 3.3.0  (was: )
Affects Version/s: 3.30
   Status: Patch Available  (was: Open)

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.30
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, 
> HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-7175:

Attachment: HDFS-7157.004.patch

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, 
> HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15143) LocatedStripedBlock returns wrong block type

2020-01-24 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15143:
---

 Summary: LocatedStripedBlock returns wrong block type
 Key: HDFS-15143
 URL: https://issues.apache.org/jira/browse/HDFS-15143
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


LocatedStripedBlock returns block type as {{CONTIGUOUS}} which actually should 
be {{STRIPED}}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022879#comment-17022879
 ] 

Stephen O'Donnell commented on HDFS-7175:
-

This issue has been dormant for a long time, but as I mentioned in HDFS-2538, 
we are starting to see a lot of fsck timeout issues, caused by -showprogress 
being off by default.

As we know fsck will fail on a large cluster without -showprogress, I would 
like to suggest we do the following:

1) Deprecate the -showprogress switch. For compatibility reasons, leave it in 
the code for now, but have it log a warning and give no effect if it is passed. 
Instead progress will always be printed.
2) Change the logic to print a dot for every 100 files processed, rather than 
every file.
3) Flush the output buffer every 1000 items processed (includes directories and 
symlinks as well as files) rather than 100.

I did consider the merits of adding a -quiet switch, but as that would cause 
timeouts on medium and large clusters, it seems like a pointless addition.

With the above changes, we will cut down on the volume of progress output 
significantly, while avoiding the timeouts caused by zero progress reporting. I 
will attach a patch for this shortly.

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, 
> HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-7175) Client-side SocketTimeoutException during Fsck

2020-01-24 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell reassigned HDFS-7175:
---

Assignee: Stephen O'Donnell  (was: Subbu Subramaniam)

> Client-side SocketTimeoutException during Fsck
> --
>
> Key: HDFS-7175
> URL: https://issues.apache.org/jira/browse/HDFS-7175
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Carl Steinbach
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, 
> HDFS-7175.patch
>
>
> HDFS-2538 disabled status reporting for the fsck command (it can optionally 
> be enabled with the -showprogress option). We have observed that without 
> status reporting the client will abort with read timeout:
> {noformat}
> [hdfs@lva1-hcl0030 ~]$ hdfs fsck / 
> Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070
> 14/09/30 06:03:41 WARN security.UserGroupInformation: 
> PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) 
> cause:java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312)
>   at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149)
>   at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346)
> {noformat}
> Since there's nothing for the client to read it will abort if the time 
> required to complete the fsck operation is longer than the client's read 
> timeout setting.
> I can think of a couple ways to fix this:
> # Set an infinite read timeout on the client side (not a good idea!).
> # Have the server-side write (and flush) zeros to the wire and instruct the 
> client to ignore these characters instead of echoing them.
> # It's possible that flushing an empty buffer on the server-side will trigger 
> an HTTP response with a zero length payload. This may be enough to keep the 
> client from hanging up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13179) TestLazyPersistReplicaRecovery#testDnRestartWithSavedReplicas fails intermittently

2020-01-24 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022774#comment-17022774
 ] 

Hadoop QA commented on HDFS-13179:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 56s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}168m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-13179 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991711/HDFS-13179.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b32cc5950b55 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 978c487 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28705/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28705/testReport/ |
| Max. process+thread count | 2986 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28705/console |
| Powered by | Apache