[jira] [Issue Comment Deleted] (HDFS-15202) HDFS-client: boost ShortCircuit Cache

2020-05-02 Thread Danil Lipovoy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danil Lipovoy updated HDFS-15202:
-
Comment: was deleted

(was: [~leosun08]
Could you take a look, please?)

> HDFS-client: boost ShortCircuit Cache
> -
>
> Key: HDFS-15202
> URL: https://issues.apache.org/jira/browse/HDFS-15202
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
> Environment: 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 8 RegionServers (2 by host)
> 8 tables by 64 regions by 1.88 Gb data in each = 900 Gb total
> Random read in 800 threads via YCSB and a little bit updates (10% of reads)
>Reporter: Danil Lipovoy
>Assignee: Danil Lipovoy
>Priority: Minor
> Attachments: HDFS_CPU_full_cycle.png, cpu_SSC.png, cpu_SSC2.png, 
> hdfs_cpu.png, hdfs_reads.png, hdfs_scc_3_test.png, 
> hdfs_scc_test_full-cycle.png, locks.png, requests_SSC.png
>
>
> ТотI want to propose how to improve reading performance HDFS-client. The 
> idea: create few instances ShortCircuit caches instead of one. 
> The key points:
> 1. Create array of caches (set by 
> clientShortCircuitNum=*dfs.client.short.circuit.num*, see in the pull 
> requests below):
> {code:java}
> private ClientContext(String name, DfsClientConf conf, Configuration config) {
> ...
> shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
> for (int i = 0; i < this.clientShortCircuitNum; i++) {
>   this.shortCircuitCache[i] = ShortCircuitCache.fromConf(scConf);
> }
> {code}
> 2 Then divide blocks by caches:
> {code:java}
>   public ShortCircuitCache getShortCircuitCache(long idx) {
> return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
>   }
> {code}
> 3. And how to call it:
> {code:java}
> ShortCircuitCache cache = 
> clientContext.getShortCircuitCache(block.getBlockId());
> {code}
> The last number of offset evenly distributed from 0 to 9 - that's why all 
> caches will full approximately the same.
> It is good for performance. Below the attachment, it is load test reading 
> HDFS via HBase where clientShortCircuitNum = 1 vs 3. We can see that 
> performance grows ~30%, CPU usage about +15%. 
> Hope it is interesting for someone.
> Ready to explain some unobvious things.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-05-02 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15255:
---
Attachment: HDFS-15255-findbugs-test.001.patch

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15255-findbugs-test.001.patch, 
> HDFS-15255.001.patch, HDFS-15255.002.patch, HDFS-15255.003.patch, 
> HDFS-15255.004.patch, HDFS-15255.005.patch, HDFS-15255.006.patch, 
> experiment-find-bugs.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return fast StorageType node when the 
> distance is same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-05-02 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15255:
---
Attachment: (was: HDFS-15255-findbugs-test.001.patch)

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch, 
> HDFS-15255.003.patch, HDFS-15255.004.patch, HDFS-15255.005.patch, 
> HDFS-15255.006.patch, experiment-find-bugs.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return fast StorageType node when the 
> distance is same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-05-02 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15255:
---
Attachment: HDFS-15255-findbugs-test.001.patch

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch, 
> HDFS-15255.003.patch, HDFS-15255.004.patch, HDFS-15255.005.patch, 
> HDFS-15255.006.patch, experiment-find-bugs.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return fast StorageType node when the 
> distance is same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098216#comment-17098216
 ] 

Hudson commented on HDFS-15325:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18210 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18210/])
HDFS-15325. TestRefreshCallQueue is failing due to changed CallQueue (liuml07: 
rev 44de193bec3eb9cc0e95f8b2f609060dd914eb94)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/TestRefreshCallQueue.java


> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15325.001.patch
>
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15326) TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-15326:
-
Description: 
Sample failing build: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
Sample failing stack:
{code}
Error Message

Wrongly computed block reconstruction work

Stacktrace

java.lang.AssertionError: Wrongly computed block reconstruction work
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}

  was:
Sample failing build: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
Sample failing stack:
{code}
Error Message
Wrongly computed block reconstruction work
Stacktrace
java.lang.AssertionError: Wrongly computed block reconstruction work
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}


> TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently
> 
>
> Key: HDFS-15326
> URL: https://issues.apache.org/jira/browse/HDFS-15326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Sample failing build: 
> https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
> Sample failing stack:
> {code}
> Error Message
> Wrongly computed block reconstruction work
> Stacktrace
> java.lang.AssertionError: Wrongly computed block reconstruction work
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
>   at

[jira] [Commented] (HDFS-15327) TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3 fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098214#comment-17098214
 ] 

Mingliang Liu commented on HDFS-15327:
--

Might be related to HDFS-15326. Linking it here.

> TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3 fails 
> intermittently
> --
>
> Key: HDFS-15327
> URL: https://issues.apache.org/jira/browse/HDFS-15327
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Sample stack trace:
> {code}
> Error Message
> ecReconstructionBytesRead should be  expected:<6501170> but was:<0>
> Stacktrace
> java.lang.AssertionError: ecReconstructionBytesRead should be  
> expected:<6501170> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3(TestDataNodeErasureCodingMetrics.java:150)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Sample failing build:
> - https://builds.apache.org/job/PreCommit-HDFS-Build/29226/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15327) TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3 fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)
Mingliang Liu created HDFS-15327:


 Summary: 
TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3 fails 
intermittently
 Key: HDFS-15327
 URL: https://issues.apache.org/jira/browse/HDFS-15327
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, erasure-coding, test
Affects Versions: 3.4.0
Reporter: Mingliang Liu


Sample stack trace:

{code}
Error Message
ecReconstructionBytesRead should be  expected:<6501170> but was:<0>
Stacktrace
java.lang.AssertionError: ecReconstructionBytesRead should be  
expected:<6501170> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testReconstructionBytesPartialGroup3(TestDataNodeErasureCodingMetrics.java:150)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}

Sample failing build:
- https://builds.apache.org/job/PreCommit-HDFS-Build/29226/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15326) TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-15326:
-
Issue Type: Bug  (was: Test)

> TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently
> 
>
> Key: HDFS-15326
> URL: https://issues.apache.org/jira/browse/HDFS-15326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Sample failing build: 
> https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
> Sample failing stack:
> {code}
> Error Message
> Wrongly computed block reconstruction work
> Stacktrace
> java.lang.AssertionError: Wrongly computed block reconstruction work
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15326) TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-15326:
-
Component/s: erasure-coding

> TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently
> 
>
> Key: HDFS-15326
> URL: https://issues.apache.org/jira/browse/HDFS-15326
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> Sample failing build: 
> https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
> Sample failing stack:
> {code}
> Error Message
> Wrongly computed block reconstruction work
> Stacktrace
> java.lang.AssertionError: Wrongly computed block reconstruction work
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-15325:
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}}. Thanks [~shv] for filing this and found the root cause. 
Thanks [~fengnanli] for providing a patch. Thanks [~ayushtkn] for discussion.

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15325.001.patch
>
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098198#comment-17098198
 ] 

Hadoop QA commented on HDFS-15325:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
41s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 17s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}196m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29226/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15325 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13001910/HDFS-15325.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux c024f658c2a4 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / f40dacdc2e4 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| unit | 
https://bui

[jira] [Comment Edited] (HDFS-15264) Backport HDFS-13571,HDFS-15149 to branch-3.1, branch-2.10

2020-05-02 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098195#comment-17098195
 ] 

Lisheng Sun edited comment on HDFS-15264 at 5/3/20, 3:04 AM:
-

The findbugs should be not related to this patch.
[~ayushtkn] Could you help confirm fingbugs warnings that is related to this 
patch?
{code:java}
CodeWarning
WMI org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator
Bug type WMI_WRONG_MAP_ITERATOR (click for details)
In class org.apache.hadoop.hdfs.server.protocol.SlowDiskReports
In method org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object)
Field org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.slowDisks
At SlowDiskReports.java:[line 105]

Bad practice Warnings
CodeWarning
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat
Dodgy code Warnings
CodeWarning
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path, 
BasicFileAttributes) due to return value of called method
UC  Useless condition: it's known that argv.length >= 1 at this point
UC  Useless condition: it's known that numBlocks == -1 at this point
{code}

1. Backport the patch to branch-3.1 as same to the tranch, that is not 
unchanged.

2.There are a few small changes when backport the patch to branch-2.10.
a.
{code:java}
 /**
   * Add datanode to suspectNodes and suspectAndDeadNodes.
   */
  public synchronized void addNodeToDetect(DFSInputStream dfsInputStream,
...  
{color:red}suspectAndDeadNodes.put(dfsInputStream, datanodeInfos);{color}
 ...
  }
{code}
  Map#putIfAbsent in DeadNodeDetector#addNodeToDetectis changed to Map#put. 
Since Map#putIfAbsent is supported from jdk 1.8, branch-2.10 is complied in 
jdk7.
b. It chages Lamda to normal method in 
TestDeadNodeDetection#startWaitForDeadNodeThread.




was (Author: leosun08):
The findbugs should be not related to this patch.
[~ayushtkn] Could you help confirm fingbugs warnings that is related to this 
patch?
{code:java}
CodeWarning
WMI org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator
Bug type WMI_WRONG_MAP_ITERATOR (click for details)
In class org.apache.hadoop.hdfs.server.protocol.SlowDiskReports
In method org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object)
Field org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.slowDisks
At SlowDiskReports.java:[line 105]

Bad practice Warnings
CodeWarning
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat
Dodgy code Warnings
CodeWarning
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method
NP  Possible null pointer dereferen

[jira] [Commented] (HDFS-15264) Backport HDFS-13571,HDFS-15149 to branch-3.1, branch-2.10

2020-05-02 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098195#comment-17098195
 ] 

Lisheng Sun commented on HDFS-15264:


The findbugs should be not related to this patch.
[~ayushtkn] Could you help confirm fingbugs warnings that is related to this 
patch?
{code:java}
CodeWarning
WMI org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator
Bug type WMI_WRONG_MAP_ITERATOR (click for details)
In class org.apache.hadoop.hdfs.server.protocol.SlowDiskReports
In method org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object)
Field org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.slowDisks
At SlowDiskReports.java:[line 105]

Bad practice Warnings
CodeWarning
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat
ME  
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat
Dodgy code Warnings
CodeWarning
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method
NP  Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path, 
BasicFileAttributes) due to return value of called method
UC  Useless condition: it's known that argv.length >= 1 at this point
UC  Useless condition: it's known that numBlocks == -1 at this point
{code}

1. Backport the patch to branch-3.1 as same to the tranch, that is not 
unchanged.
2.There are a few small changes when backport the patch to branch-2.10.
a.
{code:java}
 /**
   * Add datanode to suspectNodes and suspectAndDeadNodes.
   */
  public synchronized void addNodeToDetect(DFSInputStream dfsInputStream,
...  
{color:red}suspectAndDeadNodes.put(dfsInputStream, datanodeInfos);{color}
 ...
  }
{code}
  Map#putIfAbsent in DeadNodeDetector#addNodeToDetectis changed to Map#put. 
Since Map#putIfAbsent is supported from jdk 1.8, branch-2.10 is complied in 
jdk7.
b. It chages Lamda to normal method in 
TestDeadNodeDetection#startWaitForDeadNodeThread.



> Backport HDFS-13571,HDFS-15149 to branch-3.1, branch-2.10
> -
>
> Key: HDFS-15264
> URL: https://issues.apache.org/jira/browse/HDFS-15264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15264-branch-2.10.001.patch, 
> HDFS-15264-branch-2.10.002.patch, HDFS-15264-branch-2.10.003.patch, 
> HDFS-15264-branch-2.10.004.patch, HDFS-15264-branch-2.10.005.patch, 
> HDFS-15264-branch-3.1.001.patch, HDFS-15264-branch-3.1.002.patch, 
> HDFS-15264-branch-3.1.003.patch, HDFS-15264-branch-3.1.004.patch, 
> HDFS-15264.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15323) StandbyNode fails transition to active due to insufficient transaction tailing

2020-05-02 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098168#comment-17098168
 ] 

Hadoop QA commented on HDFS-15323:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
44s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} branch-2.10 passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.10 passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
35s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
33s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in branch-2.10 has 10 
extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTokens |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | ha

[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098149#comment-17098149
 ] 

Mingliang Liu commented on HDFS-15325:
--

+1.

Will commit after a clean QA. 

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-15325.001.patch
>
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-15325:
--
Attachment: HDFS-15325.001.patch

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-15325.001.patch
>
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-15325:
--
Status: Patch Available  (was: Open)

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-15325.001.patch
>
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15323) StandbyNode fails transition to active due to insufficient transaction tailing

2020-05-02 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15323:
---
Attachment: HDFS-15323-branch-2.10.002.patch

> StandbyNode fails transition to active due to insufficient transaction tailing
> --
>
> Key: HDFS-15323
> URL: https://issues.apache.org/jira/browse/HDFS-15323
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, qjm
>Affects Versions: 2.7.7
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: HDFS-15323-branch-2.10.002.patch, 
> HDFS-15323.000.unitTest.patch, HDFS-15323.001.patch, HDFS-15323.002.patch
>
>
> StandbyNode is asked to {{transitionToActive()}}. If it fell too far behind 
> in tailing journal transaction (from QJM) it can crash with 
> {{IllegalStateException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15323) StandbyNode fails transition to active due to insufficient transaction tailing

2020-05-02 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098135#comment-17098135
 ] 

Hadoop QA commented on HDFS-15323:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
3s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 31s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}206m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.TestRefreshCallQueue |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29225/artifact/out/Dockerfile
 |
| JIRA Issue | HDFS-15323 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13001899/HDFS-15323.002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 352987ed0a28 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / f40dacdc2e4 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29225/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org

[jira] [Assigned] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li reassigned HDFS-15325:
-

Assignee: Fengnan Li

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Fengnan Li
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098128#comment-17098128
 ] 

Fengnan Li commented on HDFS-15325:
---

I will provide a patch shortly. [~liuml07] [~ayushtkn]

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15320) StringIndexOutOfBoundsException in HostRestrictingAuthorizationFilter

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-15320:
-
Fix Version/s: 3.4.0
   3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to {{branch-3.3}} and {{trunk}} branches. Thanks for filing and 
providing a patch [~aajisaka]. Thanks for your review [~clayb]

> StringIndexOutOfBoundsException in HostRestrictingAuthorizationFilter
> -
>
> Key: HDFS-15320
> URL: https://issues.apache.org/jira/browse/HDFS-15320
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HostRestrictingAuthorizationFilter (HDFS-14234) is 
> enabled
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
>
> When there is a request to "http://:/" without "webhdfs/v1" 
> suffix, DN returns 500 response code and throws 
> StringIndexOutOfBoundsException as follows: 
> {noformat}
> 2020-05-01 16:10:20,220 ERROR 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler:
>  Exception in HostRestrictingAuthorizationFilterHandler
> java.lang.StringIndexOutOfBoundsException: String index out of range: -10
> at java.base/java.lang.String.substring(String.java:1841)
> at 
> org.apache.hadoop.hdfs.server.common.HostRestrictingAuthorizationFilter.handleInteraction(HostRestrictingAuthorizationFilter.java:234)
> at 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler.channelRead0(HostRestrictingAuthorizationFilterHandler.java:155)
> at 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler.channelRead0(HostRestrictingAuthorizationFilterHandler.java:58)
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:328)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:302)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
> at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Closed] (HDFS-15324) TestRefreshCallQueue::testRefresh fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu closed HDFS-15324.


> TestRefreshCallQueue::testRefresh fails intermittently
> --
>
> Key: HDFS-15324
> URL: https://issues.apache.org/jira/browse/HDFS-15324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: namenode, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> This test seems flaky. It failed previously see HDFS-10253 and now it is 
> failing intermittently in {{trunk}}. Not sure we should use the same fix to 
> Mock as it was used in HDFS-10253.
> Sample failing builds:
> - 
> https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/29221/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/
> Sample failing stack is:
> {code}
> Error Message
> org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
> Stacktrace
> java.lang.RuntimeException: 
> org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
>   at 
> org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:193)
>   at 
> org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:83)
>   at org.apache.hadoop.ipc.Server.(Server.java:3087)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:427)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:475)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:857)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:763)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1014)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:987)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1756)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1332)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1101)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:974)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:906)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:534)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:493)
>   at 
> org.apache.hadoop.TestRefreshCallQueue.setUp(TestRefreshCallQueue.java:67)
>   at 
> org.apache.hadoop.TestRefreshCallQueue.testRefresh(TestRefreshCallQueue.java:115)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.jav

[jira] [Resolved] (HDFS-15324) TestRefreshCallQueue::testRefresh fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HDFS-15324.
--
Resolution: Duplicate

> TestRefreshCallQueue::testRefresh fails intermittently
> --
>
> Key: HDFS-15324
> URL: https://issues.apache.org/jira/browse/HDFS-15324
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: namenode, test
>Affects Versions: 3.4.0
>Reporter: Mingliang Liu
>Priority: Major
>
> This test seems flaky. It failed previously see HDFS-10253 and now it is 
> failing intermittently in {{trunk}}. Not sure we should use the same fix to 
> Mock as it was used in HDFS-10253.
> Sample failing builds:
> - 
> https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/29221/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/
> Sample failing stack is:
> {code}
> Error Message
> org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
> Stacktrace
> java.lang.RuntimeException: 
> org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
>   at 
> org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:193)
>   at 
> org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:83)
>   at org.apache.hadoop.ipc.Server.(Server.java:3087)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:427)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:475)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:857)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:763)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1014)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:987)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1756)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1332)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1101)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:974)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:906)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:534)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:493)
>   at 
> org.apache.hadoop.TestRefreshCallQueue.setUp(TestRefreshCallQueue.java:67)
>   at 
> org.apache.hadoop.TestRefreshCallQueue.testRefresh(TestRefreshCallQueue.java:115)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.exe

[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098099#comment-17098099
 ] 

Mingliang Liu commented on HDFS-15325:
--

Yeah, Thanks [~ayushtkn].

[~shv] and I filed at the same time. I can close the other one, since this one 
has the clear "is broken by" filed. 



> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15320) StringIndexOutOfBoundsException in HostRestrictingAuthorizationFilter

2020-05-02 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098092#comment-17098092
 ] 

Hudson commented on HDFS-15320:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18209 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18209/])
HDFS-15320. StringIndexOutOfBoundsException in (github: rev 
f40dacdc2e42abced39f2706d1e3319430577269)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HostRestrictingAuthorizationFilter.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestHostRestrictingAuthorizationFilter.java


> StringIndexOutOfBoundsException in HostRestrictingAuthorizationFilter
> -
>
> Key: HDFS-15320
> URL: https://issues.apache.org/jira/browse/HDFS-15320
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HostRestrictingAuthorizationFilter (HDFS-14234) is 
> enabled
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> When there is a request to "http://:/" without "webhdfs/v1" 
> suffix, DN returns 500 response code and throws 
> StringIndexOutOfBoundsException as follows: 
> {noformat}
> 2020-05-01 16:10:20,220 ERROR 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler:
>  Exception in HostRestrictingAuthorizationFilterHandler
> java.lang.StringIndexOutOfBoundsException: String index out of range: -10
> at java.base/java.lang.String.substring(String.java:1841)
> at 
> org.apache.hadoop.hdfs.server.common.HostRestrictingAuthorizationFilter.handleInteraction(HostRestrictingAuthorizationFilter.java:234)
> at 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler.channelRead0(HostRestrictingAuthorizationFilterHandler.java:155)
> at 
> org.apache.hadoop.hdfs.server.datanode.web.HostRestrictingAuthorizationFilterHandler.channelRead0(HostRestrictingAuthorizationFilterHandler.java:58)
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:328)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:302)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
> at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15323) StandbyNode fails transition to active due to insufficient transaction tailing

2020-05-02 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098090#comment-17098090
 ] 

Konstantin Shvachko commented on HDFS-15323:


Fixed checkstyle warning. TestRefreshCallQueue is failing because of 
HDFS-15325. TestUnderReplicatedBlocks passed locally.

> StandbyNode fails transition to active due to insufficient transaction tailing
> --
>
> Key: HDFS-15323
> URL: https://issues.apache.org/jira/browse/HDFS-15323
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, qjm
>Affects Versions: 2.7.7
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: HDFS-15323.000.unitTest.patch, HDFS-15323.001.patch, 
> HDFS-15323.002.patch
>
>
> StandbyNode is asked to {{transitionToActive()}}. If it fell too far behind 
> in tailing journal transaction (from QJM) it can crash with 
> {{IllegalStateException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15323) StandbyNode fails transition to active due to insufficient transaction tailing

2020-05-02 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15323:
---
Attachment: HDFS-15323.002.patch

> StandbyNode fails transition to active due to insufficient transaction tailing
> --
>
> Key: HDFS-15323
> URL: https://issues.apache.org/jira/browse/HDFS-15323
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, qjm
>Affects Versions: 2.7.7
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: HDFS-15323.000.unitTest.patch, HDFS-15323.001.patch, 
> HDFS-15323.002.patch
>
>
> StandbyNode is asked to {{transitionToActive()}}. If it fell too far behind 
> in tailing journal transaction (from QJM) it can crash with 
> {{IllegalStateException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098088#comment-17098088
 ] 

Ayush Saxena commented on HDFS-15325:
-

Same as HDFS-15324?

cc [~liuml07] 

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15325:
---
Description: {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, 
as it is missing a parameter in the constructor, which was added by 
HADOOP-17010.  (was: {{TestRefreshCallQueue.MockCallQueue}} cannot be 
instantiated, as it is missing a parameter in the constructor, which was added 
by )

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by HADOOP-17010.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15325:
---
Description: {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, 
as it is missing a parameter in the constructor, which was added by 

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestRefreshCallQueue.MockCallQueue}} cannot be instantiated, as it is 
> missing a parameter in the constructor, which was added by 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15326) TestDataNodeErasureCodingMetrics::testFullBlock fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)
Mingliang Liu created HDFS-15326:


 Summary: TestDataNodeErasureCodingMetrics::testFullBlock fails 
intermittently
 Key: HDFS-15326
 URL: https://issues.apache.org/jira/browse/HDFS-15326
 Project: Hadoop HDFS
  Issue Type: Test
  Components: datanode, test
Affects Versions: 3.4.0
Reporter: Mingliang Liu


Sample failing build: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testFullBlock/
Sample failing stack:
{code}
Error Message
Wrongly computed block reconstruction work
Stacktrace
java.lang.AssertionError: Wrongly computed block reconstruction work
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.doTest(TestDataNodeErasureCodingMetrics.java:205)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testFullBlock(TestDataNodeErasureCodingMetrics.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Moved] (HDFS-15325) TestRefreshCallQueue is failing due to changed CallQueue constructor

2020-05-02 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko moved HADOOP-17026 to HDFS-15325:
-

Component/s: (was: test)
 test
Key: HDFS-15325  (was: HADOOP-17026)
Project: Hadoop HDFS  (was: Hadoop Common)

> TestRefreshCallQueue is failing due to changed CallQueue constructor
> 
>
> Key: HDFS-15325
> URL: https://issues.apache.org/jira/browse/HDFS-15325
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13179) TestLazyPersistReplicaRecovery#testDnRestartWithSavedReplicas fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098085#comment-17098085
 ] 

Mingliang Liu commented on HDFS-13179:
--

Seeing this is still happening in {{trunk}} , see e.g. 
[https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop.hdfs.server.datanode.fsdataset.impl/TestLazyPersistReplicaRecovery/testDnRestartWithSavedReplicas/]

> TestLazyPersistReplicaRecovery#testDnRestartWithSavedReplicas fails 
> intermittently
> --
>
> Key: HDFS-13179
> URL: https://issues.apache.org/jira/browse/HDFS-13179
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Gabor Bota
>Assignee: Ahmed Hussein
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-13179-branch-2.10.003.patch, HDFS-13179.001.patch, 
> HDFS-13179.002.patch, HDFS-13179.003.patch, test runs.zip
>
>
> The error caused by TimeoutException because the test is waiting to ensure 
> that the file is replicated to DISK storage but the replication can't be 
> finished to DISK during the 30s timeout in ensureFileReplicasOnStorageType(), 
> but the file is still on RAM_DISK - so there is no data loss.
> Adding the following to TestLazyPersistReplicaRecovery.java:56 essentially 
> fixes the flakiness. 
> {code:java}
> try {
>   ensureFileReplicasOnStorageType(path1, DEFAULT);
> }catch (TimeoutException t){
>   LOG.warn("We got \"" + t.getMessage() + "\" so trying to find data on 
> RAM_DISK");
>   ensureFileReplicasOnStorageType(path1, RAM_DISK);
> }
>   }
> {code}
> Some thoughts:
> * Successful and failed tests run similar to the point when datanode 
> restarts. Restart line is the following in the log: LazyPersistTestCase - 
> Restarting the DataNode
> * There is a line which only occurs in the failed test: *addStoredBlock: 
> Redundant addStoredBlock request received for blk_1073741825_1001 on node 
> 127.0.0.1:49455 size 5242880*
> * This redundant request at BlockManager#addStoredBlock could be the main 
> reason for the test fail. Something wrong with the gen stamp? Corrupt 
> replicas? 
> =
> Current fail ratio based on my test of TestLazyPersistReplicaRecovery: 
> 1000 runs, 34 failures (3.4% fail)
> Failure rate analysis:
> TestLazyPersistReplicaRecovery.testDnRestartWithSavedReplicas: 3.4%
> 33 failures caused by: {noformat}
> java.util.concurrent.TimeoutException: Timed out waiting for condition. 
> Thread diagnostics: Timestamp: 2018-01-05 11:50:34,964 "IPC Server handler 6 
> on 39589" 
> {noformat}
> 1 failure caused by: {noformat}
> java.net.BindException: Problem binding to [localhost:56729] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery.testDnRestartWithSavedReplicas(TestLazyPersistReplicaRecovery.java:49)
>  Caused by: java.net.BindException: Address already in use at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery.testDnRestartWithSavedReplicas(TestLazyPersistReplicaRecovery.java:49)
> {noformat}
> =
> Example stacktrace:
> {noformat}
> Timed out waiting for condition. Thread diagnostics:
> Timestamp: 2017-11-01 10:36:49,499
> "Thread-1" prio=5 tid=13 runnable
> java.lang.Thread.State: RUNNABLE
> at java.lang.Thread.dumpThreads(Native Method)
> at java.lang.Thread.getAllStackTraces(Thread.java:1610)
> at 
> org.apache.hadoop.test.TimedOutTestsListener.buildThreadDump(TimedOutTestsListener.java:87)
> at 
> org.apache.hadoop.test.TimedOutTestsListener.buildThreadDiagnosticString(TimedOutTestsListener.java:73)
> at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:369)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:140)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery.testDnRestartWithSavedReplicas(TestLazyPersistReplicaRecovery.java:54)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Created] (HDFS-15324) TestRefreshCallQueue::testRefresh fails intermittently

2020-05-02 Thread Mingliang Liu (Jira)
Mingliang Liu created HDFS-15324:


 Summary: TestRefreshCallQueue::testRefresh fails intermittently
 Key: HDFS-15324
 URL: https://issues.apache.org/jira/browse/HDFS-15324
 Project: Hadoop HDFS
  Issue Type: Test
  Components: namenode, test
Affects Versions: 3.4.0
Reporter: Mingliang Liu


This test seems flaky. It failed previously see HDFS-10253 and now it is 
failing intermittently in {{trunk}}. Not sure we should use the same fix to 
Mock as it was used in HDFS-10253.

Sample failing builds:
- 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1992/2/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/
- 
https://builds.apache.org/job/PreCommit-HDFS-Build/29221/testReport/org.apache.hadoop/TestRefreshCallQueue/testRefresh/

Sample failing stack is:
{code}
Error Message
org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
Stacktrace
java.lang.RuntimeException: 
org.apache.hadoop.TestRefreshCallQueue$MockCallQueue could not be constructed.
at 
org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:193)
at 
org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:83)
at org.apache.hadoop.ipc.Server.(Server.java:3087)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:427)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:475)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:857)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:763)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1014)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:987)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1756)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1332)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1101)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:974)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:906)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:534)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:493)
at 
org.apache.hadoop.TestRefreshCallQueue.setUp(TestRefreshCallQueue.java:67)
at 
org.apache.hadoop.TestRefreshCallQueue.testRefresh(TestRefreshCallQueue.java:115)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProc