[jira] [Commented] (HDFS-17533) RBF: Unit tests that use embedded SQL failing in CI

2024-05-20 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848013#comment-17848013
 ] 

Shilun Fan commented on HDFS-17533:
---

[~simbadzina] Thank you for the feedback! The upgrade from derby 10.14.2.0 to 
10.17.1.0 was completed by us at [https://github.com/apache/hadoop/pull/6816], 
and no abnormal unit tests were found at that time. I will roll back #6816.

> RBF: Unit tests that use embedded SQL failing in CI
> ---
>
> Key: HDFS-17533
> URL: https://issues.apache.org/jira/browse/HDFS-17533
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>
> In the CI runs for RBF the following two tests are failing
> {noformat}
> [ERROR] Failures: 
> [ERROR] 
> org.apache.hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl.null
> [ERROR]   Run 1: TestSQLDelegationTokenSecretManagerImpl Multiple Failures (2 
> failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:TokenStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:TokenStore;drop=true
> [ERROR]   Run 2: TestSQLDelegationTokenSecretManagerImpl Multiple Failures (2 
> failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:TokenStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:TokenStore;drop=true
> [ERROR]   Run 3: TestSQLDelegationTokenSecretManagerImpl Multiple Failures (2 
> failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:TokenStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:TokenStore;drop=true
> [INFO] 
> [ERROR] 
> org.apache.hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL.null
> [ERROR]   Run 1: TestStateStoreMySQL Multiple Failures (2 failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:StateStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:StateStore;drop=true
> [ERROR]   Run 2: TestStateStoreMySQL Multiple Failures (2 failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:StateStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:StateStore;drop=true
> [ERROR]   Run 3: TestStateStoreMySQL Multiple Failures (2 failures)
>   java.sql.SQLException: No suitable driver found for 
> jdbc:derby:memory:StateStore;create=true
>   java.lang.RuntimeException: java.sql.SQLException: No suitable driver 
> found for jdbc:derby:memory:StateStore;drop=true {noformat}
> [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6804/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt]
>  
> I believe the fix is first registering the driver: 
> [https://dev.mysql.com/doc/connector-j/en/connector-j-usagenotes-connect-drivermanager.html]
> [https://stackoverflow.com/questions/22384710/java-sql-sqlexception-no-suitable-driver-found-for-jdbcmysql-localhost3306]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-14 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HDFS-17520.
---
   Fix Version/s: 3.4.1
  3.5.0
Hadoop Flags: Reviewed
Target Version/s: 3.4.1, 3.5.0
  Resolution: Fixed

> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-14 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17520:
--
Affects Version/s: 3.4.0

> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-14 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17520:
--
Component/s: hdfs

> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17522) JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection

2024-05-12 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17522:
--
Affects Version/s: 3.5.0

> JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection
> --
>
> Key: HDFS-17522
> URL: https://issues.apache.org/jira/browse/HDFS-17522
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 3.0.0-alpha1, 3.5.0
>Reporter: wangzhihui
>Priority: Major
>  Labels: pull-request-available
>
> [HDFS-10579 |https://issues.apache.org/jira/browse/HDFS-10579] has added 
> protection for NameNode and DataNode, but missing protection for JournalNode 
> web interfaces.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17366) NameNode Fine-Grained Locking via Namespace Tree

2024-05-07 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844480#comment-17844480
 ] 

Shilun Fan commented on HDFS-17366:
---

[~xuzq_zander] [~ferhui] I believe it is necessary to organize an online 
meeting for discussion in order to have a clearer understanding of the goals 
and development plans. Thank you once again!

> NameNode Fine-Grained Locking via Namespace Tree
> 
>
> Key: HDFS-17366
> URL: https://issues.apache.org/jira/browse/HDFS-17366
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
> Attachments: NameNode Fine-Grained Locking Based On Directory Tree.pdf
>
>
> As we all known, the write performance of NameNode is limited by the global 
> lock. We target to enable fine-grained locking based on the Namespace tree to 
> improve the performance of NameNode write operations.
> There are multiple motivations for creating this ticket:
>  * We have implemented this fine-grained locking and gained nearly 7x 
> performance improvements in our prod environment
>  * Other companies made similar improvements based on their internal branch. 
> Internal branches are quite different from the community, so few feedback and 
> discussions in the community.
>  * The topic of fine-grained locking has been discussed for a very long time, 
> but still without any results.
>  
> We implemented this fine-gained locking based on the namespace tree to 
> maximize the number of concurrency for disjoint or independent operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17367) Add PercentUsed for Different StorageTypes in JMX

2024-04-27 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HDFS-17367.
---
Fix Version/s: 3.4.1
   3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Add PercentUsed for Different StorageTypes in JMX
> -
>
> Key: HDFS-17367
> URL: https://issues.apache.org/jira/browse/HDFS-17367
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: metrics, namenode
>Affects Versions: 3.5.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> Currently, the NameNode only displays PercentUsed for the entire cluster. We 
> plan to add corresponding PercentUsed metrics for different StorageTypes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17429) Datatransfer sender.java LOG variable uses interface's, causing log fileName mistake

2024-03-26 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17429:
--
Affects Version/s: 3.4.0

> Datatransfer sender.java LOG variable uses interface's, causing log fileName 
> mistake
> 
>
> Key: HDFS-17429
> URL: https://issues.apache.org/jira/browse/HDFS-17429
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Zhongkun Wu
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> 2024-03-18 16:34:40,274 TRACE datatransfer.DataTransferProtocol: :80 Sending 
> DataTransferOp OpReadBlockProto: header {
>  
> the log message above shows that it is in DataTransferProtocol, actually it 
> is not!!
> it is in the Sender.java which in the same package, implementing 
> DataTransferProtocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17429) Datatransfer sender.java LOG variable uses interface's, causing log fileName mistake

2024-03-26 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17429:
--
Target Version/s: 3.4.1, 3.5.0

> Datatransfer sender.java LOG variable uses interface's, causing log fileName 
> mistake
> 
>
> Key: HDFS-17429
> URL: https://issues.apache.org/jira/browse/HDFS-17429
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Zhongkun Wu
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> 2024-03-18 16:34:40,274 TRACE datatransfer.DataTransferProtocol: :80 Sending 
> DataTransferOp OpReadBlockProto: header {
>  
> the log message above shows that it is in DataTransferProtocol, actually it 
> is not!!
> it is in the Sender.java which in the same package, implementing 
> DataTransferProtocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17429) Datatransfer sender.java LOG variable uses interface's, causing log fileName mistake

2024-03-26 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HDFS-17429.
---
Fix Version/s: 3.4.1
   3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Datatransfer sender.java LOG variable uses interface's, causing log fileName 
> mistake
> 
>
> Key: HDFS-17429
> URL: https://issues.apache.org/jira/browse/HDFS-17429
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Zhongkun Wu
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> 2024-03-18 16:34:40,274 TRACE datatransfer.DataTransferProtocol: :80 Sending 
> DataTransferOp OpReadBlockProto: header {
>  
> the log message above shows that it is in DataTransferProtocol, actually it 
> is not!!
> it is in the Sender.java which in the same package, implementing 
> DataTransferProtocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.

2024-03-06 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824205#comment-17824205
 ] 

Shilun Fan commented on HDFS-17299:
---

Don't worry, hadoop-3.4.0-RC3 is a release packaged using the tag 
[release-3.4.0-RC3|[https://github.com/apache/hadoop/releases/tag/release-3.4.0-RC3]],
 and this commit has no impact on hadoop-3.4.0-RC3.  We usually backport to 
branch-3.4.

> HDFS is not rack failure tolerant while creating a new file.
> 
>
> Key: HDFS-17299
> URL: https://issues.apache.org/jira/browse/HDFS-17299
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1
>Reporter: Rushabh Shah
>Assignee: Ritesh
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
> Attachments: repro.patch
>
>
> Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ.
> Our configuration:
> 1. We use 3 Availability Zones (AZs) for fault tolerance.
> 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy.
> 3. We use the following configuration parameters: 
> dfs.namenode.heartbeat.recheck-interval: 60 
> dfs.heartbeat.interval: 3 
> So it will take 123 ms (20.5mins) to detect that datanode is dead.
>  
> Steps to reproduce:
>  # Bring down 1 AZ.
>  # HBase (HDFS client) tries to create a file (WAL file) and then calls 
> hflush on the newly created file.
>  # DataStreamer is not able to find blocks locations that satisfies the rack 
> placement policy (one copy in each rack which essentially means one copy in 
> each AZ)
>  # Since all the datanodes in that AZ are down but still alive to namenode, 
> the client gets different datanodes but still all of them are in the same AZ. 
> See logs below.
>  # HBase is not able to create a WAL file and it aborts the region server.
>  
> Relevant logs from hdfs client and namenode
>  
> {noformat}
> 2023-12-16 17:17:43,818 INFO  [on default port 9000] FSNamesystem.audit - 
> allowed=trueugi=hbase/ (auth:KERBEROS) ip=  
> cmd=create  src=/hbase/WALs/  dst=null
> 2023-12-16 17:17:43,978 INFO  [on default port 9000] hdfs.StateChange - 
> BLOCK* allocate blk_1214652565_140946716, replicas=:50010, 
> :50010, :50010 for /hbase/WALs/
> 2023-12-16 17:17:44,061 INFO  [Thread-39087] hdfs.DataStreamer - Exception in 
> createBlockOutputStream
> java.io.IOException: Got error, status=ERROR, status message , ack with 
> firstBadLink as :50010
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113)
> at 
> org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747)
> at 
> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715)
> 2023-12-16 17:17:44,061 WARN  [Thread-39087] hdfs.DataStreamer - Abandoning 
> BP-179318874--1594838129323:blk_1214652565_140946716
> 2023-12-16 17:17:44,179 WARN  [Thread-39087] hdfs.DataStreamer - Excluding 
> datanode 
> DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK]
> 2023-12-16 17:17:44,339 INFO  [on default port 9000] hdfs.StateChange - 
> BLOCK* allocate blk_1214652580_140946764, replicas=:50010, 
> :50010, :50010 for /hbase/WALs/
> 2023-12-16 17:17:44,369 INFO  [Thread-39087] hdfs.DataStreamer - Exception in 
> createBlockOutputStream
> java.io.IOException: Got error, status=ERROR, status message , ack with 
> firstBadLink as :50010
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113)
> at 
> org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747)
> at 
> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715)
> 2023-12-16 17:17:44,369 WARN  [Thread-39087] hdfs.DataStreamer - Abandoning 
> BP-179318874-NN-IP-1594838129323:blk_1214652580_140946764
> 2023-12-16 17:17:44,454 WARN  [Thread-39087] hdfs.DataStreamer - Excluding 
> datanode 
> DatanodeInfoWithStorage[AZ-2-dn-2:50010,DS-46bb45cc-af89-46f3-9f9d-24e4fdc35b6d,DISK]
> 2023-12-16 17:17:44,522 INFO  [on default port 9000] hdfs.StateChange - 
> BLOCK* allocate blk_1214652594_140946796, replicas=:50010, 
> :50010, :50010 for /hbase/WALs/
> 2023-12-16 17:17:44,712 INFO  [Thread-39087] hdfs.DataStreamer - Exception in 
> createBlockOutputStream
> java.io.IOException: Got error, status=ERROR, status message , ack with 
> firstBadLink as :50010
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus

[jira] [Updated] (HDFS-15217) Add more information to longest write/read lock held log

2024-03-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15217:
--
Affects Version/s: 3.4.0

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15217) Add more information to longest write/read lock held log

2024-03-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15217:
--
Target Version/s: 3.4.0

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.4.0
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15217) Add more information to longest write/read lock held log

2024-03-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15217:
--
Component/s: namanode

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.4.0
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17403) jenkins build failing for HDFS 3.5.0-SNAPSHOT

2024-02-29 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822299#comment-17822299
 ] 

Shilun Fan commented on HDFS-17403:
---

[~szetszwo] This is due to a failed compilation of the Maven plugin.

 

We can check the following link for more details: 
[https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6549/2/artifact/out/branch-mvninstall-root.txt].

 
{code:java}
[INFO] Apache Hadoop Maven Plugins  FAILURE [ 40.643 s]
.
[ERROR] unable to create new native thread -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/OutOfMemoryError {code}
 

If we trigger a recompilation, the issue should be resolved. This may depend on 
the state of the container at runtime, but in most cases, we can successfully 
compile.

 

# PR-6595 compile report

[https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6595/1/artifact/out/branch-mvninstall-root.txt]

# PR-6594 compile report

https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6594/2/artifact/out/branch-mvninstall-root.txt

> jenkins build failing for HDFS 3.5.0-SNAPSHOT
> -
>
> Key: HDFS-17403
> URL: https://issues.apache.org/jira/browse/HDFS-17403
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Reporter: Tsz-wo Sze
>Priority: Major
>
> See 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6549/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt
> {code}
> [INFO] ---< org.apache.hadoop:hadoop-hdfs 
> >
> [INFO] Building Apache Hadoop HDFS 3.5.0-SNAPSHOT
> [INFO] [ jar 
> ]-
> [WARNING] The POM for 
> org.apache.hadoop:hadoop-maven-plugins:jar:3.5.0-SNAPSHOT is missing, no 
> dependency information available
> [INFO] 
> 
> [INFO] BUILD FAILURE
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17146) Use the dfsadmin -reconfig command to initiate reconfiguration on all decommissioning datanodes.

2024-02-17 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17146:
--
 Component/s: dfsadmin
   Fix Version/s: 3.4.1
Target Version/s: 3.5.0

> Use the dfsadmin -reconfig command to initiate reconfiguration on all 
> decommissioning datanodes.
> 
>
> Key: HDFS-17146
> URL: https://issues.apache.org/jira/browse/HDFS-17146
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsadmin, hdfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1, 3.5.0
>
>
> If the *DFSAdmin* command could have the ability to perform bulk operations 
> across all decommissioned datanodes, that would be highly advantageous.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17146) Use the dfsadmin -reconfig command to initiate reconfiguration on all decommissioning datanodes.

2024-02-17 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HDFS-17146.
---
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Use the dfsadmin -reconfig command to initiate reconfiguration on all 
> decommissioning datanodes.
> 
>
> Key: HDFS-17146
> URL: https://issues.apache.org/jira/browse/HDFS-17146
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> If the *DFSAdmin* command could have the ability to perform bulk operations 
> across all decommissioned datanodes, that would be highly advantageous.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16616) Remove the use if Sets#newHashSet and Sets#newTreeSet

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16616:
--
  Component/s: hdfs-common
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove the use if Sets#newHashSet and Sets#newTreeSet 
> --
>
> Key: HDFS-16616
> URL: https://issues.apache.org/jira/browse/HDFS-16616
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-common
>Affects Versions: 3.4.0
>Reporter: Samrat Deb
>Assignee: Samrat Deb
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As part of removing guava dependencies  HADOOP-17115, HADOOP-17721, 
> HADOOP-17722 and HADOOP-17720 are fixed,
> Currently the code call util function to create HashSet and TreeSet in the 
> repo . These function calls dont have much importance as it is calling 
> internally new HashSet<> / new TreeSet<> from java.utils 
> This task is to clean up all the function calls to create sets which is 
> redundant 
> Before moving to java8 , sets were created using guava functions and API , 
> now since this is moved away and util code in the hadoop now looks like
> 1. 
> public static  TreeSet newTreeSet() {  return new 
> TreeSet(); 
> 2. 
> public static  HashSet newHashSet()
> { return new HashSet(); }
> These interfaces dont do anything much just a extra layer of function call 
> please refer to the task 
> https://issues.apache.org/jira/browse/HADOOP-17726
> Can anyone review if this ticket add some value in the code. 
> Looking forward to some input/ thoughts . If not adding any value we can 
> close it and not move forward with changes !



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16522) Set Http and Ipc ports for Datanodes in MiniDFSCluster

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16522:
--
  Component/s: tets
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.5, 3.4.0
Affects Version/s: 3.3.5
   3.4.0

> Set Http and Ipc ports for Datanodes in MiniDFSCluster
> --
>
> Key: HDFS-16522
> URL: https://issues.apache.org/jira/browse/HDFS-16522
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: tets
>Affects Versions: 3.4.0, 3.3.5
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We should provide options to set Http and Ipc ports for Datanodes in 
> MiniDFSCluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16502) Reconfigure Block Invalidate limit

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16502:
--
  Component/s: block placement
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.5, 3.4.0
Affects Version/s: 3.3.5
   3.4.0

> Reconfigure Block Invalidate limit
> --
>
> Key: HDFS-16502
> URL: https://issues.apache.org/jira/browse/HDFS-16502
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: block placement
>Affects Versions: 3.4.0, 3.3.5
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Based on the cluster load, it would be helpful to consider tuning block 
> invalidate limit (dfs.block.invalidate.limit). The only way we can do this 
> without restarting Namenode as of today is by reconfiguring heartbeat 
> interval 
> {code:java}
> Math.max(heartbeatInt*20, blockInvalidateLimit){code}
> , this logic is not straightforward and operators are usually not aware of it 
> (lack of documentation), also updating heartbeat interval is not desired in 
> all the cases.
> We should provide the ability to alter block invalidation limit without 
> affecting heartbeat interval on the live cluster to adjust some load at 
> Datanode level.
> We should also take this opportunity to keep (heartbeatInterval * 20) 
> computation logic in a common method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16481) Provide support to set Http and Rpc ports in MiniJournalCluster

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16481:
--
  Component/s: test
 Target Version/s: 3.3.5, 3.4.0
Affects Version/s: 3.3.5
   3.4.0

> Provide support to set Http and Rpc ports in MiniJournalCluster
> ---
>
> Key: HDFS-16481
> URL: https://issues.apache.org/jira/browse/HDFS-16481
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: test
>Affects Versions: 3.4.0, 3.3.5
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> We should provide support for clients to set Http and Rpc ports of 
> JournalNodes in MiniJournalCluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16054) Replace Guava Lists usage by Hadoop's own Lists in hadoop-hdfs-project

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16054:
--
  Component/s: hdfs-common
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Replace Guava Lists usage by Hadoop's own Lists in hadoop-hdfs-project
> --
>
> Key: HDFS-16054
> URL: https://issues.apache.org/jira/browse/HDFS-16054
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-common
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16435) Remove no need TODO comment for ObserverReadProxyProvider

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16435:
--
  Component/s: namanode
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove no need TODO comment for ObserverReadProxyProvider
> -
>
> Key: HDFS-16435
> URL: https://issues.apache.org/jira/browse/HDFS-16435
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: namanode
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Based on discussion in 
> [HDFS-13923|https://issues.apache.org/jira/browse/HDFS-13923], we don't think 
> need to Add a configuration to turn on/off observer reads.
> So I suggest removing the `TODO comment` that are not needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16541) Fix a typo in NameNodeLayoutVersion.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16541:
--
  Component/s: namanode
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix a typo in NameNodeLayoutVersion.
> 
>
> Key: HDFS-16541
> URL: https://issues.apache.org/jira/browse/HDFS-16541
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: namanode
>Affects Versions: 3.4.0
>Reporter: ZhiWei Shi
>Assignee: ZhiWei Shi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Fix a typo in NameNodeLayoutVersion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16587) Allow configuring Handler number for the JournalNodeRpcServer

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16587:
--
  Component/s: journal-node
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Allow configuring Handler number for the JournalNodeRpcServer
> -
>
> Key: HDFS-16587
> URL: https://issues.apache.org/jira/browse/HDFS-16587
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: journal-node
>Affects Versions: 3.4.0
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We can allow configuring the handler number for the JournalNodeRpcServer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16339) Show the threshold when mover threads quota is exceeded

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16339:
--
  Component/s: datanode
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.4, 3.3.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   3.4.0

> Show the threshold when mover threads quota is exceeded
> ---
>
> Key: HDFS-16339
> URL: https://issues.apache.org/jira/browse/HDFS-16339
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: datanode
>Affects Versions: 3.4.0, 3.3.2, 3.2.4
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
> Attachments: image-2021-11-20-17-23-04-924.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Show the threshold when mover threads quota is exceeded in 
> DataXceiver#replaceBlock and DataXceiver#copyBlock.
> !image-2021-11-20-17-23-04-924.png|width=1233,height=124!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16335) Fix HDFSCommands.md

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16335:
--
  Component/s: documentation
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.2, 3.4.0
Affects Version/s: 3.3.2
   3.4.0

> Fix HDFSCommands.md
> ---
>
> Key: HDFS-16335
> URL: https://issues.apache.org/jira/browse/HDFS-16335
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: documentation
>Affects Versions: 3.4.0, 3.3.2
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Fix HDFSCommands.md.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16326) Simplify the code for DiskBalancer

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16326:
--
  Component/s: diskbalancer
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.4, 3.3.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   3.4.0

> Simplify the code for DiskBalancer
> --
>
> Key: HDFS-16326
> URL: https://issues.apache.org/jira/browse/HDFS-16326
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: diskbalancer
>Affects Versions: 3.4.0, 3.3.2, 3.2.4
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Simplify the code for DiskBalancer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16319) Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16319:
--
  Component/s: metrics
 Target Version/s: 3.3.2, 3.4.0
Affects Version/s: 3.3.2
   3.4.0

> Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount
> 
>
> Key: HDFS-16319
> URL: https://issues.apache.org/jira/browse/HDFS-16319
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: metrics
>Affects Versions: 3.4.0, 3.3.2
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount. See 
> [HDFS-15808|https://issues.apache.org/jira/browse/HDFS-15808].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16298) Improve error msg for BlockMissingException

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16298:
--
  Component/s: hdfs-client
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.4, 3.3.2, 2.10.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   2.10.2
   3.4.0

> Improve error msg for BlockMissingException
> ---
>
> Key: HDFS-16298
> URL: https://issues.apache.org/jira/browse/HDFS-16298
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client
>Affects Versions: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>
> Attachments: image-2021-11-04-15-28-05-886.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When the client fails to obtain a block, a BlockMissingException is thrown. 
> To analyze the issues, we can add the relevant location information to error 
> msg here.
> !image-2021-11-04-15-28-05-886.png|width=624,height=144!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16312) Fix typo for DataNodeVolumeMetrics and ProfilingFileIoEvents

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16312:
--
  Component/s: datanode
   metrics
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.4, 3.3.2, 2.10.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   2.10.2
   3.4.0

> Fix typo for DataNodeVolumeMetrics and ProfilingFileIoEvents
> 
>
> Key: HDFS-16312
> URL: https://issues.apache.org/jira/browse/HDFS-16312
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: datanode, metrics
>Affects Versions: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fix typo for DataNodeVolumeMetrics and ProfilingFileIoEvents.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16280) Fix typo for ShortCircuitReplica#isStale

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16280:
--
  Component/s: hdfs-client
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix typo for ShortCircuitReplica#isStale
> 
>
> Key: HDFS-16280
> URL: https://issues.apache.org/jira/browse/HDFS-16280
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Fix typo for ShortCircuitReplica#isStale.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16281) Fix flaky unit tests failed due to timeout

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16281:
--
  Component/s: test
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix flaky unit tests failed due to timeout
> --
>
> Key: HDFS-16281
> URL: https://issues.apache.org/jira/browse/HDFS-16281
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> I found that this unit test 
> *_TestViewFileSystemOverloadSchemeWithHdfsScheme_* failed several times due 
> to timeout. Can we change the timeout for some methods from _*3s*_ to *_30s_* 
> to be consistent with the other methods?
> {code:java}
> [ERROR] Tests run: 19, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 
> 65.39 s <<< FAILURE! - in 
> org.apache.hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS[ERROR]
>  Tests run: 19, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 65.39 s <<< 
> FAILURE! - in 
> org.apache.hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS[ERROR]
>  
> testNflyRepair(org.apache.hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS)
>   Time elapsed: 4.132 s  <<< 
> ERROR!org.junit.runners.model.TestTimedOutException: test timed out after 
> 3000 milliseconds at java.lang.Object.wait(Native Method) at 
> java.lang.Object.wait(Object.java:502) at 
> org.apache.hadoop.util.concurrent.AsyncGet$Util.wait(AsyncGet.java:59) at 
> org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1577) at 
> org.apache.hadoop.ipc.Client.call(Client.java:1535) at 
> org.apache.hadoop.ipc.Client.call(Client.java:1432) at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
>  at com.sun.proxy.$Proxy26.setTimes(Unknown Source) at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setTimes(ClientNamenodeProtocolTranslatorPB.java:1059)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  at com.sun.proxy.$Proxy27.setTimes(Unknown Source) at 
> org.apache.hadoop.hdfs.DFSClient.setTimes(DFSClient.java:2658) at 
> org.apache.hadoop.hdfs.DistributedFileSystem$37.doCall(DistributedFileSystem.java:1978)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$37.doCall(DistributedFileSystem.java:1975)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.setTimes(DistributedFileSystem.java:1988)
>  at org.apache.hadoop.fs.FilterFileSystem.setTimes(FilterFileSystem.java:542) 
> at 
> org.apache.hadoop.fs.viewfs.ChRootedFileSystem.setTimes(ChRootedFileSystem.java:328)
>  at 
> org.apache.hadoop.fs.viewfs.NflyFSystem$NflyOutputStream.commit(NflyFSystem.java:439)
>  at 
> org.apache.hadoop.fs.viewfs.NflyFSystem$NflyOutputStream.close(NflyFSystem.java:395)
>  at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
>  at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at 
> org.apache.hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeWithHdfsScheme.writeString(TestViewFileSystemOverloadSchemeWithHdfsScheme.java:685)
>  at 
> org.apache.hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeWithHdfsScheme.testNflyRepair(TestViewFileSystemOverloadSchemeWithHdfsScheme.java:622)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.

[jira] [Updated] (HDFS-16194) Simplify the code with DatanodeID#getXferAddrWithHostname

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16194:
--
  Component/s: datanode
   metrics
   namanode
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Simplify the code with DatanodeID#getXferAddrWithHostname   
> 
>
> Key: HDFS-16194
> URL: https://issues.apache.org/jira/browse/HDFS-16194
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: datanode, metrics, namanode
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Simplify the code with DatanodeID#getXferAddrWithHostname.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16131) Show storage type for failed volumes on namenode web

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16131:
--
  Component/s: namanode
   ui
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Show storage type for failed volumes on namenode web
> 
>
> Key: HDFS-16131
> URL: https://issues.apache.org/jira/browse/HDFS-16131
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: namanode, ui
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: failed-volumes.jpg
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> To make it easy to query the storage type for failed volumes,  we can display 
> them on namenode web.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16110) Remove unused method reportChecksumFailure in DFSClient

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16110:
--
  Component/s: dfsclient
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove unused method reportChecksumFailure in DFSClient
> ---
>
> Key: HDFS-16110
> URL: https://issues.apache.org/jira/browse/HDFS-16110
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: dfsclient
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Remove unused method reportChecksumFailure and fix some code styles by the 
> way in DFSClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16106) Fix flaky unit test TestDFSShell

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16106:
--
  Component/s: test
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix flaky unit test TestDFSShell
> 
>
> Key: HDFS-16106
> URL: https://issues.apache.org/jira/browse/HDFS-16106
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This unit test occasionally fails.
> The value set for dfs.namenode.accesstime.precision is too low, result in the 
> execution of the method, accesstime could be set many times, eventually 
> leading to failed assert.
> IMO, dfs.namenode.accesstime.precision should be greater than or equal to the 
> timeout(120s) of TestDFSShell#testCopyCommandsWithPreserveOption(), or 
> directly set to 0 to disable this feature.
>  
> {code:java}
> [ERROR] Tests run: 52, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 106.778 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDFSShell[ERROR] Tests 
> run: 52, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 106.778 s <<< 
> FAILURE! - in org.apache.hadoop.hdfs.TestDFSShell [ERROR] 
> testCopyCommandsWithPreserveOption(org.apache.hadoop.hdfs.TestDFSShell)  Time 
> elapsed: 2.353 s  <<< FAILURE! java.lang.AssertionError: 
> expected:<1625095098319> but was:<1625095099374> at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.TestDFSShell.testCopyCommandsWithPreserveOption(TestDFSShell.java:2282)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> [ERROR] 
> testCopyCommandsWithPreserveOption(org.apache.hadoop.hdfs.TestDFSShell)  Time 
> elapsed: 2.467 s  <<< FAILURE! java.lang.AssertionError: 
> expected:<1625095192527> but was:<1625095193950> at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.TestDFSShell.testCopyCommandsWithPreserveOption(TestDFSShell.java:2323)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> [ERROR] 
> testCopyCommandsWithPreserveOption(org.apache.hadoop.hdfs.TestDFSShell)  Time 
> elapsed: 2.173 s  <<< FAILURE! java.lang.AssertionError: 
> expected:<1625095196756> but was:<1625095197975> at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> org.junit.Assert.assertEquals(Assert.java:633) at 
> org.apache.hadoop.hdfs.TestDFSShell.tes

[jira] [Updated] (HDFS-16089) EC: Add metric EcReconstructionValidateTimeMillis for StripedBlockReconstructor

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16089:
--
  Component/s: erasure-coding
   metrics
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.2, 3.4.0
Affects Version/s: 3.3.2
   3.4.0

> EC: Add metric EcReconstructionValidateTimeMillis for 
> StripedBlockReconstructor
> ---
>
> Key: HDFS-16089
> URL: https://issues.apache.org/jira/browse/HDFS-16089
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: erasure-coding, metrics
>Affects Versions: 3.4.0, 3.3.2
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Add metric EcReconstructionValidateTimeMillis for StripedBlockReconstructor, 
> so that we can count the elapsed time for striped block reconstructing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16104) Remove unused parameter and fix java doc for DiskBalancerCLI

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16104:
--
  Component/s: diskbalancer
   documentation
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove unused parameter and fix java doc for DiskBalancerCLI
> 
>
> Key: HDFS-16104
> URL: https://issues.apache.org/jira/browse/HDFS-16104
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: diskbalancer, documentation
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove unused parameter and fix java doc for DiskBalancerCLI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16079) Improve the block state change log

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16079:
--
  Component/s: block placement
 Target Version/s: 3.3.2, 3.4.0
Affects Version/s: 3.3.2
   3.4.0

> Improve the block state change log
> --
>
> Key: HDFS-16079
> URL: https://issues.apache.org/jira/browse/HDFS-16079
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: block placement
>Affects Versions: 3.4.0, 3.3.2
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Improve the block state change log. Add readOnlyReplicas and 
> replicasOnStaleNodes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16078) Remove unused parameters for DatanodeManager.handleLifeline()

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16078:
--
  Component/s: namanode
 Target Version/s: 3.3.2, 3.2.3, 3.4.0
Affects Version/s: 3.3.2
   3.2.3
   3.4.0

> Remove unused parameters for DatanodeManager.handleLifeline()
> -
>
> Key: HDFS-16078
> URL: https://issues.apache.org/jira/browse/HDFS-16078
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: namanode
>Affects Versions: 3.4.0, 3.2.3, 3.3.2
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove unused parameters (blockPoolId, maxTransfers) for 
> DatanodeManager.handleLifeline().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15991) Add location into datanode info for NameNodeMXBean

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15991:
--
  Component/s: metrics
   namanode
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> Add location into datanode info for NameNodeMXBean
> --
>
> Key: HDFS-15991
> URL: https://issues.apache.org/jira/browse/HDFS-15991
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: metrics, namanode
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Add location into datanode info for NameNodeMXBean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16535) SlotReleaser should reuse the domain socket based on socket paths

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan reassigned HDFS-16535:
-

Assignee: Quanlong Huang

> SlotReleaser should reuse the domain socket based on socket paths
> -
>
> Key: HDFS-16535
> URL: https://issues.apache.org/jira/browse/HDFS-16535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HDFS-13639 improves the performance of short-circuit shm slot releasing by 
> reusing the domain socket that the client previously used to send release 
> request to the DataNode.
> This is good when there are only one DataNode locates with the client (truth 
> in most of the production environment). However, if we launch multiple 
> DataNodes on a machine (usually for testing, e.g. Impala's end-to-end tests), 
> the request could be sent to the wrong DataNode. See an example in 
> IMPALA-11234.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15951) Remove unused parameters in NameNodeProxiesClient

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15951:
--
  Component/s: hdfs-client
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> Remove unused parameters in NameNodeProxiesClient
> -
>
> Key: HDFS-15951
> URL: https://issues.apache.org/jira/browse/HDFS-15951
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Remove unused parameters in org.apache.hadoop.hdfs.NameNodeProxiesClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15975) Use LongAdder instead of AtomicLong

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15975:
--
  Component/s: metrics
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> Use LongAdder instead of AtomicLong
> ---
>
> Key: HDFS-15975
> URL: https://issues.apache.org/jira/browse/HDFS-15975
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: metrics
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> When counting some indicators, we can use LongAdder instead of AtomicLong to 
> improve performance. The long value is not an atomic snapshot in LongAdder, 
> but I think we can tolerate that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15938) Fix java doc in FSEditLog

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15938:
--
  Component/s: documentation
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix java doc in FSEditLog
> -
>
> Key: HDFS-15938
> URL: https://issues.apache.org/jira/browse/HDFS-15938
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Fix java doc in 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog#logAddCacheDirectiveInfo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15906) Close FSImage and FSNamesystem after formatting is complete

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15906:
--
  Component/s: namanode
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.3, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.4.0

> Close FSImage and FSNamesystem after formatting is complete
> ---
>
> Key: HDFS-15906
> URL: https://issues.apache.org/jira/browse/HDFS-15906
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: namanode
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Close FSImage and FSNamesystem after formatting is complete. 
> org.apache.hadoop.hdfs.server.namenode#format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15892) Add metric for editPendingQ in FSEditLogAsync

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15892:
--
  Component/s: metrics
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.3, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.4.0

> Add metric for editPendingQ in FSEditLogAsync
> -
>
> Key: HDFS-15892
> URL: https://issues.apache.org/jira/browse/HDFS-15892
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: metrics
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> To monitor editPendingQ in FSEditLogAsync, we add a metric  
> and print log when the queue is full.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15870) Remove unused configuration dfs.namenode.stripe.min

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15870:
--
  Component/s: configuration
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.3, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.4.0

> Remove unused configuration dfs.namenode.stripe.min
> ---
>
> Key: HDFS-15870
> URL: https://issues.apache.org/jira/browse/HDFS-15870
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: configuration
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Remove unused configuration dfs.namenode.stripe.min.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15854) Make some parameters configurable for SlowDiskTracker and SlowPeerTracker

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15854:
--
  Component/s: block placement
 Target Version/s: 3.3.5, 3.4.0
Affects Version/s: 3.3.5
   3.4.0

> Make some parameters configurable for SlowDiskTracker and SlowPeerTracker
> -
>
> Key: HDFS-15854
> URL: https://issues.apache.org/jira/browse/HDFS-15854
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: block placement
>Affects Versions: 3.4.0, 3.3.5
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Make some parameters configurable for SlowDiskTracker and SlowPeerTracker. 
> Related to https://issues.apache.org/jira/browse/HDFS-15814.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13274) RBF: Extend RouterRpcClient to use multiple sockets

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-13274:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Extend RouterRpcClient to use multiple sockets
> ---
>
> Key: HDFS-13274
> URL: https://issues.apache.org/jira/browse/HDFS-13274
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> HADOOP-13144 introduces the ability to create multiple connections for the 
> same user and use different sockets. The RouterRpcClient should use this 
> approach to get a better throughput.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16598) Fix DataNode FsDatasetImpl lock issue without GS checks.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16598:
--
  Component/s: datanode
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix DataNode FsDatasetImpl lock issue without GS checks.
> 
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the 
> stack like:
> {code:java}
> java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the 
> block GS of client may be smaller than DN when pipeline recovery failed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16600) Fix deadlock of fine-grain lock for FsDatastImpl of DataNode.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16600:
--
  Component/s: datanode
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix deadlock of fine-grain lock for FsDatastImpl of DataNode.
> -
>
> Key: HDFS-16600
> URL: https://issues.apache.org/jira/browse/HDFS-16600
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> The UT 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction 
> failed, because happened deadlock, which  is introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. 
> DeadLock:
> {code:java}
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.createRbw line 1588 
> need a read lock
> try (AutoCloseableLock lock = lockManager.readLock(LockLevel.BLOCK_POOl,
> b.getBlockPoolId()))
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.evictBlocks line 
> 3526 need a write lock
> try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16526) Add metrics for slow DataNode

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16526:
--
  Component/s: datanode
   metrics
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Add metrics for slow DataNode
> -
>
> Key: HDFS-16526
> URL: https://issues.apache.org/jira/browse/HDFS-16526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, metrics
>Affects Versions: 3.4.0
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Metrics-html.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Add some more metrics for slow datanode operations - FlushOrSync, 
> PacketResponder send ACK.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16488) [SPS]: Expose metrics to JMX for external SPS

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16488:
--
  Component/s: metrics
   sps
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> [SPS]: Expose metrics to JMX for external SPS
> -
>
> Key: HDFS-16488
> URL: https://issues.apache.org/jira/browse/HDFS-16488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: metrics, sps
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-02-26-22-15-25-543.png
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, external SPS has no monitoring metrics. We do not know how many 
> blocks are waiting to be processed, how many blocks are waiting to be 
> retried, and how many blocks have been migrated.
> We can expose these metrics in JMX for easy collection and display by 
> monitoring systems.
> !image-2022-02-26-22-15-25-543.png|width=631,height=170!
> For example, in our cluster, we exposed these metrics to JMX, collected by 
> JMX-Exporter and combined with Prometheus, and finally display by Grafana.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16460) [SPS]: Handle failure retries for moving tasks

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16460:
--
  Component/s: sps
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> [SPS]: Handle failure retries for moving tasks
> --
>
> Key: HDFS-16460
> URL: https://issues.apache.org/jira/browse/HDFS-16460
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: sps
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Handle failure retries for moving tasks. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16484) [SPS]: Fix an infinite loop bug in SPSPathIdProcessor thread

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16484:
--
  Component/s: sps
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.5, 3.2.4, 3.4.0
Affects Version/s: 3.3.5
   3.2.4
   3.4.0

> [SPS]: Fix an infinite loop bug in SPSPathIdProcessor thread 
> -
>
> Key: HDFS-16484
> URL: https://issues.apache.org/jira/browse/HDFS-16484
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: sps
>Affects Versions: 3.4.0, 3.2.4, 3.3.5
>Reporter: qinyuren
>Assignee: qinyuren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.5
>
> Attachments: image-2022-02-25-14-35-42-255.png
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently, we ran SPS in our cluster and found this log. The 
> SPSPathIdProcessor thread enters an infinite loop and prints the same log all 
> the time.
> !image-2022-02-25-14-35-42-255.png|width=682,height=195!
> In SPSPathIdProcessor thread, if it get a inodeId which path does not exist, 
> then the SPSPathIdProcessor thread entry infinite loop and can't work 
> normally. 
> The reason is that #ctxt.getNextSPSPath() get a inodeId which path does not 
> exist. The inodeId will not be set to null, causing the thread hold this 
> inodeId forever.
> {code:java}
> public void run() {
>   LOG.info("Starting SPSPathIdProcessor!.");
>   Long startINode = null;
>   while (ctxt.isRunning()) {
> try {
>   if (!ctxt.isInSafeMode()) {
> if (startINode == null) {
>   startINode = ctxt.getNextSPSPath();
> } // else same id will be retried
> if (startINode == null) {
>   // Waiting for SPS path
>   Thread.sleep(3000);
> } else {
>   ctxt.scanAndCollectFiles(startINode);
>   // check if directory was empty and no child added to queue
>   DirPendingWorkInfo dirPendingWorkInfo =
>   pendingWorkForDirectory.get(startINode);
>   if (dirPendingWorkInfo != null
>   && dirPendingWorkInfo.isDirWorkDone()) {
> ctxt.removeSPSHint(startINode);
> pendingWorkForDirectory.remove(startINode);
>   }
> }
> startINode = null; // Current inode successfully scanned.
>   }
> } catch (Throwable t) {
>   String reClass = t.getClass().getName();
>   if (InterruptedException.class.getName().equals(reClass)) {
> LOG.info("SPSPathIdProcessor thread is interrupted. Stopping..");
> break;
>   }
>   LOG.warn("Exception while scanning file inodes to satisfy the policy",
>   t);
>   try {
> Thread.sleep(3000);
>   } catch (InterruptedException e) {
> LOG.info("Interrupted while waiting in SPSPathIdProcessor", t);
> break;
>   }
> }
>   }
> } {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15987) Improve oiv tool to parse fsimage file in parallel with delimited format

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15987:
--
  Component/s: tools
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Improve oiv tool to parse fsimage file in parallel with delimited format
> 
>
> Key: HDFS-15987
> URL: https://issues.apache.org/jira/browse/HDFS-15987
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.4.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Improve_oiv_tool_001.pdf
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The purpose of this Jira is to improve oiv tool to parse fsimage file with 
> sub-sections (see -HDFS-14617-) in parallel with delmited format. 
> 1.Serial parsing is time-consuming
> The time to serially parse a large fsimage with delimited format (e.g. `hdfs 
> oiv -p Delimited -t  ...`) is as follows: 
> {code:java}
> 1) Loading string table: -> Not time consuming.
> 2) Loading inode references: -> Not time consuming
> 3) Loading directories in INode section: -> Slightly time consuming (3%)
> 4) Loading INode directory section:  -> A bit time consuming (11%)
> 5) Output:   -> Very time consuming (86%){code}
> Therefore, output is the most parallelized stage.
> 2.How to output in parallel
> The sub-sections are grouped in order, and each thread processes a group and 
> outputs it to the file corresponding to each thread, and finally merges the 
> output files.
> 3. The result of a test
> {code:java}
>  input fsimage file info:
>  3.4G, 12 sub-sections, 55976500 INodes
>  -
>  Threads TotalTime OutputTime MergeTime
>  1   18m37s 16m18s  –
>  48m7s  4m49s   41s{code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16477) [SPS]: Add metric PendingSPSPaths for getting the number of paths to be processed by SPS

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16477:
--
  Component/s: sps
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> [SPS]: Add metric PendingSPSPaths for getting the number of paths to be 
> processed by SPS
> 
>
> Key: HDFS-16477
> URL: https://issues.apache.org/jira/browse/HDFS-16477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: sps
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Currently we have no idea how many paths are waiting to be processed when 
> using the SPS feature. We should add metric PendingSPSPaths for getting the 
> number of paths to be processed by SPS in NameNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16499) [SPS]: Should not start indefinitely while another SPS process is running

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16499:
--
  Component/s: sps
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> [SPS]: Should not start indefinitely while another SPS process is running
> -
>
> Key: HDFS-16499
> URL: https://issues.apache.org/jira/browse/HDFS-16499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: sps
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Normally, we can only start one SPS process at a time. When one process is 
> running, start another process and retry indefinitely. I think, in this case, 
> we should exit immediately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-13248:
--
  Component/s: rbf
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.5, 2.10.2, 3.4.0
Affects Version/s: 3.3.5
   2.10.2
   3.4.0

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0, 2.10.2, 3.3.5
>Reporter: Wu Weiwei
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.5
>
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16458) [SPS]: Fix bug for unit test of reconfiguring SPS mode

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16458:
--
  Component/s: sps
   test
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> [SPS]: Fix bug for unit test of reconfiguring SPS mode
> --
>
> Key: HDFS-16458
> URL: https://issues.apache.org/jira/browse/HDFS-16458
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: sps, test
>Affects Versions: 3.4.0
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> TestNameNodeReconfigure#verifySPSEnabled was compared with 
> itself({*}isSPSRunning{*}) at assertEquals.
> In addition, after an *internal SPS* has been removed, *spsService daemon* 
> will not start within StoragePolicySatisfyManager. I think the relevant code 
> can be removed to simplify the code.
> IMO, after reconfig SPS mode, we just need to confirm whether the mode is 
> correct and whether spsManager is NULL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16222) Fix ViewDFS with mount points for HDFS only API

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16222:
--
  Component/s: viewfs
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix ViewDFS with mount points for HDFS only API
> ---
>
> Key: HDFS-16222
> URL: https://issues.apache.org/jira/browse/HDFS-16222
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.4.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: test_to_repro.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Presently, For HDFS specific API, The ones not present in ViewFileSystem. The 
> resolved path seems to be coming wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16231) Fix TestDataNodeMetrics#testReceivePacketSlowMetrics

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16231:
--
  Component/s: datanode
   metrics
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix TestDataNodeMetrics#testReceivePacketSlowMetrics
> 
>
> Key: HDFS-16231
> URL: https://issues.apache.org/jira/browse/HDFS-16231
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, metrics
>Affects Versions: 3.4.0
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> TestDataNodeMetrics#testReceivePacketSlowMetrics fails with stacktrace:
> {code:java}
> java.lang.AssertionError: Expected exactly one metric for name 
> TotalPacketsReceived 
> Expected :1
> Actual   :0
>  
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.test.MetricsAsserts.checkCaptured(MetricsAsserts.java:278)
>   at 
> org.apache.hadoop.test.MetricsAsserts.getLongCounter(MetricsAsserts.java:237)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketSlowMetrics(TestDataNodeMetrics.java:200)
> {code}
> {code:java}
> // Error MetricsName in current code,e.g 
> TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
>   MetricsRecordBuilder dnMetrics = 
> getMetrics(datanode.getMetrics().name());
>   assertTrue("More than 1 packet received",
>   getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
>   assertTrue("More than 1 slow packet to mirror",
>   getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
>   assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
>   assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16192) ViewDistributedFileSystem#rename wrongly using src in the place of dst.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-16192:
--
  Component/s: viewfs
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.2, 3.4.0
Affects Version/s: 3.3.2
   3.4.0

> ViewDistributedFileSystem#rename wrongly using src in the place of dst.
> ---
>
> Key: HDFS-16192
> URL: https://issues.apache.org/jira/browse/HDFS-16192
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.4.0, 3.3.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In ViewDistributedFileSystem, we are mistakenly used src path in the place of 
> dst path when finding mount path info.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15671) TestBalancerRPCDelay#testBalancerRPCDelayQpsDefault fails on Trunk

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15671:
--
  Component/s: balancer
   test
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> TestBalancerRPCDelay#testBalancerRPCDelayQpsDefault fails on Trunk
> --
>
> Key: HDFS-15671
> URL: https://issues.apache.org/jira/browse/HDFS-15671
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, test
>Affects Versions: 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: TestBalancerRPCDelay.testBalancerRPCDelayQpsDefault.log
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> qbt report shows failures on TestBalancer
> {code:bash}
> org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay.testBalancerRPCDelayQpsDefault
> Failing for the past 1 build (Since Failed#317 )
> Took 45 sec.
> Error Message
> Timed out waiting for /tmp.txt to reach 20 replicas
> Stacktrace
> java.util.concurrent.TimeoutException: Timed out waiting for /tmp.txt to 
> reach 20 replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:829)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.createFile(TestBalancer.java:319)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:865)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerRPCDelay(TestBalancer.java:2193)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay.testBalancerRPCDelayQpsDefault(TestBalancerRPCDelay.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15973) RBF: Add permission check before doing router federation rename.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15973:
--
  Component/s: rbf
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Add permission check before doing router federation rename.
> 
>
> Key: HDFS-15973
> URL: https://issues.apache.org/jira/browse/HDFS-15973
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15973.001.patch, HDFS-15973.002.patch, 
> HDFS-15973.003.patch, HDFS-15973.004.patch, HDFS-15973.005.patch, 
> HDFS-15973.006.patch, HDFS-15973.007.patch, HDFS-15973.008.patch, 
> HDFS-15973.009.patch, HDFS-15973.010.patch
>
>
> The router federation rename is lack of permission check. It is a security 
> issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13975) TestBalancer#testMaxIterationTime fails sporadically

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-13975:
--
 Component/s: balancer
  test
Target Version/s: 3.2.3, 2.10.2, 3.3.1, 3.4.0

> TestBalancer#testMaxIterationTime fails sporadically
> 
>
> Key: HDFS-13975
> URL: https://issues.apache.org/jira/browse/HDFS-13975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, test
>Affects Versions: 3.2.0
>Reporter: Jason Darrell Lowe
>Assignee: Toshihiko Uchida
>Priority: Major
>  Labels: flaky-test, pull-request-available
> Fix For: 3.3.1, 3.4.0, 2.10.2, 3.2.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A number of precommit builds have seen this test fail like this:
> {noformat}
> java.lang.AssertionError: Unexpected iteration runtime: 4021ms > 3.5s
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testMaxIterationTime(TestBalancer.java:1649)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15848) Snapshot Operations: Add debug logs at the entry point

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15848:
--
  Component/s: snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Snapshot Operations: Add debug logs at the entry point
> --
>
> Key: HDFS-15848
> URL: https://issues.apache.org/jira/browse/HDFS-15848
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Affects Versions: 3.4.0
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: HDFS-15848.001.patch, HDFS-15848.002.patch, 
> HDFS-15848.003.patch, HDFS-15848.004.patch
>
>
> Add debug logs at the entry point for various Snapshot Operations



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15847) create client protocol: add ecPolicyName & storagePolicy param to debug statement string

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15847:
--
  Component/s: erasure-coding
   namanode
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> create client protocol: add ecPolicyName & storagePolicy param to debug 
> statement string 
> -
>
> Key: HDFS-15847
> URL: https://issues.apache.org/jira/browse/HDFS-15847
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, namanode
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15847.0001.patch
>
>
> A create (ClientProtocol) ==> namesystem.startFileInt does not print 
> "ecPolicyName & storagePolicy" param, It will be good to have these params 
> added in debug statement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15834) Remove the usage of org.apache.log4j.Level

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15834:
--
  Component/s: hdfs-common
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove the usage of org.apache.log4j.Level
> --
>
> Key: HDFS-15834
> URL: https://issues.apache.org/jira/browse/HDFS-15834
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-common
>Affects Versions: 3.4.0
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Replace org.apache.log4j.Level with org.slf4j.event.Level in hadoop-hdfs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15820) Ensure snapshot root trash provisioning happens only post safe mode exit

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15820:
--
  Component/s: snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Ensure snapshot root trash provisioning happens only post safe mode exit
> 
>
> Key: HDFS-15820
> URL: https://issues.apache.org/jira/browse/HDFS-15820
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Affects Versions: 3.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:967)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:936)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
> 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> create directory /upgrade/.Trash. Name node is in safe mode.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15817) Rename snapshots while marking them deleted

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15817:
--
  Component/s: snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Rename snapshots while marking them deleted 
> 
>
> Key: HDFS-15817
> URL: https://issues.apache.org/jira/browse/HDFS-15817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Affects Versions: 3.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> With ordered snapshot feature turned on, a snapshot will be just marked as 
> deleted but won't actually be deleted if its not the oldest one. Since, the 
> snapshot is just marked deleted, creation of  new snapshot having the same 
> name as the one which was marked deleted will fail. In order to mitigate such 
> problems, the idea here is to rename the snapshot getting marked as deleted 
> by appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15767) RBF: Router federation rename of directory.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15767:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Router federation rename of directory.
> ---
>
> Key: HDFS-15767
> URL: https://issues.apache.org/jira/browse/HDFS-15767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15767.001.patch, HDFS-15767.002.patch, 
> HDFS-15767.003.patch, HDFS-15767.004.patch, HDFS-15767.005.patch, 
> HDFS-15767.006.patch, HDFS-15767.007.patch
>
>
> This Jira trys to support rename of directory across namespaces using 
> fedbalance framework. 
> We can do the router federation rename when:
>  # Both the src and dst has only one remote location.
>  # The src and dst remote locations are at different namespaces.
>  # The src is a directory.(Fedbalance depends on snapshot).
>  # The dst doesn't exist.
> We can implement router federation rename of file in a new task so the patch 
> won't be too big to review.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15672) TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy fails on trunk

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15672:
--
  Component/s: balancer
   test
 Target Version/s: 3.2.3, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.4.0

> TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy 
> fails on trunk
> ---
>
> Key: HDFS-15672
> URL: https://issues.apache.org/jira/browse/HDFS-15672
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, test
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Ahmed Hussein
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> qbt report shows the following error:
> {code:bash}
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy
> Failing for the past 1 build (Since Failed#317 )
> Took 10 min.
> Error Message
> test timed out after 60 milliseconds
> Stacktrace
> org.junit.runners.model.TestTimedOutException: test timed out after 60 
> milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.sleep(TestBalancerWithMultipleNameNodes.java:353)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.wait(TestBalancerWithMultipleNameNodes.java:159)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:175)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:550)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy(TestBalancerWithMultipleNameNodes.java:609)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15762) TestMultipleNNPortQOP#testMultipleNNPortOverwriteDownStream fails intermittently

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15762:
--
  Component/s: hdfs
   test
 Target Version/s: 3.2.3, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.4.0

> TestMultipleNNPortQOP#testMultipleNNPortOverwriteDownStream fails 
> intermittently
> 
>
> Key: HDFS-15762
> URL: https://issues.apache.org/jira/browse/HDFS-15762
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, test
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
>  Labels: flaky-test, pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: PR2585#1-TestMultipleNNPortQOP-output.txt
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This unit test failed in https://github.com/apache/hadoop/pull/2585 due to an 
> AssertionError.
> {code}
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.TestMultipleNNPortQOP.testMultipleNNPortOverwriteDownStream(TestMultipleNNPortQOP.java:267)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> The failure occurred at the following assertion.
> {code}
>   doTest(fsPrivacy, PATH1);
>   for (int i = 0; i < 2; i++) {
> DataNode dn = dataNodes.get(i);
> SaslDataTransferClient saslClient = dn.getSaslClient();
> String qop = null;
> // It may take some time for the qop to populate
> // to all DNs, check in a loop.
> for (int trial = 0; trial < 10; trial++) {
>   qop = saslClient.getTargetQOP();
>   if (qop != null) {
> break;
>   }
>   Thread.sleep(100);
> }
> assertEquals("auth", qop);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e

[jira] [Updated] (HDFS-14558) RBF: Isolation/Fairness documentation

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-14558:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Isolation/Fairness documentation
> -
>
> Key: HDFS-14558
> URL: https://issues.apache.org/jira/browse/HDFS-14558
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: CR Hota
>Assignee: Fengnan Li
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-14558.001.patch, HDFS-14558.002.patch, 
> HDFS-14558.003.patch
>
>
> Documentation is needed to make users aware of this feature HDFS-14090.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15702) Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15702:
--
  Component/s: hdfs
   test
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission
> --
>
> Key: HDFS-15702
> URL: https://issues.apache.org/jira/browse/HDFS-15702
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, test
>Affects Versions: 3.4.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.TestDecommission.testAllocAndIBRWhileDecommission(TestDecommission.java:1025)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15766) RBF: MockResolver.getMountPoints() breaks the semantic of FileSubclusterResolver.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15766:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: MockResolver.getMountPoints() breaks the semantic of 
> FileSubclusterResolver.
> -
>
> Key: HDFS-15766
> URL: https://issues.apache.org/jira/browse/HDFS-15766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15766.001.patch, HDFS-15766.002.patch, 
> HDFS-15766.003.patch
>
>
> MockResolver.getMountPoints() breaks the semantic of 
> FileSubclusterResolver.getMountPoints(). Currently it returns null when the 
> path is a mount point and no mount points are under the path. 
> {quote}Return zero-length list if the path is a mount point but there are no 
> mount points under the path.
> {quote}
>  
> This is required by router federation rename. I found this bug when writing 
> unit test for the rbf rename. Let's fix it here to avoid mixing up with the 
> router federation rename.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15748) RBF: Move the router related part from hadoop-federation-balance module to hadoop-hdfs-rbf.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15748:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Move the router related part from hadoop-federation-balance module to 
> hadoop-hdfs-rbf.
> ---
>
> Key: HDFS-15748
> URL: https://issues.apache.org/jira/browse/HDFS-15748
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15748.001.patch, HDFS-15748.002.patch, 
> HDFS-15748.003.patch, HDFS-15748.004.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15648) TestFileChecksum should be parameterized

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15648:
--
  Component/s: test
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> TestFileChecksum should be parameterized
> 
>
> Key: HDFS-15648
> URL: https://issues.apache.org/jira/browse/HDFS-15648
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {{TestFileChecksumCompositeCrc}} extends {{TestFileChecksum}} overriding 3 
> methods that return a constant flag True/False.
> The class is useless and it causes confusion with two different jiras, while 
> the main bug should be in TestFileChecksum.
> The {{TestFileChecksum}} should be parameterized



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15677) TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15677:
--
  Component/s: rbf
   test
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk
> 
>
> Key: HDFS-15677
> URL: https://issues.apache.org/jira/browse/HDFS-15677
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf, test
>Affects Versions: 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> qbt report (Nov 8, 2020, 11:28 AM) shows failures in 
> testGetCachedDatanodeReport



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15674:
--
  Component/s: datanode
   test
 Target Version/s: 3.3.6, 3.4.0
Affects Version/s: 3.3.6
   3.4.0

> TestBPOfferService#testMissBlocksWhenReregister fails on trunk
> --
>
> Key: HDFS-15674
> URL: https://issues.apache.org/jira/browse/HDFS-15674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Ahmed Hussein
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in 
> testMissBlocksWhenReregister 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15643) EC: Fix checksum computation in case of native encoders

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15643:
--
  Component/s: erasure-coding
 Target Version/s: 3.2.3, 3.3.1, 3.2.2, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.2.2
   3.4.0

> EC: Fix checksum computation in case of native encoders
> ---
>
> Key: HDFS-15643
> URL: https://issues.apache.org/jira/browse/HDFS-15643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>Reporter: Ahmed Hussein
>Assignee: Ayush Saxena
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HDFS-15643-01.patch, Test-Fix-01.patch, 
> TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery17.log, 
> org.apache.hadoop.hdfs.TestFileChecksum-output.txt, 
> org.apache.hadoop.hdfs.TestFileChecksum.txt
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> There are many failures in {{TestFileChecksumCompositeCrc}}. The test cases 
> {{testStripedFileChecksumWithMissedDataBlocksRangeQueryXX}} fail. The 
> following is a sample of the stack trace in two of them Query7 and Query8.
> {code:bash}
> org.apache.hadoop.fs.PathIOException: `/striped/stripedFileChecksum1': Fail 
> to get block checksum for 
> LocatedStripedBlock{BP-1812707539-172.17.0.3-1602771351154:blk_-9223372036854775792_1001;
>  getBlockSize()=37748736; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:36687,DS-b00139f0-4f28-4870-8f72-b726bd339e23,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36303,DS-49a3c58e-da4a-4256-b1f9-893e4003ec94,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43975,DS-ac278858-b6c8-424f-9e20-58d718dabe31,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:37507,DS-17f9d8d8-f8d3-443b-8df7-29416a2f5cb0,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:36441,DS-7e9d19b5-6220-465f-b33e-f8ed0e60fb07,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:42555,DS-ce679f5e-19fe-45b0-a0cd-8d8bec2f4735,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:39093,DS-4a7f54bb-dd39-4b5b-8dee-31a1b565cd7f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:41699,DS-e1f939f3-37e7-413e-a522-934243477d81,DISK]];
>  indices=[1, 2, 3, 4, 5, 6, 7, 8]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$StripedFileNonStripedChecksumComputer.checksumBlocks(FileChecksumHelper.java:640)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:252)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumInternal(DFSClient.java:1851)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getFileChecksumWithCombineMode(DFSClient.java:1871)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1902)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$34.doCall(DistributedFileSystem.java:1899)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1916)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:584)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:295)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery7(TestFileChecksum.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> {code:bash}
> Error Message
> `/striped/stripedFileChecksum1': Fail to get block checksum for 
> LocatedStripedBlock{BP-1299291876-172.17.0.3-1

[jira] [Updated] (HDFS-15460) TestFileCreation#testServerDefaultsWithMinimalCaching fails intermittently

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15460:
--
  Component/s: hdfs
   test
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> TestFileCreation#testServerDefaultsWithMinimalCaching fails intermittently
> --
>
> Key: HDFS-15460
> URL: https://issues.apache.org/jira/browse/HDFS-15460
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, test
>Affects Versions: 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available, test
> Fix For: 3.4.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {{TestFileCreation.testServerDefaultsWithMinimalCaching}} fails 
> intermittently on trunk
> {code:bash}
> [ERROR] Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 103.413 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileCreation
> [ERROR] 
> testServerDefaultsWithMinimalCaching(org.apache.hadoop.hdfs.TestFileCreation) 
>  Time elapsed: 2.435 s  <<< FAILURE!
> java.lang.AssertionError: expected:<402653184> but was:<268435456>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.hdfs.TestFileCreation.testServerDefaultsWithMinimalCaching(TestFileCreation.java:279)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9776) TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-9776:
-
  Component/s: test
 Target Version/s: 3.2.3, 3.3.1, 3.2.2, 3.4.0
Affects Version/s: 3.2.3
   3.3.1
   3.2.2
   3.4.0

> TestHAAppend#testMultipleAppendsDuringCatchupTailing is flaky
> -
>
> Key: HDFS-9776
> URL: https://issues.apache.org/jira/browse/HDFS-9776
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>Reporter: Vinayakumar B
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.2.3
>
> Attachments: TestHAAppend.testMultipleAppendsDuringCatchupTailing.log
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Initial analysys of Recent test failure in 
> {{TestHAAppend#testMultipleAppendsDuringCatchupTailing}}
> [here|https://builds.apache.org/job/PreCommit-HDFS-Build/14420/testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestHAAppend/testMultipleAppendsDuringCatchupTailing/]
>  
> has found that, if the Active NameNode goes down immediately after truncate 
> operation, but before BlockRecovery command sent to datanode,
> Then this block will never be truncated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15640) Add diff threshold to FedBalance

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15640:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Add diff threshold to FedBalance
> 
>
> Key: HDFS-15640
> URL: https://issues.apache.org/jira/browse/HDFS-15640
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15640.001.patch, HDFS-15640.002.patch, 
> HDFS-15640.003.patch, HDFS-15640.004.patch
>
>
> Currently in the DistCpProcedure it must submit distcp round by round until 
> there is no diff to go to the final distcp stage. The condition is very 
> strict. During incremental copy stage, if the diff size is under the given 
> threshold scope then we don't need to wait for no diff. We can start the 
> final distcp directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15614) Initialize snapshot trash root during NameNode startup if enabled

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15614:
--
  Component/s: namanode
   snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Initialize snapshot trash root during NameNode startup if enabled
> -
>
> Key: HDFS-15614
> URL: https://issues.apache.org/jira/browse/HDFS-15614
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namanode, snapshots
>Affects Versions: 3.4.0
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This is a follow-up to HDFS-15607.
> Goal:
> Initialize (create) snapshot trash root for all existing snapshottable 
> directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to 
> {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually 
> on all those existing snapshottable directories.
> The change is expected to land in {{FSNamesystem}}.
> Discussion:
> 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the 
> client side. But in order for NN to create it at startup, the logic must 
> (also) be implemented on the server side as well. -- which is also a 
> requirement by WebHDFS (HDFS-15612).
> 2. Alternatively, we can provide an extra parameter to the 
> {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to 
> initialize/provision trash root on all existing snapshottable dirs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15598) ViewHDFS#canonicalizeUri should not be restricted to DFS only API.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15598:
--
 Component/s: viewfs
Target Version/s: 3.4.0

> ViewHDFS#canonicalizeUri should not be restricted to DFS only API.
> --
>
> Key: HDFS-15598
> URL: https://issues.apache.org/jira/browse/HDFS-15598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As part of HIve Partitions verification, insert failed due to canonicalizeUri 
> restricted to DFS only. This can be relaxed and delegate to 
> vfs#canonicalizeUri



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15585) ViewDFS#getDelegationToken should not throw UnsupportedOperationException.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15585:
--
 Component/s: viewfs
Target Version/s: 3.3.1, 3.4.0

> ViewDFS#getDelegationToken should not throw UnsupportedOperationException.
> --
>
> Key: HDFS-15585
> URL: https://issues.apache.org/jira/browse/HDFS-15585
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When starting Hive in secure environment, it is throwing 
> UnsupportedOprationException from ViewDFS.
> at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:736) 
> ~[hive-service-3.1.3000.7.2.3.0-54.jar:3.1.3000.7.2.3.0-54]
>   at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1077)
>  ~[hive-service-3.1.3000.7.2.3.0-54.jar:3.1.3000.7.2.3.0-54]
>   ... 9 more
> Caused by: java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hdfs.ViewDistributedFileSystem.getDelegationToken(ViewDistributedFileSystem.java:1042)
>  ~[hadoop-hdfs-client-3.1.1.7.2.3.0-54.jar:?]
>   at 
> org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
>  ~[hadoop-common-3.1.1.7.2.3.0-54.jar:?]
>   at 
> org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
>  ~[hadoop-common-3.1.1.7.2.3.0-54.jar:?]
>   at 
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:140)
>  ~[tez-api-0.9.1.7.2.3.0-54.jar:0.9.1.7.2.3.0-54]
>   at 
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:101)
>  ~[tez-api-0.9.1.7.2.3.0-54.jar:0.9.1.7.2.3.0-54]
>   at 
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(TokenCache.java:77)
>  ~[tez-api-0.9.1.7.2.3.0-54.jar:0.9.1.7.2.3.0-54]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createLlapCredentials(TezSessionState.java:443)
>  ~[hive-exec-3.1.3000.7.2.3.0-54.jar:3.1.3000.7.2.3.0-54]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:354)
>  ~[hive-exec-3.1.3000.7.2.3.0-54.jar:3.1.3000.7.2.3.0-54]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:313)
>  ~[hive-exec-3.1.3000.7.2.3.0-54.jar:3.1.3000.7.2.3.0-54]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15532) listFiles on root/InternalDir will fail if fallback root has file

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15532:
--
 Component/s: viewfs
Target Version/s: 3.3.1, 3.4.0

> listFiles on root/InternalDir will fail if fallback root has file
> -
>
> Key: HDFS-15532
> URL: https://issues.apache.org/jira/browse/HDFS-15532
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> listFiles implementation gets the RemoteIterator created in 
> InternalViewFSDirFs as the root is an InternalViewFSDir.  
> If there is a fallback and a file exist at root level, it would have 
> collected when collecting locatedStatuses. 
> When its iterating over to that fallbacks file from  RemoteIterator (which 
> was returned from InternalViewFSDirFs ), iterator's next will will call 
> getFileBlockLocations if it's a file.
> {code:java}
> @Override
> public LocatedFileStatus next() throws IOException {
>  System.out.println(this);
>  if (!hasNext()) {
>  throw new NoSuchElementException("No more entries in " + f);
>  }
>  FileStatus result = stats[i++];
>  // for files, use getBlockLocations(FileStatus, int, int) to avoid
>  // calling getFileStatus(Path) to load the FileStatus again
>  BlockLocation[] locs = result.isFile() ?
>  getFileBlockLocations(result, 0, result.getLen()) :
>  null;
>  return new LocatedFileStatus(result, locs);
> }{code}
>  
> this getFileBlockLocations will be made on InternalViewFSDirFs, as that 
> Iterator created originally from that fs. 
> InternalViewFSDirFs#getFileBlockLocations does not handle fallback cases. 
> It's always expecting "/", this means it always assuming the dir.
> But with the fallback and returning Iterator from InternalViewFSDirFs, will 
> create problems.
> Probably we need to handle fallback case in getFileBlockLocations as well.( 
> Fallback only should be the reason for call coming to InternalViewFSDirFs 
> with other than "/")
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15558) ViewDistributedFileSystem#recoverLease should call super.recoverLease when there are no mounts configured

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15558:
--
  Component/s: viewfs
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> ViewDistributedFileSystem#recoverLease should call super.recoverLease when 
> there are no mounts configured
> -
>
> Key: HDFS-15558
> URL: https://issues.apache.org/jira/browse/HDFS-15558
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: viewfs
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15496) Add UI for deleted snapshots

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15496:
--
  Component/s: snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Add UI for deleted snapshots
> 
>
> Key: HDFS-15496
> URL: https://issues.apache.org/jira/browse/HDFS-15496
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Affects Versions: 3.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
> Fix For: 3.4.0
>
>
> Add UI for deleted snapshots
> a) Show the list of snapshots per snapshottable directory
> b) Add deleted status in the JMX output for the Snapshot along with a snap ID
> e) NN UI, should sort the snapshots for snapIds. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15518) Wrong operation name in FsNamesystem for listSnapshots

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15518:
--
  Component/s: snapshots
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Wrong operation name in FsNamesystem for listSnapshots
> --
>
> Key: HDFS-15518
> URL: https://issues.apache.org/jira/browse/HDFS-15518
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Affects Versions: 3.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Aryan Gupta
>Priority: Major
> Fix For: 3.4.0
>
>
> List snapshots makes use of listSnapshotDirectory as the string in place of 
> ListSnapshot.
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L7026



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15374) Add documentation for fedbalance tool

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15374:
--
  Component/s: documentation
   rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Add documentation for fedbalance tool
> -
>
> Key: HDFS-15374
> URL: https://issues.apache.org/jira/browse/HDFS-15374
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation, rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: BalanceProcedureScheduler.png, 
> FedBalance_Screenshot1.jpg, FedBalance_Screenshot2.jpg, 
> FedBalance_Screenshot3.jpg, HDFS-15374.001.patch, HDFS-15374.002.patch, 
> HDFS-15374.003.patch, HDFS-15374.004.patch, HDFS-15374.005.patch
>
>
> Add documentation for fedbalance tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15410) Add separated config file hdfs-fedbalance-default.xml for fedbalance tool

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15410:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Add separated config file hdfs-fedbalance-default.xml for fedbalance tool
> -
>
> Key: HDFS-15410
> URL: https://issues.apache.org/jira/browse/HDFS-15410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15410.001.patch, HDFS-15410.002.patch, 
> HDFS-15410.003.patch, HDFS-15410.004.patch, HDFS-15410.005.patch
>
>
> Add a separated config file named hdfs-fedbalance-default.xml for fedbalance 
> tool configs. It's like the ditcp-default.xml for distcp tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15346) FedBalance tool implementation

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15346:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> FedBalance tool implementation
> --
>
> Key: HDFS-15346
> URL: https://issues.apache.org/jira/browse/HDFS-15346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, 
> HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, 
> HDFS-15346.006.patch, HDFS-15346.007.patch, HDFS-15346.008.patch, 
> HDFS-15346.009.patch, HDFS-15346.010.patch, HDFS-15346.011.patch, 
> HDFS-15346.012.patch
>
>
> This Jira implements the HDFS FedBalance tool based on the basic frame work 
> in HDFS-15340.  The whole process of hdfs federation tool is implemented in 
> this jira. See the documentation at HDFS-15374/patch-v05 for a detailed 
> description of the HDFS fedbalance tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15340) RBF: Implement BalanceProcedureScheduler basic framework

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15340:
--
  Component/s: rbf
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RBF: Implement BalanceProcedureScheduler basic framework
> 
>
> Key: HDFS-15340
> URL: https://issues.apache.org/jira/browse/HDFS-15340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch, HDFS-15340.004.patch, HDFS-15340.005.patch, 
> HDFS-15340.006.patch, HDFS-15340.007.patch, HDFS-15340.008.patch
>
>
> This Jira implements the basic framework(Balance Procedure Scheduler) of the 
> hdfs federation balance tool. 
>  The Balance Procedure Scheduler implements a state machine. It’s responsible 
> for scheduling a balance job, including submit, run, delay and recover. See 
> the documentation at HDFS-15374/patch-v05 for a detailed description of the 
> state machine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15146) TestBalancerRPCDelay. testBalancerRPCDelayQpsDefault fails intermittently

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15146:
--
  Component/s: balancer
   test
 Target Version/s: 2.10.1, 3.2.2, 3.3.0, 3.4.0
Affects Version/s: 2.10.1
   3.2.2
   3.3.0
   3.4.0

> TestBalancerRPCDelay. testBalancerRPCDelayQpsDefault fails intermittently
> -
>
> Key: HDFS-15146
> URL: https://issues.apache.org/jira/browse/HDFS-15146
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, test
>Affects Versions: 3.3.0, 3.2.2, 2.10.1, 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.2.2, 2.10.1, 3.4.0
>
> Attachments: HDFS-15146-branch-2.10.001.patch, HDFS-15146.001.patch
>
>
> TestBalancerRPCDelay.testBalancerRPCDelayQpsDefault fails intermittently when 
> the number of blocks does not match the expected. In 
> {{testBalancerRPCDelay}}, it seems like some datanodes will not be up by the 
> time we fetch the block locations.
> I see the following stack trace:
> {code:bash}
> [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 39.969 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay
> [ERROR] 
> testBalancerRPCDelayQpsDefault(org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay)
>   Time elapsed: 12.035 s  <<< FAILURE!
> java.lang.AssertionError: Number of getBlocks should be not less than 20
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerRPCDelay(TestBalancer.java:2197)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerRPCDelay.testBalancerRPCDelayQpsDefault(TestBalancerRPCDelay.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15898) Test case TestOfflineImageViewer fails

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15898:
--
  Component/s: test
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> Test case TestOfflineImageViewer fails
> --
>
> Key: HDFS-15898
> URL: https://issues.apache.org/jira/browse/HDFS-15898
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The following 3 cases failed locally
> TestOfflineImageViewer#testWriterOutputEntryBuilderForFile
>  
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/file,5,2000-01-01 00:00,2000-01-01 
> 00:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1Actual   
> :/path/file,5,2000-01-01 08:00,2000-01-01 
> 08:00,1024,3,3072,0,0,-rwx-wx-w-+,user_1,group_1
> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForFile(TestOfflineImageViewer.java:760){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForDirectory
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/dir,0,2000-01-01 00:00,1970-01-01 
> 00:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1Actual   
> :/path/dir,0,2000-01-01 08:00,1970-01-01 
> 08:00,0,0,0,700,1000,drwx-wx-w-+,user_1,group_1 at 
> org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForDirectory(TestOfflineImageViewer.java:768){code}
> TestOfflineImageViewer#testWriterOutputEntryBuilderForSymlink
> {code:java}
> org.junit.ComparisonFailure: org.junit.ComparisonFailure: Expected 
> :/path/sym,0,2000-01-01 00:00,2000-01-01 
> 00:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1Actual   :/path/sym,0,2000-01-01 
> 08:00,2000-01-01 08:00,0,0,0,0,0,-rwx-wx-w-,user_1,group_1 difference> at org.junit.Assert.assertEquals(Assert.java:115) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testWriterOutputEntryBuilderForSymlink(TestOfflineImageViewer.java:776){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >