[jira] [Commented] (HDFS-15765) Add support for Kerberos and Basic Auth in webhdfs

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697229#comment-17697229
 ] 

ASF GitHub Bot commented on HDFS-15765:
---

saxenapranav commented on code in PR #5447:
URL: https://github.com/apache/hadoop/pull/5447#discussion_r1127353729


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/URLConnectionFactory.java:
##
@@ -120,7 +136,7 @@ public HttpURLConnection configure(HttpURLConnection 
connection)
   }
 }
 
-return conn;
+return new BasicAuthConfigurator(conn, basicAuthCredentials);

Review Comment:
   should we have something like:
   ```
   private ConfigurationConfigurator getConfigurator(ConfigurationConfigurator 
configurator, String basicAuthCred) {
   if(basicAuthCred != null && !basicAuthCred.isEmpty()) {
   return new BasicAuthConfigurator(configurator, basicAuthCred);}
else {return configurator;}
   }
   ```
   and instead of creating new object of basicAuthConfigurator, we call this 
new method. And if an object of BasicAuthConfigurator can be made, method will 
return that new object.
   What you say?



##
hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/TestBasicAuthConfigurator.java:
##
@@ -0,0 +1,67 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdfs.web;
+
+import org.apache.hadoop.security.authentication.client.ConnectionConfigurator;
+import org.junit.Test;
+import org.mockito.Mockito;
+
+import java.io.IOException;
+import java.net.HttpURLConnection;
+
+public class TestBasicAuthConfigurator {
+  @Test
+  public void testNullCredentials() throws IOException {
+ConnectionConfigurator conf = new BasicAuthConfigurator(null, null);
+HttpURLConnection conn = Mockito.mock(HttpURLConnection.class);
+conf.configure(conn);
+Mockito.verify(conn, Mockito.never()).setRequestProperty(Mockito.any(), 
Mockito.any());
+  }
+
+  @Test
+  public void testEmptyCredentials() throws IOException {
+ConnectionConfigurator conf = new BasicAuthConfigurator(null, "");
+HttpURLConnection conn = Mockito.mock(HttpURLConnection.class);
+conf.configure(conn);
+Mockito.verify(conn, Mockito.never()).setRequestProperty(Mockito.any(), 
Mockito.any());
+  }
+
+  @Test
+  public void testCredentialsSet() throws IOException {
+ConnectionConfigurator conf = new BasicAuthConfigurator(null, "user:pass");
+HttpURLConnection conn = Mockito.mock(HttpURLConnection.class);
+conf.configure(conn);
+Mockito.verify(conn, Mockito.times(1)).setRequestProperty(
+"AUTHORIZATION",
+"Basic dXNlcjpwYXNz"
+);
+  }
+
+  @Test
+  public void testParentConfigurator() throws IOException {
+ConnectionConfigurator parent = Mockito.mock(ConnectionConfigurator.class);
+ConnectionConfigurator conf = new BasicAuthConfigurator(parent, 
"user:pass");
+HttpURLConnection conn = Mockito.mock(HttpURLConnection.class);
+conf.configure(conn);
+Mockito.verify(conn, Mockito.times(1)).setRequestProperty(
+"AUTHORIZATION",
+"Basic dXNlcjpwYXNz"

Review Comment:
   should we have` "Basic " + Base64.getEncoder().encodeToString(
 credentials.getBytes(StandardCharsets.UTF_8)`
   
   as its not clear from test how user:pass is converting to `dXNlcjpwYXNz`





> Add support for Kerberos and Basic Auth in webhdfs
> --
>
> Key: HDFS-15765
> URL: https://issues.apache.org/jira/browse/HDFS-15765
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hadoop-client
>Reporter: Pushpendra Singh
>Priority: Minor
>  Labels: pull-request-available
>
> webhdfs's HTTP operation like get ( GetOpParam.java) operation and other HTTP 
> operation has 'requireAuth' set to false and expected to work with Delegation 
> token only. However, when working with webhdfs over Apache Knox, delegation 
> token authentication is not supported, we should support Kerberos 
> authentication (SPNEGO) or Basic 

[jira] [Commented] (HDFS-16930) Update the wrapper for fuse-dfs

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697159#comment-17697159
 ] 

ASF GitHub Bot commented on HDFS-16930:
---

jojochuang commented on code in PR #5449:
URL: https://github.com/apache/hadoop/pull/5449#discussion_r1127102374


##
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_dfs_wrapper.sh:
##
@@ -31,22 +31,13 @@ if [ "$JAVA_HOME" = "" ]; then
 export  JAVA_HOME=/usr/local/java
 fi
 
-if [ "$LD_LIBRARY_PATH" = "" ]; then
-export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$OS_ARCH/server:/usr/local/lib
-fi
-
-while IFS= read -r -d '' file
-do
-  export CLASSPATH=$CLASSPATH:$file
-done < <(find "$HADOOP_HOME/hadoop-client" -name "*.jar" -print0)
-
 while IFS= read -r -d '' file
 do
   export CLASSPATH=$CLASSPATH:$file
-done < <(find "$HADOOP_HOME/hadoop-hdfs-project" -name "*.jar" -print0)
+done < <(find "$HADOOP_HOME/hadoop-tools/hadoop-distcp" -name "*.jar" -print0)

Review Comment:
   It's been a while but I don't think fuse-dfs relies on distcp jars





> Update the wrapper for fuse-dfs
> ---
>
> Key: HDFS-16930
> URL: https://issues.apache.org/jira/browse/HDFS-16930
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Reporter: Chao-Heng Lee
>Assignee: Chao-Heng Lee
>Priority: Minor
>  Labels: pull-request-available
>
> The fuse_dfs_wrapper.sh hasn't been updated for quite a long time.
> Although the documentation mentions that the script may not work out of the 
> box, it would be clearer to update any outdated paths.
> for example, from
> {code:java}
> export 
> LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code}
> to
> {code:java}
> export 
> LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-03-06 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-16934.
---
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

fixed; ran new test on 3.3.5 to verify the backport/conflict resolution was ok.

> org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression
> -
>
> Key: HDFS-16934
> URL: https://issues.apache.org/jira/browse/HDFS-16934
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsadmin, test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Shilun Fan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> jenkins test failure as the logged output is in the wrong order for the 
> assertions. HDFS-16624 flipped the order...without that this would have 
> worked.
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149)
> {code}
> Here the code is asserting about the contents of the output, 
> {code}
> assertTrue(outs.get(0).startsWith("Reconfiguring status for node"));
> assertTrue("SUCCESS: Changed property 
> dfs.datanode.peer.stats.enabled".equals(outs.get(2))
> || "SUCCESS: Changed property 
> dfs.datanode.peer.stats.enabled".equals(outs.get(1)));  // here
> assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: 
> \"false\"".equals(outs.get(2)));
> assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: 
> \"true\"".equals(outs.get(3)))
> {code}
> If you look at the log, the actual line is appearing in that list, just in a 
> different place. race condition
> {code}
> 2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin (TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - 
> dfsadmin -status -livenodes output:
> 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
> status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 
> and finished at Fri Feb 24 01:02:03 GMT 2023.
> 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
> status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT 
> 2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  From: "false"
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  To: "true"
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  and finished 
> at Fri Feb 24 01:02:03 GMT 2023.
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: 
> Changed property dfs.datanode.peer.stats.enabled
> {code}
> we have a race condition in output generation and the assertions are clearly 
> too brittle
> for the 3.3.5 release I'm not going to make this a blocker. What i will do is 
> propose that the asserts move to assertJ with an assertion that the 
> collection "containsExactlyInAnyOrder" all the strings.
> That will
> 1. not be brittle.
> 2. give nice errors on failure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696994#comment-17696994
 ] 

ASF GitHub Bot commented on HDFS-16934:
---

steveloughran merged PR #5434:
URL: https://github.com/apache/hadoop/pull/5434




> org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression
> -
>
> Key: HDFS-16934
> URL: https://issues.apache.org/jira/browse/HDFS-16934
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsadmin, test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Shilun Fan
>Priority: Minor
>  Labels: pull-request-available
>
> jenkins test failure as the logged output is in the wrong order for the 
> assertions. HDFS-16624 flipped the order...without that this would have 
> worked.
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149)
> {code}
> Here the code is asserting about the contents of the output, 
> {code}
> assertTrue(outs.get(0).startsWith("Reconfiguring status for node"));
> assertTrue("SUCCESS: Changed property 
> dfs.datanode.peer.stats.enabled".equals(outs.get(2))
> || "SUCCESS: Changed property 
> dfs.datanode.peer.stats.enabled".equals(outs.get(1)));  // here
> assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: 
> \"false\"".equals(outs.get(2)));
> assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: 
> \"true\"".equals(outs.get(3)))
> {code}
> If you look at the log, the actual line is appearing in that list, just in a 
> different place. race condition
> {code}
> 2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin (TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - 
> dfsadmin -status -livenodes output:
> 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
> status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 
> and finished at Fri Feb 24 01:02:03 GMT 2023.
> 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
> status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT 
> 2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  From: "false"
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  To: "true"
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  and finished 
> at Fri Feb 24 01:02:03 GMT 2023.
> 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  
> tools.TestDFSAdmin 
> (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: 
> Changed property dfs.datanode.peer.stats.enabled
> {code}
> we have a race condition in output generation and the assertions are clearly 
> too brittle
> for the 3.3.5 release I'm not going to make this a blocker. What i will do is 
> propose that the asserts move to assertJ with an assertion that the 
> collection "containsExactlyInAnyOrder" all the strings.
> That will
> 1. not be brittle.
> 2. give nice errors on failure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16939) Fix the thread safety bug in LowRedundancyBlocks

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696910#comment-17696910
 ] 

ASF GitHub Bot commented on HDFS-16939:
---

Hexiaoqiao commented on PR #5450:
URL: https://github.com/apache/hadoop/pull/5450#issuecomment-1456026324

   @zhangshuyan0 Would you mind to check if we need to backport to other active 
branches.




> Fix the thread safety bug in LowRedundancyBlocks
> 
>
> Key: HDFS-16939
> URL: https://issues.apache.org/jira/browse/HDFS-16939
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16939) Fix the thread safety bug in LowRedundancyBlocks

2023-03-06 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-16939.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fix the thread safety bug in LowRedundancyBlocks
> 
>
> Key: HDFS-16939
> URL: https://issues.apache.org/jira/browse/HDFS-16939
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16939) Fix the thread safety bug in LowRedundancyBlocks

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696906#comment-17696906
 ] 

ASF GitHub Bot commented on HDFS-16939:
---

Hexiaoqiao commented on PR #5450:
URL: https://github.com/apache/hadoop/pull/5450#issuecomment-145600

   Committed to trunk. @zhangshuyan0 Thanks for your contributions. 
@saxenapranav And thanks for your reviews.




> Fix the thread safety bug in LowRedundancyBlocks
> 
>
> Key: HDFS-16939
> URL: https://issues.apache.org/jira/browse/HDFS-16939
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16939) Fix the thread safety bug in LowRedundancyBlocks

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696905#comment-17696905
 ] 

ASF GitHub Bot commented on HDFS-16939:
---

Hexiaoqiao merged PR #5450:
URL: https://github.com/apache/hadoop/pull/5450




> Fix the thread safety bug in LowRedundancyBlocks
> 
>
> Key: HDFS-16939
> URL: https://issues.apache.org/jira/browse/HDFS-16939
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696893#comment-17696893
 ] 

ASF GitHub Bot commented on HDFS-16934:
---

hadoop-yetus commented on PR #5434:
URL: https://github.com/apache/hadoop/pull/5434#issuecomment-1455978247

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  13m  9s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 204m 15s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 323m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5434 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 4c14c0a79a4a 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 482f0f1fc9016b940ffa14f9c530cba39449bfd7 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
 /usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/7/testReport/ |
   | Max. process+thread count | 3590 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/7/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> 

[jira] [Commented] (HDFS-16867) Exiting Mover due to an exception in MoverMetrics.create

2023-03-06 Thread Hongbing Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696878#comment-17696878
 ] 

Hongbing Wang commented on HDFS-16867:
--

[~Happy-shi] Is this still being followed up?
I had the same problem with balancer.
{code:java}
2023-03-06 17:40:53,264 ERROR org.apache.hadoop.hdfs.server.balancer.Balancer: 
Exiting balancer due an exception
org.apache.hadoop.metrics2.MetricsException: Metrics source 
Balancer-BP-332003681-10.196.164.22-1648632173322 already exists!
        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:225)
        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:198)
        at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
        at 
org.apache.hadoop.hdfs.server.balancer.BalancerMetrics.create(BalancerMetrics.java:55)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:344)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.doBalance(Balancer.java:809)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:847)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:952)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:1102){code}

> Exiting Mover due to an exception in MoverMetrics.create
> 
>
> Key: HDFS-16867
> URL: https://issues.apache.org/jira/browse/HDFS-16867
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZhiWei Shi
>Assignee: ZhiWei Shi
>Priority: Major
>  Labels: pull-request-available
>
> After the Mover process is started for a period of time, the process exits 
> unexpectedly and an error is reported in the log
> {code:java}
> [hdfs@${hostname} hadoop-3.3.2-nn]$ nohup bin/hdfs mover -p 
> /test-mover-jira9534 > mover.log.jira9534.20221209.2 &
> [hdfs@{hostname}  hadoop-3.3.2-nn]$ tail -f mover.log.jira9534.20221209.2
> ...
> 22/12/09 14:22:32 INFO balancer.Dispatcher: Start moving 
> blk_1073911285_170466 with size=134217728 from 10.108.182.205:800:DISK to 
> ${ip1}:800:ARCHIVE through ${ip2}:800
> 22/12/09 14:22:32 INFO balancer.Dispatcher: Successfully moved 
> blk_1073911285_170466 with size=134217728 from 10.108.182.205:800:DISK to 
> ${ip1}:800:ARCHIVE through ${ip2}:800
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Stopping Mover metrics 
> system...
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Mover metrics system stopped.
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Mover metrics system shutdown 
> complete.
> Dec 9, 2022, 2:22:42 PM  Mover took 13mins, 19sec
> 22/12/09 14:22:42 ERROR mover.Mover: Exiting Mover due to an exception
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> Mover-${BlockpoolID} already exists!
> at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
> at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
> at 
> org.apache.hadoop.hdfs.server.mover.MoverMetrics.create(MoverMetrics.java:49)
> at org.apache.hadoop.hdfs.server.mover.Mover.(Mover.java:162)
> at org.apache.hadoop.hdfs.server.mover.Mover.run(Mover.java:684)
> at org.apache.hadoop.hdfs.server.mover.Mover$Cli.run(Mover.java:826)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81)
> at org.apache.hadoop.hdfs.server.mover.Mover.main(Mover.java:908) 
> {code}
> 1、“final ExitStatus r = m.run()”return only after scheduled one of replica
> 2、“r == ExitStatus.IN_PROGRESS”,won’t run iter.remove()
> 3、Execute “new Mover” and “this.metrics = MoverMetrics.create(this)” multiple 
> times for the same nnc,which leads to the error
> {code:java}
> //Mover.java
>  for (final StorageType t : diff.existing) {
>   for (final MLocation ml : locations) {
> final Source source = storages.getSource(ml);
> if (ml.storageType == t && source != null) {
>   // try to schedule one replica move.
>   if (scheduleMoveReplica(db, source, diff.expected)) { // 1、return only 
> after scheduled one of replica         
>  return true;
>   }
> }
>   }
> }
> while (connectors.size() > 0) {
>   Collections.shuffle(connectors);
>   Iterator iter = connectors.iterator();
>   while (iter.hasNext()) {
> NameNodeConnector nnc = iter.next();
> //3、Execute “new Mover” and “this.metrics = MoverMetrics.create(this)” 
> multiple times for the same nnc,which leads to the error
>      final Mover m = new Mover(nnc, conf, retryCount,  

[jira] [Commented] (HDFS-16867) Exiting Mover due to an exception in MoverMetrics.create

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696877#comment-17696877
 ] 

ASF GitHub Bot commented on HDFS-16867:
---

whbing commented on PR #5203:
URL: https://github.com/apache/hadoop/pull/5203#issuecomment-1455937257

   @Jing9  @Happy-shi  hello, anyone follow up on this issue? I had the same 
problem with balancer.
   ```java
   2023-03-06 17:40:53,264 ERROR 
org.apache.hadoop.hdfs.server.balancer.Balancer: Exiting balancer due an 
exception
   org.apache.hadoop.metrics2.MetricsException: Metrics source 
Balancer-BP-332003681-10.196.164.22-1648632173322 already exists!
   at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:225)
   at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:198)
   at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
   at 
org.apache.hadoop.hdfs.server.balancer.BalancerMetrics.create(BalancerMetrics.java:55)
   at 
org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:344)
   at 
org.apache.hadoop.hdfs.server.balancer.Balancer.doBalance(Balancer.java:809)
   at 
org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:847)
   at 
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:952)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
   at 
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:1102)
   ```




> Exiting Mover due to an exception in MoverMetrics.create
> 
>
> Key: HDFS-16867
> URL: https://issues.apache.org/jira/browse/HDFS-16867
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZhiWei Shi
>Assignee: ZhiWei Shi
>Priority: Major
>  Labels: pull-request-available
>
> After the Mover process is started for a period of time, the process exits 
> unexpectedly and an error is reported in the log
> {code:java}
> [hdfs@${hostname} hadoop-3.3.2-nn]$ nohup bin/hdfs mover -p 
> /test-mover-jira9534 > mover.log.jira9534.20221209.2 &
> [hdfs@{hostname}  hadoop-3.3.2-nn]$ tail -f mover.log.jira9534.20221209.2
> ...
> 22/12/09 14:22:32 INFO balancer.Dispatcher: Start moving 
> blk_1073911285_170466 with size=134217728 from 10.108.182.205:800:DISK to 
> ${ip1}:800:ARCHIVE through ${ip2}:800
> 22/12/09 14:22:32 INFO balancer.Dispatcher: Successfully moved 
> blk_1073911285_170466 with size=134217728 from 10.108.182.205:800:DISK to 
> ${ip1}:800:ARCHIVE through ${ip2}:800
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Stopping Mover metrics 
> system...
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Mover metrics system stopped.
> 22/12/09 14:22:42 INFO impl.MetricsSystemImpl: Mover metrics system shutdown 
> complete.
> Dec 9, 2022, 2:22:42 PM  Mover took 13mins, 19sec
> 22/12/09 14:22:42 ERROR mover.Mover: Exiting Mover due to an exception
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> Mover-${BlockpoolID} already exists!
> at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
> at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
> at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
> at 
> org.apache.hadoop.hdfs.server.mover.MoverMetrics.create(MoverMetrics.java:49)
> at org.apache.hadoop.hdfs.server.mover.Mover.(Mover.java:162)
> at org.apache.hadoop.hdfs.server.mover.Mover.run(Mover.java:684)
> at org.apache.hadoop.hdfs.server.mover.Mover$Cli.run(Mover.java:826)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81)
> at org.apache.hadoop.hdfs.server.mover.Mover.main(Mover.java:908) 
> {code}
> 1、“final ExitStatus r = m.run()”return only after scheduled one of replica
> 2、“r == ExitStatus.IN_PROGRESS”,won’t run iter.remove()
> 3、Execute “new Mover” and “this.metrics = MoverMetrics.create(this)” multiple 
> times for the same nnc,which leads to the error
> {code:java}
> //Mover.java
>  for (final StorageType t : diff.existing) {
>   for (final MLocation ml : locations) {
> final Source source = storages.getSource(ml);
> if (ml.storageType == t && source != null) {
>   // try to schedule one replica move.
>   if (scheduleMoveReplica(db, source, diff.expected)) { // 1、return only 
> after scheduled one of replica         
>  return true;
>   }
> }
>   }
> }
> while (connectors.size() > 0) {
>   Collections.shuffle(connectors);
>   Iterator iter = connectors.iterator();
>   while (iter.hasNext()) {
> NameNodeConnector nnc = iter.next();
> //3、Execute “new Mover” 

[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696844#comment-17696844
 ] 

ASF GitHub Bot commented on HDFS-16853:
---

steveloughran commented on PR #5366:
URL: https://github.com/apache/hadoop/pull/5366#issuecomment-1455856950

   owen actually fixed this properly




> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.5
>
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696845#comment-17696845
 ] 

ASF GitHub Bot commented on HDFS-16853:
---

steveloughran closed pull request #5366: HDFS-16853. IPC shutdown failures.
URL: https://github.com/apache/hadoop/pull/5366




> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.5
>
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16939) Fix the thread safety bug in LowRedundancyBlocks

2023-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696792#comment-17696792
 ] 

ASF GitHub Bot commented on HDFS-16939:
---

saxenapranav commented on code in PR #5450:
URL: https://github.com/apache/hadoop/pull/5450#discussion_r1126115742


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java:
##
@@ -369,7 +369,7 @@ synchronized boolean remove(BlockInfo block,
* @return true if the block was found and removed from one of the priority
* queues
*/
-  boolean remove(BlockInfo block, int priLevel) {
+  synchronized boolean remove(BlockInfo block, int priLevel) {

Review Comment:
   Agreeing with both @Hexiaoqiao  @zhangshuyan0.
   Thanks @zhangshuyan0  for taking suggestion.





> Fix the thread safety bug in LowRedundancyBlocks
> 
>
> Key: HDFS-16939
> URL: https://issues.apache.org/jira/browse/HDFS-16939
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org