[jira] [Comment Edited] (HDFS-16029) Divide by zero bug in InstrumentationService.java

2023-06-08 Thread Anton Kutuzov (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730831#comment-17730831
 ] 

Anton Kutuzov edited comment on HDFS-16029 at 6/9/23 6:45 AM:
--

More precisely, we get an exception: ArrayIndexOutOfBoundsException, because 
before devizion the 'last' used as index of array:
{code:java}
.
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last];
. {code}
So this code will thow ArrayIndexOutOfBoundsException:
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getValues(); {code}
and this
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getJSON(); 

{code}
I propose to add check on -1 in getValues. So, in this case the method 
getValues return array of zeros.
{code:java}
long[] values = new long[4];
if (last < 0) {
  return valuse;
}
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last]; 

{code}
 


was (Author: antn.kutuzov):
More precisely, we get an exception: ArrayIndexOutOfBoundsException, because 
before devizion the 'last' used as index of array:
{code:java}
.
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last];
. {code}
So this code will thow ArrayIndexOutOfBoundsException:
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getValues(); {code}
and this
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getJSON(); 

{code}
I propose to add check on -1 in getValues
{code:java}
long[] values = new long[4];
if (last < 0) {
  return valuse;
}
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last]; {code}

> Divide by zero bug in InstrumentationService.java
> -
>
> Key: HDFS-16029
> URL: https://issues.apache.org/jira/browse/HDFS-16029
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: Yiyuan GUO
>Priority: Major
>  Labels: easy-fix, security
>
> In the file _lib/service/instrumentation/InstrumentationService.java,_ the 
> method 
>  _Timer.getValues_ has the following 
> [code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L236]:
> {code:java}
> long[] getValues() {
> ..
> int limit = (full) ? size : (last + 1);
> ..
> values[AVG_TOTAL] = values[AVG_TOTAL] / limit;
> }
> {code}
> The variable _limit_ is used as a divisor. However, its value may be equal to 
> _last + 1,_ which can be zero since _last_ is initialized to -1 in the 
> [constructor|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L222]:
> {code:java}
> public Timer(int size) {
> ...
> last = -1;
> }
> {code}
> Thus, a divide by zero problem can happen.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16029) Divide by zero bug in InstrumentationService.java

2023-06-08 Thread Anton Kutuzov (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730831#comment-17730831
 ] 

Anton Kutuzov edited comment on HDFS-16029 at 6/9/23 6:34 AM:
--

More precisely, we get an exception: ArrayIndexOutOfBoundsException, because 
before devizion the 'last' used as index of array:
{code:java}
.
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last];
. {code}
So this code will thow ArrayIndexOutOfBoundsException:
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getValues(); {code}
and this
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getJSON(); 

{code}
I propose to add check on -1 in getValues
{code:java}
long[] values = new long[4];
if (last < 0) {
  return valuse;
}
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last]; {code}


was (Author: antn.kutuzov):
More precisely, we get an exception: ArrayIndexOutOfBoundsException, because 
before devizion use last as index of array:
{code:java}
.
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last];
. {code}
So this code will thow ArrayIndexOutOfBoundsException:
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getValues(); {code}
and this
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getJSON(); 

{code}
I propose to add check on -1 in getValues
{code:java}
long[] values = new long[4];
if (last < 0) {
  return valuse;
}
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last]; {code}

> Divide by zero bug in InstrumentationService.java
> -
>
> Key: HDFS-16029
> URL: https://issues.apache.org/jira/browse/HDFS-16029
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: Yiyuan GUO
>Priority: Major
>  Labels: easy-fix, security
>
> In the file _lib/service/instrumentation/InstrumentationService.java,_ the 
> method 
>  _Timer.getValues_ has the following 
> [code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L236]:
> {code:java}
> long[] getValues() {
> ..
> int limit = (full) ? size : (last + 1);
> ..
> values[AVG_TOTAL] = values[AVG_TOTAL] / limit;
> }
> {code}
> The variable _limit_ is used as a divisor. However, its value may be equal to 
> _last + 1,_ which can be zero since _last_ is initialized to -1 in the 
> [constructor|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L222]:
> {code:java}
> public Timer(int size) {
> ...
> last = -1;
> }
> {code}
> Thus, a divide by zero problem can happen.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16029) Divide by zero bug in InstrumentationService.java

2023-06-08 Thread Anton Kutuzov (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730831#comment-17730831
 ] 

Anton Kutuzov commented on HDFS-16029:
--

More precisely, we get an exception: ArrayIndexOutOfBoundsException, because 
before devizion use last as index of array:
{code:java}
.
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last];
. {code}
So this code will thow ArrayIndexOutOfBoundsException:
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getValues(); {code}
and this
{code:java}
InstrumentationService.Timer timer = new InstrumentationService.Timer(2);
long[] values = timer.getJSON(); 

{code}
I propose to add check on -1 in getValues
{code:java}
long[] values = new long[4];
if (last < 0) {
  return valuse;
}
values[LAST_TOTAL] = total[last];
values[LAST_OWN] = own[last]; {code}

> Divide by zero bug in InstrumentationService.java
> -
>
> Key: HDFS-16029
> URL: https://issues.apache.org/jira/browse/HDFS-16029
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: Yiyuan GUO
>Priority: Major
>  Labels: easy-fix, security
>
> In the file _lib/service/instrumentation/InstrumentationService.java,_ the 
> method 
>  _Timer.getValues_ has the following 
> [code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L236]:
> {code:java}
> long[] getValues() {
> ..
> int limit = (full) ? size : (last + 1);
> ..
> values[AVG_TOTAL] = values[AVG_TOTAL] / limit;
> }
> {code}
> The variable _limit_ is used as a divisor. However, its value may be equal to 
> _last + 1,_ which can be zero since _last_ is initialized to -1 in the 
> [constructor|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/service/instrumentation/InstrumentationService.java#L222]:
> {code:java}
> public Timer(int size) {
> ...
> last = -1;
> }
> {code}
> Thus, a divide by zero problem can happen.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730792#comment-17730792
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

ayushtkn commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1223781601


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -259,6 +272,39 @@ private static String[] getUserGroupForTesting() {
 return groupsForTesting;
   }
 
+  @Test
+  public void testGetTopTokenRealOwners() throws Exception {
+// Create conf and start routers with only an RPC service
+Configuration conf = initSecurity();
+
+Configuration routerConf = new RouterConfigBuilder()
+.metrics()
+.rpc()
+.build();
+conf.addResource(routerConf);
+
+Router router = initializeAndStartRouter(conf);
+
+// Create credentials
+UserGroupInformation ugi = 
UserGroupInformation.createUserForTesting("router", getUserGroupForTesting());
+Credentials creds = RouterSecurityManager.createCredentials(router, ugi, 
"some_renewer");

Review Comment:
   assigning to ``creds`` is not required maybe





> RBF: top real owners metrics can't been parsed json string
> --
>
> Key: HDFS-16946
> URL: https://issues.apache.org/jira/browse/HDFS-16946
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Nishtha Shah
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-03-09-22-24-39-833.png
>
>
> After HDFS-15447,  Add top real owners metrics for delegation tokens. But the 
> metrics can't been parsed json string.
>  RBFMetrics$getTopTokenRealOwners method just return 
> `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1`
> !image-2023-03-09-22-24-39-833.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17030) Limit wait time for getHAServiceState in ObserverReaderProxy

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730763#comment-17730763
 ] 

ASF GitHub Bot commented on HDFS-17030:
---

hadoop-yetus commented on PR #5700:
URL: https://github.com/apache/hadoop/pull/5700#issuecomment-1583560536

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  19m 32s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  22m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 12s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   6m  1s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   5m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 59s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 59s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 16s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 223m 24s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/19/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 371m 22s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/19/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5700 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 16a5ef6f944a 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 54867487cb36504fec1f2885e6ea6233d9556610 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u372-ga~us1-0ubuntu1

[jira] [Commented] (HDFS-17030) Limit wait time for getHAServiceState in ObserverReaderProxy

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730762#comment-17730762
 ] 

ASF GitHub Bot commented on HDFS-17030:
---

hadoop-yetus commented on PR #5700:
URL: https://github.com/apache/hadoop/pull/5700#issuecomment-1583542874

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  19m 34s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 16s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 16s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 51s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 37s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 19s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m  0s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m  0s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   4m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 56s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 22s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 203m 35s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 339m 44s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/20/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5700 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 1878df179392 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cf48c697a219614460e7fbca240ff63e9bd7e111 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/20/testReport/ |
   | Max. process+thread count | 3219 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client 
hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
   |

[jira] [Updated] (HDFS-17024) Potential data race introduced by HDFS-15865

2023-06-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-17024:
---
Target Version/s: 3.3.9  (was: 3.3.6)

> Potential data race introduced by HDFS-15865
> 
>
> Key: HDFS-17024
> URL: https://issues.apache.org/jira/browse/HDFS-17024
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.3.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
>
> After HDFS-15865, we found client aborted due to an NPE.
> {noformat}
> 2023-04-10 16:07:43,409 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer: * ABORTING region 
> server kqhdp36,16020,1678077077562: Replay of WAL required. Forcing server 
> shutdown *
> org.apache.hadoop.hbase.DroppedSnapshotException: region: WAFER_ALL,16|CM 
> RIE.MA1|CP1114561.18|PROC|,1625899466315.0fbdf0f1810efa9e68af831247e6555f.
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2870)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2539)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2511)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2401)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:613)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:582)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:362)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:880)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:781)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:898)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:850)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:76)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishClose(HFileWriterImpl.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:687)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:393)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:78)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1047)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2349)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2806)
> {noformat}
> This is only possible if a data race happened. File this jira to improve the 
> data and eliminate the data race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16839) It should consider EC reconstruction work when we determine if a node is busy

2023-06-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-16839:
---
Component/s: ec
 erasure-coding

> It should consider EC reconstruction work when we determine if a node is busy
> -
>
> Key: HDFS-16839
> URL: https://issues.apache.org/jira/browse/HDFS-16839
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding
>Reporter: Kidd5368
>Assignee: Kidd5368
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> In chooseSourceDatanodes( ), I think it's more reasonable if we take EC 
> reconstruction work as a consideration when we determine if a node is busy or 
> not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730601#comment-17730601
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

hadoop-yetus commented on PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#issuecomment-1582766074

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   1m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 16s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5696/11/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 50s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  21m 43s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 35s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 130m  9s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5696/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5696 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 699598558ce7 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a21b79bb95d94b0090fa763f442676f5d5ffdd6b |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5696/11/testReport/ |
   | Max. process+thread count | 2426 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/ha

[jira] [Commented] (HDFS-17030) Limit wait time for getHAServiceState in ObserverReaderProxy

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730577#comment-17730577
 ] 

ASF GitHub Bot commented on HDFS-17030:
---

hadoop-yetus commented on PR #5700:
URL: https://github.com/apache/hadoop/pull/5700#issuecomment-1582602740

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  0s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  17m 37s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  22m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   6m  1s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   5m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 56s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 226m 31s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/18/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 372m 45s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/18/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5700 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 82cae10efc75 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d7364efcb81dd4283b596640a393016fa290ccc4 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | 

[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730555#comment-17730555
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

NishthaShah commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1222939776


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -81,6 +87,17 @@ public static void createMockSecretManager() throws 
IOException {
   @Rule
   public ExpectedException exceptionRule = ExpectedException.none();
 
+  private Router initializeAndStartRouter(Configuration configuration) {
+Router router = new Router();
+try {
+  router.init(configuration);
+  router.start();
+} catch (MetricsException e) {
+  //do nothing

Review Comment:
   @ayushtkn Are you running via command line and still see failure?
   I see consistent successful behaviour when I run full test from cmdline and 
via IDE debug/run its passing for me intermittently (3/5 passed for me)
   
   Edit: With the latest commit, eliminated the flaky behaviour





> RBF: top real owners metrics can't been parsed json string
> --
>
> Key: HDFS-16946
> URL: https://issues.apache.org/jira/browse/HDFS-16946
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Nishtha Shah
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-03-09-22-24-39-833.png
>
>
> After HDFS-15447,  Add top real owners metrics for delegation tokens. But the 
> metrics can't been parsed json string.
>  RBFMetrics$getTopTokenRealOwners method just return 
> `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1`
> !image-2023-03-09-22-24-39-833.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730526#comment-17730526
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

NishthaShah commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1222939776


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -81,6 +87,17 @@ public static void createMockSecretManager() throws 
IOException {
   @Rule
   public ExpectedException exceptionRule = ExpectedException.none();
 
+  private Router initializeAndStartRouter(Configuration configuration) {
+Router router = new Router();
+try {
+  router.init(configuration);
+  router.start();
+} catch (MetricsException e) {
+  //do nothing

Review Comment:
   @ayushtkn Are you running via command line and still see failure?
   I see consistent successful behaviour when I run full test from cmdline and 
via IDE debug/run its passing for me intermittently (3/5 passed for me)





> RBF: top real owners metrics can't been parsed json string
> --
>
> Key: HDFS-16946
> URL: https://issues.apache.org/jira/browse/HDFS-16946
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Nishtha Shah
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-03-09-22-24-39-833.png
>
>
> After HDFS-15447,  Add top real owners metrics for delegation tokens. But the 
> metrics can't been parsed json string.
>  RBFMetrics$getTopTokenRealOwners method just return 
> `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1`
> !image-2023-03-09-22-24-39-833.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730525#comment-17730525
 ] 

ASF GitHub Bot commented on HDFS-16898:
---

Hexiaoqiao commented on PR #5408:
URL: https://github.com/apache/hadoop/pull/5408#issuecomment-1582450852

   @hfutatzhanghb Please check failed unit tests if relate to this changes.




> Remove write lock for processCommandFromActor of DataNode to reduce impact on 
> heartbeat
> ---
>
> Key: HDFS-16898
> URL: https://issues.apache.org/jira/browse/HDFS-16898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Now in method processCommandFromActor,  we have code like below:
>  
> {code:java}
> writeLock();
> try {
>   if (actor == bpServiceToActive) {
> return processCommandFromActive(cmd, actor);
>   } else {
> return processCommandFromStandby(cmd, actor);
>   }
> } finally {
>   writeUnlock();
> } {code}
> if method processCommandFromActive costs much time, the write lock would not 
> release.
>  
> It maybe block the updateActorStatesFromHeartbeat method in 
> offerService,furthermore, it can cause the lastcontact of datanode very high, 
> even dead when lastcontact beyond 600s.
> {code:java}
> bpos.updateActorStatesFromHeartbeat(
> this, resp.getNameNodeHaState());{code}
> here we can make write lock fine-grain in processCommandFromActor method to 
> address this problem
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730523#comment-17730523
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

NishthaShah commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1222881620


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -81,6 +87,17 @@ public static void createMockSecretManager() throws 
IOException {
   @Rule
   public ExpectedException exceptionRule = ExpectedException.none();
 
+  private Router initializeAndStartRouter(Configuration configuration) {
+Router router = new Router();
+try {
+  router.init(configuration);
+  router.start();
+} catch (MetricsException e) {
+  //do nothing

Review Comment:
   Sure I can explore DefaultMetricsSystem.setMiniClusterMode(true);
   For me, when I was running via command line, testNotRunningSecretManager 
never fail. But yes now when I run/debug via the IDE configurations, it fails
   Edit: Via IDE it fails sometimes and not all the time





> RBF: top real owners metrics can't been parsed json string
> --
>
> Key: HDFS-16946
> URL: https://issues.apache.org/jira/browse/HDFS-16946
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Nishtha Shah
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-03-09-22-24-39-833.png
>
>
> After HDFS-15447,  Add top real owners metrics for delegation tokens. But the 
> metrics can't been parsed json string.
>  RBFMetrics$getTopTokenRealOwners method just return 
> `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1`
> !image-2023-03-09-22-24-39-833.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730504#comment-17730504
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

NishthaShah commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1222881620


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -81,6 +87,17 @@ public static void createMockSecretManager() throws 
IOException {
   @Rule
   public ExpectedException exceptionRule = ExpectedException.none();
 
+  private Router initializeAndStartRouter(Configuration configuration) {
+Router router = new Router();
+try {
+  router.init(configuration);
+  router.start();
+} catch (MetricsException e) {
+  //do nothing

Review Comment:
   Sure I can explore DefaultMetricsSystem.setMiniClusterMode(true);
   For me, when I was running via command line, testNotRunningSecretManager 
never fail. But yes now when I run/debug via the IDE configurations, it fails





> RBF: top real owners metrics can't been parsed json string
> --
>
> Key: HDFS-16946
> URL: https://issues.apache.org/jira/browse/HDFS-16946
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Nishtha Shah
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-03-09-22-24-39-833.png
>
>
> After HDFS-15447,  Add top real owners metrics for delegation tokens. But the 
> metrics can't been parsed json string.
>  RBFMetrics$getTopTokenRealOwners method just return 
> `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1`
> !image-2023-03-09-22-24-39-833.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Fix Version/s: 3.3.6

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730500#comment-17730500
 ] 

Xiaoqiao He commented on HDFS-17003:


cherry-pick to branch-3.3. 

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Component/s: namenode

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-17003.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730488#comment-17730488
 ] 

ASF GitHub Bot commented on HDFS-17003:
---

Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1582282121

   Committed to trunk. Thanks @hfutatzhanghb for your contirbutions, And 
@zhangshuyan0, @sodonnel for the reviews!




> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730486#comment-17730486
 ] 

ASF GitHub Bot commented on HDFS-17003:
---

Hexiaoqiao merged PR #5643:
URL: https://github.com/apache/hadoop/pull/5643




> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16946) RBF: top real owners metrics can't been parsed json string

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730485#comment-17730485
 ] 

ASF GitHub Bot commented on HDFS-16946:
---

ayushtkn commented on code in PR #5696:
URL: https://github.com/apache/hadoop/pull/5696#discussion_r1222765730


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/security/TestRouterSecurityManager.java:
##
@@ -81,6 +87,17 @@ public static void createMockSecretManager() throws 
IOException {
   @Rule
   public ExpectedException exceptionRule = ExpectedException.none();
 
+  private Router initializeAndStartRouter(Configuration configuration) {
+Router router = new Router();
+try {
+  router.init(configuration);
+  router.start();
+} catch (MetricsException e) {
+  //do nothing

Review Comment:
   I don't think we need this try-catch logic itself, put 
   ```
   DefaultMetricsSystem.setMiniClusterMode(true);
   ```
   in the ``BeforeClass``
   
   And can you run the test locally as well, for me if 
``testNotRunningSecretManager`` runs after your test it fails
   
   ```
   java.lang.AssertionError: Expecting 
org.apache.hadoop.service.ServiceStateException with text Failed to create 
SecretManager but got :  Expected to find 'Failed to create SecretManager' but 
got unexpected exception: org.apache.hadoop.service.ServiceStateException: 
org.apache.hadoop.security.KerberosAuthException: failure to login: for 
principal: router/localh...@example.com from keytab 
/Users/ayushsaxena/code/hadoop-code/hadoop/hadoop-hdfs-project/hadoop-hdfs-rbf/target/test/data/SecurityConfUtil/test.keytab
 javax.security.auth.login.LoginException: Integrity check on decrypted field 
failed (31) - PREAUTH_FAILED
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:174)
at 
org.apache.hadoop.hdfs.server.federation.security.TestRouterSecurityManager.lambda$testNotRunningSecretManager$1(TestRouterSecurityManager.java:327)
at 
org.apache.hadoop.test.LambdaTestUtils.lambda$intercept$0(LambdaTestUtils.java:534)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:498)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:529)
at 
org.apache.hadoop.hdfs.server.federation.security.TestRouterSecurityManager.testNotRunningSecretManager(TestRouterSecurityManager.java:326)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
at 
com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(

[jira] [Updated] (HDFS-17003) Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

2023-06-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-17003:
---
Summary: Erasure Coding: invalidate wrong block after reporting bad blocks 
from datanode  (was: Erasure coding: invalidate wrong block after reporting bad 
blocks from datanode)

> Erasure Coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Critical
>  Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17037) Consider nonDfsUsed when running balancer

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730472#comment-17730472
 ] 

ASF GitHub Bot commented on HDFS-17037:
---

hadoop-yetus commented on PR #5715:
URL: https://github.com/apache/hadoop/pull/5715#issuecomment-1582218750

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  18m 38s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 37s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 55s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   4m 55s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 143 unchanged - 1 fixed = 143 total (was 144)  |
   | +1 :green_heart: |  mvnsite  |   1m 56s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 24s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 255m 54s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 391m 50s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5715/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5715 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 7d1894527cad 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 86c4919792e86c08b90626583ad880fbe89edbbc |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5715/5/testReport/ |
   | Max. process+thread count | 2898 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hado

[jira] [Resolved] (HDFS-17035) FsVolumeImpl#getActualNonDfsUsed may return negative value

2023-06-08 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-17035.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> FsVolumeImpl#getActualNonDfsUsed may return negative value
> --
>
> Key: HDFS-17035
> URL: https://issues.apache.org/jira/browse/HDFS-17035
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17035) FsVolumeImpl#getActualNonDfsUsed may return negative value

2023-06-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730463#comment-17730463
 ] 

Ayush Saxena commented on HDFS-17035:
-

Committed to trunk.

Thanx [~zhanghaobo] for the contribution, [~hexiaoqiao] and [~zhangshuyan] for 
the review!!!

> FsVolumeImpl#getActualNonDfsUsed may return negative value
> --
>
> Key: HDFS-17035
> URL: https://issues.apache.org/jira/browse/HDFS-17035
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17035) FsVolumeImpl#getActualNonDfsUsed may return negative value

2023-06-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730460#comment-17730460
 ] 

ASF GitHub Bot commented on HDFS-17035:
---

ayushtkn merged PR #5708:
URL: https://github.com/apache/hadoop/pull/5708




> FsVolumeImpl#getActualNonDfsUsed may return negative value
> --
>
> Key: HDFS-17035
> URL: https://issues.apache.org/jira/browse/HDFS-17035
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org