[jira] [Commented] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648930#comment-17648930
 ] 

ASF GitHub Bot commented on HDFS-16870:
---

hadoop-yetus commented on PR #5237:
URL: https://github.com/apache/hadoop/pull/5237#issuecomment-1356382743

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  0s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  42m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 32s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 42s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 395m 37s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5237/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 516m 54s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5237/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5237 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d20bfd7407bd 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 43d48f53c7e1fa2e0c7efc8bce71ae7c79933dfa |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5237/1/testReport/ |
   | Max. process+thread count | 2077 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648910#comment-17648910
 ] 

ASF GitHub Bot commented on HDFS-16870:
---

ayushtkn commented on code in PR #5237:
URL: https://github.com/apache/hadoop/pull/5237#discussion_r1051411641


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -5893,7 +5893,7 @@ void reportBadBlocks(LocatedBlock[] blocks) throws 
IOException {
 String[] storageIDs = blocks[i].getStorageIDs();
 for (int j = 0; j < nodes.length; j++) {
   NameNode.stateChangeLog.info("*DIR* reportBadBlocks for block: {} on"
-  + " datanode: {}", blk, nodes[j].getXferAddr());
+  + " datanode: {}" + " client: {}", blk, nodes[j].getXferAddr(), 
Server.getRemoteIp());

Review Comment:
   Instead of ``Server.getRemoteIp()``, do  a ``getClientMachine()`` instead, 
it handles calls via RBF.
   Fetching of this IP Address should be outside lock, doing this inside lock 
will hit performance. 
   You are logging this inside the loop, so better extract a variable outside.
   
   reportBadBlocks is there in both ClientProtocol as well as in 
DatanodeProtocol. Can check somewhat here for pointers
   
https://github.com/apache/hadoop/blob/ca3526da9283500643479e784a779fb7898b6627/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L1533-L1534





> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>  Labels: pull-request-available
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode 
> address is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16869) Fail to start namenode owing to 0 size of clientid recorded in edit log.

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648897#comment-17648897
 ] 

ASF GitHub Bot commented on HDFS-16869:
---

hadoop-yetus commented on PR #5235:
URL: https://github.com/apache/hadoop/pull/5235#issuecomment-1356225204

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 10s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  22m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m  9s | 
[/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5235/1/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-common in trunk failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  25m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  25m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  22m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 59s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5235/1/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 1 new + 7 
unchanged - 0 fixed = 8 total (was 7)  |
   | +1 :green_heart: |  mvnsite  |   1m 35s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 59s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5235/1/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 43s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 10s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 25s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 222m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5235/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5235 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 604a53f2d73d 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
 

[jira] [Commented] (HDFS-16869) Fail to start namenode owing to 0 size of clientid recorded in edit log.

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648895#comment-17648895
 ] 

ASF GitHub Bot commented on HDFS-16869:
---

slfan1989 commented on PR #5235:
URL: https://github.com/apache/hadoop/pull/5235#issuecomment-1356204734

   > We first encouter this issue in Hadoop 3.3.1 version when we are 
rollingUpgrade from 3.1.1 to 3.3.1, which may cause NameNode start failure but 
just occasionally not everytime.
   
   Thank you very much for your contribution, reporting this issue, but can you 
explain why this modification solves the issue?




> Fail to start namenode owing to 0 size of clientid recorded in edit log.
> 
>
> Key: HDFS-16869
> URL: https://issues.apache.org/jira/browse/HDFS-16869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Major
>  Labels: pull-request-available
>
> We first encounter this issue in 3.3.1 version when we are upgrading from 
> 3.1.1 to 3.3.1 which may cause NameNode start failure but just occasionally 
> not everytime.
> The root cause for why 0 size of clientid happened here is still not found 
> after long-term investigating.
> So here we add a protection judge here to exlude 0 size of clientid from be 
> added into cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648892#comment-17648892
 ] 

ASF GitHub Bot commented on HDFS-16870:
---

slfan1989 commented on code in PR #5237:
URL: https://github.com/apache/hadoop/pull/5237#discussion_r1051385874


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -5893,7 +5893,7 @@ void reportBadBlocks(LocatedBlock[] blocks) throws 
IOException {
 String[] storageIDs = blocks[i].getStorageIDs();
 for (int j = 0; j < nodes.length; j++) {
   NameNode.stateChangeLog.info("*DIR* reportBadBlocks for block: {} on"
-  + " datanode: {}", blk, nodes[j].getXferAddr());
+  + " datanode: {}" + " client: {}", blk, nodes[j].getXferAddr(), 
Server.getRemoteIp());

Review Comment:
   From my personal point of view, reportBadBlocks should be reported by DN. 
What does this have to do with the client?  `Server.getRemoteIp()` is somewhat 
expensive, is there a good enough reason for us to do this? 





> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>  Labels: pull-request-available
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode 
> address is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16870:
--
Labels: pull-request-available  (was: )

> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>  Labels: pull-request-available
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode 
> address is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648885#comment-17648885
 ] 

ASF GitHub Bot commented on HDFS-16870:
---

Daniel-009497 opened a new pull request, #5237:
URL: https://github.com/apache/hadoop/pull/5237

   There are two scenatio involded for reportBadBlocks
   1-HDFS client will report bad block to NameNode once the block size or data 
is not consistent with meta;
   2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
on Datanode is corrupted or be modified.
   
   As for now, when namenode process reportBadBlock rpc request, only DataNode 
address is logged.
   Client Ip should also be logged to distinguish where the report comes from, 
which is very useful for trouble shooting.




> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode 
> address is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread Daniel Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Ma updated HDFS-16870:
-
Description: 
There are two scenario involved for reportBadBlocks.
1-HDFS client will report bad block to NameNode once the block size is 
inconsitent with meta;
2-DataNode will report bad block to NameNode via heartbeat if Replica stored on 
DataNode is corrupted or be modified.

Currently, when namenode process reportBadBlock rpc request, only DataNode 
address is recorded in Log msg,
Client Ip should also be recorded to distinguish where the report comes from, 
which is very useful for trouble shooting.

  was:
There are two scenario involved for reportBadBlocks.
1-HDFS client will report bad block to NameNode once the block size is 
inconsitent with meta;
2-DataNode will report bad block to NameNode via heartbeat if Replica stored on 
DataNode is corrupted or be modified.

Currently, when namenode process reportBadBlock rpc request, only DataNode Ip 
is recorded in Log msg,
Client Ip should also be recorded to distinguish where the report comes from, 
which is very useful for trouble shooting.


> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode 
> address is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread Daniel Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Ma updated HDFS-16870:
-
Description: 
There are two scenario involved for reportBadBlocks.
1-HDFS client will report bad block to NameNode once the block size is 
inconsitent with meta;
2-DataNode will report bad block to NameNode via heartbeat if Replica stored on 
DataNode is corrupted or be modified.

Currently, when namenode process reportBadBlock rpc request, only DataNode Ip 
is recorded in Log msg,
Client Ip should also be recorded to distinguish where the report comes from, 
which is very useful for trouble shooting.

> Client ip should also be recorded when NameNode is processing reportBadBlocks
> -
>
> Key: HDFS-16870
> URL: https://issues.apache.org/jira/browse/HDFS-16870
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Daniel Ma
>Priority: Trivial
>
> There are two scenario involved for reportBadBlocks.
> 1-HDFS client will report bad block to NameNode once the block size is 
> inconsitent with meta;
> 2-DataNode will report bad block to NameNode via heartbeat if Replica stored 
> on DataNode is corrupted or be modified.
> Currently, when namenode process reportBadBlock rpc request, only DataNode Ip 
> is recorded in Log msg,
> Client Ip should also be recorded to distinguish where the report comes from, 
> which is very useful for trouble shooting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16870) Client ip should also be recorded when NameNode is processing reportBadBlocks

2022-12-17 Thread Daniel Ma (Jira)
Daniel Ma created HDFS-16870:


 Summary: Client ip should also be recorded when NameNode is 
processing reportBadBlocks
 Key: HDFS-16870
 URL: https://issues.apache.org/jira/browse/HDFS-16870
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Daniel Ma






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16869) Fail to start namenode owing to 0 size of clientid recorded in edit log.

2022-12-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648876#comment-17648876
 ] 

ASF GitHub Bot commented on HDFS-16869:
---

Daniel-009497 opened a new pull request, #5235:
URL: https://github.com/apache/hadoop/pull/5235

   We first encouter this issue in Hadoop 3.3.1 version when we are 
rollingUpgrade from 3.1.1 to 3.3.1, which may cause NameNode start failure but 
just occasionally not everytime.
   
   The root cause for why 0 size of clientId happened here is still under 
investigating.
   So here we add a protection judge to exclude 0 size of clientId from being 
added into cache.




> Fail to start namenode owing to 0 size of clientid recorded in edit log.
> 
>
> Key: HDFS-16869
> URL: https://issues.apache.org/jira/browse/HDFS-16869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Major
>
> We first encounter this issue in 3.3.1 version when we are upgrading from 
> 3.1.1 to 3.3.1 which may cause NameNode start failure but just occasionally 
> not everytime.
> The root cause for why 0 size of clientid happened here is still not found 
> after long-term investigating.
> So here we add a protection judge here to exlude 0 size of clientid from be 
> added into cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16869) Fail to start namenode owing to 0 size of clientid recorded in edit log.

2022-12-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16869:
--
Labels: pull-request-available  (was: )

> Fail to start namenode owing to 0 size of clientid recorded in edit log.
> 
>
> Key: HDFS-16869
> URL: https://issues.apache.org/jira/browse/HDFS-16869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Major
>  Labels: pull-request-available
>
> We first encounter this issue in 3.3.1 version when we are upgrading from 
> 3.1.1 to 3.3.1 which may cause NameNode start failure but just occasionally 
> not everytime.
> The root cause for why 0 size of clientid happened here is still not found 
> after long-term investigating.
> So here we add a protection judge here to exlude 0 size of clientid from be 
> added into cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16869) Fail to start namenode owing to 0 size of clientid recorded in edit log.

2022-12-17 Thread Daniel Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Ma updated HDFS-16869:
-
Description: 
We first encounter this issue in 3.3.1 version when we are upgrading from 3.1.1 
to 3.3.1 which may cause NameNode start failure but just occasionally not 
everytime.

The root cause for why 0 size of clientid happened here is still not found 
after long-term investigating.
So here we add a protection judge here to exlude 0 size of clientid from be 
added into cache.

  was:
The root cause for why 0 size of clientid happened here is still not found.
So here we add a protection judge here to exlude 0 size of clientid from be 
added into cache.


> Fail to start namenode owing to 0 size of clientid recorded in edit log.
> 
>
> Key: HDFS-16869
> URL: https://issues.apache.org/jira/browse/HDFS-16869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Major
>
> We first encounter this issue in 3.3.1 version when we are upgrading from 
> 3.1.1 to 3.3.1 which may cause NameNode start failure but just occasionally 
> not everytime.
> The root cause for why 0 size of clientid happened here is still not found 
> after long-term investigating.
> So here we add a protection judge here to exlude 0 size of clientid from be 
> added into cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org