[jira] [Commented] (HDFS-17001) Support getStatus API in WebHDFS

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721208#comment-17721208
 ] 

ASF GitHub Bot commented on HDFS-17001:
---

zhtttylz commented on code in PR #5628:
URL: https://github.com/apache/hadoop/pull/5628#discussion_r1189414594


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java:
##
@@ -2255,6 +2256,35 @@ public void testFileLinkStatus() throws Exception {
 }
   }
 
+  @Test
+  public void testFsStatus() throws Exception {
+final Configuration conf = WebHdfsTestUtil.createConf();
+try {
+  cluster = new MiniDFSCluster.Builder(conf)
+  .numDataNodes(1)
+  .build();
+  cluster.waitActive();
+
+  final WebHdfsFileSystem webHdfs =
+  WebHdfsTestUtil.getWebHdfsFileSystem(conf,
+  WebHdfsConstants.WEBHDFS_SCHEME);
+
+  final String path = "/foo";
+  OutputStream os = webHdfs.create(new Path(path));
+  os.write(new byte[1024]);
+
+  FsStatus fsStatus = webHdfs.getStatus(new Path("/"));
+  Assert.assertNotNull(fsStatus);
+
+  //used, free and capacity are non-negative longs
+  Assert.assertTrue(fsStatus.getUsed() >= 0);
+  Assert.assertTrue(fsStatus.getRemaining() >= 0);
+  Assert.assertTrue(fsStatus.getCapacity() >= 0);

Review Comment:
   Thank you for your valuable suggestion. I greatly appreciate it and will 
promptly make the necessary changes to the code!
   





> Support getStatus API in WebHDFS
> 
>
> Key: HDFS-17001
> URL: https://issues.apache.org/jira/browse/HDFS-17001
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-05-08-14-34-51-873.png
>
>
> WebHDFS should support getStatus:
> !image-2023-05-08-14-34-51-873.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

2023-05-09 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721199#comment-17721199
 ] 

farmmamba edited comment on HDFS-17002 at 5/10/23 5:58 AM:
---

[~ayushtkn] , Yes, sir. This is not a bug. The type of this Jira is just an 
improvement. Yes, client won't  read parity blocks when all data blocks are 
healthy.

When the DirectoryScanner is not working, we know nothing about parity blocks 
even they got screwed.

So, I am thinking about whether we should to sample to check the correctness of 
the parity blocks with some probability when reading ec files or some other 
methods to prevent the parity blocks break down silently.

 

What's your opinions?  Looking forward to your reply.


was (Author: zhanghaobo):
[~ayushtkn] , Yes, sir. This is not a bug. The type of this Jira is just an 
improvement. Yes, client won't  read parity blocks when all data blocks are 
healthy.

When the DirectoryScanner is not working, we know nothing about parity blocks 
even they got screwed.

So, I am thinking about whether we should to sample to check the correctness of 
the parity blocks with some probability when reading ec files or some other 
methods to prevent the parity blocks break down silently.

> Erasure coding:Generate parity blocks in time to prevent file corruption
> 
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

2023-05-09 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721199#comment-17721199
 ] 

farmmamba commented on HDFS-17002:
--

[~ayushtkn] , Yes, sir. This is not a bug. The type of this Jira is just an 
improvement. Yes, client won't  read parity blocks when all data blocks are 
healthy.

When the DirectoryScanner is not working, we know nothing about parity blocks 
even they got screwed.

So, I am thinking about whether we should to sample to check the correctness of 
the parity blocks with some probability when reading ec files or some other 
methods to prevent the parity blocks break down silently.

> Erasure coding:Generate parity blocks in time to prevent file corruption
> 
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16991) Fix testMkdirsRaceWithObserverRead

2023-05-09 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HDFS-16991:
---

Assignee: fanluo

> Fix testMkdirsRaceWithObserverRead
> --
>
> Key: HDFS-16991
> URL: https://issues.apache.org/jira/browse/HDFS-16991
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.3.4
>Reporter: fanluo
>Assignee: fanluo
>Priority: Minor
>  Labels: pull-request-available
>
> The test case testMkdirsRaceWithObserverRead which in TestObserverNode 
> sometimes failed like this:
> {code:java}
> java.lang.AssertionError: Client #1 lastSeenStateId=-9223372036854775808 
> activStateId=5
> null    at org.junit.Assert.fail(Assert.java:89)
>     at org.junit.Assert.assertTrue(Assert.java:42)
>     at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestObserverNode.testMkdirsRaceWithObserverRead(TestObserverNode.java:607)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  {code}
> i think the Thread.sleep() should move into the sub threads, like this:
> {code:java}
> public void run() {
>   try {
>     fs.mkdirs(DIR_PATH);
>     Thread.sleep(150); // wait until mkdir is logged
>     clientState.lastSeenStateId = HATestUtil.getLastSeenStateId(fs);
>     assertSentTo(fs, 0);    FileStatus stat = fs.getFileStatus(DIR_PATH);
>     assertSentTo(fs, 2);
>     assertTrue("Should be a directory", stat.isDirectory());
>   } catch (FileNotFoundException ioe) {
>     clientState.fnfe = ioe;
>   } catch (Exception e) {
>     fail("Unexpected exception: " + e);
>   }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17001) Support getStatus API in WebHDFS

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721184#comment-17721184
 ] 

ASF GitHub Bot commented on HDFS-17001:
---

ayushtkn commented on code in PR #5628:
URL: https://github.com/apache/hadoop/pull/5628#discussion_r1189349033


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java:
##
@@ -725,4 +726,21 @@ public static Map 
toJsonMap(BlockLocation[] locations)
 m.put(BlockLocation.class.getSimpleName(), blockLocations);
 return m;
   }
+
+  public static String toJsonString(FsStatus status)
+  throws IOException {
+return toJsonString(FsStatus.class, toJsonMap(status));
+  }
+
+  public static Map toJsonMap(FsStatus status)
+  throws IOException {

Review Comment:
   this doesn't throw IOE, can be removed, once you remove it from here, I 
think the throws IOE can be removed from the above method as well



##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java:
##
@@ -2178,6 +2179,19 @@ HdfsFileStatus decodeResponse(Map json) {
 return status.makeQualified(getUri(), f);
   }
 
+  @Override
+  public FsStatus getStatus(Path f) throws IOException {

Review Comment:
   nit:
   change ``f`` to ``path``



##
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md:
##
@@ -1190,6 +1191,28 @@ See also: 
[FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html).getLinkTa
 
 See also: 
[FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html).getFileLinkInfo
 
+### Get Status
+
+* Submit a HTTP GET request.
+
+curl -i "http://:/webhdfs/v1/?op=GETSTATUS"
+
+  The client receives a response with a [`FsStatus` JSON 
object](#FsStatus_JSON_Schema):
+
+HTTP/1.1 200 OK
+Content-Type: application/json
+Transfer-Encoding: chunked
+
+{
+"FsStatus": {
+"used": 0,
+"remaining": 0,
+"capacity":0
+}

Review Comment:
   can you try it in an actual cluster, get a better example rather than having 
all 0



##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java:
##
@@ -2255,6 +2256,35 @@ public void testFileLinkStatus() throws Exception {
 }
   }
 
+  @Test
+  public void testFsStatus() throws Exception {
+final Configuration conf = WebHdfsTestUtil.createConf();
+try {
+  cluster = new MiniDFSCluster.Builder(conf)
+  .numDataNodes(1)

Review Comment:
   datanodes are 1 by default, this line isn't required



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java:
##
@@ -1535,6 +1543,10 @@ public Void run() throws IOException {
 };
   }
 
+  private long getStateAtIndex(long[] states, int index) {
+return states.length > index ? states[index] : -1;
+  }

Review Comment:
   The same is defined in DfsClient, can we make the definition over there 
``public static`` and use it here as well, rather than defining it twice?



##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java:
##
@@ -2255,6 +2256,35 @@ public void testFileLinkStatus() throws Exception {
 }
   }
 
+  @Test
+  public void testFsStatus() throws Exception {
+final Configuration conf = WebHdfsTestUtil.createConf();
+try {
+  cluster = new MiniDFSCluster.Builder(conf)
+  .numDataNodes(1)
+  .build();
+  cluster.waitActive();
+
+  final WebHdfsFileSystem webHdfs =
+  WebHdfsTestUtil.getWebHdfsFileSystem(conf,
+  WebHdfsConstants.WEBHDFS_SCHEME);
+
+  final String path = "/foo";
+  OutputStream os = webHdfs.create(new Path(path));
+  os.write(new byte[1024]);
+
+  FsStatus fsStatus = webHdfs.getStatus(new Path("/"));
+  Assert.assertNotNull(fsStatus);
+
+  //used, free and capacity are non-negative longs
+  Assert.assertTrue(fsStatus.getUsed() >= 0);
+  Assert.assertTrue(fsStatus.getRemaining() >= 0);
+  Assert.assertTrue(fsStatus.getCapacity() >= 0);

Review Comment:
   there is already a static import for these. No need of Assert. prefix.
   
   Rather than just asserting they aren't 0, can you get the values from 
DistributedFileSystem and validate that they are same





> Support getStatus API in WebHDFS
> 
>
> Key: HDFS-17001
> URL: https://issues.apache.org/jira/browse/HDFS-17001
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Attac

[jira] [Commented] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

2023-05-09 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721176#comment-17721176
 ] 

Ayush Saxena commented on HDFS-17002:
-

When reading an EC file, if all the data blocks are there, as far as I know the 
client won’t even bother about the parity blocks.

 

it can get all the data from the data blocks itself, it doesn’t need to go to 
parity blocks. The use case what you are talking about is like that parity 
block on the physical datanode got screwed most probably the DirectoryScanner 
would detect and take care of that.

 

doesn’t sound like a bug to me on a quick read

> Erasure coding:Generate parity blocks in time to prevent file corruption
> 
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

2023-05-09 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721174#comment-17721174
 ] 

farmmamba commented on HDFS-17002:
--

Hi, [~sodonnell] , thanks for your reply.  I have done some tests about this 
cases as follows.

Suppose we use RS-6-3-1024K ec policy, we have (d1, d2, d3, d4, d5, d6, r1, r2, 
r3) of file test.txt

1、echo 0 > r1;  echo 0 > r2; echo 0 > r3.

2、hdfs dfs -cat test.txt

3、fsck test.txt

I found  the r1,r2,r3 parity blocks are still old.  that is to say, when 
reading ec file, it will not trigger parity block reconstruction soonly.

 

> Erasure coding:Generate parity blocks in time to prevent file corruption
> 
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721159#comment-17721159
 ] 

ASF GitHub Bot commented on HDFS-13507:
---

ZanderXu commented on PR #4990:
URL: https://github.com/apache/hadoop/pull/4990#issuecomment-1541260457

   > no problem sir, thank you! since you are anyways going to resolve merge 
conflicts, would you like to do it after #5554 gets merged? that way, you will 
need one time effort?
   
   Copy, sir. HDFS-16978 is a very useful patch, thanks for your great job. 




> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible, pull-request-available
> Attachments: HDFS-13507-HDFS-13891.003.patch, 
> HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch, HDFS-13507.003.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16990) HttpFS Add Support getFileLinkStatus API

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721155#comment-17721155
 ] 

ASF GitHub Bot commented on HDFS-16990:
---

zhtttylz commented on code in PR #5602:
URL: https://github.com/apache/hadoop/pull/5602#discussion_r1189306979


##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java:
##
@@ -2056,6 +2060,27 @@ private void testGetSnapshotDiffListing() throws 
Exception {
 }
   }
 
+  private void testGetFileLinkStatus() throws Exception {
+if (isLocalFS()) {
+  // do not test the the symlink for local FS.

Review Comment:
   Thank you so much for your assistance in reviewing the code! I truly 
appreciate your valuable feedback and will make the necessary modifications.





> HttpFS Add Support getFileLinkStatus API
> 
>
> Key: HDFS-16990
> URL: https://issues.apache.org/jira/browse/HDFS-16990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
>
> HttpFS should implement the *getFileLinkStatus* API already implemented in 
> WebHDFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-09 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721156#comment-17721156
 ] 

farmmamba commented on HDFS-17003:
--

[~hexiaoqiao] , Hi, sir.  the data loss can be reproduce like below, and main 
reason is in the end. 

suppose we have d1-d6, r1-r3 of a file test.txt.

1、echo 0 > d1  and echo 0 > d2

2、hdfs dfs -cat test.txt to report bad d1、d2 and reconstruction the d1 to d1', 
d2 to d2',  here will only invalidate d2 beacause the namenode logic, so we 
still have corrupt d1.

3、echo 0 > d1' and echo 0 > d2', then  execute hdfs dfs -cat test.txt  to 
reconstruction d1' to d1'' , d2' to d2''

4、then echo 0 > r1;  echo 0 > r2; echo 0 > r3.

5、wait a moment,  the file is corrupted and can not be recoverable.

 

The main reason of this case is  that d1 and d1‘ is not deleted in time, and 
namenode detects the excess blocks and then deletes the right block d1''.

 

any other information, we can see code in BlockManager#addStoredBlock method:
{code:java}
if ((corruptReplicasCount > 0) && (numLiveReplicas >= fileRedundancy)) {
  invalidateCorruptReplicas(storedBlock, reportedBlock, num);
}{code}
 

if we destory two data blocks of a EC stripe,  then hdfs will reconstruct those 
two data blocks and send IBR to namenode. So, it will execute 
BlockManager#addStoredBlock method,  when receiving the second data block's 
IBR,  namenode will enter the if condition above. the param we passed here is 
reportedBlock. In invalidateCorruptReplicas method,  it will add corrupt blocks 
to InvalidateBlocks according to the reportedBlock param, so this logic will 
ignore invalidating the block who send IBR firstly.

 

> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Priority: Critical
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16990) HttpFS Add Support getFileLinkStatus API

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721154#comment-17721154
 ] 

ASF GitHub Bot commented on HDFS-16990:
---

zhtttylz commented on code in PR #5602:
URL: https://github.com/apache/hadoop/pull/5602#discussion_r1189304987


##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:
##
@@ -546,6 +546,14 @@ public InputStream run() throws Exception {
   response = Response.ok(json).type(MediaType.APPLICATION_JSON).build();
   break;
 }
+case GETFILELINKSTATUS: {
+  FSOperations.FSFileLinkStatus command =
+  new FSOperations.FSFileLinkStatus(path);
+  Map js = fsExecute(user, command);

Review Comment:
   I will modify the code.





> HttpFS Add Support getFileLinkStatus API
> 
>
> Key: HDFS-16990
> URL: https://issues.apache.org/jira/browse/HDFS-16990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
>
> HttpFS should implement the *getFileLinkStatus* API already implemented in 
> WebHDFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16990) HttpFS Add Support getFileLinkStatus API

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721152#comment-17721152
 ] 

ASF GitHub Bot commented on HDFS-16990:
---

zhtttylz commented on code in PR #5602:
URL: https://github.com/apache/hadoop/pull/5602#discussion_r1189303216


##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java:
##
@@ -1743,6 +1744,18 @@ public BlockLocation[] getFileBlockLocations(final 
FileStatus status,
 return getFileBlockLocations(status.getPath(), offset, length);
   }
 
+  @Override
+  public FileStatus getFileLinkStatus(final Path f) throws IOException {

Review Comment:
   Thank you for your valuable suggestion. I greatly appreciate it and will 
promptly make the necessary changes to the code!





> HttpFS Add Support getFileLinkStatus API
> 
>
> Key: HDFS-16990
> URL: https://issues.apache.org/jira/browse/HDFS-16990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
>
> HttpFS should implement the *getFileLinkStatus* API already implemented in 
> WebHDFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721151#comment-17721151
 ] 

ASF GitHub Bot commented on HDFS-13507:
---

virajjasani commented on PR #4990:
URL: https://github.com/apache/hadoop/pull/4990#issuecomment-1541211627

   no problem sir, thank you!
   since you are anyways going to resolve merge conflicts, would you like to do 
it after #5554 gets merged? that way, you will need one time effort?




> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible, pull-request-available
> Attachments: HDFS-13507-HDFS-13891.003.patch, 
> HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch, HDFS-13507.003.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721145#comment-17721145
 ] 

ASF GitHub Bot commented on HDFS-13507:
---

ZanderXu commented on PR #4990:
URL: https://github.com/apache/hadoop/pull/4990#issuecomment-1541198183

   @virajjasani @ayushtkn Thanks for your reminder, and so sorry for my missed. 
   
   I will update this PR later.




> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible, pull-request-available
> Attachments: HDFS-13507-HDFS-13891.003.patch, 
> HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch, HDFS-13507.003.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721142#comment-17721142
 ] 

ASF GitHub Bot commented on HDFS-13507:
---

virajjasani commented on PR #4990:
URL: https://github.com/apache/hadoop/pull/4990#issuecomment-1541193032

   @ZanderXu i have a PR that touches similar code parts. if you don't have 
bandwidth, can i take up this PR as either of the PRs will anyways need to 
resolve conflicts?
   Thanks




> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible, pull-request-available
> Attachments: HDFS-13507-HDFS-13891.003.patch, 
> HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch, HDFS-13507.003.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16965) Add switch to decide whether to enable native codec.

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721133#comment-17721133
 ] 

ASF GitHub Bot commented on HDFS-16965:
---

tomscut commented on code in PR #5520:
URL: https://github.com/apache/hadoop/pull/5520#discussion_r1189267598


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/CodecUtil.java:
##
@@ -170,8 +174,14 @@ private static String[] getRawCoderNames(
 
   private static RawErasureEncoder createRawEncoderWithFallback(
   Configuration conf, String codecName, ErasureCoderOptions coderOptions) {
+boolean ISALEnabled = 
conf.getBoolean(IO_ERASURECODE_CODEC_NATIVE_ENABLED_KEY,

Review Comment:
   Please change the case of this variable.





> Add switch to decide whether to enable native codec.
> 
>
> Key: HDFS-16965
> URL: https://issues.apache.org/jira/browse/HDFS-16965
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Affects Versions: 3.3.4
>Reporter: WangYuanben
>Priority: Minor
>  Labels: pull-request-available
>
> Sometimes we need to create codec without ISA-L, while priority is given to 
> native codec by default. So it is necessary to add switch to decide whether 
> to enable native codec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16965) Add switch to decide whether to enable native codec.

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721132#comment-17721132
 ] 

ASF GitHub Bot commented on HDFS-16965:
---

tomscut commented on code in PR #5520:
URL: https://github.com/apache/hadoop/pull/5520#discussion_r1189266222


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/CodecUtil.java:
##
@@ -78,6 +78,10 @@ public final class CodecUtil {
   public static final String IO_ERASURECODE_CODEC_XOR_RAWCODERS_KEY =
   IO_ERASURECODE_CODEC + "xor.rawcoders";
 
+  public static final String IO_ERASURECODE_CODEC_NATIVE_ENABLED_KEY = 
"io.erasurecode.codec.native.enabled";

Review Comment:
   Hi @YuanbenWang , please fix the checkstyle. The other changes look good to 
me. Thanks.





> Add switch to decide whether to enable native codec.
> 
>
> Key: HDFS-16965
> URL: https://issues.apache.org/jira/browse/HDFS-16965
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Affects Versions: 3.3.4
>Reporter: WangYuanben
>Priority: Minor
>  Labels: pull-request-available
>
> Sometimes we need to create codec without ISA-L, while priority is given to 
> native codec by default. So it is necessary to add switch to decide whether 
> to enable native codec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16990) HttpFS Add Support getFileLinkStatus API

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721083#comment-17721083
 ] 

ASF GitHub Bot commented on HDFS-16990:
---

ayushtkn commented on code in PR #5602:
URL: https://github.com/apache/hadoop/pull/5602#discussion_r1189092588


##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java:
##
@@ -1743,6 +1744,18 @@ public BlockLocation[] getFileBlockLocations(final 
FileStatus status,
 return getFileBlockLocations(status.getPath(), offset, length);
   }
 
+  @Override
+  public FileStatus getFileLinkStatus(final Path f) throws IOException {

Review Comment:
   nit:
   can you change the variable name, instead of  ``f`` use ``path``



##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java:
##
@@ -2265,4 +2265,38 @@ public Map execute(FileSystem fs) throws IOException {
   "because the file system is not DistributedFileSystem.");
 }
   }
+
+  /**
+   * Executor that performs a linkFile-status FileSystemAccess files
+   * system operation.
+   */
+  @InterfaceAudience.Private
+  public static class FSFileLinkStatus
+  implements FileSystemAccess.FileSystemExecutor {
+private Path path;

Review Comment:
   can be ``final``



##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java:
##
@@ -2265,4 +2265,38 @@ public Map execute(FileSystem fs) throws IOException {
   "because the file system is not DistributedFileSystem.");
 }
   }
+
+  /**
+   * Executor that performs a linkFile-status FileSystemAccess files
+   * system operation.
+   */
+  @InterfaceAudience.Private
+  public static class FSFileLinkStatus

Review Comment:
   Add ``` @SuppressWarnings("rawtypes")```



##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:
##
@@ -546,6 +546,14 @@ public InputStream run() throws Exception {
   response = Response.ok(json).type(MediaType.APPLICATION_JSON).build();
   break;
 }
+case GETFILELINKSTATUS: {
+  FSOperations.FSFileLinkStatus command =
+  new FSOperations.FSFileLinkStatus(path);
+  Map js = fsExecute(user, command);

Review Comment:
   add  ```@SuppressWarnings("rawtypes")```



##
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java:
##
@@ -2056,6 +2060,27 @@ private void testGetSnapshotDiffListing() throws 
Exception {
 }
   }
 
+  private void testGetFileLinkStatus() throws Exception {
+if (isLocalFS()) {
+  // do not test the the symlink for local FS.

Review Comment:
   nit:
   two times ``the the``





> HttpFS Add Support getFileLinkStatus API
> 
>
> Key: HDFS-16990
> URL: https://issues.apache.org/jira/browse/HDFS-16990
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
>
> HttpFS should implement the *getFileLinkStatus* API already implemented in 
> WebHDFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

2023-05-09 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721080#comment-17721080
 ] 

Stephen O'Donnell commented on HDFS-17002:
--

If some of the parity blocks go missing, the Namenode should detect this and 
reconstruct them. Have you seen some example where this did not happen? Have 
you any more details or do you know the source of the problem?

> Erasure coding:Generate parity blocks in time to prevent file corruption
> 
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721067#comment-17721067
 ] 

ASF GitHub Bot commented on HDFS-17005:
---

hadoop-yetus commented on PR #5633:
URL: https://github.com/apache/hadoop/pull/5633#issuecomment-1540791812

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 35s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  27m 39s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 52s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 5 new + 9 unchanged - 
0 fixed = 14 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 235m 35s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 356m 36s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5633 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux dc7bbf6f6843 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / eba77970a0085c8d90af225c1c35840f59ba3b0f |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
 /usr/lib/jvm/java-8-op

[jira] [Commented] (HDFS-16978) RBF: Admin command to support bulk add of mount points

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721034#comment-17721034
 ] 

ASF GitHub Bot commented on HDFS-16978:
---

virajjasani commented on code in PR #5554:
URL: https://github.com/apache/hadoop/pull/5554#discussion_r1188968292


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java:
##
@@ -462,6 +484,142 @@ public int run(String[] argv) throws Exception {
 return exitCode;
   }
 
+  /**
+   * Add all mount point entries provided in the request.
+   *
+   * @param parameters Parameters for the mount points.
+   * @param i Current index on the parameters array.
+   * @return True if adding all mount points was successful, False otherwise.
+   * @throws IOException If the RPC call to add the mount points fail.
+   */
+  private boolean addAllMount(String[] parameters, int i) throws IOException {
+List addMountAttributesList = new ArrayList<>();
+Set mounts = new HashSet<>();
+while (i < parameters.length) {
+  AddMountAttributes addMountAttributes = 
getAddMountAttributes(parameters, i, true);
+  if (addMountAttributes == null) {
+return false;
+  }
+  if (mounts.contains(addMountAttributes.getMount())) {
+System.err.println("Multiple inputs for mount: " + 
addMountAttributes.getMount());
+return false;
+  }
+  mounts.add(addMountAttributes.getMount());

Review Comment:
   > btw. earlier this was a success case for you, the later entry used to take 
precedence. I tried it
   
   i forgot to reply to this yesterday: yes this was success case and only 
later entry was taking precedence before this addendum, that is due to the 
nature of state store putAll implementation. 
   but good for us to prevent such inputs with the proper error message so that 
user can rectify it. that way, we won't have to rely on state store impl to 
resolve it with different behavior (i.e. guard against any behavior changes of 
putAll impl in all state stores).





> RBF: Admin command to support bulk add of mount points
> --
>
> Key: HDFS-16978
> URL: https://issues.apache.org/jira/browse/HDFS-16978
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> All state store implementations support adding multiple state store records 
> using single putAll() implementation. We should provide new router admin API 
> to support bulk addition of mount table entries that can utilize this build 
> add implementation at state store level.
> For more than one mount point to be added, the goal of bulk addition should be
>  # To reduce frequent router calls
>  # To avoid frequent state store cache refreshers with each single mount 
> point addition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16978) RBF: Admin command to support bulk add of mount points

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721008#comment-17721008
 ] 

ASF GitHub Bot commented on HDFS-16978:
---

virajjasani commented on PR #5554:
URL: https://github.com/apache/hadoop/pull/5554#issuecomment-1540534240

   @ayushtkn @goiri @simbadzina latest review comments are addressed




> RBF: Admin command to support bulk add of mount points
> --
>
> Key: HDFS-16978
> URL: https://issues.apache.org/jira/browse/HDFS-16978
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> All state store implementations support adding multiple state store records 
> using single putAll() implementation. We should provide new router admin API 
> to support bulk addition of mount table entries that can utilize this build 
> add implementation at state store level.
> For more than one mount point to be added, the goal of bulk addition should be
>  # To reduce frequent router calls
>  # To avoid frequent state store cache refreshers with each single mount 
> point addition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720934#comment-17720934
 ] 

ASF GitHub Bot commented on HDFS-17005:
---

hadoop-yetus commented on PR #5633:
URL: https://github.com/apache/hadoop/pull/5633#issuecomment-1540132696

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 59s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  4s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  4s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/1/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   0m 53s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 5 new + 9 unchanged - 
0 fixed = 14 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  4s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 241m  1s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 353m 44s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestObserverNode |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5633/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5633 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux fa1f8b1b9989 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4ea58b6d2b9c85aeda038836c5f0e86a975584bb 

[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-09 Thread farmmamba (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

farmmamba updated HDFS-17003:
-
Description: 
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
datanode1:50010

2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
datanode2:50010{code}
datanode1 log:
{code:java}
2023-05-08 21:23:49,088 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
/data7/hadoop/hdfs/datanode

2023-05-08 21:24:00,509 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036848404319_1471186: ReplicaInfo not found.{code}
 

This phenomenon can be reproduced.

  was:
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}
 

This phenomenon can be reproduced.


> Erasure coding: invalidate wrong block after reporting bad blocks from 
> datanode
> ---
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: farmmamba
>Priority: Critical
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong 
> block to invalidate. It is a dangerous behaviour and may cause data loss. 
> Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on datanode: 
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
> reportBadBlocks for block: 
> BP-932824627--1680179358678:blk_-9223372036848404319_1471186 on datanode: 
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-932824627--1680179358678:blk_-9223372036848404320_1471186 on 
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not 
> found.{code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17001) Support getStatus API in WebHDFS

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720885#comment-17720885
 ] 

ASF GitHub Bot commented on HDFS-17001:
---

zhtttylz commented on PR #5628:
URL: https://github.com/apache/hadoop/pull/5628#issuecomment-1539812793

   @ayushtkn @slfan1989 Would you be so kind as to review my pull request, 
please?   Thank you very much!




> Support getStatus API in WebHDFS
> 
>
> Key: HDFS-17001
> URL: https://issues.apache.org/jira/browse/HDFS-17001
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Assignee: Hualong Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-05-08-14-34-51-873.png
>
>
> WebHDFS should support getStatus:
> !image-2023-05-08-14-34-51-873.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17001) Support getStatus API in WebHDFS

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720863#comment-17720863
 ] 

ASF GitHub Bot commented on HDFS-17001:
---

hadoop-yetus commented on PR #5628:
URL: https://github.com/apache/hadoop/pull/5628#issuecomment-1539706289

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m 21s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 15s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   6m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m  4s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   4m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   6m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 202m 52s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  20m 57s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 363m 51s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5628/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5628 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | Linux 76df79e93db0 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 343e54bdd321d346a8844bf975a212e6cfacd2da |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
 /usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5628/3/testReport/ |
   

[jira] [Updated] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread Prateek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prateek Agarwal updated HDFS-17005:
---
Status: Patch Available  (was: Open)

> NameJournalStatus JMX is not updated with new JN IP address on JN host change
> -
>
> Key: HDFS-17005
> URL: https://issues.apache.org/jira/browse/HDFS-17005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 3.3.4, 2.10.2, 2.8.2
>Reporter: Prateek Agarwal
>Priority: Major
>  Labels: pull-request-available
>
> Whenever a JournalNode host gets replaced, 'org.apache.hadoop.ipc.Client' is 
> able to refresh the new JournalNode IP by re-resolving the JN DNS or host.
> However, the JMX metrics exposed by Namenode still point to the stale JN IP 
> address. For example:
> {code}
> "NameJournalStatus" : "[{\"manager\":\"QJM to [10.92.29.151:8485, 
> 10.94.17.167:8485, 10.92.59.158:8485]\",\"stream\":\"Writing segment 
> beginning at txid 93886612. \\n10.92.29.151:8485 (Written txid 93894676), 
> 10.94.17.167:8485 (Written txid 93894674 (2 txns/1ms behind)), 
> 10.92.59.158:8485 (Written txid 
> 93894676)\",\"disabled\":\"false\",\"required\":\"true\"},{\"manager\":\"FileJournalManager(root=/data/1/dfs/name-data)\",\"stream\":\"EditLogFileOutputStream(/data/1/dfs/name-data/current/edits_inprogress_00093886612)\",\"disabled\":\"false\",\"required\":\"false\"}]",
> {code}
> The IP address '10.92.29.151' isn't updated even if the JN node now points to 
> an updated IP address, say, '10.36.72.221'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17005:
--
Labels: pull-request-available  (was: )

> NameJournalStatus JMX is not updated with new JN IP address on JN host change
> -
>
> Key: HDFS-17005
> URL: https://issues.apache.org/jira/browse/HDFS-17005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 2.8.2, 2.10.2, 3.3.4
>Reporter: Prateek Agarwal
>Priority: Major
>  Labels: pull-request-available
>
> Whenever a JournalNode host gets replaced, 'org.apache.hadoop.ipc.Client' is 
> able to refresh the new JournalNode IP by re-resolving the JN DNS or host.
> However, the JMX metrics exposed by Namenode still point to the stale JN IP 
> address. For example:
> {code}
> "NameJournalStatus" : "[{\"manager\":\"QJM to [10.92.29.151:8485, 
> 10.94.17.167:8485, 10.92.59.158:8485]\",\"stream\":\"Writing segment 
> beginning at txid 93886612. \\n10.92.29.151:8485 (Written txid 93894676), 
> 10.94.17.167:8485 (Written txid 93894674 (2 txns/1ms behind)), 
> 10.92.59.158:8485 (Written txid 
> 93894676)\",\"disabled\":\"false\",\"required\":\"true\"},{\"manager\":\"FileJournalManager(root=/data/1/dfs/name-data)\",\"stream\":\"EditLogFileOutputStream(/data/1/dfs/name-data/current/edits_inprogress_00093886612)\",\"disabled\":\"false\",\"required\":\"false\"}]",
> {code}
> The IP address '10.92.29.151' isn't updated even if the JN node now points to 
> an updated IP address, say, '10.36.72.221'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720830#comment-17720830
 ] 

ASF GitHub Bot commented on HDFS-17005:
---

prat0318 opened a new pull request, #5633:
URL: https://github.com/apache/hadoop/pull/5633

   
   
   ### Description of PR
   Update JMX JN IP address when JN host changes
   
   ### How was this patch tested?
   Unit tests and manual deployment of build on clusters
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> NameJournalStatus JMX is not updated with new JN IP address on JN host change
> -
>
> Key: HDFS-17005
> URL: https://issues.apache.org/jira/browse/HDFS-17005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 2.8.2, 2.10.2, 3.3.4
>Reporter: Prateek Agarwal
>Priority: Major
>
> Whenever a JournalNode host gets replaced, 'org.apache.hadoop.ipc.Client' is 
> able to refresh the new JournalNode IP by re-resolving the JN DNS or host.
> However, the JMX metrics exposed by Namenode still point to the stale JN IP 
> address. For example:
> {code}
> "NameJournalStatus" : "[{\"manager\":\"QJM to [10.92.29.151:8485, 
> 10.94.17.167:8485, 10.92.59.158:8485]\",\"stream\":\"Writing segment 
> beginning at txid 93886612. \\n10.92.29.151:8485 (Written txid 93894676), 
> 10.94.17.167:8485 (Written txid 93894674 (2 txns/1ms behind)), 
> 10.92.59.158:8485 (Written txid 
> 93894676)\",\"disabled\":\"false\",\"required\":\"true\"},{\"manager\":\"FileJournalManager(root=/data/1/dfs/name-data)\",\"stream\":\"EditLogFileOutputStream(/data/1/dfs/name-data/current/edits_inprogress_00093886612)\",\"disabled\":\"false\",\"required\":\"false\"}]",
> {code}
> The IP address '10.92.29.151' isn't updated even if the JN node now points to 
> an updated IP address, say, '10.36.72.221'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17005) NameJournalStatus JMX is not updated with new JN IP address on JN host change

2023-05-09 Thread Prateek Agarwal (Jira)
Prateek Agarwal created HDFS-17005:
--

 Summary: NameJournalStatus JMX is not updated with new JN IP 
address on JN host change
 Key: HDFS-17005
 URL: https://issues.apache.org/jira/browse/HDFS-17005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node
Affects Versions: 3.3.4, 2.10.2, 2.8.2
Reporter: Prateek Agarwal


Whenever a JournalNode host gets replaced, 'org.apache.hadoop.ipc.Client' is 
able to refresh the new JournalNode IP by re-resolving the JN DNS or host.

However, the JMX metrics exposed by Namenode still point to the stale JN IP 
address. For example:
{code}
"NameJournalStatus" : "[{\"manager\":\"QJM to [10.92.29.151:8485, 
10.94.17.167:8485, 10.92.59.158:8485]\",\"stream\":\"Writing segment beginning 
at txid 93886612. \\n10.92.29.151:8485 (Written txid 93894676), 
10.94.17.167:8485 (Written txid 93894674 (2 txns/1ms behind)), 
10.92.59.158:8485 (Written txid 
93894676)\",\"disabled\":\"false\",\"required\":\"true\"},{\"manager\":\"FileJournalManager(root=/data/1/dfs/name-data)\",\"stream\":\"EditLogFileOutputStream(/data/1/dfs/name-data/current/edits_inprogress_00093886612)\",\"disabled\":\"false\",\"required\":\"false\"}]",
{code}

The IP address '10.92.29.151' isn't updated even if the JN node now points to 
an updated IP address, say, '10.36.72.221'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org