[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=676130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676130
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 02:01
Start Date: 04/Nov/21 02:01
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676130)
Time Spent: 6.5h  (was: 6h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=676007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676007
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:49
Start Date: 04/Nov/21 01:49
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741582271



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Thanks, that's a good advice, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676007)
Time Spent: 6h 20m  (was: 6h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675976
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:46
Start Date: 04/Nov/21 01:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958791127






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675976)
Time Spent: 6h 10m  (was: 6h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675747
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:24
Start Date: 04/Nov/21 01:24
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958887599






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675747)
Time Spent: 6h  (was: 5h 50m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675726
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:22
Start Date: 04/Nov/21 01:22
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958610440






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675726)
Time Spent: 5h 50m  (was: 5h 40m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675606=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675606
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:07
Start Date: 04/Nov/21 01:07
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741582271



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Thanks, that's a good advice, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675606)
Time Spent: 5h 40m  (was: 5.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675587=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675587
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:05
Start Date: 04/Nov/21 01:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958791127






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675587)
Time Spent: 5.5h  (was: 5h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675499
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:56
Start Date: 04/Nov/21 00:56
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675499)
Time Spent: 5h 20m  (was: 5h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675250
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:32
Start Date: 04/Nov/21 00:32
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958887599






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675250)
Time Spent: 5h 10m  (was: 5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675220
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:27
Start Date: 04/Nov/21 00:27
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958610440






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675220)
Time Spent: 5h  (was: 4h 50m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675094
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:10
Start Date: 04/Nov/21 00:10
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741582271



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Thanks, that's a good advice, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675094)
Time Spent: 4h 50m  (was: 4h 40m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=675074=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675074
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:08
Start Date: 04/Nov/21 00:08
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958791127






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675074)
Time Spent: 4h 40m  (was: 4.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674926
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 20:20
Start Date: 03/Nov/21 20:20
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674926)
Time Spent: 4.5h  (was: 4h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674870
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 19:18
Start Date: 03/Nov/21 19:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-959845514


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 37s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 349m 27s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 454m 41s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithShortCircuitRead |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint |
   | uname | Linux 1298076a1247 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 
01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 51e61547d07d9a0c236b89e5b804aaa8f362f28d |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674615
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 13:52
Start Date: 03/Nov/21 13:52
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-959130953


   Thanks, looks good. I will commit when the CI checks come back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674615)
Time Spent: 4h 10m  (was: 4h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674557=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674557
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 11:44
Start Date: 03/Nov/21 11:44
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958953019


   @sodonnel Thanks, documentation file `HDFSCommands.md` is updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674557)
Time Spent: 4h  (was: 3h 50m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674524
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 10:45
Start Date: 03/Nov/21 10:45
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958887599


   @cndaimin I was about to commit this, and I remembered we should update the 
documentation to include this command. The documentation is in a markdown file 
and gets published with the release, like here:
   
   
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#Debug_Commands
   
   That page is generated from:
   
   ```
   hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
   ```
   
   Would you be able to add a section for this new command under the 
Debug_Commands section please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674524)
Time Spent: 3h 50m  (was: 3h 40m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674488
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 09:48
Start Date: 03/Nov/21 09:48
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958791127


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 35s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 47s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 50s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 324m 13s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 431m 32s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 04f9538a1b9b 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 21c1887fd7d0ede169c42e11b0c793c717dc7c47 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/3/testReport/ |
   | Max. process+thread count | 1996 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674341
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 02:44
Start Date: 03/Nov/21 02:44
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958610440


   @sodonnel  Thanks for your review. 
   Update: Removed the unused  import and added a test on verifying file with 2 
block groups.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674341)
Time Spent: 3.5h  (was: 3h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674337
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 02:38
Start Date: 03/Nov/21 02:38
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741582271



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Thanks, that's a good advice, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674337)
Time Spent: 3h 20m  (was: 3h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674054
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674054)
Time Spent: 2h 50m  (was: 2h 40m)

> Debug tool to verify the correctness of erasure coding on file
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674089
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:39
Start Date: 02/Nov/21 21:39
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674056
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:35
Start Date: 02/Nov/21 21:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674046
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674046)
Time Spent: 2h 40m  (was: 2.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673905
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:19
Start Date: 02/Nov/21 21:19
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673905)
Time Spent: 2.5h  (was: 2h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673661
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:22
Start Date: 02/Nov/21 18:22
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673661)
Time Spent: 2h 20m  (was: 2h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673584
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:14
Start Date: 02/Nov/21 18:14
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673512
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:05
Start Date: 02/Nov/21 18:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673509
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:05
Start Date: 02/Nov/21 18:05
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673509)
Time Spent: 1h 50m  (was: 1h 40m)

> Debug tool to verify the correctness of erasure coding on file
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673320
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:46
Start Date: 02/Nov/21 17:46
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673320)
Time Spent: 1h 40m  (was: 1.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673293
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:16
Start Date: 02/Nov/21 17:16
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673293)
Time Spent: 1.5h  (was: 1h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673291
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:15
Start Date: 02/Nov/21 17:15
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673291)
Time Spent: 1h 20m  (was: 1h 10m)

> Debug tool to verify the correctness of erasure coding on file
> 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673280
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 16:53
Start Date: 02/Nov/21 16:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673053
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:44
Start Date: 02/Nov/21 09:44
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673053)
Time Spent: 1h  (was: 50m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673051
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:39
Start Date: 02/Nov/21 09:39
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740874125



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673033
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:05
Start Date: 02/Nov/21 09:05
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=672650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672650
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 01/Nov/21 12:54
Start Date: 01/Nov/21 12:54
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740188593



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=672649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672649
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 01/Nov/21 12:52
Start Date: 01/Nov/21 12:52
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740187297



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long 

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-10-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=670741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670741
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 27/Oct/21 13:54
Start Date: 27/Oct/21 13:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-952955590


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 31s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 322m 17s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 426m 19s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ba38ee2e0214 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c33bb018ac0a5a6d4365e98f4e123d263732555f |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/1/testReport/ |
   | Max. process+thread count | 2058 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/1/console |
   | versions |