[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-24 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557731#comment-14557731
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for reviewing, Nicholas!

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch, 
 HDFS-7687.4.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554055#comment-14554055
 ] 

Hadoop QA commented on HDFS-7687:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  1s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734400/HDFS-7687.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0e4f108 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11081/console |


This message was automatically generated.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch, 
 HDFS-7687.4.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-19 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550079#comment-14550079
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for your help, Jing!

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-18 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549798#comment-14549798
 ] 

Jing Zhao commented on HDFS-7687:
-

FYI, I've merged HDFS-8405 into the feature branch.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544498#comment-14544498
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---


- Question: Do we need {{ErasureCodingResult}}s when we support multiple 
{{ECSchema}}s?

- Some suggestion on the terms:
||Replication||Erasure Coding||
| block | block group |
| replica | ec-block |
| UNDER MIN REPL'D BLOCKS | UNRECOVERABLE BLOCK GROUPS |
| DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY | MIN REQUIRED EC BLOCK# |
| Minimally replicated blocks | Minimally erasure-coded block groups |
| Over-replicated blocks | Over-erasure-coded block groups |
| Under-replicated blocks | Under-erasure-coded block groups |
| Mis-replicated blocks | Unsatisfactory placement block groups |
| Default replication factor | Default schema |
| Average block replication | Average block group size |
| Missing replicas | Missing ec-blocks |
| Decommissioned Replicas | Decommissioned ec-blocks |
| Decommissioning Replicas | Decommissioning ec-blocks |

- It is good to add two new classes ReplicationResult and ErasureCodingResult.  
Then, we can rename AbstractResult back to Result.
- minReplication should remain final.  The subclasses can initialize it by 
super constructor, i.e.
{code}
  static abstract class Result {
...

final int minReplication;

Result(int minReplication) {
  this.minReplication = minReplication;
}

...
  }

  @VisibleForTesting
  static class ReplicationResult extends Result {
final short replication;

ReplicationResult(Configuration conf) {
  super(conf.getInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY,
DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_DEFAULT));
  this.replication = (short)conf.getInt(DFSConfigKeys.DFS_REPLICATION_KEY,

DFSConfigKeys.DFS_REPLICATION_DEFAULT);
}
...
  }

  @VisibleForTesting
  static class ErasureCodingResult extends Result {
final String ecSchema;

ErasureCodingResult(Configuration conf) {
  this(ErasureCodingSchemaManager.getSystemDefaultSchema());
}

ErasureCodingResult(ECSchema ecSchema) {
  super(ecSchema.getNumDataUnits());
  this.ecSchema = ecSchema.getSchemaName();
}

...
  }
{code}
- The check method can be simplified as below:
{code}
final Result r = file.getReplication() == 0? ecRes: replRes; 
collectFileSummary(path, file, r, blocks);
if (showprogress  (replRes.totalFiles + ecRes.totalFiles) % 100 == 0) {
  out.println();
  out.flush();
}
collectBlocksSummary(parent, file, r, blocks);
{code}


 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544513#comment-14544513
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

BTW, this is a bug.  The following
{code}
  res.append(\n  
).append(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY:\t)
 .append(minReplication);
{code}
should be
{code}
  res.append(\n  
).append(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY + :\t)
 .append(minReplication);
{code}
Let fix it in trunk first.  See if you also want to do some code refactoring in 
truck.  Filed HDFS-8405 and assigned to you.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-14 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544801#comment-14544801
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thanks for your detailed review! I will recreate a patch.

# {quote}
Question: Do we need {{ErasureCodingResult}}s when we support multiple 
{{ECSchema}}s?
{quote}
Expected ec-blocks is calculated per file in this patch, so I think one 
{{ErasureCodingResult}} can treat multiple {{ECSchema}} s. And it will also be 
able to treat {{EC+Contiguous}}.
# As I mentioned in the last comment, I only used the terms of replication for 
variables. Is it no problem, or should I define new variables for EC?

And Thanks for creating a new JIRA and assigning to me!

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-14 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544842#comment-14544842
 ] 

Takanobu Asanuma commented on HDFS-7687:


OK, I agree with you.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544829#comment-14544829
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

For the variables, let's keep using terms of replication for the moment.  I 
think it will be more confusing if we change them to some abstract names. 

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542993#comment-14542993
 ] 

Hadoop QA commented on HDFS-7687:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732725/HDFS-7687.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0e85044 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10964/console |


This message was automatically generated.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-12 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539863#comment-14539863
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for your kind review! I'd like to recreate the patch with your advice.

{quote}
Tried to run TestFsck but it fails with NullPointerException as shown below. 
Could you take a look? 
{quote}

Oh, the test was passed yesterday, but it's failed now. Some recent commits may 
cause the error. I will investigate it, and create a JIRA if it is necessary.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-11 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539187#comment-14539187
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

- ECResult and Result have a lot of duplication.  Let's create a base class, 
say AbstractResult and move the shared code to their.
- Similarly, collectECBlockGroupsSummary/collectECFileSummary and 
collectReplicatedBlocksSummary/collectReplicatedFileSummary also has 
duplication.  Let's create some methods to share the code.
- Tried to run TestFsck but it fails with NullPointerException as shown below.  
Could you take a look?  It seems that there is a bug in computing quota for ec 
files.  If it is the case, could you file a JIRA?
{code}
Tests run: 23, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 68.833 sec  
FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFsck
testECFsck(org.apache.hadoop.hdfs.server.namenode.TestFsck)  Time elapsed: 
1.912 sec   ERROR!
java.lang.NullPointerException: null
at 
org.apache.hadoop.hdfs.server.namenode.QuotaCounts.add(QuotaCounts.java:82)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.computeQuotaUsage(INodeFile.java:665)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:885)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:881)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:866)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:849)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:692)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:996)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:702)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:644)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:809)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:793)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1481)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1114)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:985)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:814)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:471)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:430)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsck.testECFsck(TestFsck.java:1673)
{code}

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537619#comment-14537619
 ] 

Hadoop QA commented on HDFS-7687:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   1m 34s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731849/HDFS-7687.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4536399 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10907/console |


This message was automatically generated.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-10 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537535#comment-14537535
 ] 

Takanobu Asanuma commented on HDFS-7687:


Additional comments for HDFS-7687.2.patch:
# I created {{NamenodeFsck.ECResult}} like {{NamenodeFsck.Result}}. 
{{NamenodeFsck.ECResult}} is for EC, and {{NAmenodeFsck.Result}} is for 
replication.
# {{totalDir}} and {{totalSimlik}} are common variables between EC and 
replication in result, so I moved them to the local variables of 
{{NamenodeFsck}}.
# I renamed {{collectFileSummary}} to {{collectReplicatedFileSummary}}, and 
{{collectBlockSummary}} to {{collectReplicatedBlockSummary}}. I just renamed 
methods here. So if there are some changes in trunk, they will be merged easyly.
# I use EC terminologies in some places instead of replication terminologies 
like following:
## ReplicatedBlock(s) - BlockGroup(s)
## Replica(s) - InternalBlock(s)
## totalReplicasPerBlock - totalInternalBlocksPerGroup
# I have a concern that these are fairly large changes in output. Should we 
mark it as incompatible?

Please review my patch and see my concern. If there is no problem, I will add 
more unit tests. Thank you.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-05-08 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534140#comment-14534140
 ] 

Takanobu Asanuma commented on HDFS-7687:


I'm sorry that my initial patch is too complicated and easily influenced by 
changing in trunk.
Now I'm implementing codes by more simple way. I will be able to submit a 
better patch by next Monday.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-30 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521322#comment-14521322
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for the information.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518265#comment-14518265
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

Both JIRAs are merged to the branch now.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-27 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513637#comment-14513637
 ] 

Takanobu Asanuma commented on HDFS-7687:


This ticket is depends on some commits in trunk now, such as HDFS-7993, 
HDFS-8215 and so on. I will submit patches after the commits are merged into 
HDFS-7285.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505467#comment-14505467
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

For #1, see if you want to create a JIRA for trunk to do some refactoring first.

For #2, you may include the test here or in a separated JIRA.  Both are fine.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505463#comment-14505463
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

The items looks good.  Just a minor point: A Corrupt EC block group could have 
= 6 blocks but some of the blocks are corrupted.

 ... in (6,3)-Reed-Solomon, these groups have more than 9 blocks. (Are there 
 these cases?)

Yes, it is possible.  E.g. a datanode D0 dies and a EC block in D0 is 
reconstructed in another datanode D1. Later on, D0 comes back.  Then, both D0 
and D1 have the same EC block and the block group could have more than 9 blocks.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504914#comment-14504914
 ] 

Takanobu Asanuma commented on HDFS-7687:


Sorry for my late work, [~szetszwo].
I'm mainly changing codes in {{NamenodeFsck.check}} to handle EC and I'm going 
to add some metrics for EC, referring to replication. Please would you check 
these metrics?

{{Total EC block groups}}:
The number of all EC block groups on the HDFS.

{{Minimally stored block groups}}:
The number of EC block groups which have enough blocks to recover. For example, 
in (6,3)-Reed-Solomon, these groups have 6 blocks at least.

{{Over EC block groups}}:
The number of EC block groups which have excess blocks for some reason. For 
example, in (6,3)-Reed-Solomon, these groups have more than 9 blocks. (Are 
there these cases?)

{{Under EC block groups}}:
The number of EC block groups which have lost blocks.

{{Mis EC block groups}}:
The number of EC block groups whose rack locations are invalid.

{{Default EC schema}}:
This is usually SYS-DEFAULT-RS-6-3. I think this will be set by a 
configuration file later.

{{Corrupt EC block groups}}:
The number of EC block groups which don't have enough blocks to recovery. For 
example, in (6,3)-Reed-Solomon, these groups have less than 6 blocks, so they 
can't recover.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504916#comment-14504916
 ] 

Takanobu Asanuma commented on HDFS-7687:


And I have other thoughts. Should I create other tickets about the things below?
# {{Namenodefsck.check}} is a large method. If I add the codes to handle EC in 
this methods, it will become larger and more complicated. So we will refactor 
it later.
# We should add some tests about fsck for EC.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505896#comment-14505896
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

Yes, refactoring in trunk first.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505877#comment-14505877
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thanks for your review, Nicholas!

bq. A Corrupt EC block group could have = 6 blocks but some of the blocks are 
corrupted.
bq. Yes, it is possible. E.g. a datanode D0 dies and a EC block in D0 is 
reconstructed in another datanode D1. Later on, D0 comes back. Then, both D0 
and D1 have the same EC block and the block group could have more than 9 blocks.
OK, I understand.

bq. For #1, see if you want to create a JIRA for trunk to do some refactoring 
first.
You mean, if it needs to do some refactoring, we should do refactoring in trunk 
branch first before we add the logic to handle EC?

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505925#comment-14505925
 ] 

Takanobu Asanuma commented on HDFS-7687:


I understand. Thank you.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505897#comment-14505897
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

Yes, refactoring in trunk first.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-21 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506098#comment-14506098
 ] 

Takanobu Asanuma commented on HDFS-7687:


I created a JIRA to refactor {{NamenodeFsck#check}} in HDFS-8215.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-14 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495178#comment-14495178
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thanks for your comments, Nicholas. I will write the patches with your 
suggestion.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494803#comment-14494803
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

It looks good.  I only have a minor suggestion on the section titles:
- Replication: = Replicated Blocks:
- EC: = Erasure Coded Blocks:

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-13 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493560#comment-14493560
 ] 

Takanobu Asanuma commented on HDFS-7687:


I wrote a simple test code about fsck:
{code:java}
Path ecDirPath = new Path(/striped);
Path ecFilePath = new Path(ecDirPath, ecfile);
int numBlocks = 4;
DFSTestUtil.createECFile(cluster, ecFilePath, ecDirPath, numBlocks, 
NUM_STRIPE_PER_BLOCK);
runFsck(conf, 0, true, /);
{code}

The results are here:
{noformat}
Status: HEALTHY
 Total size:12582912 B
 Total dirs:2
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  4 (avg. block size 3145728 B)
 Minimally replicated blocks:   4 (100.0 %)
 Over-replicated blocks:4 (100.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 4 (100.0 %)
 Default replication factor:3
 Average block replication: 9.0
 Corrupt blocks:0
 Missing replicas:  0 (NaN %)
 Number of data-nodes:  9
 Number of racks:   1
FSCK ended at Tue Apr 14 13:04:16 JST 2015 in 9 milliseconds


The filesystem under path '/' is HEALTHY
{noformat}

From the results, BlockStripedInfo(which is ec block group) is regarded as 
Over-replicated blocks because current fsck is specialized in replication. I 
think we should separate between replication and EC. For example,

{noformat}
Status:
 Total size:
 Total dirs:
 Total files:
 Total symlinks:
 Number of data-nodes:
 Number of racks:
 
Replication:
 Total blocks (validated):
 Minimally replicated blocks:
 Over-replicated blocks:
 Under-replicated blocks:
 Mis-replicated blocks:
 Default replication factor:
 Average block replication:
 Corrupt blocks:
 Missing replicas:
 
EC:
 Total EC block groups (validated):
 Over EC block groups:
 Under EC block groups:
 Mis EC block groups:
 Default EC schema:
 Corrupt EC block groups:
 Missing EC block groups:
{noformat} 

How does look that?

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-03-24 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377686#comment-14377686
 ] 

Takanobu Asanuma commented on HDFS-7687:


I created a JIRA about this problem in HDFS-7981.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-03-23 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375658#comment-14375658
 ] 

Takanobu Asanuma commented on HDFS-7687:


Now, {{testStoragePoliciesCK()}} in {{TestFsck}} is failed in EC branch.

A part of the test result is odd:
{noformat}
Blocks NOT satisfying the specified storage policy:
Storage Policy  Specified Storage Policy  # of blocks   
% of blocks
DISK:3(EC)  HOT   1 
 33.%
{noformat}

I investigated this cause and I found that a logic of Storage Policy regarding 
HOT policy as EC policy. So first, we should fix this problem. 
Can I create another JIRA for this problem?

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-03-17 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364675#comment-14364675
 ] 

Takanobu Asanuma commented on HDFS-7687:


I'd like to try to do this ticket. Please would you assign it to me? Thank you.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-03-17 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364850#comment-14364850
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

Thanks Takanobu.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)