[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307210#comment-15307210
 ] 

Rakesh R commented on HDFS-9833:


Thanks [~drankye] for the detailed reviews. Attached patch addressing these 
comments.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9833:
---
Attachment: HDFS-9833-08.patch

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10468) HDFS read ends up ignoring an interrupt

2016-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307164#comment-15307164
 ] 

Hadoop QA commented on HDFS-10468:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s 
{color} | {color:red} hadoop-hdfs-project: The patch generated 2 new + 126 
unchanged - 1 fixed = 128 total (was 127) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s 
{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 17s {color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 104m 28s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807025/HDFS-10468.001.patch |
| JIRA Issue | HDFS-10468 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 25c969128609 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 93d8a7f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15607/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15607/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit test logs |  

[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl

2016-05-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307156#comment-15307156
 ] 

Allen Wittenauer commented on HDFS-10370:
-

In addition, there's some other issues, at least for trunk:

* Lots of caps being used for non-globals
* No declares for local-to-function vars
* single quote shell outs are deprecated
* Secure mode daemons do not have the necessary code here
* The comment in hadoop-env.sh is not nearly good enough.
* This is applying what is marked as datanode to literally everything, 
including end user commands

This last point gets to [~jzhuge]'s question:

bq. Any generic way to run numactl for Hadoop daemons with different numactl 
args?

Yes.  This should be written similarly to how hadoop_verify_user is built.  
Inside hadoop_java_exec (and the other places where we launch bits), it should 
check if HADOOP_NUMACTL_$\{command}_OPTS is set. $\{command} will equal 
'datanode' or 'namenode' or whatever.  If it is, then execute numactl. If not, 
run it unchanged.  So move this to be a HADOOP jira and let's make this 
functionality generic.

I'd actually recommend this NOT go into branch-2:

* This is going to be a significantly larger patch since there isn't any common 
code paths to launch java
* It really needs to be forward compatible being added so late to branch-2's 
lifecycle

> Allow DataNode to be started with numactl
> -
>
> Key: HDFS-10370
> URL: https://issues.apache.org/jira/browse/HDFS-10370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Dave Marion
>Assignee: Dave Marion
> Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, 
> HDFS-10370-3.patch, HDFS-10370-branch-2.004.patch, HDFS-10370.004.patch
>
>
> Allow numactl constraints to be applied to the datanode process. The 
> implementation I have in mind involves two environment variables (enable and 
> parameters) in the datanode startup process. Basically, if enabled and 
> numactl exists on the system, then start the java process using it. Provide a 
> default set of parameters, and allow the user to override the default. Wiring 
> this up for the non-jsvc use case seems straightforward. Not sure how this 
> can be supported using jsvc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307142#comment-15307142
 ] 

Kai Zheng commented on HDFS-9833:
-

Thanks [~rakeshr] for the update. It looks much close now. I have two more 
comments:

1. Would it be good to use {{StripedReconstructionInfo}} in the 
{{StripedReconstructor}} family? I mean, we can convert 
{{BlockECReconstructionInfo}} to StripedReconstructionInfo to pass down, and we 
can use it in the base constructor and so on.
{code}
+  public StripedBlockChecksumReconstructor(ErasureCodingWorker worker,
+  ExtendedBlock blockGroup, DatanodeInfo[] srcNodes, byte[] liveIndices,
+  int[] targetIndices, ErasureCodingPolicy ecPolicy,
+  DataOutputBuffer checksumWriter) throws IOException
{code}

2. How about {{ECBlockInfo}} => {{LiveBlockInfo}}, and 
{{setChecksumProperties}} => {{setOrCheckChecksumProperties}}?

+1 once addressed.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes

2016-05-30 Thread gurmukh singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gurmukh singh updated HDFS-7134:

Priority: Critical  (was: Major)

> Replication count for a block should not update till the blocks have settled 
> on Datanodes
> -
>
> Key: HDFS-7134
> URL: https://issues.apache.org/jira/browse/HDFS-7134
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.6.0
> Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP 
> Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> [hadoop@nn1 conf]$ cat /etc/redhat-release
> CentOS release 6.5 (Final)
>Reporter: gurmukh singh
>Priority: Critical
>
> The count for the number of replica's for a block should not change till the 
> blocks have settled on the datanodes.
> Test Case:
> Hadoop Cluster with 1 namenode and 3 datanodes.
> nn1.cluster1.com(192.168.1.70)
> dn1.cluster1.com(192.168.1.72)
> dn2.cluster1.com(192.168.1.73)
> dn3.cluster1.com(192.168.1.74)
> Cluster up and running fine with replication set to "1" for parameter 
> "dfs.replication on all nodes"
> 
> dfs.replication
> 1
> 
> To reduce the wait time, have reduced the dfs.heartbeat and recheck 
> parameters.
> on datanode2 (192.168.1.72)
> [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 /
> [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2
> On Namenode
> ===
> As expected, copy was done from datanode2, one copy will go locally.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:53:16 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.73:50010]
> Can see the blocks on the data nodes disks as well under the "current" 
> directory.
> Now, shutdown datanode2(192.168.1.73) and as expected block moves to another 
> datanode to maintain a replication of 2
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:54:21 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.72:50010]
> But, now if i bring back the datanode2, and although the namenode see that 
> this block is at 3 places now and fires a invalidate command for 
> datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 
> immediately.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:56:12 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> on Datanode1 - The invalidate command has been fired immediately and the 
> block deleted.
> =
> 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010
> 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010 size 17
> 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Scheduling blk_8132629811771280764_1175 file 
> /space/disk1/current/blk_8132629811771280764 for deletion
> 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Deleted blk_8132629811771280764_1175 at file 
> /space/disk1/current/blk_8132629811771280764
> The namenode still shows 3 replica's. even if one has been deleted, even 
> after more then 30 mins.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 14:21:27 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> This could be a dangerous, if someone remove or other 2 datanodes fail.
> On Datanode 1
> =
> Before, the datanode1 is brought back
> [hadoop@dn1 conf]$ ls -l /space/disk*/current
> /space/disk1/current:
> total 28
> -rw-rw-r-- 1 hadoop hadoop   13 Sep 21 09:09 blk_2278001646987517832
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 21 09:09 blk_2278001646987517832_1171.meta
> -rw-rw-r-- 1 hadoop hadoop   17 Sep 23 13:54 blk_8132629811771280764
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 23 13:54 blk_8132629811771280764_1175.meta
> -rw-rw-r-- 1 hadoop hadoop 5299 Sep 21 10:04 dncp_block_verification.log.curr
> 

[jira] [Updated] (HDFS-10468) HDFS read ends up ignoring an interrupt

2016-05-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10468:
-
Attachment: HDFS-10468.001.patch

Add a unit test. Also check {{InterruptIOException}}.

> HDFS read ends up ignoring an interrupt
> ---
>
> Key: HDFS-10468
> URL: https://issues.apache.org/jira/browse/HDFS-10468
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jing Zhao
> Attachments: HDFS-10468.000.patch, HDFS-10468.001.patch, log
>
>
> If an interrupt comes in during an HDFS read - it looks like HDFS ends up 
> ignoring it (handling it), and retries the read after an interval.
> An interrupt should result in the read being cancelled, with an 
> InterruptedException being thrown.
> Similarly - if an HDFS op is started with the interrupt status on the thread 
> set, an InterruptedException should be thrown.
> cc [~jingzhao]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10471:
-
Attachment: (was: HDFS-10471.001.patch)

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307068#comment-15307068
 ] 

Yiqun Lin commented on HDFS-10471:
--

The failed unit test looks not related, post the patch again.

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10471:
-
Attachment: HDFS-10471.001.patch

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10370) Allow DataNode to be started with numactl

2016-05-30 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306974#comment-15306974
 ] 

John Zhuge edited comment on HDFS-10370 at 5/30/16 11:27 PM:
-

Thanks [~dlmarion] for uploading the new patches.

Comments for file {{hdfs}} in {{HDFS-10370-branch-2.004.patch}}.
* 151-158: How about replacing the block with
{code}
if [[ ${HADOOP_DN_NUMACTL_ENABLE} == true ]] ; then
NUMACTL_ARGS=${HADOOP_DN_NUMACTL_ARGS:-"--interleave=all"}
NUMACTL_CMD="$(command -v numactl)"
fi
{code}
* Line 154: It is discouraged to use {{which}} to detect executable. See 
explanation in 
http://stackoverflow.com/questions/592620/check-if-a-program-exists-from-a-bash-script/677212#677212.
* 315 and 317: These 2 lines are almost identical, any way to remove redundancy?
* 317: Where is {{NUMACTL_ARGS}} initialized? Is it supposed to take the value 
from {{HADOOP_DN_NUMACTL_ARGS}}?
* 317: Inconsistent prefix between {{NUMA_CMD}} and {{NUMACTL_ARGS}}

[~aw] Any generic way to run numactl for Hadoop daemons with different numactl 
args? How to suppot JSVC? Maybe using JNI {{set_mempolicy}}?



was (Author: jzhuge):
Thanks [~dlmarion] for uploading the new patches.

Comments for file {{hdfs}} in {{HDFS-10370-branch-2.004.patch}}.
* 151-158: How about replacing the block with
{code}
if [[ ${HADOOP_DN_NUMACTL_ENABLE == true ]] ; then
NUMACTL_ARGS=${HADOOP_DN_NUMACTL_ARGS:-"--interleave=all"}
NUMACTL_CMD=$(command -v numactl)
fi
{code}
* Line 154: It is discouraged to use {{which}} to detect executable. See 
explanation in 
http://stackoverflow.com/questions/592620/check-if-a-program-exists-from-a-bash-script/677212#677212.
* 315 and 317: These 2 lines are almost identical, any way to remove redundancy?
* 317: Where is {{NUMACTL_ARGS}} initialized? Is it supposed to take the value 
from {{HADOOP_DN_NUMACTL_ARGS}}?
* 317: Inconsistent prefix between {{NUMA_CMD}} and {{NUMACTL_ARGS}}

[~aw] Any generic way to run numactl for Hadoop daemons with different numactl 
args? How to suppot JSVC? Maybe using JNI {{set_mempolicy}}?


> Allow DataNode to be started with numactl
> -
>
> Key: HDFS-10370
> URL: https://issues.apache.org/jira/browse/HDFS-10370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Dave Marion
>Assignee: Dave Marion
> Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, 
> HDFS-10370-3.patch, HDFS-10370-branch-2.004.patch, HDFS-10370.004.patch
>
>
> Allow numactl constraints to be applied to the datanode process. The 
> implementation I have in mind involves two environment variables (enable and 
> parameters) in the datanode startup process. Basically, if enabled and 
> numactl exists on the system, then start the java process using it. Provide a 
> default set of parameters, and allow the user to override the default. Wiring 
> this up for the non-jsvc use case seems straightforward. Not sure how this 
> can be supported using jsvc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl

2016-05-30 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306974#comment-15306974
 ] 

John Zhuge commented on HDFS-10370:
---

Thanks [~dlmarion] for uploading the new patches.

Comments for file {{hdfs}} in {{HDFS-10370-branch-2.004.patch}}.
* 151-158: How about replacing the block with
{code}
if [[ ${HADOOP_DN_NUMACTL_ENABLE == true ]] ; then
NUMACTL_ARGS=${HADOOP_DN_NUMACTL_ARGS:-"--interleave=all"}
NUMACTL_CMD=$(command -v numactl)
fi
{code}
* Line 154: It is discouraged to use {{which}} to detect executable. See 
explanation in 
http://stackoverflow.com/questions/592620/check-if-a-program-exists-from-a-bash-script/677212#677212.
* 315 and 317: These 2 lines are almost identical, any way to remove redundancy?
* 317: Where is {{NUMACTL_ARGS}} initialized? Is it supposed to take the value 
from {{HADOOP_DN_NUMACTL_ARGS}}?
* 317: Inconsistent prefix between {{NUMA_CMD}} and {{NUMACTL_ARGS}}

[~aw] Any generic way to run numactl for Hadoop daemons with different numactl 
args? How to suppot JSVC? Maybe using JNI {{set_mempolicy}}?


> Allow DataNode to be started with numactl
> -
>
> Key: HDFS-10370
> URL: https://issues.apache.org/jira/browse/HDFS-10370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Dave Marion
>Assignee: Dave Marion
> Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, 
> HDFS-10370-3.patch, HDFS-10370-branch-2.004.patch, HDFS-10370.004.patch
>
>
> Allow numactl constraints to be applied to the datanode process. The 
> implementation I have in mind involves two environment variables (enable and 
> parameters) in the datanode startup process. Basically, if enabled and 
> numactl exists on the system, then start the java process using it. Provide a 
> default set of parameters, and allow the user to override the default. Wiring 
> this up for the non-jsvc use case seems straightforward. Not sure how this 
> can be supported using jsvc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-05-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306941#comment-15306941
 ] 

Konstantin Shvachko commented on HDFS-10301:


Vinitha's patch adds one RPC only in the case when block reports are sent in 
multiple RPCs. If you choose to send the entire block report in one RPC, then 
it will be a single RPC call with her patch as well. It seems logical to have 
an extra RPC because you have chosen to split block reports into multiple RPCs. 
We are well aware of the NameNode performance problems.
Could you please review the patch.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306849#comment-15306849
 ] 

Hadoop QA commented on HDFS-9833:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s 
{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 44s 
{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 103m 52s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12806989/HDFS-9833-07.patch |
| JIRA Issue | HDFS-9833 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux b26d85edcff0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4e1f56e |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15605/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client 
hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15605/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Erasure coding: recomputing block checksum on the fly by 

[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306791#comment-15306791
 ] 

Rakesh R commented on HDFS-9833:


It seems HADOOP-13010 has changed {{StripedReconstructor.java}} file. I've 
rebased the patch on top of latest trunk code. Also, used {{Map}} instead of {{HashMap}}.

[~drankye], kindly review latest patch when you get a chance. Thanks!

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9833:
---
Attachment: HDFS-9833-07.patch

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306383#comment-15306383
 ] 

Hadoop QA commented on HDFS-10471:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 
new + 226 unchanged - 2 fixed = 228 total (was 228) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 44s {color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 107m 19s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12806926/HDFS-10471.001.patch |
| JIRA Issue | HDFS-10471 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5bb4d756ac97 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4e1f56e |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15604/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15604/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HDFS-Build/15604/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15604/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| 

[jira] [Commented] (HDFS-10426) TestPendingInvalidateBlock failed in trunk

2016-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306353#comment-15306353
 ] 

Hadoop QA commented on HDFS-10426:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
0s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 14s {color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 106m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.TestDataTransferKeepalive |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.TestEncryptionZonesWithKMS |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12806921/HDFS-10426.003.patch |
| JIRA Issue | HDFS-10426 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 266517ede364 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4e1f56e |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15603/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HDFS-Build/15603/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15603/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15603/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically 

[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-05-30 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306312#comment-15306312
 ] 

He Xiaoqiao commented on HDFS-10453:


hi [~vinayrpet], thanks for your comments.
Actually this case occurs at Step 2 in 
{{BlockManager#computeReconstructionWorkForBlocks()}} which is not with 
global-lock held as you metioned.
bq. 2. For blocks in work list, targets will be chosen.

{{rw.chooseTargets}} will stuck for long time in the following scenarios:
1. *Blocks* are chosen to be reconstructed at 
{{neededReconstruction#chooseLowRedundancyBlocks(blocksToProcess)}} with 
global-lock held;
2. BlockReconstructionWork list {{reconWork}} will be created with the *blocks* 
which are exactly present at that time in {{scheduleReplication()}} also with 
global-lock. After creating list, lock is released.
3. Deletion happens at {{BlockManager#removeBlock(BlockInfo block)}} with 
global lock held. it is a critical parts of this case,
{code}
  public void removeBlock(BlockInfo block) {
assert namesystem.hasWriteLock();
// No need to ACK blocks that are being removed entirely
// from the namespace, since the removal of the associated
// file already removes them from the block map below.
block.setNumBytes(BlockCommand.NO_ACK);
addToInvalidates(block);
removeBlockFromMap(block);
// Remove the block from pendingReconstruction and neededReconstruction
pendingReconstruction.remove(block);
neededReconstruction.remove(block, LowRedundancyBlocks.LEVEL);
if (postponedMisreplicatedBlocks.remove(block)) {
  postponedMisreplicatedBlocksCount.decrementAndGet();
}
  }
{code}
After {{removeBlock(BlockInfo block)}} *numbytes* of the block is set to 
BlockCommand.NO_ACK (=Long.MAX_VALUE), block is delete from 
{{neededReconstruction}},{{pendingReconstruction}} and {{blocksMap}}, but it is 
still referenced by {{reconWork}} which is local variable of {{BlockManager 
#computeReconstructionWorkForBlocks}}.
4. Choose target for each block in {{reconWork}}, but no Node could be selected 
after *traverse whole cluster* at this moment since numbytes of this block is 
Long.MAX_VALUE. if there are multiple blocks as depict above in a large 
cluster, Step 2 as flow below of 
{{BlockManager#computeReconstructionWorkForBlocks()}} will cost long time. it 
could be above *10 min* in our online cluster.
{code}
// Step 2: choose target nodes for each reconstruction task
final Set excludedNodes = new HashSet<>();
for(BlockReconstructionWork rw : reconWork){
  // Exclude all of the containing nodes from being targets.
  // This list includes decommissioning or corrupt nodes.
  excludedNodes.clear();
  for (DatanodeDescriptor dn : rw.getContainingNodes()) {
excludedNodes.add(dn);
  }

  // choose replication targets: NOT HOLDING THE GLOBAL LOCK
  // It is costly to extract the filename for which chooseTargets is called,
  // so for now we pass in the block collection itself.
  final BlockPlacementPolicy placementPolicy =
  placementPolicies.getPolicy(rw.getBlock().isStriped());
  rw.chooseTargets(placementPolicy, storagePolicySuite, excludedNodes);
}
{code}
5. The rest processing as usually.

FYI.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.7.1
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenarioļ¼š
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> 

[jira] [Updated] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10471:
-
Status: Patch Available  (was: Open)

Attach a simple patch fot this.

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10471:
-
Attachment: HDFS-10471.001.patch

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-30 Thread Yiqun Lin (JIRA)
Yiqun Lin created HDFS-10471:


 Summary: DFSAdmin#SetQuotaCommand's help msg is not correct
 Key: HDFS-10471
 URL: https://issues.apache.org/jira/browse/HDFS-10471
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Yiqun Lin
Assignee: Yiqun Lin
Priority: Minor


The help message of the command that related with SetQuota is not show correct. 
In message, the name {{quota}} was showed as {{N}}. The {{N}} was not appeared 
before.
{noformat}
-setQuota  ...: Set the quota  for each 
directory .
The directory quota is a long integer that puts a hard limit
on the number of names in the directory tree
For each directory, attempt to set the quota. An error will be 
reported if
1. N is not a positive integer, or
2. User is not an administrator, or
3. The directory does not exist or is a file.
Note: A quota of 1 would force the directory to remain empty.
{noformat}
The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306265#comment-15306265
 ] 

Rakesh R commented on HDFS-9833:


Test case failures are unrelated to my patch, please ignore it.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org