[jira] [Commented] (HDFS-11163) Mover should move the file blocks to default storage once policy is unset

2017-01-09 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814232#comment-15814232
 ] 

Surendra Singh Lilhore commented on HDFS-11163:
---

Thanks [~cnauroth] and [~szetszwo] for comments..

bq. Indeed, the additional RPC is not needed since HdfsLocatedFileStatus 
already has the resolved storage policy. We don't need to call getStoragePolicy 
again.
Yes, HdfsLocatedFileStatus has resolved storage policy. I am calling 
{{getStoragePolicy}} because it will give default policy in case of 
{{BLOCK_STORAGE_POLICY_ID_UNSPECIFIED}}.

{code}
  public BlockStoragePolicy getPolicy(byte id) {
// id == 0 means policy not specified.
return id == 0? getDefaultPolicy(): policies[id];
  }
{code}

we can add one API to get default policy from namenode, so we can avoid 
{{getStoragePolicy}} RPC per file.


> Mover should move the file blocks to default storage once policy is unset
> -
>
> Key: HDFS-11163
> URL: https://issues.apache.org/jira/browse/HDFS-11163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11163-001.patch, HDFS-11163-002.patch
>
>
> HDFS-9534 added new API in FileSystem to unset the storage policy. Once 
> policy is unset blocks should move back to the default storage policy.
> Currently mover is not moving file blocks which have zero storage ID
> {code}
>   // currently we ignore files with unspecified storage policy
>   if (policyId == HdfsConstants.BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) {
> return;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11150) [SPS]: Provide persistence when satisfying storage policy.

2017-01-09 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu updated HDFS-11150:
--
Attachment: HDFS-11150-HDFS-10285.006.patch

Uploaded v6 patch:
1. Rebase my patch
2. Delete the code change of {{StoragePolicySatisfier}}

> [SPS]: Provide persistence when satisfying storage policy.
> --
>
> Key: HDFS-11150
> URL: https://issues.apache.org/jira/browse/HDFS-11150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-11150-HDFS-10285.001.patch, 
> HDFS-11150-HDFS-10285.002.patch, HDFS-11150-HDFS-10285.003.patch, 
> HDFS-11150-HDFS-10285.004.patch, HDFS-11150-HDFS-10285.005.patch, 
> HDFS-11150-HDFS-10285.006.patch, editsStored, editsStored.xml
>
>
> Provide persistence for SPS in case that Hadoop cluster crashes by accident. 
> Basically we need to change EditLog and FsImage here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11163) Mover should move the file blocks to default storage once policy is unset

2017-01-09 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814077#comment-15814077
 ] 

Tsz Wo Nicholas Sze commented on HDFS-11163:


> ... It will require an additional getStoragePolicy RPC per file with the 
> default storage policy, whereas previously there was no RPC for those files. 
> ...

Indeed, the additional RPC is not needed since HdfsLocatedFileStatus already 
has the resolved storage policy.  We don't need to call getStoragePolicy again.

> Mover should move the file blocks to default storage once policy is unset
> -
>
> Key: HDFS-11163
> URL: https://issues.apache.org/jira/browse/HDFS-11163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11163-001.patch, HDFS-11163-002.patch
>
>
> HDFS-9534 added new API in FileSystem to unset the storage policy. Once 
> policy is unset blocks should move back to the default storage policy.
> Currently mover is not moving file blocks which have zero storage ID
> {code}
>   // currently we ignore files with unspecified storage policy
>   if (policyId == HdfsConstants.BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) {
> return;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814033#comment-15814033
 ] 

Hadoop QA commented on HDFS-9391:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 294 unchanged - 1 fixed = 294 total (was 295) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
| JDK v1.7.0_121 Failed junit tests | 
hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.TestEncryptionZones |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-9391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846485/HDFS-9391-branch-2.01.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 67c8a669255d 3.13.0-96-generic #143-Ubuntu SMP M

[jira] [Comment Edited] (HDFS-11072) Add ability to unset and change directory EC policy

2017-01-09 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814020#comment-15814020
 ] 

SammiChen edited comment on HDFS-11072 at 1/10/17 6:08 AM:
---

Thanks [~andrew.wang] !  I agree with the general comment. Will upload a new 
patch to address point 1&2.


was (Author: sammi):
Thanks [~andrew.wang] !  I agree with the general comment. Will upload a new 
update the address point 1&2.

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch, HDFS-11072-v5.patch, 
> HDFS-11072-v6.patch, HDFS-11072-v7.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy

2017-01-09 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814020#comment-15814020
 ] 

SammiChen commented on HDFS-11072:
--

Thanks [~andrew.wang] !  I agree with the general comment. Will upload a new 
update the address point 1&2.

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch, HDFS-11072-v5.patch, 
> HDFS-11072-v6.patch, HDFS-11072-v7.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11072) Add ability to unset and change directory EC policy

2017-01-09 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-11072:
-
Attachment: HDFS-11072-v7.patch

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch, HDFS-11072-v5.patch, 
> HDFS-11072-v6.patch, HDFS-11072-v7.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11306) Print remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813865#comment-15813865
 ] 

Yongjun Zhang commented on HDFS-11306:
--

Hi [~jojochuang],

Thanks much for working on this issue!

Some comments of the patch:

# Suggest to print a summary WARN message at the beginning of 
{{dumpRemainingEditLogs()}}, stating something like "The edits buffer should 
have been flushed but there are still  unflushed. Below are the list 
of the unflushed transactions:".
# add a finally block and call
{code}
IOUtils.cleanup(LOG, dis);
IOUtils.cleanup(LOG, bis);
{code}
# Can we add couple of more different edits in the test?

Thanks.


> Print remaining edit logs from buffer if edit log can't be rolled.
> --
>
> Key: HDFS-11306
> URL: https://issues.apache.org/jira/browse/HDFS-11306
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11306.001.patch
>
>
> In HDFS-10943 [~yzhangal] reported that edit log can not be rolled due to 
> unexpected edit logs lingering in the buffer.
> Unable to root cause the bug, I propose that we dump the remaining edit logs 
> in the buffer into namenode log, before crashing namenode. Use this new 
> capability to find the ops that sneaks into the buffer unexpectedly, and 
> hopefully catch the bug.
> This effort is orthogonal, but related to HDFS-11292, which adds additional 
> informational logs to help debug this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9935) Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD and unused import from HdfsServerConstants

2017-01-09 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813846#comment-15813846
 ] 

Yiqun Lin commented on HDFS-9935:
-

I have taken a quick check for this in branch-2 and branch-2.8, I found these 
two variables were also not used anymore. Is this means we can be safe to 
commit?

> Remove LEASE_{SOFTLIMIT,HARDLIMIT}_PERIOD and unused import from 
> HdfsServerConstants
> 
>
> Key: HDFS-9935
> URL: https://issues.apache.org/jira/browse/HDFS-9935
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-9935.001.patch, HDFS-9935.002.patch
>
>
> In HDFS-9134, it has moved the 
> {{LEASE_SOFTLIMIT_PERIOD}},{{LEASE_HARDLIMIT_PERIOD}} constants from 
> {{HdfsServerConstants}} to {{HdfsConstants}} because these two constants are 
> used by {{DFSClient}} which is moved to {{hadoop-hdfs-client}}. And constants 
> in {{HdfsConstants}} can be both used by client and server side. In addition, 
> I have checked that these two constants in {{HdfsServerConstants}} has 
> already not been used in project now and were all replaced by 
> {{HdfsConstants.LEASE_SOFTLIMIT_PERIOD}},{{HdfsConstants.LEASE_HARDLIMIT_PERIOD}}.
>  So I think we can remove these unused constant values in 
> {{HdfsServerConstants}} completely. Instead of we can use them in 
> {{HdfsConstants}} if we want to use them in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11150) [SPS]: Provide persistence when satisfying storage policy.

2017-01-09 Thread Yuanbo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813791#comment-15813791
 ] 

Yuanbo Liu commented on HDFS-11150:
---

[~umamaheswararao] Sure. Thanks for your reminder.

> [SPS]: Provide persistence when satisfying storage policy.
> --
>
> Key: HDFS-11150
> URL: https://issues.apache.org/jira/browse/HDFS-11150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-11150-HDFS-10285.001.patch, 
> HDFS-11150-HDFS-10285.002.patch, HDFS-11150-HDFS-10285.003.patch, 
> HDFS-11150-HDFS-10285.004.patch, HDFS-11150-HDFS-10285.005.patch, 
> editsStored, editsStored.xml
>
>
> Provide persistence for SPS in case that Hadoop cluster crashes by accident. 
> Basically we need to change EditLog and FsImage here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813708#comment-15813708
 ] 

Manoj Govindassamy commented on HDFS-9391:
--

Thanks [~mingma]. Attached branch-2.01.patch. Thanks for the review.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, 
> HDFS-9391-branch-2.01.patch, HDFS-9391.01.patch, HDFS-9391.02.patch, 
> HDFS-9391.03.patch, HDFS-9391.04.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Attachment: HDFS-9391-branch-2.01.patch

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, 
> HDFS-9391-branch-2.01.patch, HDFS-9391.01.patch, HDFS-9391.02.patch, 
> HDFS-9391.03.patch, HDFS-9391.04.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813659#comment-15813659
 ] 

Hudson commented on HDFS-11273:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11095 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11095/])
HDFS-11273. Move TransferFsImage#doGetUrl function to a Util class. (jing9: rev 
7ec609b28989303fe0cc36812f225028b0251b32)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Util.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HttpPutFailedException.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HttpGetFailedException.java


> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch, 
> HDFS-11273.002.patch, HDFS-11273.003.patch, HDFS-11273.004.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-01-09 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813646#comment-15813646
 ] 

Weiwei Yang edited comment on HDFS-6874 at 1/10/17 2:52 AM:


Chatted with [~clamb] offline, he mentioned he was not near with HDFS for 
sometime and suggest me find someone else to review. [~andrew.wang], would you 
help to review this? This is a continual effort to make GETFILEBLOCKLOCATIONS 
exposed consistently from fs/httpfs/webhdfs interfaces, this is an old JIRA 
opened for sometime but I would like to pick up and get it done.

Thanks 


was (Author: cheersyang):
Chatted with [~clamb] offline, he mentioned he was not near with HDFS for 
sometime and suggest me find someone else to review. [~andrew.wang], would you 
help to review this? This is a continual effort to make GETFILEBLOCKLOCATIONS 
exposed consistently from fs/httpfs/webhdfs interfaces, this is old JIRA opened 
for sometime but I would like to pick up and get it done.

Thanks 

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-01-09 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-6874:
--
Fix Version/s: (was: 2.9.0)

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-01-09 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813646#comment-15813646
 ] 

Weiwei Yang commented on HDFS-6874:
---

Chatted with [~clamb] offline, he mentioned he was not near with HDFS for 
sometime and suggest me find someone else to review. [~andrew.wang], would you 
help to review this? This is a continual effort to make GETFILEBLOCKLOCATIONS 
exposed consistently from fs/httpfs/webhdfs interfaces, this is old JIRA opened 
for sometime but I would like to pick up and get it done.

Thanks 

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Fix For: 2.9.0
>
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2017-01-09 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813615#comment-15813615
 ] 

Anu Engineer commented on HDFS-7343:


bq.  Also, it makes SSM more stable as these data stored in NN. When SSM node 
failure happened, we can simply launch another instance on another node.
I do see now where this thought came from. However I think that SSM should be 
able to stand independent and not rely on Namenode. Here are some reasons I can 
think of.

1. SSM can be implemented with no changes to NN, that makes for easier and 
faster development 
2. No added complexity in Namenode 
3. Moving State from SSM to Namenode just makes SSM simpler, but makes Namenode 
to that degree more complicated. 

So while I have a better understanding of the motivation, making NN store rules 
and metrics which are needed by SSM feels like a wrong choice. As I said 
earlier, if you want to run this in other scenarios then this dependency on NN 
makes it hard. For example, someone is running SSM in cloud and is using a 
cloud native file system instead of Namenode. 

bq. This brings in uncertainty to users. We can implement automatical 
rule-engine level throttle in Phase 2.
if a rule is misbehaving then NN will become slow, thus it brings uncertainty. 
But I am fine with the choice of postponing this to a later stage. Would you be 
able to count how many times a particular rule was triggered in a given time 
window ? That would be useful to debug this issue.

bq. Rule is the core part for SSM to function. For convenient and reliable 
consideration, it's better to store it in NN to keep SSM simple and stateless 
as suggested.
Rules are core part of SSM.So let us store them in SSM instead of storing it in 
NN, or feel free to store it as a file on HDFS. Modifying Namenode to store 
config of some other service will make Namenode a dumping ground of config for 
all other services.

bq. Yes, good question. We can support HA by many ways, for example, 
periodically checkpoint the data to HDFS or store the data in the same way as 
edit log.

Sorry, I am not unable to understand this response clearly. Are you now saying 
we will support HA  ? 

bq. First, we provide some verification mechanism when adding some rule. For 
example, we can give the user some warning when the candidate files of an 
action (such as move) exceeding some certain value. 

This is a classic time of check to time of use problem. When the rule gets 
written may be there is no issue, but as the file count increases this becomes 
a problem.

bq. Second, the execution state and other info related info can also be showed 
in the dashboard or queried. It's convenient for users to track the status and 
take actions accordingly. It's also very good to implement a timeout mechanism.

Agreed, but now have we not introduced the uncertainty issue back into the 
solution ? I thought we did not want to restrict the number of times a rule 
fires since that would introduce uncertainty.

bq. HDFS client will bypass SSM when the query fails, then the client goes back 
to the original working flow. It has almost no effect on the existing I/O.
So then the SSM rules are violated ? How does it deal with that issue ? since 
you have to deal with SSM being down why have the HDFS client even talk to SSM 
in an I/O path ? Why not just rely on background SSM logic and rely on the 
rules doing the right thing ? 


Thanks for sharing the graph, I appreciate it.











> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813607#comment-15813607
 ] 

Brahma Reddy Battula commented on HDFS-11308:
-

bq.nc is alias of ncat which does not have "-z" option in CentOS 7.
Even Suse also same issue..Please have look at HDFS-3618.

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813598#comment-15813598
 ] 

Hadoop QA commented on HDFS-7967:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 7s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
23s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 366 unchanged - 9 fixed = 371 total (was 375) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
55s{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_121. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-7967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846448/HDFS-7967.branch-2.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b8546401be71 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision 

[jira] [Updated] (HDFS-11268) Persist erasure coding policy ID in FSImage

2017-01-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11268:
---
   Priority: Critical  (was: Major)
Component/s: erasure-coding
 Issue Type: Bug  (was: Improvement)

Hi [~Sammi] have you made any progress on this issue? I'm changing this to Bug 
and Critical, since it seems important for EC users.

> Persist erasure coding policy ID in FSImage
> ---
>
> Key: HDFS-11268
> URL: https://issues.apache.org/jira/browse/HDFS-11268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: SammiChen
>Assignee: SammiChen
>Priority: Critical
>  Labels: hdfs-ec-3.0-must-do
>
> Currently, FSImage only has the information about whether the file is striped 
> or not. It doesn't save the erasure coding policy ID. Later, when the FSImage 
> is loaded to create the name space, the default system ec policy is used to 
> as file's ec policy.  In case if the ec policy on file is not the default ec 
> policy, then the content of the file cannot be accessed correctly in this 
> case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10759) Change fsimage bool isStriped from boolean to an enum

2017-01-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813585#comment-15813585
 ] 

Andrew Wang commented on HDFS-10759:


Thanks for the rev Ewan. Overall this looks really good, just nitty review 
comments. Since this is a big patch with a lot of mechanical changes, I'll try 
to review promptly to reduce rebase overhead. Some review comments:

* It looks like checkstyle is unhappy, I think your IDE is set to wider than 80 
chars line width, some other issues too.
* It'd be good to have a test that the default value of the proto enum is 
CONTIGUOUS as expected, per Jing's concern.
* INodeFile#BLOCK_TYPE_MASK_CONTIGUOUS is unused
* BLOCK_ID_MASK_STRIPED could use a comment, I double taked initially when I 
saw blockType was being compared against a variable named MASK, and it was the 
same value as BLOCK_ID_MASK.
* BlockType could use some more unit tests with filled in lower values.

> Change fsimage bool isStriped from boolean to an enum
> -
>
> Key: HDFS-10759
> URL: https://issues.apache.org/jira/browse/HDFS-10759
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1, 3.0.0-beta1, 3.0.0-alpha2
>Reporter: Ewan Higgs
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-10759.0001.patch, HDFS-10759.0002.patch, 
> HDFS-10759.0003.patch
>
>
> The new erasure coding project has updated the protocol for fsimage such that 
> the {{INodeFile}} has a boolean '{{isStriped}}'. I think this is better as an 
> enum or integer since a boolean precludes any future block types. 
> For example:
> {code}
> enum BlockType {
>   CONTIGUOUS = 0,
>   STRIPED = 1,
> }
> {code}
> We can also make this more robust to future changes where there are different 
> block types supported in a staged rollout.  Here, we would use 
> {{UNKNOWN_BLOCK_TYPE}} as the first value since this is the default value. 
> See 
> [here|http://androiddevblog.com/protocol-buffers-pitfall-adding-enum-values/] 
> for more discussion.
> {code}
> enum BlockType {
>   UNKNOWN_BLOCK_TYPE = 0,
>   CONTIGUOUS = 1,
>   STRIPED = 2,
> }
> {code}
> But I'm not convinced this is necessary since there are other enums that 
> don't use this approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-09 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-11273:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   Status: Resolved  (was: Patch Available)

The latest patch looks good to me. The failed tests should be unrelated and 
they passed in my local machine. +1

I've committed the patch to trunk. Thanks for the contribution, [~hkoneru]!

> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch, 
> HDFS-11273.002.patch, HDFS-11273.003.patch, HDFS-11273.004.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11303) Hedged read might hang infinitely if read data from all DN failed

2017-01-09 Thread Chen Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813527#comment-15813527
 ] 

Chen Zhang commented on HDFS-11303:
---

Stack, thanks for your comments. Yes, the test is just to verify, it hangs 
without the fix.

> Hedged read might hang infinitely if read data from all DN failed 
> --
>
> Key: HDFS-11303
> URL: https://issues.apache.org/jira/browse/HDFS-11303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha1
>Reporter: Chen Zhang
> Attachments: HDFS-11303-001.patch
>
>
> Hedged read will read from a DN first, if timeout, then read other DNs 
> simultaneously.
> If read all DN failed, this bug will cause the future-list not empty(the 
> first timeout request left in list), and hang in the loop infinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2017-01-09 Thread Wei Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813506#comment-15813506
 ] 

Wei Zhou commented on HDFS-7343:


{quote}
though I'd encourage you to think about how far we can get with a stateless 
system (possibly by pushing more work into the NN and DN).
{quote}
>From this words, I myself thought it's better to store these data in NN to 
>approach the stateless system suggested by Andrew. I did not mean that Andrew 
>said it's better to do it in this way. Sorry for my poor English.

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2017-01-09 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813493#comment-15813493
 ] 

Xiaobing Zhou edited comment on HDFS-11194 at 1/10/17 1:38 AM:
---

Thank you [~arpitagarwal] for the patch.
I've some comments.
# RollingAverages#getStats is using non-rolling mean states, it should use 
rolling ones instead. See RollingAverages#snapshot for calculation of rolling 
averages.
# DFS_METRICS_ROLLING_AVERAGES_WINDOW_SIZE_DEFAULT should be changed to the 
same naming with DFS_METRICS_ROLLING_AVERAGE_WINDOW_LENGTH_KEY
# These parameters can be changed to be configurable.
SlowNodeDetector#minOutlierDetectionPeers
DataNodePeerMetrics#LOW_THRESHOLD_MS
SlowPeerTracker#MAX_NODES_TO_REPORT
# In BlockReceiver#receivePacket, after 
trackSendPacketToLastNodeInPipeline(duration); It may need to change a bit, 
e.g. if (duration > DataNodePeerMetrics#LOW_THRESHOLD_MS). or remove the 
warning msg at all.


was (Author: xiaobingo):
Thank you [~arpitagarwal] for the patch.
I've some comments.
# RollingAverages#getStats is using non-rolling mean states, it should use 
rolling ones instead. See RollingAverages#snapshot for calculation of rolling 
averages.

# DFS_METRICS_ROLLING_AVERAGES_WINDOW_SIZE_DEFAULT should be changed to the 
same naming with DFS_METRICS_ROLLING_AVERAGE_WINDOW_LENGTH_KEY

# These parameters can be changed to be configurable.
SlowNodeDetector#minOutlierDetectionPeers
DataNodePeerMetrics#LOW_THRESHOLD_MS
SlowPeerTracker#MAX_NODES_TO_REPORT

# In BlockReceiver#receivePacket, after 
trackSendPacketToLastNodeInPipeline(duration); It may need to change a bit, 
e.g. if (duration > DataNodePeerMetrics#LOW_THRESHOLD_MS). or remove the 
warning msg at all.

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Arpit Agarwal
> Attachments: HDFS-11194-03-04.delta, HDFS-11194.01.patch, 
> HDFS-11194.02.patch, HDFS-11194.03.patch, HDFS-11194.04.patch
>
>
> The metrics collected in HDFS-10917 should be reported to and aggregated on 
> NameNode as part of heart beat messages. This will make is easy to expose it 
> through JMX to users who are interested in them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2017-01-09 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813493#comment-15813493
 ] 

Xiaobing Zhou commented on HDFS-11194:
--

Thank you [~arpitagarwal] for the patch.
I've some comments.
# RollingAverages#getStats is using non-rolling mean states, it should use 
rolling ones instead. See RollingAverages#snapshot for calculation of rolling 
averages.

# DFS_METRICS_ROLLING_AVERAGES_WINDOW_SIZE_DEFAULT should be changed to the 
same naming with DFS_METRICS_ROLLING_AVERAGE_WINDOW_LENGTH_KEY

# These parameters can be changed to be configurable.
SlowNodeDetector#minOutlierDetectionPeers
DataNodePeerMetrics#LOW_THRESHOLD_MS
SlowPeerTracker#MAX_NODES_TO_REPORT

# In BlockReceiver#receivePacket, after 
trackSendPacketToLastNodeInPipeline(duration); It may need to change a bit, 
e.g. if (duration > DataNodePeerMetrics#LOW_THRESHOLD_MS). or remove the 
warning msg at all.

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Arpit Agarwal
> Attachments: HDFS-11194-03-04.delta, HDFS-11194.01.patch, 
> HDFS-11194.02.patch, HDFS-11194.03.patch, HDFS-11194.04.patch
>
>
> The metrics collected in HDFS-10917 should be reported to and aggregated on 
> NameNode as part of heart beat messages. This will make is easy to expose it 
> through JMX to users who are interested in them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813481#comment-15813481
 ] 

Ming Ma commented on HDFS-9391:
---

+1. Manoj, given the patch doesn't apply directly to branch-2, can you please 
provide another patch? Thanks.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, HDFS-9391.04.patch, Maintenance 
> webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy

2017-01-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813468#comment-15813468
 ] 

Andrew Wang commented on HDFS-11072:


I'm almost +1, thanks for the quick rev Sammi,

* In the new shell command, we get the EC policy before unsetting to check if 
there is a policy already set. {{unsetStoragePolicy}} and the Java 
{{unsetECPolicy}} API don't seem to error in this case, so I think we should 
just call unset without checking. This also requires a fix in the user docs. As 
a general comment, I prefer to surface errors on the NN rather than in the 
client code, for consistency between the Java and shell APIs.
* Nit: "unexist" -> "non-existent" in the comments in testNonExistentDir

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch, HDFS-11072-v5.patch, 
> HDFS-11072-v6.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813396#comment-15813396
 ] 

Hadoop QA commented on HDFS-11305:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
33s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
55s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
26s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
32s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
49s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m  
3s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:78fc6b6 |
| JIRA Issue | HDFS-11305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846443/HDFS-11305.HDFS-8707.001.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux ec886e0f1381 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 2ceec2b |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| JDK v1.7.0_121  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18120/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18120/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>

[jira] [Created] (HDFS-11309) chooseTargetTypeInSameNode should pass accurate block size to chooseStorage4Block while choosing target

2017-01-09 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-11309:
--

 Summary: chooseTargetTypeInSameNode should pass accurate block 
size to chooseStorage4Block while choosing target
 Key: HDFS-11309
 URL: https://issues.apache.org/jira/browse/HDFS-11309
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-10285
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Currently chooseTargetTypeInSameNode is not passing accurate block size to 
chooseStorage4Block while choosing local target. Instead of accurate size we 
are passing 0, which assumes to ignore space constraint in the storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11292) log lastWrittenTxId etc info in logSyncAll

2017-01-09 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.8.0
   Status: Resolved  (was: Patch Available)

> log lastWrittenTxId etc info in logSyncAll
> --
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch, 
> HDFS-11292.003.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11292) log lastWrittenTxId etc info in logSyncAll

2017-01-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813354#comment-15813354
 ] 

Yongjun Zhang commented on HDFS-11292:
--

Thanks [~jojochuang] much for the review! I committed to trunk, branch-2 and 
branch-2.8.


> log lastWrittenTxId etc info in logSyncAll
> --
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch, 
> HDFS-11292.003.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813330#comment-15813330
 ] 

Manoj Govindassamy commented on HDFS-9391:
--

Test failures are not related to the patch. They are passing locally for me.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, HDFS-9391.04.patch, Maintenance 
> webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11292) log lastWrittenTxId etc info in logSyncAll

2017-01-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813317#comment-15813317
 ] 

Hudson commented on HDFS-11292:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11093 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11093/])
HDFS-11292. log lastWrittenTxId etc info in logSyncAll. Contributed by (yzhang: 
rev 603cbcd513a74c29e0e4ec9dc181ff08887d64a4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


> log lastWrittenTxId etc info in logSyncAll
> --
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch, 
> HDFS-11292.003.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813259#comment-15813259
 ] 

Hadoop QA commented on HDFS-9391:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 276 unchanged - 1 fixed = 276 total (was 277) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 43s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}104m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.TestStartup |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-9391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846427/HDFS-9391.04.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1c07ed9ae4f4 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18119/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18119/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18119/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues

[jira] [Commented] (HDFS-11306) Print remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813250#comment-15813250
 ] 

Hadoop QA commented on HDFS-11306:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11306 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846419/HDFS-11306.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 908f880b1514 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18116/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18116/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18116/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18116/console |
| Powered by | Apache 

[jira] [Commented] (HDFS-11150) [SPS]: Provide persistence when satisfying storage policy.

2017-01-09 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813252#comment-15813252
 ] 

Uma Maheswara Rao G commented on HDFS-11150:


[~yuanbo], please proceed with your patch now. HDFS-11293 has committed just 
few minutes ago. 

> [SPS]: Provide persistence when satisfying storage policy.
> --
>
> Key: HDFS-11150
> URL: https://issues.apache.org/jira/browse/HDFS-11150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-11150-HDFS-10285.001.patch, 
> HDFS-11150-HDFS-10285.002.patch, HDFS-11150-HDFS-10285.003.patch, 
> HDFS-11150-HDFS-10285.004.patch, HDFS-11150-HDFS-10285.005.patch, 
> editsStored, editsStored.xml
>
>
> Provide persistence for SPS in case that Hadoop cluster crashes by accident. 
> Basically we need to change EditLog and FsImage here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: HDFS-7967.branch-2.002.patch
HDFS-7967.branch-2.8.002.patch

Use collections.sort and add the remove() method for jdk 7 compliance.  
Apologies for so many posts, having difficulty getting local env to actually 
use jdk 7 for compiling.

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, 
> HDFS-7967.branch-2-1.patch, HDFS-7967.branch-2.001.patch, 
> HDFS-7967.branch-2.002.patch, HDFS-7967.branch-2.8-1.patch, 
> HDFS-7967.branch-2.8.001.patch, HDFS-7967.branch-2.8.002.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11299) Support multiple Datanode File IO hooks

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813212#comment-15813212
 ] 

Hadoop QA commented on HDFS-11299:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m  
3s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 45s{color} | {color:orange} root: The patch generated 1 new + 578 unchanged 
- 0 fixed = 579 total (was 578) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
55s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11299 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846413/HDFS-11299.002.patch |
| Optional Tests |  asflicense  mvnsite  compile  javac  javadoc  mvninstall  
unit  findbugs  checkstyle  |
| uname | Linux 8cb3224826df 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18115/artifact/patchprocess/diff-checkstyle-root.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18115/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18115/testReport/ |
| modules | C: hadoop-common-project/hadoop-common 
hadoop-hdf

[jira] [Commented] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813207#comment-15813207
 ] 

Hadoop QA commented on HDFS-11209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846426/HDFS-11209.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 7f8814d53b92 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18118/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18118/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18118/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
>  

[jira] [Updated] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-11305:
---
Attachment: HDFS-11305.HDFS-8707.001.patch

Cleaned up white spaces.

> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>Priority: Minor
> Attachments: HDFS-11305.HDFS-8707.000.patch, 
> HDFS-11305.HDFS-8707.001.patch
>
>
> The information can be logged as debug log and contain hostname, ip address, 
> with file path and offset. With these information, we can check things like 
> rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813155#comment-15813155
 ] 

Hadoop QA commented on HDFS-11273:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 7 new + 124 unchanged - 9 fixed = 131 total (was 133) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11273 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846411/HDFS-11273.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 545690ddc75b 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18114/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18114/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18114/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18114/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   h

[jira] [Commented] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813153#comment-15813153
 ] 

James Clampffer commented on HDFS-11305:


Code looks good to me, would you mind just clearing up whatever whitespace 
issue CI is warning about?

> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>Priority: Minor
> Attachments: HDFS-11305.HDFS-8707.000.patch
>
>
> The information can be logged as debug log and contain hostname, ip address, 
> with file path and offset. With these information, we can check things like 
> rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813143#comment-15813143
 ] 

Hadoop QA commented on HDFS-11305:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 
47s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
19s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
49s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
32s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m  
2s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:78fc6b6 |
| JIRA Issue | HDFS-11305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846421/HDFS-11305.HDFS-8707.000.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux e6193ddc9990 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 2ceec2b |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18117/artifact/patchprocess/whitespace-eol.txt
 |
| JDK v1.7.0_121  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18117/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18117/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.o

[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813132#comment-15813132
 ] 

Wei-Chiu Chuang commented on HDFS-11308:


(and more portable)

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread tangshangwen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813129#comment-15813129
 ] 

tangshangwen commented on HDFS-11308:
-

ok, thank you for your comments!

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813118#comment-15813118
 ] 

Wei-Chiu Chuang commented on HDFS-11308:


I don't know if there's standard/specification for command switches in Linux.
But I am inclined to use a configuration property to replace the hard coded "nc 
-z" command, so that this kind of issue is easier to workaround.

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813108#comment-15813108
 ] 

Masatake Iwasaki commented on HDFS-11308:
-

nc is alias of ncat which does not have "-z" option in CentOS 7. There seems to 
be some replacement by using bash.
http://stackoverflow.com/questions/4922943/test-from-shell-script-if-remote-tcp-port-is-open

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813104#comment-15813104
 ] 

Masatake Iwasaki commented on HDFS-11308:
-

Since "nc -z" is used to check that the process is still alive after failing to 
kill process by "fuser -k", you should check the reason of the failure. SSH 
user must be the user running namenode or root in order to make fuser work.

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11292) log lastWrittenTxId etc info in logSyncAll

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813078#comment-15813078
 ] 

Wei-Chiu Chuang commented on HDFS-11292:


+1. The failed tests cannot be reproduced in my local tree.

Thanks [~yzhangal]!

> log lastWrittenTxId etc info in logSyncAll
> --
>
> Key: HDFS-11292
> URL: https://issues.apache.org/jira/browse/HDFS-11292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch, 
> HDFS-11292.003.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
> LOG.info("Ending log segment " + curSegmentTxId);
> Preconditions.checkState(isSegmentOpen(),
> "Bad state: %s", state);
> if (writeEndTxn) {
>   logEdit(LogSegmentOp.getInstance(cache.get(),
>   FSEditLogOpCodes.OP_END_LOG_SEGMENT));
> }
> // always sync to ensure all edits are flushed.
> A.logSyncAll();
> B.printStatistics(true);
> final long lastTxId = getLastWrittenTxId();
> try {
> C.  journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>   editLogStream = null;
> } catch (IOException e) {
>   //All journals have failed, it will be handled in logSync.
> }
> state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11293) [SPS]: Local DN should be given preference as source node, when target available in same node

2017-01-09 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-11293:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-10285
   Status: Resolved  (was: Patch Available)

> [SPS]: Local DN should be given preference as source node, when target 
> available in same node
> -
>
> Key: HDFS-11293
> URL: https://issues.apache.org/jira/browse/HDFS-11293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Yuanbo Liu
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: HDFS-10285
>
> Attachments: HDFS-11293-HDFS-10285-00.patch, 
> HDFS-11293-HDFS-10285-01.patch, HDFS-11293-HDFS-10285-02.patch
>
>
> In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica 
> info by block pool id. But in this situation:
> {code}
> datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
> 1. the same block replica exists in A[DISK] and B[DISK].
> 2. the block pool id of datanode A and datanode B are the same.
> {code}
> Then we start to change the file's storage policy and move the block replica 
> in the cluster. Very likely we have to move block from B[DISK] to A[SSD], at 
> this time, datanode A throws ReplicaAlreadyExistsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11293) [SPS]: Local DN should be given preference as source node, when target available in same node

2017-01-09 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813069#comment-15813069
 ] 

Uma Maheswara Rao G commented on HDFS-11293:


I have just committed this to branch.
Thanks [~yuanbo] and [~rakeshr] for reviews!
Thanks [~yuanbo] for finding issue and sharing test cases.


> [SPS]: Local DN should be given preference as source node, when target 
> available in same node
> -
>
> Key: HDFS-11293
> URL: https://issues.apache.org/jira/browse/HDFS-11293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Yuanbo Liu
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Attachments: HDFS-11293-HDFS-10285-00.patch, 
> HDFS-11293-HDFS-10285-01.patch, HDFS-11293-HDFS-10285-02.patch
>
>
> In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica 
> info by block pool id. But in this situation:
> {code}
> datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
> 1. the same block replica exists in A[DISK] and B[DISK].
> 2. the block pool id of datanode A and datanode B are the same.
> {code}
> Then we start to change the file's storage policy and move the block replica 
> in the cluster. Very likely we have to move block from B[DISK] to A[SSD], at 
> this time, datanode A throws ReplicaAlreadyExistsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11028) libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel pending connections

2017-01-09 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813059#comment-15813059
 ] 

James Clampffer commented on HDFS-11028:


In case we needed another example of why "delete this" and tenuous resource 
management in C++ are a recipe for pain: it looks like this can leak memory if 
the FileSystem destructor is called while this waits on the asio dns resolver.  
The bug existed before this patch but the cancel test executable in my patch 
provides a simple reproducer.

Situation:
In common/namenode_info.cc BulkResolve takes a set of NamenodeInfo objects and 
does a DNS lookup on each host to get a vector of endpoints.  In order to be 
fast the function does an arbitrary amount of async lookups in parallel and 
joins at the end to make the API reasonably simple to use.

In order to keep track of multiple pipelines a vector of 
std::pair*, std::shared_ptr>> is set up.  Each 
pair represents a continuation pipeline that's doing the resolve work and the 
std::promise that will eventually contain the result status assuming the 
continuation runs to completion.  This seemed like a reasonable way to 
encapsulate async work using continuations that needed to be joined but it 
turns out it's incredibly difficult to clean this up if it's been interrupted.

-Can't simply call delete on the Pipeline pointers contained in the vector 
because the continuation may have already called "delete this", if it has 
self-destructed the pointer remains non-null so double deleting will break 
things.
-Can't loop though the vector can call cancel on all the Pipelines because some 
may have been destructed via "delete this".  If the malloc implementation is 
being generous the call might give __cxa_pure_virtual exception but it's more 
likely to just trash the heap.
-Can't check the status of the Pipeline because it's wrapped in a promise, so 
that will just block.

Possible fixes:
-Add a pointer-to-a-flag to the continuation state so the pipeline can indicate 
it self destructed, make sure the ResolveContinuation can actually deal with 
cancel semantics.
-Rewrite dns lookup by allocating memory correctly and calling asio functions.

I'll file another jira to rewrite the dns resolution code as I don't think an 
issue that's been around for so long should block this.  The temporary fix is 
to avoid deleting the FileSystem object immediately after cancel.  The pipeline 
will clean itself up when the resolver returns, but it risks invalid writes if 
the vector of endpoints disappears since it's holding a back_inserter i.e. it's 
a dangling pointer issue obfuscated by piles of abstraction.

> libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel 
> pending connections
> --
>
> Key: HDFS-11028
> URL: https://issues.apache.org/jira/browse/HDFS-11028
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-11028.HDFS-8707.000.patch, 
> HDFS-11028.HDFS-8707.001.patch, HDFS-11028.HDFS-8707.002.patch
>
>
> Cancel support is now reasonably robust except the case where a FileHandle 
> operation ends up causing the RpcEngine to try to create a new RpcConnection. 
>  In HA configs it's common to have something like 10-20 failovers and a 20 
> second failover delay (no exponential backoff just yet). This means that all 
> of the functions with synchronous interfaces can still block for many minutes 
> after an operation has been canceled, and often the cause of this is 
> something trivial like a bad config file.
> The current design makes this sort of thing tricky to do because the 
> FileHandles need to be individually cancelable via CancelOperations, but they 
> share the RpcEngine that does the async magic.
> Updated design:
> Original design would end up forcing lots of reconnects.  Not a huge issue on 
> an unauthenticated cluster but on a kerberized cluster this is a recipe for 
> Kerberos thinking we're attempting a replay attack.
> User visible cancellation and internal resources cleanup are separable 
> issues.  The former can be implemented by atomically swapping the callback of 
> the operation to be canceled with a no-op callback.  The original callback is 
> then posted to the IoService with an OperationCanceled status and the user is 
> no longer blocked.  For RPC cancels this is sufficient, it's not expensive to 
> keep a request around a little bit longer and when it's eventually invoked or 
> timed out it invokes the no-op callback and is ignored (other than a trace 
> level log notification).  Connect cancels push a flag down into the RPC 
> engine to kill the connection and make sure it doesn't attempt to reconnect.

[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread tangshangwen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813039#comment-15813039
 ] 

tangshangwen commented on HDFS-11308:
-

Hi [~jojochuang], Do you think we should judge the return value is 1 to avoid 
this problem?


> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread tangshangwen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813035#comment-15813035
 ] 

tangshangwen commented on HDFS-11308:
-

Ok, thank you ~ [~jojochuang]

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813031#comment-15813031
 ] 

Hadoop QA commented on HDFS-7967:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 
39s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 366 unchanged - 9 fixed = 371 total (was 375) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 29s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.hdfs.tools.TestDFSZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-7967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846404/HDFS-7967.branch-2.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 357bacb0b6a2 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/pe

[jira] [Commented] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813026#comment-15813026
 ] 

Wei-Chiu Chuang commented on HDFS-11308:


I don't have a CentOS 7 box available. But it looks like a platform issue.

The same bug was reported here: 
https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1349384

Looks like the workaround is to use nmap.

> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAddr.getPort());
> if (rc == 0) {
>   // the service is still listening - we are unable to fence
>   LOG.warn("Unable to fence - it is running but we cannot kill it");
>   return false;
> } else {
>   LOG.info("Verified that the service is down.");
>   return true;  
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11299) Support multiple Datanode File IO hooks

2017-01-09 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813013#comment-15813013
 ] 

Xiaoyu Yao commented on HDFS-11299:
---

Thanks [~hanishakoneru] for updating the patch and [~arpitagarwal] for the 
review. 

Looks like LEN_INT can be replaced with Integer.BYTES from JDK. But 
Integer.BYTES is only available in JDK1.8. It will cause more work when 
backporting it to branch-2. 
I'm OK with the patch v02. +1 pending Jenkins.

> Support multiple Datanode File IO hooks
> ---
>
> Key: HDFS-11299
> URL: https://issues.apache.org/jira/browse/HDFS-11299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11299.000.patch, HDFS-11299.001.patch, 
> HDFS-11299.002.patch
>
>
> HDFS-10958 introduces instrumentation hooks around DataNode disk IO and 
> HDFS-10959 adds support for profiling hooks to expose latency statistics. 
> Instead of choosing only one hook using Config parameters, we want to add two 
> separate hooks - one for profiling and one for fault injection. The fault 
> injection hook will be useful for testing purposes. 
> This jira only introduces support for fault injection hook. The 
> implementation for that will come later on.
> Also, now Default and Counting FileIOEvents would not be needed as we can 
> control enabling the profiling and fault injection hooks using config 
> parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Attachment: HDFS-9391.04.patch

Attached v04 patch to fix the LeavingServiceStatus and use outOfServiceReplica 
block counts in DN UI pages. [~mingma], [~eddyxu] can you please take a look at 
this patch revision ? 

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, HDFS-9391.04.patch, Maintenance 
> webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread tangshangwen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tangshangwen updated HDFS-11308:

Description: 
In our Cluster, I found some abnormal in ZKFC log
{noformat}
[2017-01-10T01:42:37.168+08:00] [INFO] 
hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health Monitor 
for NameNode at xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] 
: Indeterminate response from trying to kill service. Verifying whether it is 
running using nc...
[2017-01-10T01:42:37.234+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
[2017-01-10T01:42:37.235+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
for more information, usage options and help. QUITTING.
{noformat}

When I perform nc an exception occurs, the return value is 2, and cannot 
confirm sshfence success,this may lead to some problems
{code:title=SshFenceByTcpPort.java|borderStyle=solid}
rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
" " + serviceAddr.getPort());
if (rc == 0) {
  // the service is still listening - we are unable to fence
  LOG.warn("Unable to fence - it is running but we cannot kill it");
  return false;
} else {
  LOG.info("Verified that the service is down.");
  return true;  
}
{code}


  was:
In our Cluster, I found some abnormal in ZKFC log
{noformat}
[2017-01-10T01:42:37.168+08:00] [INFO] 
hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health Monitor 
for NameNode at xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] 
: Indeterminate response from trying to kill service. Verifying whether it is 
running using nc...
[2017-01-10T01:42:37.234+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
BJYF-Druid-17224.hadoop.jd.local 8021 via ssh: StreamPumper for STDERR] : nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
[2017-01-10T01:42:37.235+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
BJYF-Druid-17224.hadoop.jd.local 8021 via ssh: Ncat: Try `--help' or man(1) 
ncat for more information, usage options and help. QUITTING.
{noformat}

When I perform nc an exception occurs, the return value is 2, and cannot 
confirm sshfence success,this may lead to some problems
{code:title=SshFenceByTcpPort.java|borderStyle=solid}
rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
" " + serviceAddr.getPort());
if (rc == 0) {
  // the service is still listening - we are unable to fence
  LOG.warn("Unable to fence - it is running but we cannot kill it");
  return false;
} else {
  LOG.info("Verified that the service is down.");
  return true;  
}
{code}



> NameNode doFence state judgment problem
> ---
>
> Key: HDFS-11308
> URL: https://issues.apache.org/jira/browse/HDFS-11308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.7.1
> Environment: CentOS Linux release 7.1.1503 (Core)
>Reporter: tangshangwen
>
> In our Cluster, I found some abnormal in ZKFC log
> {noformat}
> [2017-01-10T01:42:37.168+08:00] [INFO] 
> hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health 
> Monitor for NameNode at 
> xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] : 
> Indeterminate response from trying to kill service. Verifying whether it is 
> running using nc...
> [2017-01-10T01:42:37.234+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
> [2017-01-10T01:42:37.235+08:00] [WARN] 
> hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
> xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
> xxx-xxx-17224.hadoop.xxx.com 8021 via ssh: Ncat: Try `--help' or man(1) ncat 
> for more information, usage options and help. QUITTING.
> {noformat}
> When I perform nc an exception occurs, the return value is 2, and cannot 
> confirm sshfence success,this may lead to some problems
> {code:title=SshFenceByTcpPort.java|borderStyle=solid}
> rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
> " " + serviceAd

[jira] [Created] (HDFS-11308) NameNode doFence state judgment problem

2017-01-09 Thread tangshangwen (JIRA)
tangshangwen created HDFS-11308:
---

 Summary: NameNode doFence state judgment problem
 Key: HDFS-11308
 URL: https://issues.apache.org/jira/browse/HDFS-11308
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Affects Versions: 2.7.1
 Environment: CentOS Linux release 7.1.1503 (Core)
Reporter: tangshangwen


In our Cluster, I found some abnormal in ZKFC log
{noformat}
[2017-01-10T01:42:37.168+08:00] [INFO] 
hadoop.ha.SshFenceByTcpPort.doFence(SshFenceByTcpPort.java 147) [Health Monitor 
for NameNode at xxx-xxx-172xxx.hadoop.xxx.com/xxx.xxx.172.xxx:8021-EventThread] 
: Indeterminate response from trying to kill service. Verifying whether it is 
running using nc...
[2017-01-10T01:42:37.234+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
BJYF-Druid-17224.hadoop.jd.local 8021 via ssh: StreamPumper for STDERR] : nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: nc: invalid option -- 'z'
[2017-01-10T01:42:37.235+08:00] [WARN] 
hadoop.ha.SshFenceByTcpPort.pump(StreamPumper.java 88) [nc -z 
xxx-xxx-172xx.hadoop.xxx.com 8021 via ssh: StreamPumper for STDERR] : nc -z 
BJYF-Druid-17224.hadoop.jd.local 8021 via ssh: Ncat: Try `--help' or man(1) 
ncat for more information, usage options and help. QUITTING.
{noformat}

When I perform nc an exception occurs, the return value is 2, and cannot 
confirm sshfence success,this may lead to some problems
{code:title=SshFenceByTcpPort.java|borderStyle=solid}
rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
" " + serviceAddr.getPort());
if (rc == 0) {
  // the service is still listening - we are unable to fence
  LOG.warn("Unable to fence - it is running but we cannot kill it");
  return false;
} else {
  LOG.info("Verified that the service is down.");
  return true;  
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11307) The rpc to portmap service for NFS has hardcoded timeout.

2017-01-09 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HDFS-11307:
---

 Summary: The rpc to portmap service for NFS has hardcoded timeout. 
 Key: HDFS-11307
 URL: https://issues.apache.org/jira/browse/HDFS-11307
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Mukul Kumar Singh


The NFS service makes rpc call to the portmap but the timeout is hardcoded. 
Tests on slow virtual machines sometimes fail due to timeout. We should make 
the timeout configurable, with the same default value as current value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812983#comment-15812983
 ] 

Xiaoyu Yao commented on HDFS-11209:
---

I've also manually tested on a cluster with rolling upgrade (non-HA) from NN 
layout version 60 to 63 and verify the patch fixed the checkpoint problem on 
SNN. 

> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11209.00.patch, HDFS-11209.01.patch, 
> HDFS-11209.02.patch
>
>
> Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 
> brings this back. 
> With HDFS-8432, the primary NN will not update the VERSION file to the new 
> version after running with "rollingUpgrade" option until upgrade is 
> finalized. This is to support more downgrade use cases.
> However, the checkpoint on the SNN is incorrectly updating the VERSION file 
> when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint 
> successfully but fail to push it to the primary NN because its version is 
> higher than the primary NN as shown below.
> {code}
> 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode 
> (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  Image uploading failed, status: 403, url: 
> http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., 
> message: This namenode has storage info -60:221856466:1444080250181:clusterX 
> but the secondary expected -63:221856466:1444080250181:clusterX
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-11209:
--
Attachment: HDFS-11209.02.patch

Fix the checkstyle and adding unit test.

> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11209.00.patch, HDFS-11209.01.patch, 
> HDFS-11209.02.patch
>
>
> Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 
> brings this back. 
> With HDFS-8432, the primary NN will not update the VERSION file to the new 
> version after running with "rollingUpgrade" option until upgrade is 
> finalized. This is to support more downgrade use cases.
> However, the checkpoint on the SNN is incorrectly updating the VERSION file 
> when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint 
> successfully but fail to push it to the primary NN because its version is 
> higher than the primary NN as shown below.
> {code}
> 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode 
> (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  Image uploading failed, status: 403, url: 
> http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., 
> message: This namenode has storage info -60:221856466:1444080250181:clusterX 
> but the secondary expected -63:221856466:1444080250181:clusterX
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11299) Support multiple Datanode File IO hooks

2017-01-09 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812955#comment-15812955
 ] 

Arpit Agarwal commented on HDFS-11299:
--

Thanks [~hanishakoneru].

+1 pending Jenkins. Looks like Xiaoyu's feedback is also addressed but will 
hold off committing until tomorrow in case he has more feedback.

> Support multiple Datanode File IO hooks
> ---
>
> Key: HDFS-11299
> URL: https://issues.apache.org/jira/browse/HDFS-11299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11299.000.patch, HDFS-11299.001.patch, 
> HDFS-11299.002.patch
>
>
> HDFS-10958 introduces instrumentation hooks around DataNode disk IO and 
> HDFS-10959 adds support for profiling hooks to expose latency statistics. 
> Instead of choosing only one hook using Config parameters, we want to add two 
> separate hooks - one for profiling and one for fault injection. The fault 
> injection hook will be useful for testing purposes. 
> This jira only introduces support for fault injection hook. The 
> implementation for that will come later on.
> Also, now Default and Counting FileIOEvents would not be needed as we can 
> control enabling the profiling and fault injection hooks using config 
> parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11306) Print remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11306:
---
Summary: Print remaining edit logs from buffer if edit log can't be rolled. 
 (was: Dump remaining edit logs from buffer if edit log can't be rolled.)

> Print remaining edit logs from buffer if edit log can't be rolled.
> --
>
> Key: HDFS-11306
> URL: https://issues.apache.org/jira/browse/HDFS-11306
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11306.001.patch
>
>
> In HDFS-10943 [~yzhangal] reported that edit log can not be rolled due to 
> unexpected edit logs lingering in the buffer.
> Unable to root cause the bug, I propose that we dump the remaining edit logs 
> in the buffer into namenode log, before crashing namenode. Use this new 
> capability to find the ops that sneaks into the buffer unexpectedly, and 
> hopefully catch the bug.
> This effort is orthogonal, but related to HDFS-11292, which adds additional 
> informational logs to help debug this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-11305:
---
Attachment: HDFS-11305.HDFS-8707.000.patch

000.patch adds a debug log in FileHandleImpl::AsyncPreadSome to display 
datanode hostname and ip address, file path and offset of an HDFS block.

> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>Priority: Minor
> Attachments: HDFS-11305.HDFS-8707.000.patch
>
>
> The information can be logged as debug log and contain hostname, ip address, 
> with file path and offset. With these information, we can check things like 
> rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-11305:
---
Status: Patch Available  (was: Open)

> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>Priority: Minor
>
> The information can be logged as debug log and contain hostname, ip address, 
> with file path and offset. With these information, we can check things like 
> rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11306) Dump remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11306:
---
Status: Patch Available  (was: Open)

> Dump remaining edit logs from buffer if edit log can't be rolled.
> -
>
> Key: HDFS-11306
> URL: https://issues.apache.org/jira/browse/HDFS-11306
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11306.001.patch
>
>
> In HDFS-10943 [~yzhangal] reported that edit log can not be rolled due to 
> unexpected edit logs lingering in the buffer.
> Unable to root cause the bug, I propose that we dump the remaining edit logs 
> in the buffer into namenode log, before crashing namenode. Use this new 
> capability to find the ops that sneaks into the buffer unexpectedly, and 
> hopefully catch the bug.
> This effort is orthogonal, but related to HDFS-11292, which adds additional 
> informational logs to help debug this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11306) Dump remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11306:
---
Attachment: HDFS-11306.001.patch

Upload patch v001. This patch adds a private method that dumps edit logs in 
human readable formation into namenode log. A test case is also added.

Any suggestion is greatly appreciated. I honestly do not have much experience 
in edit logs and HA.

> Dump remaining edit logs from buffer if edit log can't be rolled.
> -
>
> Key: HDFS-11306
> URL: https://issues.apache.org/jira/browse/HDFS-11306
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11306.001.patch
>
>
> In HDFS-10943 [~yzhangal] reported that edit log can not be rolled due to 
> unexpected edit logs lingering in the buffer.
> Unable to root cause the bug, I propose that we dump the remaining edit logs 
> in the buffer into namenode log, before crashing namenode. Use this new 
> capability to find the ops that sneaks into the buffer unexpectedly, and 
> hopefully catch the bug.
> This effort is orthogonal, but related to HDFS-11292, which adds additional 
> informational logs to help debug this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11306) Dump remaining edit logs from buffer if edit log can't be rolled.

2017-01-09 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-11306:
--

 Summary: Dump remaining edit logs from buffer if edit log can't be 
rolled.
 Key: HDFS-11306
 URL: https://issues.apache.org/jira/browse/HDFS-11306
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, namenode
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


In HDFS-10943 [~yzhangal] reported that edit log can not be rolled due to 
unexpected edit logs lingering in the buffer.

Unable to root cause the bug, I propose that we dump the remaining edit logs in 
the buffer into namenode log, before crashing namenode. Use this new capability 
to find the ops that sneaks into the buffer unexpectedly, and hopefully catch 
the bug.

This effort is orthogonal, but related to HDFS-11292, which adds additional 
informational logs to help debug this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812904#comment-15812904
 ] 

Manoj Govindassamy edited comment on HDFS-9391 at 1/9/17 9:30 PM:
--

Sure. 

Its about whether we should include all blocks of *Maintenance + Decommission* 
states under "Block with No Live Replicas" for each of DN in the "Entering  
Maintenance" and "Decommissioning" page. Previously i was trying to have them 
include only one of these states (as per the initial discussion in this jira). 
But, thinking more about it and after your discussing i feel including both 
these states makes sense. Will upload the new patch soon. Thanks a lot for the 
review.


was (Author: manojg):
Sure. 

Its about whether we should include all blocks of *Maintenance + Decommission* 
states under "Block with No Live Replicas" for each of DN in the "Entering  
Maintenance" and "Decommissioning" page. Previously i was trying to have them 
include one of these states. But, thinking more about it and after your 
discussing i feel including both these states makes sense. Will upload the new 
patch soon. Thanks a lot for the review.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812904#comment-15812904
 ] 

Manoj Govindassamy commented on HDFS-9391:
--

Sure. 

Its about whether we should include all blocks of *Maintenance + Decommission* 
states under "Block with No Live Replicas" for each of DN in the "Entering  
Maintenance" and "Decommissioning" page. Previously i was trying to have them 
include one of these states. But, thinking more about it and after your 
discussing i feel including both these states makes sense. Will upload the new 
patch soon. Thanks a lot for the review.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812883#comment-15812883
 ] 

Hadoop QA commented on HDFS-11209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 151 unchanged - 0 fixed = 153 total (was 151) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}103m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846397/HDFS-11209.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux aee66256 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18112/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18112/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Updated] (HDFS-11305) libhdfs++: Log Datanode information when reading an HDFS block

2017-01-09 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-11305:
---
Summary: libhdfs++: Log Datanode information when reading an HDFS block  
(was: Log Datanode information when reading an HDFS block)

> libhdfs++: Log Datanode information when reading an HDFS block
> --
>
> Key: HDFS-11305
> URL: https://issues.apache.org/jira/browse/HDFS-11305
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaowei Zhu
>Assignee: Xiaowei Zhu
>Priority: Minor
>
> The information can be logged as debug log and contain hostname, ip address, 
> with file path and offset. With these information, we can check things like 
> rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11305) Log Datanode information when reading an HDFS block

2017-01-09 Thread Xiaowei Zhu (JIRA)
Xiaowei Zhu created HDFS-11305:
--

 Summary: Log Datanode information when reading an HDFS block
 Key: HDFS-11305
 URL: https://issues.apache.org/jira/browse/HDFS-11305
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaowei Zhu
Assignee: Xiaowei Zhu
Priority: Minor


The information can be logged as debug log and contain hostname, ip address, 
with file path and offset. With these information, we can check things like 
rack locality, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11299) Support multiple Datanode File IO hooks

2017-01-09 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-11299:
--
Attachment: HDFS-11299.002.patch

Thank you [~xyao] and [~arpitagarwal] for reviewing the patch and for the 
comments. I have addressed them in patch v02.

> Support multiple Datanode File IO hooks
> ---
>
> Key: HDFS-11299
> URL: https://issues.apache.org/jira/browse/HDFS-11299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11299.000.patch, HDFS-11299.001.patch, 
> HDFS-11299.002.patch
>
>
> HDFS-10958 introduces instrumentation hooks around DataNode disk IO and 
> HDFS-10959 adds support for profiling hooks to expose latency statistics. 
> Instead of choosing only one hook using Config parameters, we want to add two 
> separate hooks - one for profiling and one for fault injection. The fault 
> injection hook will be useful for testing purposes. 
> This jira only introduces support for fault injection hook. The 
> implementation for that will come later on.
> Also, now Default and Counting FileIOEvents would not be needed as we can 
> control enabling the profiling and fault injection hooks using config 
> parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11273) Move TransferFsImage#doGetUrl function to a Util class

2017-01-09 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-11273:
--
Attachment: HDFS-11273.004.patch

> Move TransferFsImage#doGetUrl function to a Util class
> --
>
> Key: HDFS-11273
> URL: https://issues.apache.org/jira/browse/HDFS-11273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11273.000.patch, HDFS-11273.001.patch, 
> HDFS-11273.002.patch, HDFS-11273.003.patch, HDFS-11273.004.patch
>
>
> TransferFsImage#doGetUrl downloads files from the specified url and stores 
> them in the specified storage location. HDFS-4025 plans to synchronize the 
> log segments in JournalNodes. If a log segment is missing from a JN, the JN 
> downloads it from another JN which has the required log segment. We need 
> TransferFsImage#doGetUrl and TransferFsImage#receiveFile to accomplish this. 
> So we propose to move the said functions to a Utility class so as to be able 
> to use it for JournalNode syncing as well, without duplication of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11293) [SPS]: Local DN should be given preference as source node, when target available in same node

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812794#comment-15812794
 ] 

Hadoop QA commented on HDFS-11293:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}151m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}172m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestFileChecksum |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11293 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846378/HDFS-11293-HDFS-10285-02.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5f854f997750 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10285 / 5aacf2a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18109/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18109/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hd

[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812791#comment-15812791
 ] 

Allen Wittenauer commented on HDFS-7967:


Sorry. :(

I tried hard to make hyphens as separators work (and in some cases they do), 
but other cases were just not reliable enough, even when cross referencing a 
list of valid branches.  (.e.g, is HADOOP-6671-HADOOP-6671-2.patch a patch for 
the HADOOP-6671 branch or the HADOOP-6671-2 branch?  The YARN-5355-branch-2 
branch brings a whole new level of hurt...)

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, 
> HDFS-7967.branch-2-1.patch, HDFS-7967.branch-2.001.patch, 
> HDFS-7967.branch-2.8-1.patch, HDFS-7967.branch-2.8.001.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: HDFS-7967.branch-2.001.patch
HDFS-7967.branch-2.8.001.patch

Renamed patches to make pre-commit happ(ier).

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, 
> HDFS-7967.branch-2-1.patch, HDFS-7967.branch-2.001.patch, 
> HDFS-7967.branch-2.8-1.patch, HDFS-7967.branch-2.8.001.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2017-01-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812642#comment-15812642
 ] 

Ming Ma commented on HDFS-9391:
---

Thanks Manoj. I just found something related to our discussion. For any 
decommissioning node, given getDecommissionOnlyReplicas is the same as 
getOutOfServiceOnlyReplicas, can we just use getOutOfServiceOnlyReplicas value 
for JSON decommissionOnlyReplicas property? Same for any entering maintenance 
node. In other words, we might not need to add the extra 
decommissionOnlyReplicas and maintenanceOnlyReplicas to LeavingServiceStatus.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> HDFS-9391.02.patch, HDFS-9391.03.patch, Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11304) Namenode fails to start, even edit log available in the journal node

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812634#comment-15812634
 ] 

Wei-Chiu Chuang edited comment on HDFS-11304 at 1/9/17 7:38 PM:


Thanks for reporting the issue, [~kpalanisamy].
If a NameNode crashes because edit log has a gap, and the gap is not due to 
some operational error, it can't be just a minor issue. It has to be at least a 
major one. Bump up the priority.


was (Author: jojochuang):
Thanks for reporting the issue, [~kpalanisamy].
If a NameNode crashes because edit log has a gap, it can't be just a minor 
issue. It has to be at least a major one. Bump up the priority.

> Namenode fails to start, even edit log available in the journal node
> 
>
> Key: HDFS-11304
> URL: https://issues.apache.org/jira/browse/HDFS-11304
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, journal-node
>Affects Versions: 2.8.0, 2.7.1
> Environment: *HDP 2.4.2.0-258*
>Reporter: Karthik P
>Assignee: Karthik P
>  Labels: patch
>
> JN => JournalNode
> NN => Namenode local directory (_dfs.namenode.name.dir_)
> Y/N => Is edit file/log present?
> Ex : edits_1627921-1627961
> *Scenario:*
> ||JN 1||JN 2||JN 3||NN local|| Is NN started?
> |N|N|Y|N|Started|   
> |Y|N|N|N|Started|
> |N|Y|N|N|Failed|
> |N|Y|N|Y|Started|
> |Y|Y|N|N|Started| 
> *Note:* Namenode and JN2 installed on the same machine
> *Trace :*
>  ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start 
> namenode.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 1627921, but got txid 1627962.
>   at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11304) Namenode fails to start, even edit log available in the journal node

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11304:
---
Priority: Major  (was: Minor)

> Namenode fails to start, even edit log available in the journal node
> 
>
> Key: HDFS-11304
> URL: https://issues.apache.org/jira/browse/HDFS-11304
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, journal-node
>Affects Versions: 2.8.0, 2.7.1
> Environment: *HDP 2.4.2.0-258*
>Reporter: Karthik P
>Assignee: Karthik P
>  Labels: patch
>
> JN => JournalNode
> NN => Namenode local directory (_dfs.namenode.name.dir_)
> Y/N => Is edit file/log present?
> Ex : edits_1627921-1627961
> *Scenario:*
> ||JN 1||JN 2||JN 3||NN local|| Is NN started?
> |N|N|Y|N|Started|   
> |Y|N|N|N|Started|
> |N|Y|N|N|Failed|
> |N|Y|N|Y|Started|
> |Y|Y|N|N|Started| 
> *Note:* Namenode and JN2 installed on the same machine
> *Trace :*
>  ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start 
> namenode.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 1627921, but got txid 1627962.
>   at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11304) Namenode fails to start, even edit log available in the journal node

2017-01-09 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812634#comment-15812634
 ] 

Wei-Chiu Chuang commented on HDFS-11304:


Thanks for reporting the issue, [~kpalanisamy].
If a NameNode crashes because edit log has a gap, it can't be just a minor 
issue. It has to be at least a major one. Bump up the priority.

> Namenode fails to start, even edit log available in the journal node
> 
>
> Key: HDFS-11304
> URL: https://issues.apache.org/jira/browse/HDFS-11304
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, journal-node
>Affects Versions: 2.8.0, 2.7.1
> Environment: *HDP 2.4.2.0-258*
>Reporter: Karthik P
>Assignee: Karthik P
>  Labels: patch
>
> JN => JournalNode
> NN => Namenode local directory (_dfs.namenode.name.dir_)
> Y/N => Is edit file/log present?
> Ex : edits_1627921-1627961
> *Scenario:*
> ||JN 1||JN 2||JN 3||NN local|| Is NN started?
> |N|N|Y|N|Started|   
> |Y|N|N|N|Started|
> |N|Y|N|N|Failed|
> |N|Y|N|Y|Started|
> |Y|Y|N|N|Started| 
> *Note:* Namenode and JN2 installed on the same machine
> *Trace :*
>  ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start 
> namenode.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 1627921, but got txid 1627962.
>   at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-11209:
--
Attachment: HDFS-11209.01.patch

Fix the build issue.

> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11209.00.patch, HDFS-11209.01.patch
>
>
> Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 
> brings this back. 
> With HDFS-8432, the primary NN will not update the VERSION file to the new 
> version after running with "rollingUpgrade" option until upgrade is 
> finalized. This is to support more downgrade use cases.
> However, the checkpoint on the SNN is incorrectly updating the VERSION file 
> when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint 
> successfully but fail to push it to the primary NN because its version is 
> higher than the primary NN as shown below.
> {code}
> 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode 
> (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  Image uploading failed, status: 403, url: 
> http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., 
> message: This namenode has storage info -60:221856466:1444080250181:clusterX 
> but the secondary expected -63:221856466:1444080250181:clusterX
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812624#comment-15812624
 ] 

Hadoop QA commented on HDFS-11209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m 27s{color} | 
{color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 149 unchanged - 0 fixed = 151 total (was 149) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 29s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846388/HDFS-11209.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 4eb08e4769c8 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 91bf504 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18110/artifact/patchprocess/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18110/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| cc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18110/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18110/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 
https://builds.apache.org/

[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812618#comment-15812618
 ] 

Hadoop QA commented on HDFS-7967:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-7967 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-7967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846394/HDFS-7967.branch-2-1.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18111/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, 
> HDFS-7967.branch-2-1.patch, HDFS-7967.branch-2.8-1.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11301) Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams

2017-01-09 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812602#comment-15812602
 ] 

Hanisha Koneru commented on HDFS-11301:
---

Thank you [~arpitagarwal].

> Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams
> -
>
> Key: HDFS-11301
> URL: https://issues.apache.org/jira/browse/HDFS-11301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11301.000.patch
>
>
> In LocalReplicaInPipeline#createStreams, there is a WrappedFileOutputStream 
> created over a WrappedRandomAccessFile. This double layer of instrumentation 
> is unnecessary. 
> {quote}
>   blockOut = fileIoProvider.getFileOutputStream(getVolume(),
>   fileIoProvider.getRandomAccessFile(getVolume(), blockFile, "rw")
>   .getFD());
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11163) Mover should move the file blocks to default storage once policy is unset

2017-01-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812601#comment-15812601
 ] 

Chris Nauroth commented on HDFS-11163:
--

[~surendrasingh], thank you for the patch.  This looks correct to me.  One 
thing I'm unsure about is the potential impact on performance of Mover.  It 
will require an additional {{getStoragePolicy}} RPC per file with the default 
storage policy, whereas previously there was no RPC for those files.  
Unfortunately, I don't see a way to avoid that, at least not with the current 
APIs, because that's how we resolve inheritance of storage policies from parent 
paths.  I would prefer to get an opinion from [~szetszwo].

> Mover should move the file blocks to default storage once policy is unset
> -
>
> Key: HDFS-11163
> URL: https://issues.apache.org/jira/browse/HDFS-11163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-11163-001.patch, HDFS-11163-002.patch
>
>
> HDFS-9534 added new API in FileSystem to unset the storage policy. Once 
> policy is unset blocks should move back to the default storage policy.
> Currently mover is not moving file blocks which have zero storage ID
> {code}
>   // currently we ignore files with unspecified storage policy
>   if (policyId == HdfsConstants.BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) {
> return;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7967) Reduce the performance impact of the balancer

2017-01-09 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7967:
--
Attachment: HDFS-7967.branch-2-1.patch
HDFS-7967.branch-2.8-1.patch

Simple change from jdk predicate to google predicate.

> Reduce the performance impact of the balancer
> -
>
> Key: HDFS-7967
> URL: https://issues.apache.org/jira/browse/HDFS-7967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, 
> HDFS-7967.branch-2-1.patch, HDFS-7967.branch-2.8-1.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The 
> block lookup is extremely inefficient.  An iterator of the node's blocks is 
> created from the iterators of its storages' blocks.  A random number is 
> chosen corresponding to how many blocks will be skipped via the iterator.  
> Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring 
> imbalances within the nodes's storages.  A more efficient and intelligent 
> design may eliminate the costly skipping of blocks via round-robin selection 
> of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11303) Hedged read might hang infinitely if read data from all DN failed

2017-01-09 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812572#comment-15812572
 ] 

stack commented on HDFS-11303:
--

Patch LGTM. Your patch allows that the primary read might still complete before 
the new hedged reads whereas what was there previous would discard anything 
that came in after timeout. Good. The test is just to verify we time out? W/o 
your fix, the test hangs?

> Hedged read might hang infinitely if read data from all DN failed 
> --
>
> Key: HDFS-11303
> URL: https://issues.apache.org/jira/browse/HDFS-11303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha1
>Reporter: Chen Zhang
> Attachments: HDFS-11303-001.patch
>
>
> Hedged read will read from a DN first, if timeout, then read other DNs 
> simultaneously.
> If read all DN failed, this bug will cause the future-list not empty(the 
> first timeout request left in list), and hang in the loop infinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-11209:
--
Attachment: HDFS-11209.00.patch

Attach a patch to check the NN rolling upgrade status before update the VERSION 
file on SNN and Backup NN. The original code that checks the SNN namesystem 
rollingUpgrade won't work as SNN will never start with RollingUpgrade option. 
Backup NN should have the similar issue. 

Will add a unit test later.

> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11209.00.patch
>
>
> Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 
> brings this back. 
> With HDFS-8432, the primary NN will not update the VERSION file to the new 
> version after running with "rollingUpgrade" option until upgrade is 
> finalized. This is to support more downgrade use cases.
> However, the checkpoint on the SNN is incorrectly updating the VERSION file 
> when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint 
> successfully but fail to push it to the primary NN because its version is 
> higher than the primary NN as shown below.
> {code}
> 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode 
> (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  Image uploading failed, status: 403, url: 
> http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., 
> message: This namenode has storage info -60:221856466:1444080250181:clusterX 
> but the secondary expected -63:221856466:1444080250181:clusterX
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized

2017-01-09 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-11209:
--
Status: Patch Available  (was: Open)

> SNN can't checkpoint when rolling upgrade is not finalized
> --
>
> Key: HDFS-11209
> URL: https://issues.apache.org/jira/browse/HDFS-11209
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1, 2.8.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11209.00.patch
>
>
> Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 
> brings this back. 
> With HDFS-8432, the primary NN will not update the VERSION file to the new 
> version after running with "rollingUpgrade" option until upgrade is 
> finalized. This is to support more downgrade use cases.
> However, the checkpoint on the SNN is incorrectly updating the VERSION file 
> when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint 
> successfully but fail to push it to the primary NN because its version is 
> higher than the primary NN as shown below.
> {code}
> 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode 
> (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  Image uploading failed, status: 403, url: 
> http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., 
> message: This namenode has storage info -60:221856466:1444080250181:clusterX 
> but the secondary expected -63:221856466:1444080250181:clusterX
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10733) NameNode terminated after full GC thinking QJM is unresponsive.

2017-01-09 Thread Vinitha Reddy Gankidi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812538#comment-15812538
 ] 

Vinitha Reddy Gankidi commented on HDFS-10733:
--

[~kihwal] Thanks for the great suggestion. 

I have attached a patch that increases the endtime/timeout if there is a long 
pause due to a Full GC in NN. The unit test included asserts that a timeout 
exception is thrown instead of increasing the timeout as in the case of a Full 
GC if there indeed aren't any responses from the journal nodes. Please take a 
look. 

> NameNode terminated after full GC thinking QJM is unresponsive.
> ---
>
> Key: HDFS-10733
> URL: https://issues.apache.org/jira/browse/HDFS-10733
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, qjm
>Affects Versions: 2.6.4
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
> Attachments: HDFS-10733.001.patch
>
>
> NameNode went into full GC while in {{AsyncLoggerSet.waitForWriteQuorum()}}. 
> After completing GC it checks if the timeout for quorum is reached. If the GC 
> was long enough the timeout can expire, and {{QuorumCall.waitFor()}} will 
> throw {{TimeoutExcpetion}}. Finally {{FSEditLog.logSync()}} catches the 
> exception and terminates NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10733) NameNode terminated after full GC thinking QJM is unresponsive.

2017-01-09 Thread Vinitha Reddy Gankidi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinitha Reddy Gankidi updated HDFS-10733:
-
Attachment: HDFS-10733.001.patch

> NameNode terminated after full GC thinking QJM is unresponsive.
> ---
>
> Key: HDFS-10733
> URL: https://issues.apache.org/jira/browse/HDFS-10733
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, qjm
>Affects Versions: 2.6.4
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
> Attachments: HDFS-10733.001.patch
>
>
> NameNode went into full GC while in {{AsyncLoggerSet.waitForWriteQuorum()}}. 
> After completing GC it checks if the timeout for quorum is reached. If the GC 
> was long enough the timeout can expire, and {{QuorumCall.waitFor()}} will 
> throw {{TimeoutExcpetion}}. Finally {{FSEditLog.logSync()}} catches the 
> exception and terminates NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11301) Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams

2017-01-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812524#comment-15812524
 ] 

Hudson commented on HDFS-11301:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11091 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11091/])
HDFS-11301. Double wrapping over RandomAccessFile in (arp: rev 
91bf504440967ccdff1cb1cbe7801a5ce2ba88ab)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/LocalReplicaInPipeline.java


> Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams
> -
>
> Key: HDFS-11301
> URL: https://issues.apache.org/jira/browse/HDFS-11301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11301.000.patch
>
>
> In LocalReplicaInPipeline#createStreams, there is a WrappedFileOutputStream 
> created over a WrappedRandomAccessFile. This double layer of instrumentation 
> is unnecessary. 
> {quote}
>   blockOut = fileIoProvider.getFileOutputStream(getVolume(),
>   fileIoProvider.getRandomAccessFile(getVolume(), blockFile, "rw")
>   .getFD());
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11299) Support multiple Datanode File IO hooks

2017-01-09 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812501#comment-15812501
 ] 

Arpit Agarwal commented on HDFS-11299:
--

Thank you for the updated patch [~hanishakoneru]. In addition to Xiaoyu's 
feedback, one typo in the setting name: 
dfs.datanode.enable.fileio.fault.injectio. _injectio --> injection_.

The length field is a pre-existing bug but it's a good to fix it.

> Support multiple Datanode File IO hooks
> ---
>
> Key: HDFS-11299
> URL: https://issues.apache.org/jira/browse/HDFS-11299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11299.000.patch, HDFS-11299.001.patch
>
>
> HDFS-10958 introduces instrumentation hooks around DataNode disk IO and 
> HDFS-10959 adds support for profiling hooks to expose latency statistics. 
> Instead of choosing only one hook using Config parameters, we want to add two 
> separate hooks - one for profiling and one for fault injection. The fault 
> injection hook will be useful for testing purposes. 
> This jira only introduces support for fault injection hook. The 
> implementation for that will come later on.
> Also, now Default and Counting FileIOEvents would not be needed as we can 
> control enabling the profiling and fault injection hooks using config 
> parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11301) Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams

2017-01-09 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-11301:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   Status: Resolved  (was: Patch Available)

Committed to trunk. The test failures look unrelated.

Thanks for the contribution [~hanishakoneru].

> Double wrapping over RandomAccessFile in LocalReplicaInPipeline#createStreams
> -
>
> Key: HDFS-11301
> URL: https://issues.apache.org/jira/browse/HDFS-11301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11301.000.patch
>
>
> In LocalReplicaInPipeline#createStreams, there is a WrappedFileOutputStream 
> created over a WrappedRandomAccessFile. This double layer of instrumentation 
> is unnecessary. 
> {quote}
>   blockOut = fileIoProvider.getFileOutputStream(getVolume(),
>   fileIoProvider.getRandomAccessFile(getVolume(), blockFile, "rw")
>   .getFD());
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11293) [SPS]: Local DN should be given preference as source node, when target available in same node

2017-01-09 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812385#comment-15812385
 ] 

Rakesh R commented on HDFS-11293:
-

+1 on the latest patch. Pending Jenkins.

> [SPS]: Local DN should be given preference as source node, when target 
> available in same node
> -
>
> Key: HDFS-11293
> URL: https://issues.apache.org/jira/browse/HDFS-11293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Yuanbo Liu
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Attachments: HDFS-11293-HDFS-10285-00.patch, 
> HDFS-11293-HDFS-10285-01.patch, HDFS-11293-HDFS-10285-02.patch
>
>
> In {{FsDatasetImpl#createTemporary}}, we use {{volumeMap}} to get replica 
> info by block pool id. But in this situation:
> {code}
> datanode A => {DISK, SSD}, datanode B => {DISK, ARCHIVE}.
> 1. the same block replica exists in A[DISK] and B[DISK].
> 2. the block pool id of datanode A and datanode B are the same.
> {code}
> Then we start to change the file's storage policy and move the block replica 
> in the cluster. Very likely we have to move block from B[DISK] to A[SSD], at 
> this time, datanode A throws ReplicaAlreadyExistsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >