[jira] [Commented] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756475#comment-15756475
 ] 

Hadoop QA commented on HDFS-10959:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 97 unchanged - 0 fixed = 99 total (was 97) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 52s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}127m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10959 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12843700/HDFS-10959.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8c1a921fc9a8 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbe152 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17883/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17883/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17883/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17883/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Adding per disk IO 

[jira] [Commented] (HDFS-10860) Switch HttpFS from Tomcat to Jetty

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756442#comment-15756442
 ] 

Hadoop QA commented on HDFS-10860:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
0s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-assemblies {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
27s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  9m 19s{color} 
| {color:red} root generated 4 new + 690 unchanged - 0 fixed = 694 total (was 
690) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} root: The patch generated 0 new + 97 unchanged - 4 
fixed = 97 total (was 101) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
14s{color} | {color:green} The patch generated 0 new + 566 unchanged - 10 fixed 
= 566 total (was 576) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-assemblies {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-assemblies in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
32s{color} | {color:green} hadoop-auth in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
20s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 63m 
32s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
2s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The 

[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756391#comment-15756391
 ] 

Hadoop QA commented on HDFS-9391:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 276 unchanged - 1 fixed = 276 total (was 277) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}101m 
21s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-9391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12843698/HDFS-9391.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c552f1d8f8d4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbe152 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17881/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17881/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756345#comment-15756345
 ] 

Ming Ma commented on HDFS-9391:
---

Thanks [~manojg]. Some minor questions:

* {{.put("inMaintenance", node.isInMaintenance())}} might not be necessary 
given it also outputs {{.put("adminState", node.getAdminState().toString())}}.
* Should {{liveDecommissioningReplicas}} be {{OnlyDecommissioningReplicas}} 
which is the old behavior before maintenance? There are two differences, one is 
"Only", another one is "live".

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756346#comment-15756346
 ] 

Ming Ma commented on HDFS-9391:
---

Thanks [~manojg]. Some minor questions:

* {{.put("inMaintenance", node.isInMaintenance())}} might not be necessary 
given it also outputs {{.put("adminState", node.getAdminState().toString())}}.
* Should {{liveDecommissioningReplicas}} be {{OnlyDecommissioningReplicas}} 
which is the old behavior before maintenance? There are two differences, one is 
"Only", another one is "live".

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9391:
--
Comment: was deleted

(was: Thanks [~manojg]. Some minor questions:

* {{.put("inMaintenance", node.isInMaintenance())}} might not be necessary 
given it also outputs {{.put("adminState", node.getAdminState().toString())}}.
* Should {{liveDecommissioningReplicas}} be {{OnlyDecommissioningReplicas}} 
which is the old behavior before maintenance? There are two differences, one is 
"Only", another one is "live".)

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10959:
--
Attachment: HDFS-10959.01.patch

Thanks [~arpitagarwal] for the review. Update patch to address the feedback.

bq. Why do we have volumeName.replace(':', '-') in 
DataNodeVolumeMetrics#create? Is it because the : character is not accepted as 
part of the metric name?

Yes. : is not allowed for metric name. That transformation is just to be safe. 

I'm also looking into one of the test failures which seems related. 

> Adding per disk IO statistics in DataNode.
> --
>
> Key: HDFS-10959
> URL: https://issues.apache.org/jira/browse/HDFS-10959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-10959.00.patch, HDFS-10959.01.patch
>
>
> This ticket is opened to support per disk IO statistics in DataNode based on 
> HDFS-10930. The statistics added will make it easier to implement HDFS-4169 
> "Add per-disk latency metrics to DataNode".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org




[jira] [Comment Edited] (HDFS-11222) Document application/octet-stream as required content type for data upload requests in HTTPFS

2016-12-16 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756199#comment-15756199
 ] 

Yiqun Lin edited comment on HDFS-11222 at 12/17/16 4:13 AM:


Thanks [~jojochuang] for the review and comments.
{quote}
The content type check seems only performed when I added extra parameters 
data=true ?
{quote}
I think this is not absolutely right. I debugged the test in my local and found 
something interesting. In httpfs, if we don't add parameter {{=true}}, the 
data stream of uploading file and the relevant attribute parameters 
(replication, blocksize, etc.) will be ignored. Instead of, it will create a 
new url in {{HttpFSServer#createUploadRedirectionURL}} with {{=true}} 
parameter added, then redirect to the new url. Since in new url the {{data}} 
was added, it will do the content type check as well . I just test this in the 
unit tests, correct me if I am wrong thanks. 


was (Author: linyiqun):
Thanks [~jojochuang] for the review and comments.
{quote}
The content type check seems only performed when I added extra parameters 
data=true ?
{quote}
I think this is not absolutely right. I debugged the test in my local and found 
something interesting. In httpfs, if we don't add parameter {{=true}}, the 
data stream of uploading file and the relevant attribute parameters 
(replication, blocksize, etc.) will be ignored. Instead of, it will create a 
new url in {{HttpFSServer#createUploadRedirectionURL}} with adding 
{{=true}} parameter added then redirect to the new url. Since in new url 
the {{data}} was added, it will do the content type check as well . I just test 
this in the unit tests, correct me if I am wrong thanks. 

> Document application/octet-stream as required content type for data upload 
> requests in HTTPFS
> -
>
> Key: HDFS-11222
> URL: https://issues.apache.org/jira/browse/HDFS-11222
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, httpfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Sailesh Patel
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11222.001.patch, HDFS-11222.002.patch
>
>
> Documentation  at [1]   should  indicate the PUT and POST requires a command 
> like  ( --header ):
> curl -i -X PUT -T  
> "http://:/webhdfs/v1/?op=CREATE..." --header 
> "content-type: application/octet-stream"
> [1]  
> https://hadoop.apache.org/docs/stable2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11222) Document application/octet-stream as required content type for data upload requests in HTTPFS

2016-12-16 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756199#comment-15756199
 ] 

Yiqun Lin edited comment on HDFS-11222 at 12/17/16 4:12 AM:


Thanks [~jojochuang] for the review and comments.
{quote}
The content type check seems only performed when I added extra parameters 
data=true ?
{quote}
I think this is not absolutely right. I debugged the test in my local and found 
something interesting. In httpfs, if we don't add parameter {{=true}}, the 
data stream of uploading file and the relevant attribute parameters 
(replication, blocksize, etc.) will be ignored. Instead of, it will create a 
new url in {{HttpFSServer#createUploadRedirectionURL}} with adding 
{{=true}} parameter added then redirect to the new url. Since in new url 
the {{data}} was added, it will do the content type check as well . I just test 
this in the unit tests, correct me if I am wrong thanks. 


was (Author: linyiqun):
Thanks [~jojochuang] for the review and comments.
{quote}
The content type check seems only performed when I added extra parameters 
data=true ?
{quote}
I'm absolutely confirm this is right. I debugged the test in my local and found 
something interesting. In httpfs, if we don't add parameter {{=true}}, the 
data stream of uploading file and the relevant attribute parameters 
(replication, blocksize, etc.) will be ignored. Instead of, it will create a 
new url in {{HttpFSServer#createUploadRedirectionURL}} with {{=true}} 
parameter added. So the content type checking only happens in data upload 
request with sending file data. 

> Document application/octet-stream as required content type for data upload 
> requests in HTTPFS
> -
>
> Key: HDFS-11222
> URL: https://issues.apache.org/jira/browse/HDFS-11222
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, httpfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Sailesh Patel
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11222.001.patch, HDFS-11222.002.patch
>
>
> Documentation  at [1]   should  indicate the PUT and POST requires a command 
> like  ( --header ):
> curl -i -X PUT -T  
> "http://:/webhdfs/v1/?op=CREATE..." --header 
> "content-type: application/octet-stream"
> [1]  
> https://hadoop.apache.org/docs/stable2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10860) Switch HttpFS from Tomcat to Jetty

2016-12-16 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10860:
--
Status: Patch Available  (was: In Progress)

> Switch HttpFS from Tomcat to Jetty
> --
>
> Key: HDFS-10860
> URL: https://issues.apache.org/jira/browse/HDFS-10860
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: httpfs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
> Attachments: HDFS-10860.001.patch, HDFS-10860.002.patch
>
>
> The Tomcat 6 we are using will reach EOL at the end of 2017. While there are 
> other good options, I would propose switching to {{Jetty 9}} for the 
> following reasons:
> * Easier migration. Both Tomcat and Jetty are based on {{Servlet 
> Containers}}, so we don't have to change client code that much. It would 
> require more work to switch to {{JAX-RS}}.
> * Well established.
> * Good performance and scalability.
> Other alternatives:
> * Jersey + Grizzly
> * Tomcat 8
> Your opinions will be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10860) Switch HttpFS from Tomcat to Jetty

2016-12-16 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10860:
--
Attachment: HDFS-10860.002.patch

Patch 002
- Update doc index.md, ServerSetup.md.vm, and HDFSCommands.md
- Set {{HttpServer2.Builder#authFilterConfigurationPrefix}} to integrate with 
HttpServer2’s secret provider

TESTING DONE
- hdfs dfs -ls webhdfs://localhost:14000/
- hdfs dfs -ls swebhdfs://localhost:14000/“ in SSL mode
- hdfs https, hdfs —daemon start|status|stop httpfs
- httpfs.sh run|start|status|stop
- hadoop daemonlog
- HttpFS unit tests
- dist-test nadoop-common and hadoop-hdfs: 
http://dist-test.cloudera.org/job?job_id=hadoop.jzhuge.1481879379.20252, 10 
unrelated test failures.
- /jmx, /logLevel, /conf, /stack, /logs, and /static/index.html

> Switch HttpFS from Tomcat to Jetty
> --
>
> Key: HDFS-10860
> URL: https://issues.apache.org/jira/browse/HDFS-10860
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: httpfs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
> Attachments: HDFS-10860.001.patch, HDFS-10860.002.patch
>
>
> The Tomcat 6 we are using will reach EOL at the end of 2017. While there are 
> other good options, I would propose switching to {{Jetty 9}} for the 
> following reasons:
> * Easier migration. Both Tomcat and Jetty are based on {{Servlet 
> Containers}}, so we don't have to change client code that much. It would 
> require more work to switch to {{JAX-RS}}.
> * Well established.
> * Good performance and scalability.
> Other alternatives:
> * Jersey + Grizzly
> * Tomcat 8
> Your opinions will be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11222) Document application/octet-stream as required content type for data upload requests in HTTPFS

2016-12-16 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756199#comment-15756199
 ] 

Yiqun Lin commented on HDFS-11222:
--

Thanks [~jojochuang] for the review and comments.
{quote}
The content type check seems only performed when I added extra parameters 
data=true ?
{quote}
I'm absolutely confirm this is right. I debugged the test in my local and found 
something interesting. In httpfs, if we don't add parameter {{=true}}, the 
data stream of uploading file and the relevant attribute parameters 
(replication, blocksize, etc.) will be ignored. Instead of, it will create a 
new url in {{HttpFSServer#createUploadRedirectionURL}} with {{=true}} 
parameter added. So the content type checking only happens in data upload 
request with sending file data. 

> Document application/octet-stream as required content type for data upload 
> requests in HTTPFS
> -
>
> Key: HDFS-11222
> URL: https://issues.apache.org/jira/browse/HDFS-11222
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, httpfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Sailesh Patel
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11222.001.patch, HDFS-11222.002.patch
>
>
> Documentation  at [1]   should  indicate the PUT and POST requires a command 
> like  ( --header ):
> curl -i -X PUT -T  
> "http://:/webhdfs/v1/?op=CREATE..." --header 
> "content-type: application/octet-stream"
> [1]  
> https://hadoop.apache.org/docs/stable2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756191#comment-15756191
 ] 

Manoj Govindassamy edited comment on HDFS-9391 at 12/17/16 3:19 AM:


Attaching v01 patch to address the following:
1. Introduced NameNodeMXBean#getEnteringMaintenanceNodes and the same is 
implemented in FSNameSystem
2. Updated LeavingServiceStatus to include details on Decommissioning, 
Maintenance and OutOfService replicas
3. DecommissionManager to update LeavingServiceStatus
4. Updated dfshealth.html to have details on Maintenance nodes in Summary and 
DataNode Information pages. (Entering Maintenance, Live Maintenance, Dead 
Maintenance)
5. Unit test for the Maintenance mode JMX.

[~eddyxu], [~mingma], Can you please take a look at the patch and comment on 
what can be improved ?


was (Author: manojg):
Attaching v01 patch to address the following:
1. Introduced NameNodeMXBean#getEnteringMaintenanceNodes and the same is 
implemented in FSNameSystem
2. Updated LeavingServiceStatus to include details on Decommissioning, 
Maintenance and OutOfService replicas
3. DecommissionManager to update LeavingServiceStatus
4. Updated dfshealth.html to have details on Maintenance nodes in Summary and 
DataNode Information pages. (Entering Maintenance, Live Maintenance, Dead 
Maintenance)
5. Unit test for the Maintenance mode JMX.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Affects Version/s: 3.0.0-alpha1
 Target Version/s: 3.0.0-alpha2
   Status: Patch Available  (was: Open)

Attaching v01 patch to address the following:
1. Introduced NameNodeMXBean#getEnteringMaintenanceNodes and the same is 
implemented in FSNameSystem
2. Updated LeavingServiceStatus to include details on Decommissioning, 
Maintenance and OutOfService replicas
3. DecommissionManager to update LeavingServiceStatus
4. Updated dfshealth.html to have details on Maintenance nodes in Summary and 
DataNode Information pages. (Entering Maintenance, Live Maintenance, Dead 
Maintenance)
5. Unit test for the Maintenance mode JMX.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Attachment: HDFS-9391.01.patch

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, HDFS-9391.01.patch, 
> Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11259) Update fsck to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-11259:
-

 Summary: Update fsck to display maintenance state info
 Key: HDFS-11259
 URL: https://issues.apache.org/jira/browse/HDFS-11259
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Attachment: HDFS-9391-MaintenanceMode-WebUI.pdf

Attaching the WebUI proposal for Maintenance Mode details.

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: HDFS-9391-MaintenanceMode-WebUI.pdf, Maintenance 
> webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9391) Update webUI/JMX to display maintenance state info

2016-12-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-9391:
-
Summary: Update webUI/JMX to display maintenance state info  (was: Update 
webUI/JMX/fsck to display maintenance state info)

> Update webUI/JMX to display maintenance state info
> --
>
> Key: HDFS-9391
> URL: https://issues.apache.org/jira/browse/HDFS-9391
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Manoj Govindassamy
> Attachments: Maintenance webUI.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756151#comment-15756151
 ] 

Yiqun Lin commented on HDFS-9911:
-

Thanks [~anu] for the commit and thanks every one!

> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756117#comment-15756117
 ] 

Arpit Agarwal commented on HDFS-10959:
--

Thanks Xiaoyu. The patch looks good. 

Minor comments:
# FsVolumeImpl#metrics should be final.
# We can replace MetadataFileIo with MetadataOperation everwhere, since one 
call may result in multiple IOs.
# Why do we have {{volumeName.replace(':', '-')}} in 
{{DataNodeVolumeMetrics#create}}? Is it because the {{:}} character is not 
accepted as part of the metric name?
# beforeFileIo and afterFileIo should use sampling. Updating global state for 
every packet looks inefficient. But we can add the sampling later as the 
profiling is off by default.

> Adding per disk IO statistics in DataNode.
> --
>
> Key: HDFS-10959
> URL: https://issues.apache.org/jira/browse/HDFS-10959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-10959.00.patch
>
>
> This ticket is opened to support per disk IO statistics in DataNode based on 
> HDFS-10930. The statistics added will make it easier to implement HDFS-4169 
> "Add per-disk latency metrics to DataNode".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11258) File mtime change could not save to editlog

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756055#comment-15756055
 ] 

Hadoop QA commented on HDFS-11258:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11258 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12843670/hdfs-11258.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1007037699dd 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbe152 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17880/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17880/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17880/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> File mtime change could not save to editlog
> ---
>
> Key: HDFS-11258
> URL: https://issues.apache.org/jira/browse/HDFS-11258
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: 

[jira] [Comment Edited] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755526#comment-15755526
 ] 

Xiaoyu Yao edited comment on HDFS-10959 at 12/17/16 12:07 AM:
--

Attach a patch that add per volume io stats and metrics based on 
ProfilingFileIoEvens.java. It is off by default to minimize the impact on 
datanode normal IO. When needed, this can be enabled with the following for 
profiling datanode IO perf. 
{code}

dfs.datanode.fileio.events.class 

org.apache.hadoop.hdfs.server.datanode.ProfilingFileIoEvents
Profiling File IO Events on datanode volumes and expose 
metrics.

{code}


was (Author: xyao):
Attach a patch that add per volume io stats and metrics based on 
ProfilingFileIoEvens.java. It is off by default to minimize the impact on 
datanode normal IO. When needed, this can be enabled with the following for 
profiling datanode IO perf. 
{code}

dfs.datanode.fileio.events.class  
org.apache.hadoop.hdfs.server.datanode.ProfilingFileIoEvents
Profiling File IO Events on datanode volumes and expose 
metrics.

{code}

> Adding per disk IO statistics in DataNode.
> --
>
> Key: HDFS-10959
> URL: https://issues.apache.org/jira/browse/HDFS-10959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-10959.00.patch
>
>
> This ticket is opened to support per disk IO statistics in DataNode based on 
> HDFS-10930. The statistics added will make it easier to implement HDFS-4169 
> "Add per-disk latency metrics to DataNode".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11258) File mtime change could not save to editlog

2016-12-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755818#comment-15755818
 ] 

Haohui Mai commented on HDFS-11258:
---

Thanks [~jxiang] for reporting the issue! Can you please add a unit test for 
it? Thanks.

> File mtime change could not save to editlog
> ---
>
> Key: HDFS-11258
> URL: https://issues.apache.org/jira/browse/HDFS-11258
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: hdfs-11258.1.patch
>
>
> When both mtime and atime are changed, and atime is not beyond the precision 
> limit, the mtime change is not saved to edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11258) File mtime change could not save to editlog

2016-12-16 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-11258:
---
Status: Patch Available  (was: Open)

> File mtime change could not save to editlog
> ---
>
> Key: HDFS-11258
> URL: https://issues.apache.org/jira/browse/HDFS-11258
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: hdfs-11258.1.patch
>
>
> When both mtime and atime are changed, and atime is not beyond the precision 
> limit, the mtime change is not saved to edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11258) File mtime change could not save to editlog

2016-12-16 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-11258:
---
Attachment: hdfs-11258.1.patch

> File mtime change could not save to editlog
> ---
>
> Key: HDFS-11258
> URL: https://issues.apache.org/jira/browse/HDFS-11258
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: hdfs-11258.1.patch
>
>
> When both mtime and atime are changed, and atime is not beyond the precision 
> limit, the mtime change is not saved to edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11258) File mtime change could not save to editlog

2016-12-16 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HDFS-11258:
--

 Summary: File mtime change could not save to editlog
 Key: HDFS-11258
 URL: https://issues.apache.org/jira/browse/HDFS-11258
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor


When both mtime and atime are changed, and atime is not beyond the precision 
limit, the mtime change is not saved to edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755768#comment-15755768
 ] 

Hadoop QA commented on HDFS-9901:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HDFS-9901 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-9901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12795434/0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17879/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch, 
> 0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-11254.

Resolution: Duplicate

I'm going to close it as HDFS-8865 can help eliminate this issue.

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
> at 
> 

[jira] [Comment Edited] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755754#comment-15755754
 ] 

Wei-Chiu Chuang edited comment on HDFS-9901 at 12/16/16 11:12 PM:
--

FsDatasetImpl#checkBlock does not perform any disk I/O at all. It looks up an 
in-memory structure. I don't understand why there's I/O involved. Please 
correct me if I am wrong.

Also, FsDatasetImpl#checkBlock is called without lock, which is unusual. (this 
is existing code)

I think making transferBlock an asynchronous thread is fine though. But I still 
don't know why that is needed. Making DF async is fine too.


was (Author: jojochuang):
FsDatasetImpl#checkBlock does not perform any disk I/O at all. It looks up an 
in-memory structure. I don't understand why there's I/O involved. Please 
correct me if I am wrong.

Also, FsDatasetImpl#checkBlock is called without lock, which is unusual. (this 
is existing code)

I think making transferBlock an asynchronous thread is fine though. But I still 
don't know why that is needed.

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch, 
> 0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755754#comment-15755754
 ] 

Wei-Chiu Chuang commented on HDFS-9901:
---

FsDatasetImpl#checkBlock does not perform any disk I/O at all. It looks up an 
in-memory structure. I don't understand why there's I/O involved. Please 
correct me if I am wrong.

Also, FsDatasetImpl#checkBlock is called without lock, which is unusual. (this 
is existing code)

I think making transferBlock an asynchronous thread is fine though. But I still 
don't know why that is needed.

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch, 
> 0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755740#comment-15755740
 ] 

Hadoop QA commented on HDFS-10959:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 27 new + 96 unchanged - 0 fixed = 123 total (was 96) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Switch statement found in 
org.apache.hadoop.hdfs.server.datanode.ProfilingFileIoEvents.afterFileIo(FsVolumeSpi,
 FileIoProvider$OPERATION, long, long) where default case is missing  At 
ProfilingFileIoEvents.java:FileIoProvider$OPERATION, long, long) where default 
case is missing  At ProfilingFileIoEvents.java:[lines 70-81] |
|  |  Unused field:DataNodeVolumeMetrics.java |
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeMXBean |
|   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10959 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12843650/HDFS-10959.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8bb11a1b60e9 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f121645 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17878/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| findbugs | 

[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11160:
---
Release Note: Fixed a race condition that caused VolumeScanner to recognize 
a good replica as a bad one if the replica is also being written concurrently.

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, 
> HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, 
> HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11160:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.7.4
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed the patch to branch-2.7, 2.8 , branch-2 and trunk.
Much thanks to [~kihwal] [~yzhangal] and [~xiaochen] for multiple rounds of 
reviews!

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, 
> HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, 
> HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-12-16 Thread Yishan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755580#comment-15755580
 ] 

Yishan Yang commented on HDFS-10453:


Any update? Whether community wanna accept this fix? Thanks!

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10959:
--
Status: Patch Available  (was: Open)

> Adding per disk IO statistics in DataNode.
> --
>
> Key: HDFS-10959
> URL: https://issues.apache.org/jira/browse/HDFS-10959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-10959.00.patch
>
>
> This ticket is opened to support per disk IO statistics in DataNode based on 
> HDFS-10930. The statistics added will make it easier to implement HDFS-4169 
> "Add per-disk latency metrics to DataNode".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10959) Adding per disk IO statistics in DataNode.

2016-12-16 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10959:
--
Attachment: HDFS-10959.00.patch

Attach a patch that add per volume io stats and metrics based on 
ProfilingFileIoEvens.java. It is off by default to minimize the impact on 
datanode normal IO. When needed, this can be enabled with the following for 
profiling datanode IO perf. 
{code}

dfs.datanode.fileio.events.class  
org.apache.hadoop.hdfs.server.datanode.ProfilingFileIoEvents
Profiling File IO Events on datanode volumes and expose 
metrics.

{code}

> Adding per disk IO statistics in DataNode.
> --
>
> Key: HDFS-10959
> URL: https://issues.apache.org/jira/browse/HDFS-10959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-10959.00.patch
>
>
> This ticket is opened to support per disk IO statistics in DataNode based on 
> HDFS-10930. The statistics added will make it easier to implement HDFS-4169 
> "Add per-disk latency metrics to DataNode".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755493#comment-15755493
 ] 

Wei-Chiu Chuang commented on HDFS-11192:


Ping. Any progress? 

> OOM during Quota Initialization lead to Namenode hang
> -
>
> Key: HDFS-11192
> URL: https://issues.apache.org/jira/browse/HDFS-11192
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or 
> not able to create,it will not notify the parent.Parent still waiting for the 
> notify call..that's not timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
> "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
> native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11257) Evacuate DN when the remaining is negative

2016-12-16 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755365#comment-15755365
 ] 

Inigo Goiri commented on HDFS-11257:


The proposal would be for the {{BlockManager}} to check for this situation and 
leverage the code in {{blockHasEnoughRacks()}} to mark blocks as needing 
replicas in other nodes. Once that's done, the block placement policy would 
mark the blocks in machine with {{getRemaining()<0}} for deletion.

> Evacuate DN when the remaining is negative
> --
>
> Key: HDFS-11257
> URL: https://issues.apache.org/jira/browse/HDFS-11257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.3
>Reporter: Inigo Goiri
>
> Datanodes have a maximum amount of disk they can use. This is set using 
> {{dfs.datanode.du.reserved}}. For example, if we have a 1TB disk and we set 
> the reserved to 100GB, the DN can only use ~900GB. However, if we fill the DN 
> and later other processes (e.g., logs or co-located services) start to use 
> the disk space, the remaining space will go to a negative and the used 
> storage >100%.
> The Rebalancer or decommissioning would cover this situation. However, both 
> approaches require administrator intervention while this is a situation that 
> violates the settings. Note that decommisioning, would be too extreme as it 
> would evacuate all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11257) Evacuate DN when the remaining is negative

2016-12-16 Thread Inigo Goiri (JIRA)
Inigo Goiri created HDFS-11257:
--

 Summary: Evacuate DN when the remaining is negative
 Key: HDFS-11257
 URL: https://issues.apache.org/jira/browse/HDFS-11257
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.7.3
Reporter: Inigo Goiri


Datanodes have a maximum amount of disk they can use. This is set using 
{{dfs.datanode.du.reserved}}. For example, if we have a 1TB disk and we set the 
reserved to 100GB, the DN can only use ~900GB. However, if we fill the DN and 
later other processes (e.g., logs or co-located services) start to use the disk 
space, the remaining space will go to a negative and the used storage >100%.

The Rebalancer or decommissioning would cover this situation. However, both 
approaches require administrator intervention while this is a situation that 
violates the settings. Note that decommisioning, would be too extreme as it 
would evacuate all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf

2016-12-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755358#comment-15755358
 ] 

Chris Douglas commented on HDFS-6984:
-

bq. So for the Hive usecase, they would have to pass around a full serialized 
FileStatus even though open() only needs the PathHandle field? This API seems 
fine for same-process usage (HDFS-9806?) but inefficient for cross-process. I 
think UNIX users are also used to the idea of an inode id separate from a file 
status.
Sure, but this wasn't what I was trying to get at. On its efficiency:
# The overhead at the NN is significant, since it's often the cluster 
bottleneck. If we're returning information that's immediately discarded, the 
client-side inefficiency of not-immediately-discarding is not irrelevant, but 
it's not significant. For a few hundred {{FileStatus}} records, this overhead 
is less than 1MB, and it's often available for GC (most {{FileStatus}} objects 
are short-lived).
# For cases where we want to manage thousands or millions of FileStatus 
instances, this overhead may become significant. But we can compress repeated 
data, and most collections of {{FileStatus}} objects are mostly repeated data. 
Further, most of these fields are optional, and if an application wants to 
transfer only necessary fields (e.g., the Path and PathHandle, omitting 
everything else), that's fine.
# I share your caution w.r.t. the API surface, which is why HDFS-7878 avoids 
adding lots of new calls accepting {{PathHandle}} objects. Users of 
{{FileSystem}} almost always intend to refer to the entity returned by the last 
query, not whatever happens to exist at that path now.

In exchange for non-optimal record size, the API gains some convenient symmetry 
(i.e., get a FileStatus, use a FileStatus) that at least makes it possible to 
avoid the TOCTOU races that are uncommon, but annoying. The serialization can 
omit fields. We can use generic container formats that understand PB to 
compress collections of {{FileStatus}} objects. We can even use a more generic 
type ({{FileStatus}}) if the specific type follows some PB conventions. 
Considering the cost of these, the distance from "optimal" seems well within 
the ambient noise.

bq. I still lean toward removing Writable altogether, since it reduces our API 
surface. Similarly, I'd rather not open up HdfsFileStatus as a public API 
(without a concrete usecase) since it expands our API surface.
Again, I'm mostly ambivalent about {{Writable}}. [~ste...@apache.org]? 
Preserving the automatic serialization that some user programs may rely on... 
we can try removing it in 3.x-alpha/beta, and see if anyone complains. If I 
moved this to a library, would that move this forward?

I didn't mean to suggest that {{HdfsFileStatus}} should be a public API (with 
all the restrictions on evolving it).

bq. The cross-serialization is also fragile since we need to be careful not to 
reuse field numbers across two structures, and I've seen numbering mistakes 
made before even for normal PB changes.
Granted, but a hole in the PB toolchain doesn't mean we shouldn't use the 
feature. We can add more comments to the .proto. Reading through HDFS-6326, 
perhaps making {{FsPermission}} a protobuf record would help.

> In Hadoop 3, make FileStatus serialize itself via protobuf
> --
>
> Key: HDFS-6984
> URL: https://issues.apache.org/jira/browse/HDFS-6984
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, 
> HDFS-6984.003.patch, HDFS-6984.nowritable.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
> to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
> have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
> information.  The protobuf form is preferable, since it allows us to add new 
> fields in a backwards-compatible way.  Another issue is that already a lot of 
> subclasses of FileStatus don't override the Writable methods of the 
> superclass, breaking the interface contract that read(status.write) should be 
> equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so 
> that we don't have to deal with these issues.  It's probably too late to do 
> this in Hadoop 2, since user code may be relying on the existing FileStatus 
> serialization there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org


[jira] [Commented] (HDFS-11114) Support for running async disk checks in DataNode

2016-12-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755260#comment-15755260
 ] 

ASF GitHub Bot commented on HDFS-4:
---

Github user arp7 closed the pull request at:

https://github.com/apache/hadoop/pull/154


> Support for running async disk checks in DataNode
> -
>
> Key: HDFS-4
> URL: https://issues.apache.org/jira/browse/HDFS-4
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.9.0, 3.0.0-alpha2
>
>
> Introduce support for running async checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755252#comment-15755252
 ] 

Chris Nauroth commented on HDFS-9911:
-

[~linyiqun], thank you for the patch!

> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755179#comment-15755179
 ] 

Hudson commented on HDFS-9911:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11008 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11008/])
HDFS-9911. TestDataNodeLifeline Fails intermittently. Contributed by 
(aengineer: rev a95639068c99ebcaefe8b6c4268449d12a6577d6)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java


> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9911:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: (was: 2.8.0)
  3.0.0-alpha2
Target Version/s: 3.0.0-alpha2  (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

 [~cnauroth] Thanks for the code review comments. [~linyiqun] Thank you for the 
contribution. I have committed this to trunk.

> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755091#comment-15755091
 ] 

Anu Engineer commented on HDFS-9911:


+1 on patch 2.

> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11215) Log Namenode thread dump on unexpected exits

2016-12-16 Thread Stephen O'Donnell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754869#comment-15754869
 ] 

Stephen O'Donnell commented on HDFS-11215:
--

Hey [~apurtell] - I wasn't particularly thinking of front line operators for 
this 'feature'. My thinking was more that if you are getting repeated crashes 
and the cause is not clear, this could be something to give your support vendor 
for them to dig into.

> Log Namenode thread dump on unexpected exits
> 
>
> Key: HDFS-11215
> URL: https://issues.apache.org/jira/browse/HDFS-11215
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>
> With HA namenodes, reasonably frequently we can see a namenode exit due to a 
> quorum of journals not responding inside the 20 second timeout, for example:
> {code}
> 2016-11-29 01:43:22,969  WARN 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19016 ms 
> (timeout=2 ms) for a response for sendEdits. No responses yet.
> 2016-11-29 01:43:23,954  FATAL 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for 
> required journal (JournalAndStream(mgr=QJM to [10.x.x.x:8485, 10.x.x.x:8485, 
> 10.x.x.x:8485], stream=QuorumOutputStream starting at txid 762756576))
> 2016-11-29 01:43:23,954  FATAL 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for 
> required journal (JournalAndStream(mgr=QJM to [10.x.x.x:8485, 10.x.x.x:8485, 
> 10.x.x.x:8485], stream=QuorumOutputStream starting at txid 762756576))
> java.io.IOException: Timed out waiting 2ms for a quorum of nodes to 
> respond.
>   at 
> org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
>   at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
>   at 
> org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2687)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2559)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:592)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> 2016-11-29 01:43:23,955  WARN 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting 
> QuorumOutputStream starting at txid 762756576
> 2016-11-29 01:43:23,987  INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2016-11-29 01:43:24,003  INFO 
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
> {code}
> When this happens, it can often be a mystery around what caused it. Eg, was 
> it a network issue, was the thread blocked waiting on a response from a KDC 
> (due to no kdc_timeout set), was it slow disk (on the Journal or even log4j 
> on the Namenode), etc.
> I wonder if it would make sense to log a thread dump to the Namenodes log 
> just before exiting, as this may give some clues about what caused the lack 
> of response from the Journals?
> There 

[jira] [Commented] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754689#comment-15754689
 ] 

Wei-Chiu Chuang commented on HDFS-11254:


On the other hand, maybe a dedicated NameNode IPC thread for ZKFC, or something 
similar to DataNode lifeline protocol is the ultimate solution, because I am 
sure there are other incidences like this that could trigger NameNode crash.

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> at 
> 

[jira] [Commented] (HDFS-11222) Document application/octet-stream as required content type for data upload requests in HTTPFS

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754657#comment-15754657
 ] 

Wei-Chiu Chuang commented on HDFS-11222:


[~linyiqun] thanks for the updated patch, but I still have questions.
The content type check seems only performed when I added extra parameters 
data=true {{?op=CREATE=true}}. [~saileshpatel] would you mind to take a 
look to see if it matches your observation?

> Document application/octet-stream as required content type for data upload 
> requests in HTTPFS
> -
>
> Key: HDFS-11222
> URL: https://issues.apache.org/jira/browse/HDFS-11222
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, httpfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Sailesh Patel
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11222.001.patch, HDFS-11222.002.patch
>
>
> Documentation  at [1]   should  indicate the PUT and POST requires a command 
> like  ( --header ):
> curl -i -X PUT -T  
> "http://:/webhdfs/v1/?op=CREATE..." --header 
> "content-type: application/octet-stream"
> [1]  
> https://hadoop.apache.org/docs/stable2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11254:
---
Comment: was deleted

(was: Looking at code again, the more relevant one is HDFS-8865 which uses 
thread pool to improve quota initialization performance. HDFS-6763 moved the 
operation outside FSImage#loadEdits to FSNameSystem#startActiveServices, but it 
still holds NameNodeRPCServer lock, so that's not going to help. But HDFS-11192 
mentioned an issue using the threadpool in HDFS-8865.)

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 

[jira] [Commented] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754366#comment-15754366
 ] 

Wei-Chiu Chuang commented on HDFS-11254:


Looking at code again, the more relevant one is HDFS-8865 which uses thread 
pool to improve quota initialization performance. HDFS-6763 moved the operation 
outside FSImage#loadEdits to FSNameSystem#startActiveServices, but it still 
holds NameNodeRPCServer lock, so that's not going to help. But HDFS-11192 
mentioned an issue using the threadpool in HDFS-8865.

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>

[jira] [Commented] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754367#comment-15754367
 ] 

Wei-Chiu Chuang commented on HDFS-11254:


Looking at code again, the more relevant one is HDFS-8865 which uses thread 
pool to improve quota initialization performance. HDFS-6763 moved the operation 
outside FSImage#loadEdits to FSNameSystem#startActiveServices, but it still 
holds NameNodeRPCServer lock, so that's not going to help. But HDFS-11192 
mentioned an issue using the threadpool in HDFS-8865.

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
>

[jira] [Commented] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long

2016-12-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754339#comment-15754339
 ] 

Wei-Chiu Chuang commented on HDFS-11254:


[~linyiqun] yeah code-wise looks that way. Thanks a lot.
We are based on Hadoop 2.6/2.7 so missed that one.

A few other related jiras too: HDFS-7811, HDFS-7728, HDFS-8865, HDFS-9003 and 
HDFS-6763.

> Standby NameNode may crash during failover if loading edits takes too long
> --
>
> Key: HDFS-11254
> URL: https://issues.apache.org/jira/browse/HDFS-11254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.9.0, 3.0.0-beta1
>
>
> We found Standby NameNode crashed when it tried to transition from standby to 
> active. This issue is similar to HDFS-11225 in nature. 
> The root cause is all IPC threads were blocked, so ZKFC connection to NN 
> timed out. In particular, when it crashed, we saw a few threads blocked on 
> this thread:
> {noformat}
> Thread 188 (IPC Server handler 25 on 8022):
>   State: RUNNABLE
>   Blocked count: 278
>   Waited count: 17419
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887)
> 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188)
> 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477)
> 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458)
> {noformat}
> This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. 
> We also found the following edit logs was rejected after journal node 
> advanced epoch, which implies a failed transitionToActive request.
> {noformat}
> 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> at 
> 

[jira] [Commented] (HDFS-9911) TestDataNodeLifeline Fails intermittently

2016-12-16 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754110#comment-15754110
 ] 

Yiqun Lin commented on HDFS-9911:
-

Hi [~anu], [~cnauroth], could you please help do the final commit of patch, I 
see there are no further comments from others now. Thanks.

> TestDataNodeLifeline  Fails intermittently
> --
>
> Key: HDFS-9911
> URL: https://issues.apache.org/jira/browse/HDFS-9911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-9911.001.patch, HDFS-9911.002.patch
>
>
> In HDFS-1312 branch, we have a failure for this test.
> {{org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime}}
> {noformat}
> Error Message
> Expect metrics to count no lifeline calls. expected:<0> but was:<1>
> Stacktrace
> java.lang.AssertionError: Expect metrics to count no lifeline calls. 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeLifeline.testNoLifelineSentIfHeartbeatsOnTime(TestDataNodeLifeline.java:256)
> {noformat}
> Details can be found here.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14726/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeLifeline/testNoLifelineSentIfHeartbeatsOnTime/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11256) Rebalance specific directory

2016-12-16 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11256:

Target Version/s:   (was: 3.0.0-alpha1)

> Rebalance specific directory
> 
>
> Key: HDFS-11256
> URL: https://issues.apache.org/jira/browse/HDFS-11256
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Affects Versions: 3.0.0-alpha1
>Reporter: Amos Bird
>
> Currently hdfs only supports rebalancing over entire cluster. This might not 
> be utilized by data processing systems like hive, spark, impala etl. 
> In hive, we may need to maximize some fact tables IO performance by carefully 
> sharding their blocks evenly over all disks. Normally a INSERT SELECT is done 
> to achieve such redistributing progress. 
> Given a table is backed by one directory on hdfs, rebalancing specific dir 
> may be very useful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11256) Rebalance specific directory

2016-12-16 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11256:

Fix Version/s: (was: 3.0.0-alpha1)

> Rebalance specific directory
> 
>
> Key: HDFS-11256
> URL: https://issues.apache.org/jira/browse/HDFS-11256
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Affects Versions: 3.0.0-alpha1
>Reporter: Amos Bird
>
> Currently hdfs only supports rebalancing over entire cluster. This might not 
> be utilized by data processing systems like hive, spark, impala etl. 
> In hive, we may need to maximize some fact tables IO performance by carefully 
> sharding their blocks evenly over all disks. Normally a INSERT SELECT is done 
> to achieve such redistributing progress. 
> Given a table is backed by one directory on hdfs, rebalancing specific dir 
> may be very useful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11256) Rebalance specific directory

2016-12-16 Thread Amos Bird (JIRA)
Amos Bird created HDFS-11256:


 Summary: Rebalance specific directory
 Key: HDFS-11256
 URL: https://issues.apache.org/jira/browse/HDFS-11256
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: balancer & mover
Affects Versions: 3.0.0-alpha1
Reporter: Amos Bird
 Fix For: 3.0.0-alpha1


Currently hdfs only supports rebalancing over entire cluster. This might not be 
utilized by data processing systems like hive, spark, impala etl. 

In hive, we may need to maximize some fact tables IO performance by carefully 
sharding their blocks evenly over all disks. Normally a INSERT SELECT is done 
to achieve such redistributing progress. 

Given a table is backed by one directory on hdfs, rebalancing specific dir may 
be very useful. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org