[jira] [Updated] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails

2017-10-12 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12415:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7240
   Status: Resolved  (was: Patch Available)

Thanks [~vagarychen] and [~cheersyang] for the reviews. I have committed this 
to HDFS-7240 branch.

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> 
>
> Key: HDFS-12415
> URL: https://issues.apache.org/jira/browse/HDFS-12415
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, 
> HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12556) [SPS] : Block movement analysis should be done in read lock.

2017-10-12 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-12556:
--
Attachment: HDFS-12556-HDFS-10285-03.patch

Rebased patch, please review...

> [SPS] : Block movement analysis should be done in read lock.
> 
>
> Key: HDFS-12556
> URL: https://issues.apache.org/jira/browse/HDFS-12556
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-12556-HDFS-10285-01.patch, 
> HDFS-12556-HDFS-10285-02.patch, HDFS-12556-HDFS-10285-03.patch
>
>
> {noformat}
> 2017-09-27 15:58:32,852 [StoragePolicySatisfier] ERROR 
> namenode.StoragePolicySatisfier 
> (StoragePolicySatisfier.java:handleException(308)) - StoragePolicySatisfier 
> thread received runtime exception. Stopping Storage policy satisfier work
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getStorages(BlockManager.java:4130)
>   at 
> org.apache.hadoop.hdfs.server.namenode.StoragePolicySatisfier.analyseBlocksStorageMovementsAndAssignToDN(StoragePolicySatisfier.java:362)
>   at 
> org.apache.hadoop.hdfs.server.namenode.StoragePolicySatisfier.run(StoragePolicySatisfier.java:236)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12613) Native EC coder should implement release() as idempotent function.

2017-10-12 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203099#comment-16203099
 ] 

Kai Zheng commented on HDFS-12613:
--

Thanks [~eddyxu] for the update! +1 pending on the Jenkins and the minor check 
style.

> Native EC coder should implement release() as idempotent function.
> --
>
> Key: HDFS-12613
> URL: https://issues.apache.org/jira/browse/HDFS-12613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12613.00.patch, HDFS-12613.01.patch, 
> HDFS-12613.02.patch, HDFS-12613.03.patch
>
>
> Recently, we found native EC coder crashes JVM because 
> {{NativeRSDecoder#release()}} being called multiple times (HDFS-12612 and 
> HDFS-12606). 
> We should strength the implement the native code to make {{release()}} 
> idempotent  as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12504) Ozone: Improve SQLCLI performance

2017-10-12 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203089#comment-16203089
 ] 

Weiwei Yang commented on HDFS-12504:


bq.  if we can have some simple benchmark results to see the performance 
improvement,

Agree with this idea. Actually I suggest to add some log to record the time 
consumed on critical paths, e.g insert a record to target DB, insert a batch of 
records to a target DB. So that we can estimate the performance improvement 
given by this patch. [~yuanbo], does that make sense to you?

> Ozone: Improve SQLCLI performance
> -
>
> Key: HDFS-12504
> URL: https://issues.apache.org/jira/browse/HDFS-12504
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Yuanbo Liu
>  Labels: performance
> Attachments: HDFS-12504-HDFS-7240.001.patch
>
>
> In my test, my {{ksm.db}} has *3017660* entries with total size of *128mb*, 
> SQLCLI tool runs over *2 hours* but still not finish exporting the DB. This 
> is because it iterates each entry and inserts that to another sqllite DB 
> file, which is not efficient. We need to improve this to be running more 
> efficiently on large DB files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203065#comment-16203065
 ] 

Hadoop QA commented on HDFS-11467:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 47 unchanged - 2 fixed = 47 total (was 49) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 44s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:3d04c00 |
| JIRA Issue | HDFS-11467 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891964/HDFS-11467.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 96aed1c2fd5d 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e46d5bb |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21678/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21678/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job

[jira] [Commented] (HDFS-12637) Extend TestDistributedFileSystemWithECFile with a random EC policy

2017-10-12 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203063#comment-16203063
 ] 

Xiao Chen commented on HDFS-12637:
--

Thanks [~tasanuma0829] for filing the jira and providing a patch.

I'm not familiar with EC details, but what's the difference with having a 
random non-default EC policy than parameterizing all policies for the base test 
({{TestDistributedFileSystemWithECFile}})? IIUC parameterizing will 
deterministically cover all policies, right?

> Extend TestDistributedFileSystemWithECFile with a random EC policy
> --
>
> Key: HDFS-12637
> URL: https://issues.apache.org/jira/browse/HDFS-12637
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
> Attachments: HDFS-12637.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12642) Log block and datanode details in BlockRecoveryWorker

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203062#comment-16203062
 ] 

Hadoop QA commented on HDFS-12642:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 43s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m  2s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:3d04c00 |
| JIRA Issue | HDFS-12642 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891634/HDFS-12642.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fd594502c92b 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e46d5bb |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21677/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21677/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21677/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http:

[jira] [Comment Edited] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails

2017-10-12 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203055#comment-16203055
 ] 

Weiwei Yang edited comment on HDFS-12415 at 10/13/17 5:03 AM:
--

I am also +1 to [~msingh]'s patch, lets get this committed and see if this 
resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your 
attention. [~msingh], feel free to commit latest patch since you got 2 +1s :).


was (Author: cheersyang):
I am also +1 to [~msingh]'s patch, lets get this committed and see if this 
resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your 
attention.

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> 
>
> Key: HDFS-12415
> URL: https://issues.apache.org/jira/browse/HDFS-12415
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, 
> HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails

2017-10-12 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned HDFS-12415:
--

Assignee: Mukul Kumar Singh  (was: Weiwei Yang)

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> 
>
> Key: HDFS-12415
> URL: https://issues.apache.org/jira/browse/HDFS-12415
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, 
> HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails

2017-10-12 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203055#comment-16203055
 ] 

Weiwei Yang commented on HDFS-12415:


I am also +1 to [~msingh]'s patch, lets get this committed and see if this 
resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your 
attention.

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> 
>
> Key: HDFS-12415
> URL: https://issues.apache.org/jira/browse/HDFS-12415
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, 
> HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12613) Native EC coder should implement release() as idempotent function.

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203032#comment-16203032
 ] 

Hadoop QA commented on HDFS-12613:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 34m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  7m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
26m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 28m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 59s{color} | {color:orange} root: The patch generated 1 new + 82 unchanged - 
0 fixed = 83 total (was 82) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  7m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 13m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 18s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
7s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}295m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestFixKerberosTicketOrder |
|   | hadoop.ha.TestZKFailoverController |
|   | hadoop.security.token.delegation.TestZKDelegationTokenSecretManager |
|   | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.tracing.TestTracing |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   |

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Jiandan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203022#comment-16203022
 ] 

Jiandan Yang  commented on HDFS-12638:
--

[~daryn] NN audit log lost some, and we did not find truncate operation about 
this file, but I think this file was truncated by viewing DN logs. In DN log 
the block was first finallized then do recover.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12621) Inconsistency/confusion around ViewFileSystem.getDelagation

2017-10-12 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202997#comment-16202997
 ] 

Mohammad Kamrul Islam commented on HDFS-12621:
--

Thanks [~sureshms] for re-adding me.

[~xkrogen] : thanks for your comments. Follow up comments:
_addDelegationTokens_ for ViewFileSystem works fine and collects the 
appropriate tokens from child filesystem(s). But the  confusion is 
*getDelegationToken*() works for most FS but not for ViewFileSsytem. 
 
Which option do you think will be a good idea? I think option #1 could be less 
risky but at least give some message to the caller to call 
_addDelegationTokens_ instead. 



> Inconsistency/confusion around ViewFileSystem.getDelagation 
> 
>
> Key: HDFS-12621
> URL: https://issues.apache.org/jira/browse/HDFS-12621
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>
> *Symptom*: 
> When a user invokes ViewFileSystem.getDelegationToken(String renewer), she 
> gets a "null". However, for any other file system, it returns a valid 
> delegation token. For a normal user, it is very confusing and it takes 
> substantial time to debug/find out an alternative.
> *Root Cause:*
>  ViewFileSystem inherits the basic implementation from 
> FileSystem.getDelegationToken() that returns "_null_". The comments in the 
> source code indicates not to use it and instead use addDelegationTokens(). 
> However, it works fine DistributedFileSystem. 
> In short, the same client call is working for hdfs:// but not for  viewfs://. 
> And there is no way of end-user to identify the root cause. This also creates 
> a lot of confusion for any service that are supposed to work for both viewfs 
> and hdfs.
> *Possible Solution*:
> _Option 1:_ Add  a LOG.warn() with reasons/alternative before returning 
> "null" in the base class.
> _Option 2:_ As done for other FS, ViewFileSystem can override the method with 
> a implementation by returning the token related to fs.defaultFS. In this 
> case, the defaultFS is something like "viewfs://..". We need to find out the 
> actual namenode and uses that to retrieve the delegation token.
> _Option 3:_ Open for suggestion ?
> *Last note:* My hunch is : there are very few users who may be using 
> viewfs:// with Kerberos. Therefore, it was not being exposed earlier.
> I'm working on a good solution. Please add your suggestion.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Attachment: HDFS-11467.001.patch

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Status: Patch Available  (was: Open)

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202993#comment-16202993
 ] 

Huafeng Wang commented on HDFS-11467:
-

As discussed with Wei offline, I'll take this one.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei Zhou
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-11467:
---

Assignee: Huafeng Wang  (was: Wei Zhou)

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12642) Log block and datanode details in BlockRecoveryWorker

2017-10-12 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12642:
-
Status: Patch Available  (was: Open)

> Log block and datanode details in BlockRecoveryWorker
> -
>
> Key: HDFS-12642
> URL: https://issues.apache.org/jira/browse/HDFS-12642
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12642.01.patch
>
>
> In a recent investigation, we have seen a weird block recovery issue, which 
> is difficult to reach to a conclusion because of insufficient logs.
> For the most critical part of the events, we see block recovery failed to 
> {{commitBlockSynchronization]} on the NN, due to the block not closed. This 
> leaves the file as open forever (for 1+ months).
> The reason the block was not closed on NN, was because it is configured with 
> {{dfs.namenode.replication.min}} =2, and only 1 replica was with the latest 
> genstamp.
> We were not able to tell why only 1 replica is on latest genstamp.
> From the primary node of the recovery (ps2204), {{initReplicaRecoveryImpl}} 
> was called on each of the 7 DNs the block were ever placed. All DNs but 
> ps2204 and ps3765 failed because of genstamp comparison - that's expected. 
> ps2204 and ps3765 have gone past the comparison (since no exceptions from 
> their logs), but {{updateReplicaUnderRecovery}} only appeared to be called on 
> ps3765.
> This jira is to propose we log more details when {{BlockRecoveryWorker}} is 
> about to call {{updateReplicaUnderRecovery}} on the DataNodes, so this could 
> be figured out in the future.
> {noformat}
> $ grep "updateReplica:" ps2204.dn.log 
> $ grep "updateReplica:" ps3765.dn.log 
> hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.933Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"updateReplicaUnderRecovery","level":"INFO","line_number":"2512","thread_name":"IPC
>  Server handler 6 on 
> 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"updateReplica:
>  BP-550436645-17.142.147.13-1438988035284:blk_2172795728_1106150312, 
> recoveryId=1107074793, length=65024, replica=ReplicaUnderRecovery, 
> blk_2172795728_1106150312, RUR
> $ grep "initReplicaRecovery:" ps2204.dn.log 
> hadoop-hdfs-datanode-ps2204.log.1:{"@timestamp":"2017-09-13T00:56:20.691Z","source_host":"ps2204.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@5ae3cb26","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery:
>  blk_2172795728_1106150312, recoveryId=1107074793, 
> replica=ReplicaWaitingToBeRecovered, blk_2172795728_1106150312, RWR
> hadoop-hdfs-datanode-ps2204.log.1:{"@timestamp":"2017-09-13T00:56:20.691Z","source_host":"ps2204.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2497","thread_name":"org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@5ae3cb26","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery:
>  changing replica state for blk_2172795728_1106150312 from RWR to 
> RUR","class":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","mdc":{}}
> $ grep "initReplicaRecovery:" ps3765.dn.log 
> hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"IPC
>  Server handler 5 on 
> 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery:
>  blk_2172795728_1106150312, recoveryId=1107074793, 
> replica=ReplicaBeingWritten, blk_2172795728_1106150312, RBW
> hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"IPC
>  Server handler 5 on 
> 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery:
>  blk_2172795728_1106150312, recoveryId=1107074793, 
> replica=ReplicaBeingWritten, blk_2172795728_1106150312, RBW
> hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2497","thread_name":"IPC
>  Serve

[jira] [Commented] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202981#comment-16202981
 ] 

Hadoop QA commented on HDFS-12650:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 13s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 385 unchanged - 
9 fixed = 386 total (was 394) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 102 unchanged - 2 fixed = 105 total (was 104) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 97m 
43s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:3d04c00 |
| JIRA Issue | HDFS-12650 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891850/HDFS-12650.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux dff4c7f281f4 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e46d5bb |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21676/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21676/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21676/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://build

[jira] [Resolved] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware

2017-10-12 Thread Gruust (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gruust resolved HDFS-12649.
---
Resolution: Invalid

> handling of corrupt blocks not suitable for commodity hardware
> --
>
> Key: HDFS-12649
> URL: https://issues.apache.org/jira/browse/HDFS-12649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.1
>Reporter: Gruust
>Priority: Minor
>
> Hadoop's documentation tells me it's suitable for commodity hardware in the 
> sense that hardware failures are expected to happen frequently. However, 
> there is currently no automatic handling of corrupted blocks, which seems a 
> bit contradictory to me.
> See: 
> https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
> This is even problematic for data integrity as the redundancy is not kept at 
> the desired level without manual intervention and therefore in a timely 
> manner. If there is a corrupted block, I would at least expect that the 
> namenode forces the creation of an additional good replica to keep up the 
> redundancy level, ie. the redundancy level should never include corrupted 
> data... which it currently does:
> "UnderReplicatedBlocks" : 0,
> "CorruptBlocks" : 2,
> (namenode /jmx http dump)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202878#comment-16202878
 ] 

Hadoop QA commented on HDFS-12626:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
37s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
1s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.ozone.scm.TestXceiverClientManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891822/HDFS-12626-HDFS-7240.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 74707213b235 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ma

[jira] [Updated] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements

2017-10-12 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-12570:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-10285
   Status: Resolved  (was: Patch Available)

I have just pushed it to branch

> [SPS]: Refactor Co-ordinator datanode logic to track the block storage 
> movements
> 
>
> Key: HDFS-12570
> URL: https://issues.apache.org/jira/browse/HDFS-12570
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: HDFS-10285
>
> Attachments: HDFS-12570-HDFS-10285-00.patch, 
> HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, 
> HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch
>
>
> This task is to refactor the C-DN block storage movements. Basically, the 
> idea is to move the scheduling and tracking logic to Namenode rather than at 
> the special C-DN. Please refer the discussion with [~andrew.wang] to 
> understand the [background and the necessity of 
> refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements

2017-10-12 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202847#comment-16202847
 ] 

Uma Maheswara Rao G commented on HDFS-12570:


+1 on the latest patch. Thanks Rakesh for updating the patch. Thanks 
[~surendrasingh] for the reviews.

> [SPS]: Refactor Co-ordinator datanode logic to track the block storage 
> movements
> 
>
> Key: HDFS-12570
> URL: https://issues.apache.org/jira/browse/HDFS-12570
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-12570-HDFS-10285-00.patch, 
> HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, 
> HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch
>
>
> This task is to refactor the C-DN block storage movements. Basically, the 
> idea is to move the scheduling and tracking logic to Namenode rather than at 
> the special C-DN. Please refer the discussion with [~andrew.wang] to 
> understand the [background and the necessity of 
> refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12653) Implement toArray() and toSubArray() for ReadOnlyList

2017-10-12 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12653:
-

 Summary: Implement toArray() and toSubArray() for ReadOnlyList
 Key: HDFS-12653
 URL: https://issues.apache.org/jira/browse/HDFS-12653
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


{{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
list supports following Util methods for easy construction of read only views 
of any given list. 

{noformat}
public static  ReadOnlyList asReadOnlyList(final List list) 

public static  List asList(final ReadOnlyList list)
{noformat}

{{asList}} above additionally overrides {{Object[] toArray()}} of the 
{{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
returns an array of Objects referring to the backing list and avoid any copying 
of objects. Given that we have many usages of read only lists,

1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
{{ReadOnlyList}} as well. 
2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12653) Implement toArray() and subArray() for ReadOnlyList

2017-10-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12653:
--
Summary: Implement toArray() and subArray() for ReadOnlyList  (was: 
Implement toArray() and toSubArray() for ReadOnlyList)

> Implement toArray() and subArray() for ReadOnlyList
> ---
>
> Key: HDFS-12653
> URL: https://issues.apache.org/jira/browse/HDFS-12653
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
> list supports following Util methods for easy construction of read only views 
> of any given list. 
> {noformat}
> public static  ReadOnlyList asReadOnlyList(final List list) 
> public static  List asList(final ReadOnlyList list)
> {noformat}
> {{asList}} above additionally overrides {{Object[] toArray()}} of the 
> {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
> returns an array of Objects referring to the backing list and avoid any 
> copying of objects. Given that we have many usages of read only lists,
> 1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
> {{ReadOnlyList}} as well. 
> 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
> lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10685) libhdfs++: return explicit error when non-secured client connects to secured server

2017-10-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer reassigned HDFS-10685:
---

Assignee: Kai Jiang

> libhdfs++: return explicit error when non-secured client connects to secured 
> server
> ---
>
> Key: HDFS-10685
> URL: https://issues.apache.org/jira/browse/HDFS-10685
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Kai Jiang
> Attachments: HDFS-10685.HDFS-8707.000.patch, 
> HDFS-10685.HDFS-8707.001.patch
>
>
> When a non-secured client tries to connect to a secured server, the first 
> indication is an error from RpcConnection::HandleRpcRespose complaining about 
> "RPC response with Unknown call id -33".
> We should insert code in HandleRpcResponse to detect if the unknown call id 
> == RpcEngine::kCallIdSasl and return an informative error that you have an 
> unsecured client connecting to a secured server.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12578) TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7

2017-10-12 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202827#comment-16202827
 ] 

Xiao Chen commented on HDFS-12578:
--

Thank you for the investigation [~ajayydv]. And sorry for my delayed response.

I think the reason there is a 1+ second delay in branch-2.7, is in 
{{BlockManagerTestUtil.checkHeartbeat}}:
{code:title=HDFS-11224's change, which is in 2.8+}
   public static void checkHeartbeat(BlockManager bm) {
 -bm.getDatanodeManager().getHeartbeatManager().heartbeatCheck();
 +HeartbeatManager hbm = bm.getDatanodeManager().getHeartbeatManager();
 +hbm.restartHeartbeatStopWatch();
 +hbm.heartbeatCheck();
}
{code}

The jira HDFS-11224 was mainly to fix a bug in feature HDFS-9239, which is only 
in branch-2.8+.

So here is what I propose:
- for branch-2.7, in addition to what you have found, we should also restart 
the stopwatch.
- for branch-2.8+, let's do what you did in the patch, to give a wider range 
than 1 millisecond. (Please upload a patch based on trunk, so pre-commit can be 
triggered)

Does this make sense?

> TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7
> 
>
> Key: HDFS-12578
> URL: https://issues.apache.org/jira/browse/HDFS-12578
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiao Chen
>Assignee: Ajay Kumar
>Priority: Blocker
> Attachments: HDFS-12578-branch-2.7.001.patch
>
>
> It appears {{TestDeadDatanode#testNonDFSUsedONDeadNodeReReg}} is consistently 
> failing in branch-2.7. We should investigate and fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12628) libhdfs crashes on thread exit for JNI+libhdfs applications

2017-10-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202824#comment-16202824
 ] 

Allen Wittenauer commented on HDFS-12628:
-

Honestly, it's probably time HDFS-8707 got merged into a release and then we 
can kill off libhdfs.  cc: [~James C], [~bobhansen], [~anatoli.shein], ...




> libhdfs crashes on thread exit for JNI+libhdfs applications
> ---
>
> Key: HDFS-12628
> URL: https://issues.apache.org/jira/browse/HDFS-12628
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.0.0-alpha3
>Reporter: Joe McDonnell
>Priority: Critical
> Attachments: jni-util-test2.cc
>
>
> Impala uses libhdfs to access HDFS while also using JNI to run other Java 
> code. Impala currently relies on HDFS's getJNIEnv to get a JNIEnv to interact 
> with the process JVM (which is created by HDFS code). It uses this JNIEnv 
> even for code that is not related to HDFS.
> In recent versions of HDFS, getJNIEnv is no longer visible in libhdfs due to 
> HDFS-7879. In HDFS-8474, the proposed solution was for Impala to write its 
> own equivalent (tracked by IMPALA-2029). After implementing an equivalent of 
> getJNIEnv (heavily based on HDFS code, but with distinct names), we are 
> seeing crashes in hdfsThreadDestructor() in threads that use both HDFS and 
> JNI codepaths. The crash shows up under concurrency and does not reproduce in 
> serial execution.
> I have distilled it down to a simple testcase that reproduces the issue. It 
> creates a JVM in the main thread (which Impala does at startup), then spawns 
> multiple threads that do basic HDFS and JNI work. I have removed all but the 
> essential steps. 
> This blocks running Impala on any hadoop version past 2.7 (when HDFS-7879 was 
> merged). Note that exposing getJNIEnv should unblock Impala development if a 
> fix is not forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12650:
--
Status: Patch Available  (was: Open)

> Use slf4j instead of log4j in LeaseManager
> --
>
> Key: HDFS-12650
> URL: https://issues.apache.org/jira/browse/HDFS-12650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: 3.1.0
>
> Attachments: HDFS-12650.01.patch
>
>
> LeaseManager is still using log4j dependencies. We should move those to  
> slf4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12650:
--
Attachment: HDFS-12650.01.patch

> Use slf4j instead of log4j in LeaseManager
> --
>
> Key: HDFS-12650
> URL: https://issues.apache.org/jira/browse/HDFS-12650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: 3.1.0
>
> Attachments: HDFS-12650.01.patch
>
>
> LeaseManager is still using log4j dependencies. We should move those to  
> slf4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12650:
--
Attachment: (was: HDFS-12650.01.patch)

> Use slf4j instead of log4j in LeaseManager
> --
>
> Key: HDFS-12650
> URL: https://issues.apache.org/jira/browse/HDFS-12650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: 3.1.0
>
> Attachments: HDFS-12650.01.patch
>
>
> LeaseManager is still using log4j dependencies. We should move those to  
> slf4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12650:
--
Attachment: HDFS-12650.01.patch

> Use slf4j instead of log4j in LeaseManager
> --
>
> Key: HDFS-12650
> URL: https://issues.apache.org/jira/browse/HDFS-12650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: 3.1.0
>
> Attachments: HDFS-12650.01.patch
>
>
> LeaseManager is still using log4j dependencies. We should move those to  
> slf4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202794#comment-16202794
 ] 

Manoj Govindassamy commented on HDFS-12614:
---

Filed HDFS-12652 to track {{INodeAttributeProvider#getAttributes()}} 
performance improvement task detailed by [~daryn] in the previous comments. I 
am assuming that the request is not for changing the 
INodeAttributesProvider#getAttributes() interface.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12652) INodeAttributesProvider#getAttributes(): Avoid multiple conversions of path components byte[][] to String[] when requesting INode attributes

2017-10-12 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12652:
-

 Summary: INodeAttributesProvider#getAttributes(): Avoid multiple 
conversions of path components byte[][] to String[] when requesting INode 
attributes
 Key: HDFS-12652
 URL: https://issues.apache.org/jira/browse/HDFS-12652
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.0.0-beta1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


{{INodeAttributesProvider#getAttributes}} needs the path components passed in 
to be an array of Strings. Where as the INode and related layers maintain path 
components as an array of byte[]. So, these layers are required to convert each 
byte[] component of the path back into a string and for multiple times when 
requesting for INode attributes from the Provider. 

That is, the path "/a/b/c" requires calling the attribute provider with: (1) 
"", (2) "", "a", (3) "", "a","b", (4) "", "a","b", "c". Every single one of 
those strings were freshly (re)converted from a byte[]. Say, a file listing is 
done on a huge directory containing 100s of millions of files, then these 
multiple time redundant conversions of byte[][] to String[] create lots of tiny 
object garbages, occupying memory and affecting performance. Better if we could 
avoid creating redundant copies of path component strings.
  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12411) Ozone: Add container usage information to DN container report

2017-10-12 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202783#comment-16202783
 ] 

Xiaoyu Yao commented on HDFS-12411:
---

Open HDFS-12651 for [~anu]'s comment #4. 

> Ozone: Add container usage information to DN container report
> -
>
> Key: HDFS-12411
> URL: https://issues.apache.org/jira/browse/HDFS-12411
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone, scm
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>  Labels: ozoneMerge
> Attachments: HDFS-12411-HDFS-7240.001.patch, 
> HDFS-12411-HDFS-7240.002.patch, HDFS-12411-HDFS-7240.003.patch, 
> HDFS-12411-HDFS-7240.004.patch, HDFS-12411-HDFS-7240.005.patch, 
> HDFS-12411-HDFS-7240.006.patch, HDFS-12411-HDFS-7240.007.patch, 
> HDFS-12411-HDFS-7240.008.patch
>
>
> Current DN ReportState for container only has a counter, we will need to 
> include individual container usage information so that SCM can 
> * close container when they are full
> * assign container for block service with different policies.
> * etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12651) Ozone: SCM: avoid synchronously loading all the keys from containers upon SCM datanode start

2017-10-12 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-12651:
-

 Summary: Ozone: SCM: avoid synchronously loading all the keys from 
containers upon SCM datanode start
 Key: HDFS-12651
 URL: https://issues.apache.org/jira/browse/HDFS-12651
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7240
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


This is based on code review feedback from HDFS-12411 to avoid slow SCM 
datanode restart when there are large amount of keys and containers. 

E.g., 5 GB per container / 4 KB per key = 1.25 Million keys per container.

The proposed solution is async loading containers/key size info and update the 
containerStatus once done. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors

2017-10-12 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202766#comment-16202766
 ] 

Hanisha Koneru commented on HDFS-12639:
---

Hi [~daryn],

This is my understanding of the problem. Please correct me if I am wrong.

BPServiceActor obtains the writeLock before processing each command and 
releases it after. During the processing of a single command, the other actor 
would not be able to register or process heartbeats.

If we remove the write lock held during command processing, then command 
processing would no longer be asynchronous. Not sure if this opens up the 
possibility of creating anomalies in the datanode. 

bq. The worst case scenario for processing commands while holding the lock is 
re-registration. The actor will loop, catching and logging exceptions, leaving 
the other actor blocked for an non-deterministic (possibly infinite) amount of 
time.

The re-registration process itself does not acquire the write lock to register 
Datanode right? (It needs the write lock to check that the new registration 
info is consistent with the storage). 
Can you please elaborate on how the actor would go into a loop, catching and 
logging exceptions?

> BPOfferService lock may stall all service actors
> 
>
> Key: HDFS-12639
> URL: https://issues.apache.org/jira/browse/HDFS-12639
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Hanisha Koneru
>
> {{BPOfferService}} manages {{BPServiceActor}} instances for the active and 
> standby.  It uses a RW lock to primarily protect registration information 
> while determining the active/standby from heartbeats.
> Unfortunately the write lock is held during command processing.  If an actor 
> is experiencing high latency processing commands, the other actor will 
> neither be able to register (blocked in createRegistration, setNamespaceInfo, 
> verifyAndSetNamespaceInfo) nor process heartbeats (blocked in 
> updateActorStatesFromHeartbeat).
> The worst case scenario for processing commands while holding the lock is 
> re-registration.  The actor will loop, catching and logging exceptions, 
> leaving the other actor blocked for an non-deterministic (possibly infinite) 
> amount of time.
> The lock must not be held during command processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12613) Native EC coder should implement release() as idempotent function.

2017-10-12 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12613:
-
Attachment: HDFS-12613.03.patch

Thanks for the suggestion, [~Sammi]. Add checks into java code as well, also 
this patch propagates {{IOException}} for {{decode()}} / {{encode()}}. 

Would you mind to give another review? [~Sammi] and [~drankye] 

> Native EC coder should implement release() as idempotent function.
> --
>
> Key: HDFS-12613
> URL: https://issues.apache.org/jira/browse/HDFS-12613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12613.00.patch, HDFS-12613.01.patch, 
> HDFS-12613.02.patch, HDFS-12613.03.patch
>
>
> Recently, we found native EC coder crashes JVM because 
> {{NativeRSDecoder#release()}} being called multiple times (HDFS-12612 and 
> HDFS-12606). 
> We should strength the implement the native code to make {{release()}} 
> idempotent  as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12626:
--
Attachment: HDFS-12626-HDFS-7240.004.patch

Somehow Jenkins ran on v002 patch again, resubmit v003 patch as v004 to trigger 
another run.

> Ozone : delete open key entries that will no longer be closed
> -
>
> Key: HDFS-12626
> URL: https://issues.apache.org/jira/browse/HDFS-12626
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12626-HDFS-7240.001.patch, 
> HDFS-12626-HDFS-7240.002.patch, HDFS-12626-HDFS-7240.003.patch, 
> HDFS-12626-HDFS-7240.004.patch
>
>
> HDFS-12543 introduced the notion of "open key" where when a key is opened, an 
> open key entry gets persisted, only after client calls a close will this 
> entry be made visible. One issue is that if the client does not call close 
> (e.g. failed), then that open key entry will never be deleted from meta data. 
> This JIRA tracks this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202730#comment-16202730
 ] 

Hadoop QA commented on HDFS-12626:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
38s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
51s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
18s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
38s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}141m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
29s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}211m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.cblock.TestCBlockCLI |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
|   | hadoop.cblock.TestCBlockReadWrite |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.ozone.scm.TestAllocateContainer |
|   | hadoop.ozone.container.common.impl.TestContainerPersistence |
|   | hadoop.hdfs.server.federation.router.TestRouterRpc |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
| Timed out junit tests | org.apache.hadoop.cblock.TestLocalBlockCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12626 |
| 

[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12650:
--
Description: LeaseManager is still using log4j dependencies. We should move 
those to  slf4j.  (was: FileNamesystem is still using log4j dependencies. We 
should move those to  slf4j, as most of the methods using log4j are deprecated.)

> Use slf4j instead of log4j in LeaseManager
> --
>
> Key: HDFS-12650
> URL: https://issues.apache.org/jira/browse/HDFS-12650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: 3.1.0
>
>
> LeaseManager is still using log4j dependencies. We should move those to  
> slf4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12650) Use slf4j instead of log4j in LeaseManager

2017-10-12 Thread Ajay Kumar (JIRA)
Ajay Kumar created HDFS-12650:
-

 Summary: Use slf4j instead of log4j in LeaseManager
 Key: HDFS-12650
 URL: https://issues.apache.org/jira/browse/HDFS-12650
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ajay Kumar
Assignee: Ajay Kumar
 Fix For: 3.1.0


FileNamesystem is still using log4j dependencies. We should move those to  
slf4j, as most of the methods using log4j are deprecated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202696#comment-16202696
 ] 

Hadoop QA commented on HDFS-12614:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs |
|   | org.apache.hadoop.fs.TestSymlinkHdfsFileContext |
|   | org.apache.hadoop.fs.TestEnhancedByteBufferAccess |
|   | org.apache.hadoop.fs.TestSymlinkHdfsFileSystem |
|   | org.apache.hadoop.fs.permission.TestStickyBit |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:3d04c00 |
| JIRA Issue | HDFS-12614 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891787/HDFS-12614.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2d953a9b0100 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e46d5bb |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21672/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21672/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/h

[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12626:
--
Attachment: HDFS-12626-HDFS-7240.003.patch

fix asf license header missing in v003 patch. failed tests are unrelated

> Ozone : delete open key entries that will no longer be closed
> -
>
> Key: HDFS-12626
> URL: https://issues.apache.org/jira/browse/HDFS-12626
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12626-HDFS-7240.001.patch, 
> HDFS-12626-HDFS-7240.002.patch, HDFS-12626-HDFS-7240.003.patch
>
>
> HDFS-12543 introduced the notion of "open key" where when a key is opened, an 
> open key entry gets persisted, only after client calls a close will this 
> entry be made visible. One issue is that if the client does not call close 
> (e.g. failed), then that open key entry will never be deleted from meta data. 
> This JIRA tracks this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202654#comment-16202654
 ] 

Hadoop QA commented on HDFS-12626:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
29s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
34s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891777/HDFS-12626-HDFS-7240.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux f5de9eae4ed0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/perso

[jira] [Assigned] (HDFS-12648) DN should provide feedback to NN for throttling commands

2017-10-12 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru reassigned HDFS-12648:
-

Assignee: Hanisha Koneru

> DN should provide feedback to NN for throttling commands
> 
>
> Key: HDFS-12648
> URL: https://issues.apache.org/jira/browse/HDFS-12648
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Hanisha Koneru
>
> The NN should avoid sending commands to a DN with a high number of 
> outstanding commands.  The heartbeat could provide this feedback via perhaps 
> a simple count of the commands or rate of processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12585) Add description for config in Ozone config UI

2017-10-12 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202608#comment-16202608
 ] 

Chen Liang commented on HDFS-12585:
---

Thanks [~ajayydv] for the clarification. Then I think maybe it's better to 
either rename loadDescriptionFromXml to something such as descriptionLoaded or 
remove this flag and just check if {{propertyMap}} is null.

> Add description for config in Ozone config UI
> -
>
> Key: HDFS-12585
> URL: https://issues.apache.org/jira/browse/HDFS-12585
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: HDFS-7240
>
> Attachments: HDFS-12585-HDFS-7240.01.patch, 
> HDFS-12585-HDFS-7240.02.patch, HDFS-12585-HDFS-7240.03.patch
>
>
> Add description for each config in Ozone config UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12249) dfsadmin -metaSave to output maintenance mode blocks

2017-10-12 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202604#comment-16202604
 ] 

Wellington Chevreuil commented on HDFS-12249:
-

I believe the test failures are not related. Have those passing, locally.

> dfsadmin -metaSave to output maintenance mode blocks
> 
>
> Key: HDFS-12249
> URL: https://issues.apache.org/jira/browse/HDFS-12249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-12249.001.patch
>
>
> Found while reviewing for HDFS-12182.
> {quote}
> After the patch, the output of metaSave is:
> Live Datanodes: 0
> Dead Datanodes: 0
> Metasave: Blocks waiting for reconstruction: 0
> Metasave: Blocks currently missing: 1
> file16387: blk_0_1 MISSING (replicas: l: 0 d: 0 c: 2 e: 0)  
> 1.1.1.1:9866(corrupt) (block deletions maybe out of date) :  
> 2.2.2.2:9866(corrupt) (block deletions maybe out of date) : 
> Mis-replicated blocks that have been postponed:
> Metasave: Blocks being reconstructed: 0
> Metasave: Blocks 0 waiting deletion from 0 datanodes.
> Corrupt Blocks:
> Block=0   Node=1.1.1.1:9866   StorageID=s1StorageState=NORMAL 
> TotalReplicas=2 Reason=GENSTAMP_MISMATCH
> Block=0   Node=2.2.2.2:9866   StorageID=s2StorageState=NORMAL 
> TotalReplicas=2 Reason=GENSTAMP_MISMATCH
> Metasave: Number of datanodes: 0
> {quote}
> {quote}
> Looking at the output
> The output is not user friendly — The meaning of "(replicas: l: 0 d: 0 c: 2 
> e: 0)" is not obvious without looking at the code.
> Also, it should print maintenance mode replicas.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11821) BlockManager.getMissingReplOneBlocksCount() does not report correct value if corrupt file with replication factor of 1 gets deleted

2017-10-12 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202602#comment-16202602
 ] 

Wellington Chevreuil commented on HDFS-11821:
-

Any insights from anyone?

> BlockManager.getMissingReplOneBlocksCount() does not report correct value if 
> corrupt file with replication factor of 1 gets deleted
> ---
>
> Key: HDFS-11821
> URL: https://issues.apache.org/jira/browse/HDFS-11821
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.6.0, 3.0.0-alpha2
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-11821-1.patch, HDFS-11821-2.patch
>
>
> *BlockManager* keeps a separate metric for number of missing blocks with 
> replication factor of 1. This is returned by 
> *BlockManager.getMissingReplOneBlocksCount()* method currently, and that's 
> what is displayed on below attribute for *dfsadmin -report* (in below 
> example, there's one corrupt block that relates to a file with replication 
> factor of 1):
> {noformat}
> ...
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> However, if the related file gets deleted, (for instance, using hdfs fsck 
> -delete option), this metric never gets updated, and *dfsadmin -report* will 
> keep reporting a missing block, even though the file does not exist anymore. 
> The only workaround available is to restart the NN, so that this metric will 
> be cleared.
> This can be easily reproduced by forcing a replication factor 1 file 
> corruption such as follows:
> 1) Put a file into hdfs with replication factor 1:
> {noformat}
> $ hdfs dfs -Ddfs.replication=1 -put test_corrupt /
> $ hdfs dfs -ls /
> -rw-r--r--   1 hdfs supergroup 19 2017-05-10 09:21 /test_corrupt
> {noformat}
> 2) Find related block for the file and delete it from DN:
> {noformat}
> $ hdfs fsck /test_corrupt -files -blocks -locations
> ...
> /test_corrupt 19 bytes, 1 block(s):  OK
> 0. BP-782213640-172.31.113.82-1494420317936:blk_1073742742_1918 len=19 
> Live_repl=1 
> [DatanodeInfoWithStorage[172.31.112.178:20002,DS-a0dc0b30-a323-4087-8c36-26ffdfe44f46,DISK]]
> Status: HEALTHY
> ...
> $ find /dfs/dn/ -name blk_1073742742*
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta
> $ rm -rf 
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742
> $ rm -rf 
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta
> {noformat}
> 3) Running fsck will report the corruption as expected:
> {noformat}
> $ hdfs fsck /test_corrupt -files -blocks -locations
> ...
> /test_corrupt 19 bytes, 1 block(s): 
> /test_corrupt: CORRUPT blockpool BP-782213640-172.31.113.82-1494420317936 
> block blk_1073742742
>  MISSING 1 blocks of total size 19 B
> ...
> Total blocks (validated): 1 (avg. block size 19 B)
>   
>   UNDER MIN REPL'D BLOCKS:1 (100.0 %)
>   dfs.namenode.replication.min:   1
>   CORRUPT FILES:  1
>   MISSING BLOCKS: 1
>   MISSING SIZE:   19 B
>   CORRUPT BLOCKS: 1
> ...
> {noformat}
> 4) Same for *dfsadmin -report*
> {noformat}
> $ hdfs dfsadmin -report
> ...
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 1
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> 5) Running *fsck -delete* option does cause fsck to report correct 
> information about corrupt block, but dfsadmin still shows the corrupt block:
> {noformat}
> $ hdfs fsck /test_corrupt -delete
> ...
> $ hdfs fsck /
> ...
> The filesystem under path '/' is HEALTHY
> ...
> $ hdfs dfsadmin -report
> ...
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> The problem seems to be on *BlockManager.removeBlock()* method, which in turn 
> uses util class *LowRedundancyBlocks* that classifies blocks according to the 
> current replication level, including blocks currently marked as corrupt. 
> The related metric showed on *dfsadmin -report* for corrupt blocks with 
> replication factor 1 is tracked on this *LowRedundancyBlocks*. Whenever a 
> block is marked as corrupt and it has replication factor of 1, the related 
> metric is updated. When removing the block, though, 
> *BlockManager.removeBlock()* is calling *LowRedundancyBlocks.remove(BlockInfo 
> block, int priLevel)*, which does not check if the given bl

[jira] [Updated] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware

2017-10-12 Thread Gruust (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gruust updated HDFS-12649:
--
Description: 
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention and therefore in a timely manner. 
If there is a corrupted block, I would at least expect that the namenode forces 
the creation of an additional good replica to keep up the redundancy level, ie. 
the redundancy level should never include corrupted data... which it currently 
does:

"UnderReplicatedBlocks" : 0,
"CorruptBlocks" : 2,

(namenode /jmx http dump)

  was:
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention and therefore in a timely manner. 
If there is a corrupted block, I would at least expect that the namenode forces 
the creation of an additional good replica to keep up the redundancy level. 


> handling of corrupt blocks not suitable for commodity hardware
> --
>
> Key: HDFS-12649
> URL: https://issues.apache.org/jira/browse/HDFS-12649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.1
>Reporter: Gruust
>Priority: Minor
>
> Hadoop's documentation tells me it's suitable for commodity hardware in the 
> sense that hardware failures are expected to happen frequently. However, 
> there is currently no automatic handling of corrupted blocks, which seems a 
> bit contradictory to me.
> See: 
> https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
> This is even problematic for data integrity as the redundancy is not kept at 
> the desired level without manual intervention and therefore in a timely 
> manner. If there is a corrupted block, I would at least expect that the 
> namenode forces the creation of an additional good replica to keep up the 
> redundancy level, ie. the redundancy level should never include corrupted 
> data... which it currently does:
> "UnderReplicatedBlocks" : 0,
> "CorruptBlocks" : 2,
> (namenode /jmx http dump)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware

2017-10-12 Thread Gruust (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gruust updated HDFS-12649:
--
Description: 
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention and therefore in a timely manner. 
If there is a corrupted block, I would at least expect that the namenode forces 
the creation of an additional good replica to keep up the redundancy level. 

  was:
Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention. If there is a corrupted block, I 
would at least expect that the namenode forces the creation of an additional 
good replica to keep up the redundancy level. 


> handling of corrupt blocks not suitable for commodity hardware
> --
>
> Key: HDFS-12649
> URL: https://issues.apache.org/jira/browse/HDFS-12649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.1
>Reporter: Gruust
>Priority: Minor
>
> Hadoop's documentation tells me it's suitable for commodity hardware in the 
> sense that hardware failures are expected to happen frequently. However, 
> there is currently no automatic handling of corrupted blocks, which seems a 
> bit contradictory to me.
> See: 
> https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
> This is even problematic for data integrity as the redundancy is not kept at 
> the desired level without manual intervention and therefore in a timely 
> manner. If there is a corrupted block, I would at least expect that the 
> namenode forces the creation of an additional good replica to keep up the 
> redundancy level. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware

2017-10-12 Thread Gruust (JIRA)
Gruust created HDFS-12649:
-

 Summary: handling of corrupt blocks not suitable for commodity 
hardware
 Key: HDFS-12649
 URL: https://issues.apache.org/jira/browse/HDFS-12649
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.8.1
Reporter: Gruust
Priority: Minor


Hadoop's documentation tells me it's suitable for commodity hardware in the 
sense that hardware failures are expected to happen frequently. However, there 
is currently no automatic handling of corrupted blocks, which seems a bit 
contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at 
the desired level without manual intervention. If there is a corrupted block, I 
would at least expect that the namenode forces the creation of an additional 
good replica to keep up the redundancy level. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem

2017-10-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12632:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

The failed tests are unrelated. I've committed this to the feature branch, 
thanks [~msingh] for the contribution and [~anu] for the review!

> Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
> -
>
> Key: HDFS-12632
> URL: https://issues.apache.org/jira/browse/HDFS-12632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>  Labels: ozoneMerge
> Fix For: HDFS-7240
>
> Attachments: HDFS-12632-HDFS-7240.001.patch
>
>
> HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to 
> add ContractTest for the filesystem to Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10743) MiniDFSCluster test runtimes can be drastically reduce

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202482#comment-16202482
 ] 

Daryn Sharp commented on HDFS-10743:


Triggering a block report immediately after the heartbeat isn't addressing the 
main issue of delayed reconnects after a cluster restart.  Eliminating that 
delay will save a lot of time.

The DNs are stuck waiting for the next heartbeat or stuck in 
{{sleepAfterException}}.  The mini cluster has some "triggerBlah" methods but 
they are synchronous and wait for the operation to complete which we can't do 
because sometimes DNs are expected to fail to connect.  An async wakeup can be 
done with {{DataNode#scheduleAllBlockReport(0)}} – if it also then triggered a 
heartbeat.  Maybe add a flag to that method for sending a heartbeat.

Something needs to be done to wake the thread from {{sleepAfterException}} 
because tests will likely encounter that delay during restarts.

> MiniDFSCluster test runtimes can be drastically reduce
> --
>
> Key: HDFS-10743
> URL: https://issues.apache.org/jira/browse/HDFS-10743
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
> Attachments: HDFS-10743.001.patch, HDFS-10743.002.patch, 
> HDFS-10743.003.patch
>
>
> {{MiniDFSCluster}} tests have excessive runtimes.  The main problem appears 
> to be the heartbeat interval.  The NN may have to wait up to 3s (default 
> value) for all DNs to heartbeat, triggering registration, so NN can go 
> active.  Tests that repeatedly restart the NN are severely affected.
> Example for varying heartbeat intervals for {{TestFSImageWithAcl}}:
> * 3s = ~70s -- (disgusting, why I investigated)
> * 1s = ~27s
> * 500ms = ~17s -- (had to hack DNConf for millisecond precision)
> That a 4x improvement in runtime.
> 17s is still excessively long for what the test does.  Further areas to 
> explore when running tests:
> * Reduce numerous sleeps intervals in DN's {{BPServiceActor}}.
> * Ensure heartbeats and initial BR are sent immediately upon (re)registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.03.patch

Attached v03 patch with more comments. [~yzhangal], [~daryn], can you please 
take a look at the latest patch revision? Thanks.


> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem

2017-10-12 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202460#comment-16202460
 ] 

Chen Liang commented on HDFS-12632:
---

Thanks [~msingh] for adding the tests. +1

> Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
> -
>
> Key: HDFS-12632
> URL: https://issues.apache.org/jira/browse/HDFS-12632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>  Labels: ozoneMerge
> Fix For: HDFS-7240
>
> Attachments: HDFS-12632-HDFS-7240.001.patch
>
>
> HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to 
> add ContractTest for the filesystem to Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12626:
--
Attachment: HDFS-12626-HDFS-7240.002.patch

002 patch to rebase.

> Ozone : delete open key entries that will no longer be closed
> -
>
> Key: HDFS-12626
> URL: https://issues.apache.org/jira/browse/HDFS-12626
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12626-HDFS-7240.001.patch, 
> HDFS-12626-HDFS-7240.002.patch
>
>
> HDFS-12543 introduced the notion of "open key" where when a key is opened, an 
> open key entry gets persisted, only after client calls a close will this 
> entry be made visible. One issue is that if the client does not call close 
> (e.g. failed), then that open key entry will never be deleted from meta data. 
> This JIRA tracks this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202361#comment-16202361
 ] 

Hadoop QA commented on HDFS-12626:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-12626 does not apply to HDFS-7240. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891548/HDFS-12626-HDFS-7240.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21669/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone : delete open key entries that will no longer be closed
> -
>
> Key: HDFS-12626
> URL: https://issues.apache.org/jira/browse/HDFS-12626
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12626-HDFS-7240.001.patch
>
>
> HDFS-12543 introduced the notion of "open key" where when a key is opened, an 
> open key entry gets persisted, only after client calls a close will this 
> entry be made visible. One issue is that if the client does not call close 
> (e.g. failed), then that open key entry will never be deleted from meta data. 
> This JIRA tracks this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202357#comment-16202357
 ] 

Daryn Sharp commented on HDFS-12638:


The issues actually appear unrelated.  In our case, there was only 1 replica, 
that node died, the block was deleted.  The block got stuck in the replication 
monitor's missing block queue.  Kihwal is filing a jira with additional details.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed

2017-10-12 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12626:
--
Status: Patch Available  (was: Open)

> Ozone : delete open key entries that will no longer be closed
> -
>
> Key: HDFS-12626
> URL: https://issues.apache.org/jira/browse/HDFS-12626
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12626-HDFS-7240.001.patch
>
>
> HDFS-12543 introduced the notion of "open key" where when a key is opened, an 
> open key entry gets persisted, only after client calls a close will this 
> entry be made visible. One issue is that if the client does not call close 
> (e.g. failed), then that open key entry will never be deleted from meta data. 
> This JIRA tracks this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails

2017-10-12 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202341#comment-16202341
 ] 

Chen Liang commented on HDFS-12415:
---

I looked in this a little bit too. What was happening seems to be that 
{{SCMCommonPolicy#chooseDatanodes}} calls 
{{nodeManager.getNodes(OzoneProtos.NodeState.HEALTHY);}}, but the returned list 
contains a {{null}} datanode id entry. So the {{hasEnoughSpace(d, 
sizeRequired)}} call on the null d will fail with NPE. And the returned list 
with a null entry is returned by {{SCMNodeManager#getNodes}}, where seems there 
is some datanode id in {{healthyNodes}} but not present in {{nodes}} map.

I don't see how could a datanode id be present in {{healthyNodes}} but not in 
{{nodes}}, because the first thing of register is to always add that datanode 
to {{nodes}}, before {{healthyNodes}}. I can only think of the issue being just 
like [~msingh] mentioned, that it is probably due to some unexpected race 
condition behaviour when two register calls happen and change the HashMap 
{{nodes}} at the same time. So I would +1 on Mukul's change. Additionally, I 
ran {{TestXceiverClientManager}} several ten times with v005 patch applied. The 
test did not fail.

> Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
> 
>
> Key: HDFS-12415
> URL: https://issues.apache.org/jira/browse/HDFS-12415
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-12415-HDFS-7240.001.patch, 
> HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, 
> HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch
>
>
> TestXceiverClientManager seems to be occasionally failing in some jenkins 
> jobs,
> {noformat}
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at 
> org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
> {noformat}
> see more from [this 
> report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202333#comment-16202333
 ] 

Hadoop QA commented on HDFS-5926:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}131m 32s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:3d04c00 |
| JIRA Issue | HDFS-5926 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891730/HDFS-5926-1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 461d2b393e7a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 075358e |
| Default Java | 1.8.0_144 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21668/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21668/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21668/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
> -
>
> 

[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors

2017-10-12 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202321#comment-16202321
 ] 

Hanisha Koneru commented on HDFS-12639:
---

Sure.. Thanks Daryn.

> BPOfferService lock may stall all service actors
> 
>
> Key: HDFS-12639
> URL: https://issues.apache.org/jira/browse/HDFS-12639
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Hanisha Koneru
>
> {{BPOfferService}} manages {{BPServiceActor}} instances for the active and 
> standby.  It uses a RW lock to primarily protect registration information 
> while determining the active/standby from heartbeats.
> Unfortunately the write lock is held during command processing.  If an actor 
> is experiencing high latency processing commands, the other actor will 
> neither be able to register (blocked in createRegistration, setNamespaceInfo, 
> verifyAndSetNamespaceInfo) nor process heartbeats (blocked in 
> updateActorStatesFromHeartbeat).
> The worst case scenario for processing commands while holding the lock is 
> re-registration.  The actor will loop, catching and logging exceptions, 
> leaving the other actor blocked for an non-deterministic (possibly infinite) 
> amount of time.
> The lock must not be held during command processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12639) BPOfferService lock may stall all service actors

2017-10-12 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru reassigned HDFS-12639:
-

Assignee: Hanisha Koneru

> BPOfferService lock may stall all service actors
> 
>
> Key: HDFS-12639
> URL: https://issues.apache.org/jira/browse/HDFS-12639
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Hanisha Koneru
>
> {{BPOfferService}} manages {{BPServiceActor}} instances for the active and 
> standby.  It uses a RW lock to primarily protect registration information 
> while determining the active/standby from heartbeats.
> Unfortunately the write lock is held during command processing.  If an actor 
> is experiencing high latency processing commands, the other actor will 
> neither be able to register (blocked in createRegistration, setNamespaceInfo, 
> verifyAndSetNamespaceInfo) nor process heartbeats (blocked in 
> updateActorStatesFromHeartbeat).
> The worst case scenario for processing commands while holding the lock is 
> re-registration.  The actor will loop, catching and logging exceptions, 
> leaving the other actor blocked for an non-deterministic (possibly infinite) 
> amount of time.
> The lock must not be held during command processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem

2017-10-12 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202302#comment-16202302
 ] 

Anu Engineer commented on HDFS-12632:
-

+1, LGTM.

> Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
> -
>
> Key: HDFS-12632
> URL: https://issues.apache.org/jira/browse/HDFS-12632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>  Labels: ozoneMerge
> Fix For: HDFS-7240
>
> Attachments: HDFS-12632-HDFS-7240.001.patch
>
>
> HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to 
> add ContractTest for the filesystem to Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12599) Remove Mockito dependency from DataNodeTestUtils

2017-10-12 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12599:
-
Fix Version/s: (was: 3.1.0)

> Remove Mockito dependency from DataNodeTestUtils
> 
>
> Key: HDFS-12599
> URL: https://issues.apache.org/jira/browse/HDFS-12599
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-12599.v1.patch, HDFS-12599.v1.patch, 
> HDFS-12599.v1.patch
>
>
> HDFS-11164 introduced {{DataNodeTestUtils.mockDatanodeBlkPinning}} which 
> brought dependency on mockito back into DataNodeTestUtils
> Downstream, this resulted in:
> {code}
> java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified

2017-10-12 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202294#comment-16202294
 ] 

Wei-Chiu Chuang commented on HDFS-12212:


Hi [~vinayrpet] thanks for the patch.
The fix itself looks to me. Would you also like to contribute a test as well?


> Options.Rename.To_TRASH is considered even when Options.Rename.NONE is 
> specified
> 
>
> Key: HDFS-12212
> URL: https://issues.apache.org/jira/browse/HDFS-12212
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-12212-01.patch
>
>
> HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the 
> movement to trash and other renames for permission checks.
> When Options.Rename.NONE is passed also TO_TRASH is considered for rename and 
> wrong permissions are checked for rename.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified

2017-10-12 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202257#comment-16202257
 ] 

Wei-Chiu Chuang commented on HDFS-12212:


Updated Affects versions based on HDFS-8312's affects versions.

> Options.Rename.To_TRASH is considered even when Options.Rename.NONE is 
> specified
> 
>
> Key: HDFS-12212
> URL: https://issues.apache.org/jira/browse/HDFS-12212
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-12212-01.patch
>
>
> HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the 
> movement to trash and other renames for permission checks.
> When Options.Rename.NONE is passed also TO_TRASH is considered for rename and 
> wrong permissions are checked for rename.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified

2017-10-12 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-12212:
---
Affects Version/s: 2.8.2
   2.9.0
   2.7.4
   3.0.0-alpha1

> Options.Rename.To_TRASH is considered even when Options.Rename.NONE is 
> specified
> 
>
> Key: HDFS-12212
> URL: https://issues.apache.org/jira/browse/HDFS-12212
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-12212-01.patch
>
>
> HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the 
> movement to trash and other renames for permission checks.
> When Options.Rename.NONE is passed also TO_TRASH is considered for rename and 
> wrong permissions are checked for rename.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey

2017-10-12 Thread Nanda kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDFS-12490:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Ozone: OzoneClient: Add creation/modification time information in 
> OzoneVolume/OzoneBucket/OzoneKey
> --
>
> Key: HDFS-12490
> URL: https://issues.apache.org/jira/browse/HDFS-12490
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12490-HDFS-7240.001.patch, 
> HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch
>
>
> OzoneBucket should have information about the bucket creation time.
> OzoneFileSystem needs creation time to display the file status information 
> for the root of the filesystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey

2017-10-12 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202211#comment-16202211
 ] 

Nanda kumar commented on HDFS-12490:


I have committed this to feature branch. Thanks for the contribution [~msingh] 
and [~anu] for the review.

> Ozone: OzoneClient: Add creation/modification time information in 
> OzoneVolume/OzoneBucket/OzoneKey
> --
>
> Key: HDFS-12490
> URL: https://issues.apache.org/jira/browse/HDFS-12490
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12490-HDFS-7240.001.patch, 
> HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch
>
>
> OzoneBucket should have information about the bucket creation time.
> OzoneFileSystem needs creation time to display the file status information 
> for the root of the filesystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey

2017-10-12 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202203#comment-16202203
 ] 

Nanda kumar commented on HDFS-12490:


Since we don't want to expose setter methods for the parameters passed to 
constructor, the checkstyle issue is left for now.
Test failures are not related, will commit it shortly.

> Ozone: OzoneClient: Add creation/modification time information in 
> OzoneVolume/OzoneBucket/OzoneKey
> --
>
> Key: HDFS-12490
> URL: https://issues.apache.org/jira/browse/HDFS-12490
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12490-HDFS-7240.001.patch, 
> HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch
>
>
> OzoneBucket should have information about the bucket creation time.
> OzoneFileSystem needs creation time to display the file status information 
> for the root of the filesystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12620) Backporting HDFS-10467 to branch-2

2017-10-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202148#comment-16202148
 ] 

Íñigo Goiri commented on HDFS-12620:


Even with the most standard naming convention, Jenkins is not giving me the 
report so not sure what's wrong with javac.

> Backporting HDFS-10467 to branch-2
> --
>
> Key: HDFS-12620
> URL: https://issues.apache.org/jira/browse/HDFS-12620
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Attachments: HDFS-10467-branch-2.001.patch, 
> HDFS-10467-branch-2.002.patch, HDFS-10467-branch-2.003.patch, 
> HDFS-10467-branch-2.patch, HDFS-12620-branch-2.000.patch, 
> HDFS-12620-branch-2.004.patch
>
>
> When backporting HDFS-10467, there are a few things that changed:
> * {{bin\hdfs}}
> * {{ClientProtocol}}
> * Java 7 not supporting referencing functions
> * {{org.eclipse.jetty.util.ajax.JSON}} in branch-2 is 
> {{org.mortbay.util.ajax.JSON}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12585) Add description for config in Ozone config UI

2017-10-12 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202128#comment-16202128
 ] 

Ajay Kumar commented on HDFS-12585:
---

[~vagarychen], thanks for review. Yes, {{propertyMap}}  is only initialized in 
{{loadDescriptions()}}. Since {{loadDescriptionFromXml}} default value is true 
{{loadDescriptions()}} will be called initially and there after we will use the 
initialized map. Main intention behind calling  {{loadDescriptionFromXml}}  is 
to load the description of properties from xml files which are not going to 
change frequently. So, we don't want to unmarshall xml files every time there 
is a call to servlet.

> Add description for config in Ozone config UI
> -
>
> Key: HDFS-12585
> URL: https://issues.apache.org/jira/browse/HDFS-12585
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
> Fix For: HDFS-7240
>
> Attachments: HDFS-12585-HDFS-7240.01.patch, 
> HDFS-12585-HDFS-7240.02.patch, HDFS-12585-HDFS-7240.03.patch
>
>
> Add description for each config in Ozone config UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202092#comment-16202092
 ] 

Daryn Sharp commented on HDFS-12638:


It does differ from our case but likely has the same root cause.  The block was 
in your blocks map, but not in ours.  The block existed on your DN, but not 
ours.  The commonality is the block referenced a non-existent inode/collection.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202079#comment-16202079
 ] 

Daryn Sharp commented on HDFS-12638:


This is bad.  The fsck NPE means the block _is_ in the blocks map but the 
inode/collection it references _is not_ in the blocks map.  Last I knew, a size 
of Long.MAX_VALUE means the block is scheduled for a "fire and forget" 
invalidation because the file was deleted.

bq. and there are logs of truncate cmd in auditlog

To double check, did you mean "are" or "are not"?



> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202067#comment-16202067
 ] 

Daryn Sharp commented on HDFS-12639:


Sure, go ahead and assign to yourself.  If I don't assign to myself unless I 
have free cycles.  Even though my responses are often delayed, please let me 
review your patch.

> BPOfferService lock may stall all service actors
> 
>
> Key: HDFS-12639
> URL: https://issues.apache.org/jira/browse/HDFS-12639
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>
> {{BPOfferService}} manages {{BPServiceActor}} instances for the active and 
> standby.  It uses a RW lock to primarily protect registration information 
> while determining the active/standby from heartbeats.
> Unfortunately the write lock is held during command processing.  If an actor 
> is experiencing high latency processing commands, the other actor will 
> neither be able to register (blocked in createRegistration, setNamespaceInfo, 
> verifyAndSetNamespaceInfo) nor process heartbeats (blocked in 
> updateActorStatesFromHeartbeat).
> The worst case scenario for processing commands while holding the lock is 
> re-registration.  The actor will loop, catching and logging exceptions, 
> leaving the other actor blocked for an non-deterministic (possibly infinite) 
> amount of time.
> The lock must not be held during command processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12645) FSDatasetImpl lock will stall BP service actors and may cause missing blocks

2017-10-12 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202063#comment-16202063
 ] 

Daryn Sharp commented on HDFS-12645:


I understand the lifeline protocol was designed to avoid the node being 
declared dead, but it's just hiding the consequences of a poor locking design.  
Preventing a dead node via a lifeline is of dubious value when the node is 
effectively dead due to blocked IO in the dataset lock.  The node can't process 
replications which may lead to data loss when another node could have serviced 
the replication request.  Ex.  The lifeline will keep a node "alive" even 
though it's having severe hw issues and ultimately crashed.

> FSDatasetImpl lock will stall BP service actors and may cause missing blocks
> 
>
> Key: HDFS-12645
> URL: https://issues.apache.org/jira/browse/HDFS-12645
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>
> The DN is extremely susceptible to a slow volume due bad locking practices.  
> DN operations require a fs dataset lock.  IO in the dataset lock should not 
> be permissible as it leads to severe performance degradation and possibly 
> (temporarily) missing blocks.
> A slow disk will cause pipelines to experience significant latency and 
> timeouts, increasing lock/io contention while cleaning up, leading to more 
> timeouts, etc.  Meanwhile, the actor service thread is interleaving multiple 
> lock acquire/releases with xceivers.  If many commands are issued, the node 
> may be incorrectly declared as dead.
> HDFS-12639 documents that both actors synchronize on the offer service lock 
> while processing commands.  A backlogged active actor will block the standby 
> actor and cause it to go dead too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12647) DN commands processing should be async

2017-10-12 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar reassigned HDFS-12647:
-

Assignee: Nandakumar

> DN commands processing should be async
> --
>
> Key: HDFS-12647
> URL: https://issues.apache.org/jira/browse/HDFS-12647
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Nandakumar
>
> Due to dataset lock contention, service actors may encounter significant 
> latency while processing  DN commands.  Even the queuing of async deletions 
> require multiple lock acquisitions.  A slow disk will cause a backlog of 
> xceivers instantiating block sender/receivers which starves the actor and 
> leads to the NN falsely declaring the node dead.
> Async processing of all commands will free the actor to perform its primary 
> purpose of heartbeating and block reporting.  Note that FBRs will be 
> dependent on queued block invalidations not being included in the report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12648) DN should provide feedback to NN for throttling commands

2017-10-12 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12648:
--

 Summary: DN should provide feedback to NN for throttling commands
 Key: HDFS-12648
 URL: https://issues.apache.org/jira/browse/HDFS-12648
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp


The NN should avoid sending commands to a DN with a high number of outstanding 
commands.  The heartbeat could provide this feedback via perhaps a simple count 
of the commands or rate of processing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity

2017-10-12 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-5926:
-
Status: Patch Available  (was: Open)

> documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
> -
>
> Key: HDFS-5926
> URL: https://issues.apache.org/jira/browse/HDFS-5926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.2
>Reporter: Alexander Fahlke
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5926-1.patch
>
>
> I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as 
> many others with the parameter for dfs.datanode.du.reserved. One day some 
> data nodes got out of disk errors although there was space left on the disks.
> The following values are rounded to make the problem more clear:
> - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS 
> data
> - you plan to set the dfs.datanode.du.reserved to 20GB
> - the reserved reserved-blocks-percentage by tune2fs is 5% (the default)
> That gives all users, except root, 5% less capacity that they can use.
> Although the System reports the total of 1000GB as usable for all users via 
> df. The hadoop-deamons are not running as root.
> If i read it right, than hadoop get's the free capacity via df.
>  
> Starting in 
> {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line 
> 350: {{return usage.getCapacity()-reserved;}}
> going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says:
> {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}}
> When you have 5% reserved by tune2fs (in our case 50GB) and you give 
> dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of 
> disk errors that hadoop can't handle.
> In this case you must add the planned 20GB du reserved to the reserved 
> capacity by tune2fs. This results in (at least) 70GB for 
> dfs.datanode.du.reserved in my case.
> Two ideas:
> # The documentation must be clear at this point to avoid this problem.
> # Hadoop could check for reserved space by tune2fs (or other tools) and add 
> this value to the dfs.datanode.du.reserved parameter.
> This ticket is a follow up from the Mailinglist: 
> https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity

2017-10-12 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-5926:
-
Attachment: HDFS-5926-1.patch

> documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
> -
>
> Key: HDFS-5926
> URL: https://issues.apache.org/jira/browse/HDFS-5926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.2
>Reporter: Alexander Fahlke
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5926-1.patch
>
>
> I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as 
> many others with the parameter for dfs.datanode.du.reserved. One day some 
> data nodes got out of disk errors although there was space left on the disks.
> The following values are rounded to make the problem more clear:
> - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS 
> data
> - you plan to set the dfs.datanode.du.reserved to 20GB
> - the reserved reserved-blocks-percentage by tune2fs is 5% (the default)
> That gives all users, except root, 5% less capacity that they can use.
> Although the System reports the total of 1000GB as usable for all users via 
> df. The hadoop-deamons are not running as root.
> If i read it right, than hadoop get's the free capacity via df.
>  
> Starting in 
> {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line 
> 350: {{return usage.getCapacity()-reserved;}}
> going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says:
> {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}}
> When you have 5% reserved by tune2fs (in our case 50GB) and you give 
> dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of 
> disk errors that hadoop can't handle.
> In this case you must add the planned 20GB du reserved to the reserved 
> capacity by tune2fs. This results in (at least) 70GB for 
> dfs.datanode.du.reserved in my case.
> Two ideas:
> # The documentation must be clear at this point to avoid this problem.
> # Hadoop could check for reserved space by tune2fs (or other tools) and add 
> this value to the dfs.datanode.du.reserved parameter.
> This ticket is a follow up from the Mailinglist: 
> https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity

2017-10-12 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-5926:
-
Attachment: (was: HDFS-5926-1.patch)

> documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
> -
>
> Key: HDFS-5926
> URL: https://issues.apache.org/jira/browse/HDFS-5926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.2
>Reporter: Alexander Fahlke
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5926-1.patch
>
>
> I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as 
> many others with the parameter for dfs.datanode.du.reserved. One day some 
> data nodes got out of disk errors although there was space left on the disks.
> The following values are rounded to make the problem more clear:
> - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS 
> data
> - you plan to set the dfs.datanode.du.reserved to 20GB
> - the reserved reserved-blocks-percentage by tune2fs is 5% (the default)
> That gives all users, except root, 5% less capacity that they can use.
> Although the System reports the total of 1000GB as usable for all users via 
> df. The hadoop-deamons are not running as root.
> If i read it right, than hadoop get's the free capacity via df.
>  
> Starting in 
> {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line 
> 350: {{return usage.getCapacity()-reserved;}}
> going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says:
> {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}}
> When you have 5% reserved by tune2fs (in our case 50GB) and you give 
> dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of 
> disk errors that hadoop can't handle.
> In this case you must add the planned 20GB du reserved to the reserved 
> capacity by tune2fs. This results in (at least) 70GB for 
> dfs.datanode.du.reserved in my case.
> Two ideas:
> # The documentation must be clear at this point to avoid this problem.
> # Hadoop could check for reserved space by tune2fs (or other tools) and add 
> this value to the dfs.datanode.du.reserved parameter.
> This ticket is a follow up from the Mailinglist: 
> https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12647) DN commands processing should be async

2017-10-12 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12647:
--

 Summary: DN commands processing should be async
 Key: HDFS-12647
 URL: https://issues.apache.org/jira/browse/HDFS-12647
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp


Due to dataset lock contention, service actors may encounter significant 
latency while processing  DN commands.  Even the queuing of async deletions 
require multiple lock acquisitions.  A slow disk will cause a backlog of 
xceivers instantiating block sender/receivers which starves the actor and leads 
to the NN falsely declaring the node dead.

Async processing of all commands will free the actor to perform its primary 
purpose of heartbeating and block reporting.  Note that FBRs will be dependent 
on queued block invalidations not being included in the report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12646) Avoid IO while holding the FsDataset lock

2017-10-12 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12646:
--

 Summary: Avoid IO while holding the FsDataset lock
 Key: HDFS-12646
 URL: https://issues.apache.org/jira/browse/HDFS-12646
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp


IO operations should be allowed while holding the dataset lock.  Notable 
offenders include but are not limited to the instantiation of a block 
sender/receiver, constructing the path to a block, unfinalizing a block.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202006#comment-16202006
 ] 

Hadoop QA commented on HDFS-12570:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
19s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} HDFS-10285 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m  
4s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-10285 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 51s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 10 new + 1142 unchanged - 2 fixed = 1152 total (was 1144) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 
47s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12570 |
| JIRA Patch URL | 
https://issues.ap

[jira] [Created] (HDFS-12645) FSDatasetImpl lock will stall BP service actors and may cause missing blocks

2017-10-12 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12645:
--

 Summary: FSDatasetImpl lock will stall BP service actors and may 
cause missing blocks
 Key: HDFS-12645
 URL: https://issues.apache.org/jira/browse/HDFS-12645
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp


The DN is extremely susceptible to a slow volume due bad locking practices.  DN 
operations require a fs dataset lock.  IO in the dataset lock should not be 
permissible as it leads to severe performance degradation and possibly 
(temporarily) missing blocks.

A slow disk will cause pipelines to experience significant latency and 
timeouts, increasing lock/io contention while cleaning up, leading to more 
timeouts, etc.  Meanwhile, the actor service thread is interleaving multiple 
lock acquire/releases with xceivers.  If many commands are issued, the node may 
be incorrectly declared as dead.

HDFS-12639 documents that both actors synchronize on the offer service lock 
while processing commands.  A backlogged active actor will block the standby 
actor and cause it to go dead too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.

2017-10-12 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201997#comment-16201997
 ] 

Rushabh S Shah commented on HDFS-11754:
---

[~subru]: I feel the patch is very close to done.
[~erofeev]: do you have some bandwidth to address my review comments ?
If not, we can push to next release.

> Make FsServerDefaults cache configurable.
> -
>
> Key: HDFS-11754
> URL: https://issues.apache.org/jira/browse/HDFS-11754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Mikhail Erofeev
>Priority: Minor
>  Labels: newbie
> Fix For: 2.9.0
>
> Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, 
> HDFS-11754.003.patch, HDFS-11754.004.patch
>
>
> DFSClient caches the result of FsServerDefaults for 60 minutes.
> But the 60 minutes time is not configurable.
> Continuing the discussion from HDFS-11702, it would be nice if we can make 
> this configurable and make the default as 60 minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12519) Ozone: Lease Manager framework

2017-10-12 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-12519:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Ozone: Lease Manager framework
> --
>
> Key: HDFS-12519
> URL: https://issues.apache.org/jira/browse/HDFS-12519
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Anu Engineer
>Assignee: Nandakumar
>  Labels: ozoneMerge
> Attachments: HDFS-12519-HDFS-7240.000.patch, 
> HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, 
> HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch
>
>
> Many objects, including Containers and pipelines can time out during creating 
> process. We need a way to track these timeouts. This lease Manager allows SCM 
> to hold a lease on these objects and helps SCM timeout waiting for creating 
> of these objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity

2017-10-12 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-5926:
-
Attachment: HDFS-5926-1.patch

> documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
> -
>
> Key: HDFS-5926
> URL: https://issues.apache.org/jira/browse/HDFS-5926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.2
>Reporter: Alexander Fahlke
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5926-1.patch
>
>
> I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as 
> many others with the parameter for dfs.datanode.du.reserved. One day some 
> data nodes got out of disk errors although there was space left on the disks.
> The following values are rounded to make the problem more clear:
> - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS 
> data
> - you plan to set the dfs.datanode.du.reserved to 20GB
> - the reserved reserved-blocks-percentage by tune2fs is 5% (the default)
> That gives all users, except root, 5% less capacity that they can use.
> Although the System reports the total of 1000GB as usable for all users via 
> df. The hadoop-deamons are not running as root.
> If i read it right, than hadoop get's the free capacity via df.
>  
> Starting in 
> {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line 
> 350: {{return usage.getCapacity()-reserved;}}
> going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says:
> {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}}
> When you have 5% reserved by tune2fs (in our case 50GB) and you give 
> dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of 
> disk errors that hadoop can't handle.
> In this case you must add the planned 20GB du reserved to the reserved 
> capacity by tune2fs. This results in (at least) 70GB for 
> dfs.datanode.du.reserved in my case.
> Two ideas:
> # The documentation must be clear at this point to avoid this problem.
> # Hadoop could check for reserved space by tune2fs (or other tools) and add 
> this value to the dfs.datanode.du.reserved parameter.
> This ticket is a follow up from the Mailinglist: 
> https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12519) Ozone: Lease Manager framework

2017-10-12 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201953#comment-16201953
 ] 

Nandakumar commented on HDFS-12519:
---

Committed this to HDFS-7240. Thanks [~linyiqun], [~vagarychen] & [~anu] for the 
review.

> Ozone: Lease Manager framework
> --
>
> Key: HDFS-12519
> URL: https://issues.apache.org/jira/browse/HDFS-12519
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Anu Engineer
>Assignee: Nandakumar
>  Labels: ozoneMerge
> Attachments: HDFS-12519-HDFS-7240.000.patch, 
> HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, 
> HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch
>
>
> Many objects, including Containers and pipelines can time out during creating 
> process. We need a way to track these timeouts. This lease Manager allows SCM 
> to hold a lease on these objects and helps SCM timeout waiting for creating 
> of these objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12519) Ozone: Lease Manager framework

2017-10-12 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-12519:
--
Summary: Ozone: Lease Manager framework  (was: Ozone: Add a Lease Manager 
to SCM)

> Ozone: Lease Manager framework
> --
>
> Key: HDFS-12519
> URL: https://issues.apache.org/jira/browse/HDFS-12519
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Anu Engineer
>Assignee: Nandakumar
>  Labels: ozoneMerge
> Attachments: HDFS-12519-HDFS-7240.000.patch, 
> HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, 
> HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch
>
>
> Many objects, including Containers and pipelines can time out during creating 
> process. We need a way to track these timeouts. This lease Manager allows SCM 
> to hold a lease on these objects and helps SCM timeout waiting for creating 
> of these objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12519) Ozone: Add a Lease Manager to SCM

2017-10-12 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201930#comment-16201930
 ] 

Nandakumar commented on HDFS-12519:
---

Test failures are not related, will commit it shortly.

> Ozone: Add a Lease Manager to SCM
> -
>
> Key: HDFS-12519
> URL: https://issues.apache.org/jira/browse/HDFS-12519
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Anu Engineer
>Assignee: Nandakumar
>  Labels: ozoneMerge
> Attachments: HDFS-12519-HDFS-7240.000.patch, 
> HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, 
> HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch
>
>
> Many objects, including Containers and pipelines can time out during creating 
> process. We need a way to track these timeouts. This lease Manager allows SCM 
> to hold a lease on these objects and helps SCM timeout waiting for creating 
> of these objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201903#comment-16201903
 ] 

Hadoop QA commented on HDFS-12490:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
24s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
17s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
19s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
19s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 54s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 
0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}134m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}210m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.TestDatanodeReport |
|   | hadoop.ozone.web.client.TestKeys |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
\\
\\
|| Subsystem || Repo

[jira] [Commented] (HDFS-12519) Ozone: Add a Lease Manager to SCM

2017-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201848#comment-16201848
 ] 

Hadoop QA commented on HDFS-12519:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12519 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891674/HDFS-12519-HDFS-7240.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 06569d1c7fbc 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 034f01a |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21666/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21666/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21666/console |
| Powered by | Apache Yetus 0.6

[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements

2017-10-12 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201817#comment-16201817
 ] 

Rakesh R commented on HDFS-12570:
-

Attached another patch fixing {{TestStoragePolicySatisfierWithStripedFile}} and 
{{TestHdfsConfigFields}} test failures. I could see 
{{TestPersistentStoragePolicySatisfier}} failures will be fixed as part of 
HDFS-12556.

> [SPS]: Refactor Co-ordinator datanode logic to track the block storage 
> movements
> 
>
> Key: HDFS-12570
> URL: https://issues.apache.org/jira/browse/HDFS-12570
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-12570-HDFS-10285-00.patch, 
> HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, 
> HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch
>
>
> This task is to refactor the C-DN block storage movements. Basically, the 
> idea is to move the scheduling and tracking logic to Namenode rather than at 
> the special C-DN. Please refer the discussion with [~andrew.wang] to 
> understand the [background and the necessity of 
> refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements

2017-10-12 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-12570:

Attachment: HDFS-12570-HDFS-10285-04.patch

> [SPS]: Refactor Co-ordinator datanode logic to track the block storage 
> movements
> 
>
> Key: HDFS-12570
> URL: https://issues.apache.org/jira/browse/HDFS-12570
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-12570-HDFS-10285-00.patch, 
> HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, 
> HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch
>
>
> This task is to refactor the C-DN block storage movements. Basically, the 
> idea is to move the scheduling and tracking logic to Namenode rather than at 
> the special C-DN. Please refer the discussion with [~andrew.wang] to 
> understand the [background and the necessity of 
> refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12644) Offer a non-privileged listEncryptionZone operation

2017-10-12 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-12644:
--

 Summary: Offer a non-privileged listEncryptionZone operation
 Key: HDFS-12644
 URL: https://issues.apache.org/jira/browse/HDFS-12644
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: encryption, namenode
Affects Versions: 3.0.0-alpha1, 2.8.0
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


As discussed in HDFS-12484, we can consider adding a non-privileged 
listEncryptionZone for better user experience.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12484) Undefined -expunge behavior after 2.8

2017-10-12 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201776#comment-16201776
 ] 

Wei-Chiu Chuang commented on HDFS-12484:


Thanks [~xyao] and sorry for late update. I was on vacation.
I am a little hesitate to take the latter route, since I am not so sure about 
the purpose of listEncryptionZone, specifically, why is it a privileged 
operation? Let's file a new jira to initiate the discussion.

> Undefined -expunge behavior after 2.8
> -
>
> Key: HDFS-12484
> URL: https://issues.apache.org/jira/browse/HDFS-12484
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-12484.001.patch, HDFS-12484.002.patch
>
>
> (Rewrote the description to reflect the actual behavior)
> Hadoop 2.8 added a feature to support trash inside encryption zones, which is 
> a great feature to have.
> However, when it comes to -expunge, the behavior is not well defined. A 
> superuser invoking -expunge removes files under all encryption zone trash 
> directory belonging to the user. On the other hand, because 
> listEncryptionZones requires superuser permission, a non-privileged user 
> invoking -expunge can removes under home directory, but not under encryption 
> zones.
> Moreover, the command prints a scary warning message that looks annoying.
> {noformat}
> 2017-09-21 01:22:44,744 [main] WARN  hdfs.DFSClient 
> (DistributedFileSystem.java:getTrashRoots(2795)) - Cannot get all encrypted 
> trash roots
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
>  Access denied for user user. Superuser privilege is required
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:130)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:4556)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listEncryptionZones(FSNamesystem.java:7048)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.listEncryptionZones(NameNodeRpcServer.java:2053)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listEncryptionZones(ClientNamenodeProtocolServerSideTranslatorPB.java:1477)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1490)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1436)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1346)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy25.listEncryptionZones(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listEncryptionZones(ClientNamenodeProtocolTranslatorPB.java:1510)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy29.listEncryptionZones(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocol.EncryptionZoneIterator.makeRequest(Encrypti

[jira] [Resolved] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent

2017-10-12 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-11797.

Resolution: Duplicate

I'm going to close it as a dup of HDFS-11445. Feel free to reopen if this is 
not the case. Thanks [~kshukla]!

> BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException 
> when corrupt replicas are inconsistent
> --
>
> Key: HDFS-11797
> URL: https://issues.apache.org/jira/browse/HDFS-11797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Critical
> Attachments: HDFS-11797.001.patch
>
>
> The calculation for {{numMachines}} can be too less (causing 
> ArrayIndexOutOfBoundsException) or too many (causing NPE (HDFS-9958)) if data 
> structures find inconsistent number of corrupt replicas. This was earlier 
> found related to failed storages. This JIRA tracks a change that works for 
> all possible cases of inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >