[jira] [Updated] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11493:

Attachment: HDFS-11493-HDFS-7240.005.patch

bq. Should we reuse the existing one in 
ShutdownThreadsHelper.shutdownExecutorService?
Couple of issues, including the fact that it does not take a logger for us to 
log the failure of shutdown.

bq. ScmConfigKeys.java
Fixed.

bq. OzoneConfigKeys.java
Fixed.

bq. maxContainerReportThreads can be a local variable.
Fixed.

bq.  this can be pulled into a generic CollectionUtil if it does not exist in 
hadoop-common/hadoop-hdfs

The equivalent functions exist. But they seem to be before Java 7/8 time frame. 
This function just calls into start java collection API.


bq.  conf parameter can be removed as it is not used
Fixed.

bq. adding a counter as you mentioned in the TODO is a good idea
Added the counter.

bq. Should we rename it to InProgressPool if this is for a single pool being 
processed?
Fixed.

bq. should we change “<“ to “>” to indicate that we have done waiting for the 
maxWaitTime?
Thanks for catching this. Fixed.

bq. one UNKNOWN node in the pool can cost 100s. Should we reduce the maxTry 
from 1000 to 100 here?
Fixed.

bq. do we miss the size/keycount for the ContainerInfo of the ContainerReport?
Right now, we don't need that since all we are going to track is the container 
names and state.
We will get to this sometime later.


> Ozone: SCM:  Add the ability to handle container reports 
> -
>
> Key: HDFS-11493
> URL: https://issues.apache.org/jira/browse/HDFS-11493
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: container-replication-storage.pdf, 
> exploring-scalability-scm.pdf, HDFS-11493-HDFS-7240.001.patch, 
> HDFS-11493-HDFS-7240.002.patch, HDFS-11493-HDFS-7240.003.patch, 
> HDFS-11493-HDFS-7240.004.patch, HDFS-11493-HDFS-7240.005.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to 
> determine if the replication levels are acceptable. If it is not, SCM should 
> initiate a replication request to another datanode. This JIRA tracks how SCM  
> handles a container report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087131#comment-16087131
 ] 

Hadoop QA commented on HDFS-11493:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
54s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
40s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
57s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
54s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
24s{color} | {color:red} hadoop-common-project/hadoop-common in HDFS-7240 has 
19 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
18s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
24s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.ozone.container.replication.TestContainerReplicationManager |
|   | hadoop.ozone.TestOzoneConfigurationFields |
| Timed out junit tests | 
org.apache.hadoop.ozone.container.ozoneimpl.TestRatisManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11493 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877250/HDFS-11493-HDFS-7240.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 30dbdee852e8 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Li

[jira] [Commented] (HDFS-11596) hadoop-hdfs-client jar is in the wrong directory in release tarball

2017-07-14 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087162#comment-16087162
 ] 

Lars Francke commented on HDFS-11596:
-

This broke the ability to run Hadoop from the source tree as documented in the 
Wiki (https://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment) as 
the client dependency is missing from 
{{hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-3.0.0-*/share/hadoop/hdfs/lib}}.

I'm not questioning that this is a good change but I wonder how all of you guys 
are running Hadoop from source these days? We need to update the documentation 
and/or fix this situation somehow.

> hadoop-hdfs-client jar is in the wrong directory in release tarball
> ---
>
> Key: HDFS-11596
> URL: https://issues.apache.org/jira/browse/HDFS-11596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Andrew Wang
>Assignee: Yuanbo Liu
>Priority: Critical
> Fix For: 3.0.0-alpha4
>
> Attachments: HDFS-11596.001.patch, HDFS-11596.002.patch
>
>
> Mentioned by [~aw] on HDFS-11356. The hdfs-client jar is in the lib directory 
> rather than with the other hadoop jars:
> From the alpha2 artifacts:
> {noformat}
> -> % find . -name "*hdfs-client*.jar"
> ./share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-hdfs-client-3.0.0-alpha2.jar
> ./share/hadoop/hdfs/sources/hadoop-hdfs-client-3.0.0-alpha2-sources.jar
> ./share/hadoop/hdfs/sources/hadoop-hdfs-client-3.0.0-alpha2-test-sources.jar
> ./share/hadoop/hdfs/lib/hadoop-hdfs-client-3.0.0-alpha2.jar
> ./share/hadoop/hdfs/hadoop-hdfs-client-3.0.0-alpha2-tests.jar
> {noformat}
> Strangely enough, the tests jar is in the right place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-07-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-10285:

Attachment: HDFS-10285-consolidated-merge-patch-00.patch

Thank you all the contributors in making this feature. I have finished a pass 
rebasing all the changes made in HDFS-10285 sub-tasks. Uploading the 
consolidated patch to the umbrella jira to get the QA report.

Thanks [~umamaheswararao] for the offline discussions.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087317#comment-16087317
 ] 

Hadoop QA commented on HDFS-10285:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 23 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
21s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 
extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
41s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  4s{color} | {color:orange} hadoop-hdfs-project: The patch generated 25 new 
+ 1967 unchanged - 1 fixed = 1992 total (was 1968) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m  4s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-10285 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877278/HDFS-10285-consolidated-merge-patch-00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  xml  |
| uname | Linux 65179995d79e 3.13.0-

[jira] [Commented] (HDFS-12120) Use new block for pre-RollingUpgrade files' append requests

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087393#comment-16087393
 ] 

Kihwal Lee commented on HDFS-12120:
---

It sounds fine conceptually
- The variable length block feature is new and has not been fully field tested. 
Making an existing popular feature depend on it has a risk.  We need to think 
about how the risk can be mitigated. E.g. provide a way to out-out in case it 
creates issues.
- Need to think about compatibility issues. (interoperability between new/old 
servers and clients) Any limitations should be mentioned in the release note.
- Using blockID for cutoff may not be reliable. Old clusters can still have old 
blocks with old style IDs.  Gen stamp might work better.


> Use new block for pre-RollingUpgrade files' append requests
> ---
>
> Key: HDFS-12120
> URL: https://issues.apache.org/jira/browse/HDFS-12120
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-12120-01.patch
>
>
> After the RollingUpgrade prepare, append on pre-RU files will re-open the 
> same last block and makes changes to it (appending extra data, changing 
> genstamp etc).
> These changes to the block will not be tracked in Datanodes (either in trash 
> or via hardlinks)
> This creates problem if RollingUpgrade.Rollback is called.
> Since block state and size both changed, after rollback block will be marked 
> corrupted.
> To avoid this, first time append on pre-RU files can be forced to write to 
> new block itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12137:
---
Attachment: HDFS-12137.trunk.patch

Reposting trunk to kick precommit just to be thorough.

> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12140:
--

 Summary: Remove BPOfferService lock contention to get block pool id
 Key: HDFS-12140
 URL: https://issues.apache.org/jira/browse/HDFS-12140
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical


The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
excessive contention especially for xceivers threads attempting to queue IBRs 
and heartbeat processing.  When the latter is delayed due to excessive 
FSDataset lock contention, it causes pipelines to collapse.

Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12140:
---
Attachment: HDFS-12140.trunk.patch

Simply sets/clears a volatile for the bp id when the ns info is updated.  
Retains the blocking behavior prior to setting the ns info, but lockless access 
thereafter.

Will post branch-2 patch shortly.  Minor change needed for the absent "quiet" 
boolean.

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087437#comment-16087437
 ] 

Daryn Sharp commented on HDFS-12140:


Also avoids a silly double fetch/penalty for a precondition's error message!

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12140:
---
Status: Patch Available  (was: Open)

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12140:
---
Attachment: HDFS-12140.branch-2.8.patch

Trunk applies to branch-2 as well.  Just need 2.8 change.

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.branch-2.8.patch, HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12141) [SPS]: Fix checkstyle warnings

2017-07-14 Thread Rakesh R (JIRA)
Rakesh R created HDFS-12141:
---

 Summary: [SPS]: Fix checkstyle warnings
 Key: HDFS-12141
 URL: https://issues.apache.org/jira/browse/HDFS-12141
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R


This sub-task is to fix the applicable checkstyle warnings in HDFS-10285 
branch. Attached the checkstyle report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12141) [SPS]: Fix checkstyle warnings

2017-07-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-12141:

Attachment: diff-checkstyle-hadoop-hdfs-project.txt

> [SPS]: Fix checkstyle warnings
> --
>
> Key: HDFS-12141
> URL: https://issues.apache.org/jira/browse/HDFS-12141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: diff-checkstyle-hadoop-hdfs-project.txt
>
>
> This sub-task is to fix the applicable checkstyle warnings in HDFS-10285 
> branch. Attached the checkstyle report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12142) Files may be closed before streamer is done

2017-07-14 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12142:
--

 Summary: Files may be closed before streamer is done
 Key: HDFS-12142
 URL: https://issues.apache.org/jira/browse/HDFS-12142
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 2.8.0
Reporter: Daryn Sharp


We're encountering multiple cases of clients calling updateBlockForPipeline on 
completed blocks.  Initial analysis is the client closes a file, completeFile 
succeeds, then it immediately attempts recovery.  The exception is swallowed on 
the client, only logged on the NN by checkUCBlock.

The problem "appears" to be benign (no data loss) but it's unproven if the 
issue always occurs for successfully closed files.  There appears to be very 
poor coordination between the dfs output stream's threads which leads to races 
that confuse the streamer thread – which probably should have been joined 
before returning from close.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-11874:
---
Attachment: ArchivalStorage.html
HDFS-11874-HDFS-10285-004.patch

A minor fix: added 'start' option in reconfig command. Thanks [~rakeshr] for 
pointing me.
Also corrected quotes for config parameter.  

> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12143) Improve performance of getting and removing inode features

2017-07-14 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12143:
--

 Summary: Improve performance of getting and removing inode features
 Key: HDFS-12143
 URL: https://issues.apache.org/jira/browse/HDFS-12143
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.8.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp


Getting a feature uses an iterator which is less performant than an indexed for 
loop.  Feature lookups are becoming more prolific so cycles count.

Removing a feature requires building a string for up to 3 precondition checks.  
The worst case of 3 is the penalty for a successful removal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087505#comment-16087505
 ] 

Uma Maheswara Rao G edited comment on HDFS-11874 at 7/14/17 3:49 PM:
-

A minor fix: added 'start' option in reconfig command. Thanks [~rakeshr] for 
pointing me.
Also corrected quotes for config parameter in command section.  
{quote}configuration item 'dfs.storage.policy.satisfier.activate' in 
configuration file{quote} to {quote} configuration item 
`dfs.storage.policy.satisfier.activate` in configuration file{quote}


was (Author: umamaheswararao):
A minor fix: added 'start' option in reconfig command. Thanks [~rakeshr] for 
pointing me.
Also corrected quotes for config parameter.  

> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12143) Improve performance of getting and removing inode features

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12143:
---
Status: Patch Available  (was: Open)

> Improve performance of getting and removing inode features
> --
>
> Key: HDFS-12143
> URL: https://issues.apache.org/jira/browse/HDFS-12143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-12143.patch
>
>
> Getting a feature uses an iterator which is less performant than an indexed 
> for loop.  Feature lookups are becoming more prolific so cycles count.
> Removing a feature requires building a string for up to 3 precondition 
> checks.  The worst case of 3 is the penalty for a successful removal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12143) Improve performance of getting and removing inode features

2017-07-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12143:
---
Attachment: HDFS-12143.patch

> Improve performance of getting and removing inode features
> --
>
> Key: HDFS-12143
> URL: https://issues.apache.org/jira/browse/HDFS-12143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-12143.patch
>
>
> Getting a feature uses an iterator which is less performant than an indexed 
> for loop.  Feature lookups are becoming more prolific so cycles count.
> Removing a feature requires building a string for up to 3 precondition 
> checks.  The worst case of 3 is the penalty for a successful removal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12083) Ozone: KSM: previous key has to be excluded from result in listVolumes, listBuckets and listKeys

2017-07-14 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-12083:
-
Affects Version/s: HDFS-7240
 Hadoop Flags: Reviewed
 Target Version/s: HDFS-7240
Fix Version/s: HDFS-7240

Committed this to feature branch. Thanks [~nandakumar131] for the contribution!

> Ozone: KSM: previous key has to be excluded from result in listVolumes, 
> listBuckets and listKeys
> 
>
> Key: HDFS-12083
> URL: https://issues.apache.org/jira/browse/HDFS-12083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Nandakumar
>Assignee: Nandakumar
>Priority: Critical
> Fix For: HDFS-7240
>
> Attachments: HDFS-12083-HDFS-7240.000.patch, 
> HDFS-12083-HDFS-7240.001.patch
>
>
> When previous key is set as part of list calls [listVolume, listBuckets & 
> listKeys], the result includes previous key, there is no need to have this in 
> the result. 
> Since previous key is present as part of result, we will never receive an 
> empty list in the subsequent list calls, this makes it difficult to have a 
> exit criteria where we want to get all the values using multiple list calls 
> (with previous-key set).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12083) Ozone: KSM: previous key has to be excluded from result in listVolumes, listBuckets and listKeys

2017-07-14 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-12083:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Ozone: KSM: previous key has to be excluded from result in listVolumes, 
> listBuckets and listKeys
> 
>
> Key: HDFS-12083
> URL: https://issues.apache.org/jira/browse/HDFS-12083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Nandakumar
>Assignee: Nandakumar
>Priority: Critical
> Fix For: HDFS-7240
>
> Attachments: HDFS-12083-HDFS-7240.000.patch, 
> HDFS-12083-HDFS-7240.001.patch
>
>
> When previous key is set as part of list calls [listVolume, listBuckets & 
> listKeys], the result includes previous key, there is no need to have this in 
> the result. 
> Since previous key is present as part of result, we will never receive an 
> empty list in the subsequent list calls, this makes it difficult to have a 
> exit criteria where we want to get all the values using multiple list calls 
> (with previous-key set).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087525#comment-16087525
 ] 

Hadoop QA commented on HDFS-11874:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
26s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} HDFS-10285 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11874 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877326/HDFS-11874-HDFS-10285-004.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 6252122b8ba8 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10285 / d685df4 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20274/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12142) Files may be closed before streamer is done

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087565#comment-16087565
 ] 

Kihwal Lee commented on HDFS-12142:
---

The following appears after the files is successfully closed. It seems 
DataStreamer is sometimes left running and the regular pipeline shutdown is 
somehow recognized as a failure.

{noformat}
2017-07-10 20:19:11,870 [IPC Server handler 72 on 8020] INFO ipc.Server: IPC 
Server handler 72 on 8020, call Call#99 Retry#0
 org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline from 
x.x.x.x:50972
java.io.IOException: Unexpected BlockUCState: BP-yyy:blk_1230_1 is 
COMPLETE but not UNDER_CONSTRUCTION
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5509)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:5576)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:918)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline
(ClientNamenodeProtocolServerSideTranslatorPB.java:971)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod
(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:448)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:999)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:881)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:810)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1936)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2523)
{noformat}

The blocks are all finalized normally and had no data loss, but until we know 
the actual cause of this, I can't be sure whether it will cause any data loss.

> Files may be closed before streamer is done
> ---
>
> Key: HDFS-12142
> URL: https://issues.apache.org/jira/browse/HDFS-12142
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>
> We're encountering multiple cases of clients calling updateBlockForPipeline 
> on completed blocks.  Initial analysis is the client closes a file, 
> completeFile succeeds, then it immediately attempts recovery.  The exception 
> is swallowed on the client, only logged on the NN by checkUCBlock.
> The problem "appears" to be benign (no data loss) but it's unproven if the 
> issue always occurs for successfully closed files.  There appears to be very 
> poor coordination between the dfs output stream's threads which leads to 
> races that confuse the streamer thread – which probably should have been 
> joined before returning from close.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12141) [SPS]: Fix checkstyle warnings

2017-07-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-12141:

Attachment: HDFS-12141-HDFS-10285-01.patch

Attached simple patch fixing applicable checkstyle warnings. 
[~umamaheswararao], please review, thanks!

> [SPS]: Fix checkstyle warnings
> --
>
> Key: HDFS-12141
> URL: https://issues.apache.org/jira/browse/HDFS-12141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: diff-checkstyle-hadoop-hdfs-project.txt, 
> HDFS-12141-HDFS-10285-01.patch
>
>
> This sub-task is to fix the applicable checkstyle warnings in HDFS-10285 
> branch. Attached the checkstyle report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12141) [SPS]: Fix checkstyle warnings

2017-07-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-12141:

Status: Patch Available  (was: Open)

> [SPS]: Fix checkstyle warnings
> --
>
> Key: HDFS-12141
> URL: https://issues.apache.org/jira/browse/HDFS-12141
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: diff-checkstyle-hadoop-hdfs-project.txt, 
> HDFS-12141-HDFS-10285-01.patch
>
>
> This sub-task is to fix the applicable checkstyle warnings in HDFS-10285 
> branch. Attached the checkstyle report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087583#comment-16087583
 ] 

Rakesh R commented on HDFS-11874:
-

Thanks [~umamaheswararao] for the good docs. 
+1 LGTM, I will commit it shortly.

> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-11874:

  Resolution: Fixed
   Fix Version/s: HDFS-10285
Target Version/s:   (was: HDFS-10285)
  Status: Resolved  (was: Patch Available)

> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: HDFS-10285
>
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11874) [SPS]: Document the SPS feature

2017-07-14 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087604#comment-16087604
 ] 

Rakesh R commented on HDFS-11874:
-

Committed to the branch!

> [SPS]: Document the SPS feature
> ---
>
> Key: HDFS-11874
> URL: https://issues.apache.org/jira/browse/HDFS-11874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: HDFS-10285
>
> Attachments: ArchivalStorage.html, ArchivalStorage.html, 
> ArchivalStorage.html, HDFS-11874-HDFS-10285-001.patch, 
> HDFS-11874-HDFS-10285-002.patch, HDFS-11874-HDFS-10285-003.patch, 
> HDFS-11874-HDFS-10285-004.patch
>
>
> This JIRA is for tracking the documentation about the feature



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2017-07-14 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087608#comment-16087608
 ] 

Daryn Sharp commented on HDFS-12136:


The failed tests either aren't failing for me, or are failing with or w/o this 
patch.  The volume failure tests are very flaky.  Apparently races let volume 
references leak so the cluster can't shutdown and timeouts occur.  Will kick 
the build again to cross-compare test failures.

> BlockSender performance regression due to volume scanner edge case
> --
>
> Key: HDFS-12136
> URL: https://issues.apache.org/jira/browse/HDFS-12136
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan 
> by reading the last checksum of finalized blocks within the {{BlockSender}} 
> ctor.  Unfortunately it's holding the exclusive dataset lock to open and read 
> the metafile multiple times  Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high 
> xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
> The xceiver threads congest creating block senders and impair the heartbeat 
> processing that is contending for the same lock.  Combined with other lock 
> contention issues, pipelines break and nodes sporadically go dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-14 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087652#comment-16087652
 ] 

Chen Liang commented on HDFS-12130:
---

Thanks [~szetszwo] for the review! The failed tests are unrelated, and passed 
locally.

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12143) Improve performance of getting and removing inode features

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087650#comment-16087650
 ] 

Hadoop QA commented on HDFS-12143:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
22s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12143 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877330/HDFS-12143.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7a22519594eb 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 75c0220 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20275/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20275/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20275/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Improve performance of getting and removing inode features
> --
>
> Key: HDFS-12143
> URL: https://issues.apache.org/jira/browse/HDFS-12143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versi

[jira] [Commented] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087678#comment-16087678
 ] 

Hadoop QA commented on HDFS-12140:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
41s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_131 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 18s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_131. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_131 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:d946387 |
| JIRA Issue | HDFS-12140 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877322/HDFS-12140.branch-2.8.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 57d424cc22d2 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.8 / d6228fb |
| Default Java |

[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

2017-07-14 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087703#comment-16087703
 ] 

Xiaoyu Yao commented on HDFS-11493:
---

Thanks [~anu] for the update. +1 for v5 patch. 

> Ozone: SCM:  Add the ability to handle container reports 
> -
>
> Key: HDFS-11493
> URL: https://issues.apache.org/jira/browse/HDFS-11493
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: container-replication-storage.pdf, 
> exploring-scalability-scm.pdf, HDFS-11493-HDFS-7240.001.patch, 
> HDFS-11493-HDFS-7240.002.patch, HDFS-11493-HDFS-7240.003.patch, 
> HDFS-11493-HDFS-7240.004.patch, HDFS-11493-HDFS-7240.005.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to 
> determine if the replication levels are acceptable. If it is not, SCM should 
> initiate a replication request to another datanode. This JIRA tracks how SCM  
> handles a container report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087734#comment-16087734
 ] 

Kihwal Lee commented on HDFS-12137:
---

The trunk precommit failed to download the patch and failed.  I just kicked the 
build again and it looks like going this time.
{noformat}
Modes:  Sentinel  MultiJDK  Jenkins  Robot  Docker  ResetRepo  UnitTests 
Processing: HDFS-12137
ERROR: Unsure how to process HDFS-12137.
{noformat}

> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11493:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

[~cheersyang], [~xyao] Thanks for reviews. I have committed this to the feature 
branch.

> Ozone: SCM:  Add the ability to handle container reports 
> -
>
> Key: HDFS-11493
> URL: https://issues.apache.org/jira/browse/HDFS-11493
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: container-replication-storage.pdf, 
> exploring-scalability-scm.pdf, HDFS-11493-HDFS-7240.001.patch, 
> HDFS-11493-HDFS-7240.002.patch, HDFS-11493-HDFS-7240.003.patch, 
> HDFS-11493-HDFS-7240.004.patch, HDFS-11493-HDFS-7240.005.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to 
> determine if the replication levels are acceptable. If it is not, SCM should 
> initiate a replication request to another datanode. This JIRA tracks how SCM  
> handles a container report.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087778#comment-16087778
 ] 

Hadoop QA commented on HDFS-12136:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 46s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12136 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877142/HDFS-12136.trunk.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2376c0df5944 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 75c0220 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20277/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20277/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20277/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20277/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> BlockSender performance regression due to vo

[jira] [Commented] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087804#comment-16087804
 ] 

Anu Engineer commented on HDFS-12123:
-

*OzoneClientImpl.java*

*  Line: 100, Please add a Preconditions.checkNotNull(conf);
*  Line 229, Please addPreconditions.checkState(quota >= 0); 
*  Line 273, 281.. , Compiler warns that null is Confusing, plese rw-write as 
(OzoneAcl) null.
*  Line 317, no change is needed, just a comment. We don't have a good user 
model yet, neither ACLs model. At some point you might want to define it in 
  detail. Let us get this code in for now.
*  Line 435, // TODO: the following createContainer and key writes may fail, in 
which
// case we should revert the above allocateKey to KSM.
  I think this a JIRA that we should file against KSM. Clients might fail 
before they can do this. So it is really KSM which needs to keep track of this 
and 
  cleanup. I know I wrote this comment long time ago. So taking this 
opportunity to fix my mistake.
*  Line 442, Nit: That line can be in a single line.
* Line 445,  Nit: 81 chars ??


I don't like that fact that we cannot assert in *TestOzoneClientImpl*, but I 
understand why. I hope your next patch will enable full scale testing.

I am +1 with these minor nits addressed.


> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12141) [SPS]: Fix checkstyle warnings

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087811#comment-16087811
 ] 

Hadoop QA commented on HDFS-12141:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
25s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-10285 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-10285 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-10285 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 184 unchanged - 15 fixed = 186 total (was 199) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestDFSClientExcludedNodes |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12141 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877340/HDFS-12141-HDFS-10285-01.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1c5927b17a11 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-10285 / d685df4 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20276/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20276/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20276/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Result

[jira] [Commented] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087839#comment-16087839
 ] 

Anu Engineer commented on HDFS-12115:
-

Other than an expected failure in TestKeys, all test failures looks like they 
are unrelated to this patch.

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087917#comment-16087917
 ] 

Hadoop QA commented on HDFS-12137:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
42s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12137 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877315/HDFS-12137.trunk.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bf376120b301 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 75c0220 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20278/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20278/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20278/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20278/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
>  

[jira] [Commented] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087922#comment-16087922
 ] 

Rushabh S Shah commented on HDFS-12140:
---

[~daryn]: thanks for the patch.
Precommit ran on only branch-2.8 version of patch and not on trunk.
Please re-attach trunk patch again.

I did a code review also.
Overall the patch looks good.
I just have one concern in test case.
You have used mockito.reset few times.
In mockito documentation, it has been mentioned that you shouldn't use reset 
method. Instead consider simple and small tests.
But you are testing only small portion of code, so I am +1 to patch.

+1 (non-binding)

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.branch-2.8.patch, HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-5896) DataXceiver and DFSOutputStream call Thread.setName(), which is not thread safe

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDFS-5896:


Assignee: Bharat Viswanadham

> DataXceiver and DFSOutputStream call Thread.setName(), which is not thread 
> safe
> ---
>
> Key: HDFS-5896
> URL: https://issues.apache.org/jira/browse/HDFS-5896
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hiroshi Ikeda
>Assignee: Bharat Viswanadham
>Priority: Trivial
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver and 
> org.apache.hadoop.hdfs.DFSOutputStream renames running threads, but 
> Thread.setName() is not thread safe. Thread.setName() is currently intended 
> to call before running the thread.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5896) DataXceiver and DFSOutputStream call Thread.setName(), which is not thread safe

2017-07-14 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087936#comment-16087936
 ] 

Bharat Viswanadham commented on HDFS-5896:
--

Hi [~ikeda]
Refer this JDK bug for setName/getName thread safe fix 
https://bugs.openjdk.java.net/browse/JDK-8010182
the methods setName and getName are synchronized from JDK 8. So, I think it 
will not be issue with JDK8.

> DataXceiver and DFSOutputStream call Thread.setName(), which is not thread 
> safe
> ---
>
> Key: HDFS-5896
> URL: https://issues.apache.org/jira/browse/HDFS-5896
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hiroshi Ikeda
>Assignee: Bharat Viswanadham
>Priority: Trivial
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver and 
> org.apache.hadoop.hdfs.DFSOutputStream renames running threads, but 
> Thread.setName() is not thread safe. Thread.setName() is currently intended 
> to call before running the thread.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-5896) DataXceiver and DFSOutputStream call Thread.setName(), which is not thread safe

2017-07-14 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087936#comment-16087936
 ] 

Bharat Viswanadham edited comment on HDFS-5896 at 7/14/17 8:03 PM:
---

Hi [~ikeda]
Refer this JDK bug for setName/getName thread safe fix 
https://bugs.openjdk.java.net/browse/JDK-8010182
the methods setName and getName are synchronized and name is declared as 
volatile in Thread class. So, I think it will not be issue.


was (Author: bharatviswa):
Hi [~ikeda]
Refer this JDK bug for setName/getName thread safe fix 
https://bugs.openjdk.java.net/browse/JDK-8010182
the methods setName and getName are synchronized from JDK 8. So, I think it 
will not be issue with JDK8.

> DataXceiver and DFSOutputStream call Thread.setName(), which is not thread 
> safe
> ---
>
> Key: HDFS-5896
> URL: https://issues.apache.org/jira/browse/HDFS-5896
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hiroshi Ikeda
>Assignee: Bharat Viswanadham
>Priority: Trivial
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver and 
> org.apache.hadoop.hdfs.DFSOutputStream renames running threads, but 
> Thread.setName() is not thread safe. Thread.setName() is currently intended 
> to call before running the thread.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-12123:
--
Attachment: HDFS-12123-HDFS-7240.001.patch

> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch, 
> HDFS-12123-HDFS-7240.001.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087992#comment-16087992
 ] 

Nandakumar commented on HDFS-12123:
---

Thanks [~anu] for the review.

>> Line: 100, Please add a Preconditions.checkNotNull(conf);
Added
>> Line 229, Please add Preconditions.checkState(quota >= 0);
Added
>> Line 273, 281.. , Compiler warns that null is Confusing, plese rw-write as 
>> (OzoneAcl) null.
Done
>> Line 442, Nit: That line can be in a single line.
Done
>> Line 445, Nit: 81 chars ??
Corrected
>> I think this a JIRA that we should file against KSM.
I will create a JIRA to track the same

Uploaded patch v1 incorporating review comments.

> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch, 
> HDFS-12123-HDFS-7240.001.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11948) Ozone: change TestRatisManager to check cluster with data

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087993#comment-16087993
 ] 

Anu Engineer commented on HDFS-11948:
-

[~szetszwo] Thanks for uploading the patch. From code changes perspective this 
looks good. I think the unit test failure is related to this patch.
{code}
2017-07-14 13:21:13,357 [Thread-624] INFO  impl.LeaderElection 
(LeaderElection.java:askForVotes(127)) - 127.0.0.1:53437: begin an election in 
Term 592
2017-07-14 13:21:13,367 [nioEventLoopGroup-7-4] WARN  
channel.DefaultChannelPipeline (Slf4JLogger.java:warn(151)) - An 
exceptionCaught() event was fired, and it reached at the tail of the pipeline. 
It usually means the last handler in the pipeline did not handle the exception.
java.lang.IllegalStateException: STATE MISMATCHED: In 127.0.0.1:53431, current 
state STARTING is not one of the expected states [RUNNING]
{code}

I think on the server side we are not catching some exception and it is caught 
by netty default handler on the server side, hence that error is not propagated 
to the client side.

Also in the profiler, I see around 2025 threads being launched for this single 
test. Thought you might be interested in that.

> Ozone: change TestRatisManager to check cluster with data
> -
>
> Key: HDFS-11948
> URL: https://issues.apache.org/jira/browse/HDFS-11948
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-11948-HDFS-7240.20170614.patch
>
>
> TestRatisManager first creates multiple Ratis clusters.  Then it changes the 
> membership and closes some clusters.  However, it does not test the clusters 
> with data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087997#comment-16087997
 ] 

Anu Engineer commented on HDFS-12123:
-

Thank you very much for the quick turn around, I appreciate it. +1,  Pending 
Jenkins. I will commit this as soon as we have Jenkins run.


> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch, 
> HDFS-12123-HDFS-7240.001.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088005#comment-16088005
 ] 

Xiaoyu Yao commented on HDFS-12115:
---

Patch v2 looks good to me. +1. 

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12131) Add some of the FSNamesystem JMX values as metrics

2017-07-14 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088011#comment-16088011
 ] 

Rushabh S Shah commented on HDFS-12131:
---

Thanks [~xkrogen] for the patch.
Adding a test case in {{TestNameNodeMetrics.java}} would be good.

> Add some of the FSNamesystem JMX values as metrics
> --
>
> Key: HDFS-12131
> URL: https://issues.apache.org/jira/browse/HDFS-12131
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Attachments: HDFS-12131.000.patch
>
>
> A number of useful numbers are emitted via the FSNamesystem JMX, but not 
> through the metrics system. These would be useful to be able to track over 
> time, e.g. to alert on via standard metrics systems or to view trends and 
> rate changes:
> * NumLiveDataNodes
> * NumDeadDataNodes
> * NumDecomLiveDataNodes
> * NumDecomDeadDataNodes
> * NumDecommissioningDataNodes
> * NumStaleStorages
> This is a simple change that just requires annotating the JMX methods with 
> {{@Metric}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12144) Ozone: KSM: Cleanup of keys in KSM for failed clients.

2017-07-14 Thread Nandakumar (JIRA)
Nandakumar created HDFS-12144:
-

 Summary: Ozone: KSM: Cleanup of keys in KSM for failed clients.
 Key: HDFS-12144
 URL: https://issues.apache.org/jira/browse/HDFS-12144
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Nandakumar


While writing data into Ozone, client can fail after the key is allocated in 
KSM. Cleanup of these keys has to be done.
Clients can fail after key allocation in KSM while writing data to the 
container, in such cases the key in KSM has to be deleted.
more context: [HDFS-12123 - comment | 
https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12144) Ozone: KSM: Cleanup of keys in KSM for failed clients

2017-07-14 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-12144:
--
Summary: Ozone: KSM: Cleanup of keys in KSM for failed clients  (was: 
Ozone: KSM: Cleanup of keys in KSM for failed clients.)

> Ozone: KSM: Cleanup of keys in KSM for failed clients
> -
>
> Key: HDFS-12144
> URL: https://issues.apache.org/jira/browse/HDFS-12144
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>
> While writing data into Ozone, client can fail after the key is allocated in 
> KSM. Cleanup of these keys has to be done.
> Clients can fail after key allocation in KSM while writing data to the 
> container, in such cases the key in KSM has to be deleted.
> more context: [HDFS-12123 - comment | 
> https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12144) Ozone: KSM: Cleanup of keys in KSM for failed clients

2017-07-14 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-12144:
--
Description: 
While writing data into Ozone, client can fail after the key is allocated in 
KSM. Cleanup of these keys has to be done.
Clients can fail after key allocation in KSM while writing data to the 
container, in such cases the key in KSM has to be deleted.
more context: [HDFS-12123 - comment - TODO | 
https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]

  was:
While writing data into Ozone, client can fail after the key is allocated in 
KSM. Cleanup of these keys has to be done.
Clients can fail after key allocation in KSM while writing data to the 
container, in such cases the key in KSM has to be deleted.
more context: [HDFS-12123 - comment | 
https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]


> Ozone: KSM: Cleanup of keys in KSM for failed clients
> -
>
> Key: HDFS-12144
> URL: https://issues.apache.org/jira/browse/HDFS-12144
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>
> While writing data into Ozone, client can fail after the key is allocated in 
> KSM. Cleanup of these keys has to be done.
> Clients can fail after key allocation in KSM while writing data to the 
> container, in such cases the key in KSM has to be deleted.
> more context: [HDFS-12123 - comment - TODO | 
> https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088022#comment-16088022
 ] 

Kihwal Lee commented on HDFS-12137:
---

+1 the patch looks good.  The test failure is not related.

> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088023#comment-16088023
 ] 

Nandakumar commented on HDFS-12123:
---

{quote}
>> I think this a JIRA that we should file against KSM.
I will create a JIRA to track the same
{quote}
HDFS-12144 is for tracking the issue

> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch, 
> HDFS-12123-HDFS-7240.001.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-14 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-12130:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   2.9.0
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Chen!

> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088034#comment-16088034
 ] 

Anu Engineer commented on HDFS-12115:
-

Thank you, but before I commit I will post one more patch, since I need to 
rebase this patch on the top of the tree. Another of my commit conflicts with 
this one. 

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11948) Ozone: change TestRatisManager to check cluster with data

2017-07-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088042#comment-16088042
 ] 

Tsz Wo Nicholas Sze commented on HDFS-11948:


Thanks, Anu.  I will see why the tests use so many threads.


> Ozone: change TestRatisManager to check cluster with data
> -
>
> Key: HDFS-11948
> URL: https://issues.apache.org/jira/browse/HDFS-11948
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-11948-HDFS-7240.20170614.patch
>
>
> TestRatisManager first creates multiple Ratis clusters.  Then it changes the 
> membership and closes some clusters.  However, it does not test the clusters 
> with data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-12137:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
   2.9.0
   2.8.3
   Status: Resolved  (was: Patch Available)

Thanks for the review, [~xiaochen] and the patch, Daryn.

> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.3, 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088043#comment-16088043
 ] 

Kihwal Lee edited comment on HDFS-12137 at 7/14/17 8:56 PM:


Thanks for the review, [~xiaochen] and the patch, Daryn.
Just committed to trunk, branch-2 and branch-2.8.


was (Author: kihwal):
Thanks for the review, [~xiaochen] and the patch, Daryn.

> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3
>
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088051#comment-16088051
 ] 

Hudson commented on HDFS-12130:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12009 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12009/])
HDFS-12130. Optimizing permission check for getContentSummary. (szetszwo: rev 
a29fe100b3c671954b759add5923a2b44af9e6a4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetContentSummaryWithPermission.java


> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for permission 
> check first, we start with phase 2 directly, but for each directory, before 
> obtaining its summary, check its permission first. This way we take advantage 
> of existing lock yield in phase 2 code and still able to check permission and 
> terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088053#comment-16088053
 ] 

Kihwal Lee commented on HDFS-12140:
---

+1. For a BPOS, the blockpool ID won't change once registers. 

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12140.branch-2.8.patch, HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDFS-2319:


Assignee: Bharat Viswanadham

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-2319 started by Bharat Viswanadham.

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-12140:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.3
   3.0.0-beta1
   2.9.0
   Status: Resolved  (was: Patch Available)

Thansk for the patch, Daryn. And thanks for reviewing it, Rushabh.  I've 
committed this to trunk, branch-2 and branch-2.8.

> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3
>
> Attachments: HDFS-12140.branch-2.8.patch, HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086282#comment-16086282
 ] 

Kihwal Lee edited comment on HDFS-12136 at 7/14/17 9:21 PM:


Normal serving has been greatly impaired.  Appending to a block while it's 
scanned is exceeding rare compared to the normal block sending rate, yet the 
fix impacted all serving.  There's a better way accomplished via:
* Entirely remove (revert) fetching of checksums for finalized blocks in the 
BlockSender ctor.  Reduces lock hold time by eliminating i/o in the dataset 
lock.
* If a checksum exception occurs during the scan, and the genstamp changed, 
mark the block as suspect for rescan.  This is the edge case.
* Recent suspect blocks considers genstamps.  Suspect blocks with a newer 
genstamp than last recorded are not skipped.
* Recent suspects expire 10 min after being added to the cache.  Prior behavior 
was 10 mins after last access - which could lead to indefinite postponement.

No test changes needed.  {{TestBlockScanner#testAppendWhileScanning}} proves 
this approach continues to work.

Only difference in trunk/branch-2 is context and a few log lines in code copied 
into a getStoredBlock method.


was (Author: daryn):
Normal serving has been greatly impaired.  Appending to a block while it's 
scanned is exceeding rare compared to the normal block sending rate, yet the 
fix impacted all serving.  There's a bettery way accomplished via:
* Entirely remove (revert) fetching of checksums for finalized blocks in the 
BlockSender ctor.  Reduces lock hold time by eliminating i/o in the dataset 
lock.
* If a checksum exception occurs during the scan, and the genstamp changed, 
mark the block as suspect for rescan.  This is the edge case.
* Recent suspect blocks considers genstamps.  Suspect blocks with a newer 
genstamp than last recorded are not skipped.
* Recent suspects expire 10 min after being added to the cache.  Prior behavior 
was 10 mins after last access - which could lead to indefinite postponement.

No test changes needed.  {{TestBlockScanner#testAppendWhileScanning}} proves 
this approach continues to work.

Only difference in trunk/branch-2 is context and a few log lines in code copied 
into a getStoredBlock method.

> BlockSender performance regression due to volume scanner edge case
> --
>
> Key: HDFS-12136
> URL: https://issues.apache.org/jira/browse/HDFS-12136
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan 
> by reading the last checksum of finalized blocks within the {{BlockSender}} 
> ctor.  Unfortunately it's holding the exclusive dataset lock to open and read 
> the metafile multiple times  Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high 
> xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
> The xceiver threads congest creating block senders and impair the heartbeat 
> processing that is contending for the same lock.  Combined with other lock 
> contention issues, pipelines break and nodes sporadically go dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12137) DN dataset lock should be fair

2017-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088100#comment-16088100
 ] 

Hudson commented on HDFS-12137:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12010 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12010/])
HDFS-12137. DN dataset lock should be fair. Contributed by Daryn Sharp. 
(kihwal: rev 8d86a93915ee00318289535d9c78e48b75c8359d)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


> DN dataset lock should be fair
> --
>
> Key: HDFS-12137
> URL: https://issues.apache.org/jira/browse/HDFS-12137
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3
>
> Attachments: HDFS-12137.branch-2.patch, HDFS-12137.trunk.patch, 
> HDFS-12137.trunk.patch
>
>
> The dataset lock is very highly contended.  The unfair nature can be 
> especially harmful to the heartbeat handling.  Under high loads, partially 
> expose by HDFS-12136 introducing disk i/o within the lock, the heartbeat 
> handling thread may process commands so slowly due to the contention that the 
> node becomes stale or falsely declared dead.  The unfair lock is not helping 
> and appears to be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2017-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088124#comment-16088124
 ] 

Kihwal Lee commented on HDFS-12136:
---

[~jojochuang], we started seeing significant performance regression after 
increased I/O activities. Jstacking has revealed that DataXceiver threads are 
all waiting for the dataset impl lock. When the I/O load is reasonable, this 
might not be visible.

{noformat}
"org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@61a9d939" #351184
 daemon prio=5 os_prio=0 tid=0x7f94ddf0a000 nid=0xafef waiting on condition 
[0x7f94c1d4f000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xd55efd28> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.hdfs.InstrumentedLock.lock(InstrumentedLock.java:102)
at 
org.apache.hadoop.util.AutoCloseableLock.acquire(AutoCloseableLock.java:67)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.acquireDatasetLock(FsDatasetImpl.java:3274)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:252)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2348)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None
{noformat}
{noformat}
"DataXceiver for client DFSClient_xxx [Sending block xxx]" #351183
 daemon prio=5 os_prio=0 tid=0x0409b000 nid=0xafee waiting on condition 
[0x7f94c9f49000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xd55efd28> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.hdfs.InstrumentedLock.lock(InstrumentedLock.java:102)
at 
org.apache.hadoop.util.AutoCloseableLock.acquire(AutoCloseableLock.java:67)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.acquireDatasetLock(FsDatasetImpl.java:3274)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:252)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:580)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:145)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:100)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None
{noformat}

> BlockSender performance regression due to volume scanner edge case
> --
>
> Key: HDFS-12136
> URL: https://issues.apache.org/jira/browse/HDFS-12136
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan 
> by reading the last checksum of finalized blocks within the {{BlockSender}} 
> ctor.  Unfortunately it's holding the exclusive dataset lock to open and read 
> the metafile multiple times  Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high 
> xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
> The xceiver thread

[jira] [Updated] (HDFS-12138) Remove redundant 'public' modifiers from BlockCollection

2017-07-14 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12138:
--
Attachment: HDFS-12138.001.patch

Post initial patch.

> Remove redundant 'public' modifiers from BlockCollection
> 
>
> Key: HDFS-12138
> URL: https://issues.apache.org/jira/browse/HDFS-12138
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Trivial
> Fix For: 3.0.0-alpha4
>
> Attachments: HDFS-12138.001.patch
>
>
> The 'public' modifier of the methods in {{BlockCollection}} are redundant, 
> since this is a public interface. Running checkstyle against also complains 
> this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12138) Remove redundant 'public' modifiers from BlockCollection

2017-07-14 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-12138:
--
Status: Patch Available  (was: Open)

> Remove redundant 'public' modifiers from BlockCollection
> 
>
> Key: HDFS-12138
> URL: https://issues.apache.org/jira/browse/HDFS-12138
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Trivial
> Fix For: 3.0.0-alpha4
>
> Attachments: HDFS-12138.001.patch
>
>
> The 'public' modifier of the methods in {{BlockCollection}} are redundant, 
> since this is a public interface. Running checkstyle against also complains 
> this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12140) Remove BPOfferService lock contention to get block pool id

2017-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088144#comment-16088144
 ] 

Hudson commented on HDFS-12140:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12011 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12011/])
HDFS-12140. Remove BPOfferService lock contention to get block pool id. 
(kihwal: rev e7d187a1b6a826edd5bd0f708184d48f3674d489)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java


> Remove BPOfferService lock contention to get block pool id
> --
>
> Key: HDFS-12140
> URL: https://issues.apache.org/jira/browse/HDFS-12140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3
>
> Attachments: HDFS-12140.branch-2.8.patch, HDFS-12140.trunk.patch
>
>
> The block pool id is protected by a lock in {{BPOfferService}}.  This creates 
> excessive contention especially for xceivers threads attempting to queue IBRs 
> and heartbeat processing.  When the latter is delayed due to excessive 
> FSDataset lock contention, it causes pipelines to collapse.
> Accessing the block pool id should be lockless after registration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088185#comment-16088185
 ] 

Hadoop QA commented on HDFS-12123:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 7s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 56s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.TestOzoneConfigurationFields |
|   | hadoop.ozone.container.replication.TestContainerReplicationManager |
|   | hadoop.ozone.container.common.TestDatanodeStateMachine |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
| Timed out junit tests | 
org.apache.hadoop.ozone.container.ozoneimpl.TestRatisManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12123 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877388/HDFS-12123-HDFS-7240.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux aeef627a93c0 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 8d37ef3 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20279/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20279/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20279/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone: OzoneClient: Abstraction of OzoneClient and default imple

[jira] [Commented] (HDFS-12130) Optimizing permission check for getContentSummary

2017-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088187#comment-16088187
 ] 

Hudson commented on HDFS-12130:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12012 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12012/])
Revert "HDFS-12130. Optimizing permission check for getContentSummary." 
(szetszwo: rev a1f12bb543778ddc243205eaa962e99da4d8f135)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* (delete) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetContentSummaryWithPermission.java
HDFS-12130. Optimizing permission check for getContentSummary.  (szetszwo: rev 
f413ee33df301659c4ca9024380c2354983dcc84)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetContentSummaryWithPermission.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java


> Optimizing permission check for getContentSummary
> -
>
> Key: HDFS-12130
> URL: https://issues.apache.org/jira/browse/HDFS-12130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch, 
> HDFS-12130.003.patch
>
>
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory 
> does not have {{READ_EXECUTE}}, an access control exception is thrown and 
> {{getContentSummary}} terminates here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively 
> to get the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, 
> such that it does not block other operations for too long. However phase 1 
> does not yield. Meaning it's possible that the permission check phase still 
> blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 
> 1 into phase 2. Namely, instead of doing a full traversal for pe

[jira] [Updated] (HDFS-12123) Ozone: OzoneClient: Abstraction of OzoneClient and default implementation

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-12123:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

[~nandakumar131] Thank you for the contribution. I have committed this to the 
feature branch.

> Ozone: OzoneClient: Abstraction of OzoneClient and default implementation
> -
>
> Key: HDFS-12123
> URL: https://issues.apache.org/jira/browse/HDFS-12123
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>Assignee: Nandakumar
> Attachments: HDFS-12123-HDFS-7240.000.patch, 
> HDFS-12123-HDFS-7240.001.patch
>
>
> {{OzoneClient}} interface defines all the client operations supported by 
> Ozone. 
> {{OzoneClientImpl}} will have the default implementation, it should connects 
> to KSM, SCM and DataNode through RPC protocol to execute client calls.
> Similarly we should have a client implementation which implements 
> {{OzoneClient}} and uses REST protocol to execute client calls.
> This will provide lots of flexibility to Ozone applications, when 
> applications are running inside the cluster, they can use RPC protocol, but 
> when running from outside the cluster, the same applications can speak REST 
> protocol to communicate with Ozone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-12115:

Attachment: HDFS-12115-HDFS-7240.003.patch

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch, HDFS-12115-HDFS-7240.003.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088221#comment-16088221
 ] 

Anu Engineer commented on HDFS-12115:
-

v3 rebased on top of the tree.

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch, HDFS-12115-HDFS-7240.003.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11989) Ozone: add TestKeysRatis, TestBucketsRatis and TestVolumeRatis

2017-07-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088256#comment-16088256
 ] 

Anu Engineer commented on HDFS-11989:
-

+1, I will commit this shortly. Also, FYI, ListKeys is fixed now in distributed 
handler.


> Ozone: add TestKeysRatis, TestBucketsRatis and TestVolumeRatis
> --
>
> Key: HDFS-11989
> URL: https://issues.apache.org/jira/browse/HDFS-11989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone, test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-11989-HDFS-7240.20170618.patch, 
> HDFS-11989-HDFS-7240.20170620b.patch, HDFS-11989-HDFS-7240.20170620c.patch, 
> HDFS-11989-HDFS-7240.20170620.patch, HDFS-11989-HDFS-7240.20170621b.patch, 
> HDFS-11989-HDFS-7240.20170621c.patch, HDFS-11989-HDFS-7240.20170621.patch, 
> HDFS-11989-HDFS-7240.20170710.patch, HDFS-11989-HDFS-7240.20170712.patch
>
>
> Add Ratis tests similar to TestKeys, TestBuckets and TestVolume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12138) Remove redundant 'public' modifiers from BlockCollection

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088269#comment-16088269
 ] 

Hadoop QA commented on HDFS-12138:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 1 unchanged - 11 fixed = 1 total (was 12) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 55s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12138 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877401/HDFS-12138.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 19bf6651cb83 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f413ee3 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20280/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20280/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20280/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20280/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automa

[jira] [Commented] (HDFS-12144) Ozone: KSM: Cleanup of keys in KSM for failed clients

2017-07-14 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088267#comment-16088267
 ] 

Chen Liang commented on HDFS-12144:
---

I read through the comments in HDFS-12123 discussion. Actually I think I wrote 
that original TODO comment in the code, and as the follow-up, HDFS-11886 was 
filed. So this JIRA seems to be a duplicate of HDFS-11886? (Please feel free to 
comment/edit HDFS-11886 if there is anything you want to add also).

> Ozone: KSM: Cleanup of keys in KSM for failed clients
> -
>
> Key: HDFS-12144
> URL: https://issues.apache.org/jira/browse/HDFS-12144
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>
> While writing data into Ozone, client can fail after the key is allocated in 
> KSM. Cleanup of these keys has to be done.
> Clients can fail after key allocation in KSM while writing data to the 
> container, in such cases the key in KSM has to be deleted.
> more context: [HDFS-12123 - comment - TODO | 
> https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2017-07-14 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088273#comment-16088273
 ] 

Lei (Eddy) Xu commented on HDFS-12116:
--

Hi, Xiao

Thanks for address the test failure. it looks good overall.

Few small questions:

* Could you add comments to the adding interface. {{int 
updateIntervalForTesting}}.  Maybe split the function to 
{{updateHeartbeatItervalForTesting}} and {{updateIBRItervalForTesting}} to be 
readability ?
* In {code}
 void updateIntervalForTesting(final long heartbeat, final long ibr) {
  heartbeatIntervalMs = heartbeat;
  scheduleNextHeartbeat();
  blockReportIntervalMs = ibr;
}
{code}

do you need to schedule block report?

> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.org/jira/browse/HDFS-12116
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, 
> TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml
>
>
> This seems to be long-standing, but the failure rate (~10%) is slightly 
> higher in dist-test run in using cdh.
> In both _08 and _09 tests:
> # an attempt is made to make a replica in {{TEMPORARY}}
>  state, by {{waitForTempReplica}}.
> # Once that's returned, the test goes on to verify block reports shows 
> correct pending replication blocks.
> But there's a race condition. If the replica is replicated between steps #1 
> and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how 
> many replicas are replicated, hence failing the test.
> Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and 
> {{TestNNHandlesCombinedBlockReport}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2017-07-14 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088315#comment-16088315
 ] 

Xiao Chen commented on HDFS-12116:
--

Thanks a lot for the review Eddy!

Updated patch 3 to address both comments.
bq. do you need to schedule block report?
Not for this test... This is because in {{BPServiceActor#offerService}}, 
{{isBlockReportDue}} is checked iff {{isHeartbeatDue}} is true. But since the 
new method is updating block report interval and named so, your proposal sounds 
to be the right thing to do in case it's used by future tests.

> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.org/jira/browse/HDFS-12116
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, 
> TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml
>
>
> This seems to be long-standing, but the failure rate (~10%) is slightly 
> higher in dist-test run in using cdh.
> In both _08 and _09 tests:
> # an attempt is made to make a replica in {{TEMPORARY}}
>  state, by {{waitForTempReplica}}.
> # Once that's returned, the test goes on to verify block reports shows 
> correct pending replication blocks.
> But there's a race condition. If the replica is replicated between steps #1 
> and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how 
> many replicas are replicated, hence failing the test.
> Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and 
> {{TestNNHandlesCombinedBlockReport}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2017-07-14 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12116:
-
Attachment: HDFS-12116.03.patch

Will run dist-test again with patch 3 just in case.

> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.org/jira/browse/HDFS-12116
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, 
> HDFS-12116.03.patch, 
> TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml
>
>
> This seems to be long-standing, but the failure rate (~10%) is slightly 
> higher in dist-test run in using cdh.
> In both _08 and _09 tests:
> # an attempt is made to make a replica in {{TEMPORARY}}
>  state, by {{waitForTempReplica}}.
> # Once that's returned, the test goes on to verify block reports shows 
> correct pending replication blocks.
> But there's a race condition. If the replica is replicated between steps #1 
> and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how 
> many replicas are replicated, hence failing the test.
> Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and 
> {{TestNNHandlesCombinedBlockReport}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088329#comment-16088329
 ] 

Hadoop QA commented on HDFS-12115:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
32s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
31s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} hadoop-hdfs-project: The patch generated 7 new + 
153 unchanged - 0 fixed = 160 total (was 153) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
14s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 24s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.ozone.container.replication.TestContainerReplicationManager |
|   | hadoop.ozone.web.client.TestKeys |
|   | hadoop.ozone.container.common.impl.TestContainerPersistence |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.ozone.TestOzoneConfigurationFields |
|   | hadoop.ozone.container.placement.TestContainerPlacement |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainerRatis |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12115 |
| JIRA Patch URL | 
https://is

[jira] [Updated] (HDFS-11989) Ozone: add TestKeysRatis, TestBucketsRatis and TestVolumeRatis

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11989:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

[~szetszwo] Thanks for the contribution. I have committed this patch to the 
feature branch.

> Ozone: add TestKeysRatis, TestBucketsRatis and TestVolumeRatis
> --
>
> Key: HDFS-11989
> URL: https://issues.apache.org/jira/browse/HDFS-11989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone, test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-11989-HDFS-7240.20170618.patch, 
> HDFS-11989-HDFS-7240.20170620b.patch, HDFS-11989-HDFS-7240.20170620c.patch, 
> HDFS-11989-HDFS-7240.20170620.patch, HDFS-11989-HDFS-7240.20170621b.patch, 
> HDFS-11989-HDFS-7240.20170621c.patch, HDFS-11989-HDFS-7240.20170621.patch, 
> HDFS-11989-HDFS-7240.20170710.patch, HDFS-11989-HDFS-7240.20170712.patch
>
>
> Add Ratis tests similar to TestKeys, TestBuckets and TestVolume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-2319:
-
Attachment: (was: HDFS-2319.patch)

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-2319:
-
Attachment: HDFS-2319.patch

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-2319:
-
Attachment: HDFS-2319.02.patch

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-2319:
-
Status: Patch Available  (was: In Progress)

> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088362#comment-16088362
 ] 

Bharat Viswanadham edited comment on HDFS-2319 at 7/15/17 1:34 AM:
---

Hi [~ajisakaa]
I have updated the code.
Fixed the test failure and added more test cases.
Could you please help in reviewing the changes.


was (Author: bharatviswa):
Hi [~ajisakaa]
I have updated the code.
Fixed the test failure and added more test cases.


> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088362#comment-16088362
 ] 

Bharat Viswanadham commented on HDFS-2319:
--

Hi [~ajisakaa]
I have updated the code.
Fixed the test failure and added more test cases.


> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088368#comment-16088368
 ] 

Hadoop QA commented on HDFS-12116:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
40s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877421/HDFS-12116.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 91f9f5581f10 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f413ee3 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20282/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20282/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20282/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20282/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.or

[jira] [Commented] (HDFS-2319) Add test cases for FSshell -stat

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088370#comment-16088370
 ] 

Hadoop QA commented on HDFS-2319:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-2319 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877431/HDFS-2319.02.patch |
| Optional Tests |  asflicense  unit  xml  |
| uname | Linux cdc6fa837b32 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f413ee3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20283/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20283/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add test cases for FSshell -stat
> 
>
> Key: HDFS-2319
> URL: https://issues.apache.org/jira/browse/HDFS-2319
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: XieXianshan
>Assignee: Bharat Viswanadham
>Priority: Trivial
> Attachments: HDFS-2319.02.patch, HDFS-2319.patch
>
>
> Add test cases for HADOOP-7574.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12098) Ozone: Datanode is unable to register with scm if scm starts later

2017-07-14 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12098:
---
Attachment: HDFS-12098-HDFS-7240.testcase.patch

> Ozone: Datanode is unable to register with scm if scm starts later
> --
>
> Key: HDFS-12098
> URL: https://issues.apache.org/jira/browse/HDFS-12098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ozone, scm
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: disabled-scm-test.patch, HDFS-12098-HDFS-7240.001.patch, 
> HDFS-12098-HDFS-7240.002.patch, HDFS-12098-HDFS-7240.testcase.patch, Screen 
> Shot 2017-07-11 at 4.58.08 PM.png, thread_dump.log
>
>
> Reproducing steps
> 1. Start namenode
> {{./bin/hdfs --daemon start namenode}}
> 2. Start datanode
> {{./bin/hdfs datanode}}
> will see following connection issues
> {noformat}
> 17/07/13 21:16:48 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:49 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:50 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:51 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> {noformat}
> this is expected because scm is not started yet
> 3. Start scm
> {{./bin/hdfs scm}}
> expecting datanode can register to this scm, expecting the log in scm
> {noformat}
> 17/07/13 21:22:30 INFO node.SCMNodeManager: Data node with ID: 
> af22862d-aafa-4941-9073-53224ae43e2c Registered.
> {noformat}
> but did *NOT* see this log. (_I debugged into the code and found the datanode 
> state was transited SHUTDOWN unexpectedly because the thread leaks, each of 
> those threads counted to set to next state and they all set to SHUTDOWN 
> state_)
> 4. Create a container from scm CLI
> {{./bin/hdfs scm -container -create -c 20170714c0}}
> this fails with following exception
> {noformat}
> Creating container : 20170714c0.
> Error executing 
> command:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.scm.exceptions.SCMException):
>  Unable to create container while in chill mode
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:241)
>   at 
> org.apache.hadoop.ozone.scm.StorageContainerManager.allocateContainer(StorageContainerManager.java:392)
>   at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.allocateContainer(StorageContainerLocationProtocolServerSideTranslatorPB.java:73)
> {noformat}
> datanode was not registered to scm, thus it's still in chill mode.
> *Note*, if we start scm first, there is no such issue, I can create container 
> from CLI without any problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12098) Ozone: Datanode is unable to register with scm if scm starts later

2017-07-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088382#comment-16088382
 ] 

Weiwei Yang commented on HDFS-12098:


Hi [~anu]

I just uploaded a test case patch to reproduce this problem from UT. I revised 
some code about how scm was started in MiniOzoneCluster, ensures that scm 
constructor is only called when scm is started. In this case, I could reproduce 
the same issue as I was seeing from a real setup. Please take a look and if you 
are agree with the problem I described, we then can look at the fix.

Thank you. 

> Ozone: Datanode is unable to register with scm if scm starts later
> --
>
> Key: HDFS-12098
> URL: https://issues.apache.org/jira/browse/HDFS-12098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ozone, scm
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: disabled-scm-test.patch, HDFS-12098-HDFS-7240.001.patch, 
> HDFS-12098-HDFS-7240.002.patch, HDFS-12098-HDFS-7240.testcase.patch, Screen 
> Shot 2017-07-11 at 4.58.08 PM.png, thread_dump.log
>
>
> Reproducing steps
> 1. Start namenode
> {{./bin/hdfs --daemon start namenode}}
> 2. Start datanode
> {{./bin/hdfs datanode}}
> will see following connection issues
> {noformat}
> 17/07/13 21:16:48 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:49 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:50 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:51 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> {noformat}
> this is expected because scm is not started yet
> 3. Start scm
> {{./bin/hdfs scm}}
> expecting datanode can register to this scm, expecting the log in scm
> {noformat}
> 17/07/13 21:22:30 INFO node.SCMNodeManager: Data node with ID: 
> af22862d-aafa-4941-9073-53224ae43e2c Registered.
> {noformat}
> but did *NOT* see this log. (_I debugged into the code and found the datanode 
> state was transited SHUTDOWN unexpectedly because the thread leaks, each of 
> those threads counted to set to next state and they all set to SHUTDOWN 
> state_)
> 4. Create a container from scm CLI
> {{./bin/hdfs scm -container -create -c 20170714c0}}
> this fails with following exception
> {noformat}
> Creating container : 20170714c0.
> Error executing 
> command:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.scm.exceptions.SCMException):
>  Unable to create container while in chill mode
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:241)
>   at 
> org.apache.hadoop.ozone.scm.StorageContainerManager.allocateContainer(StorageContainerManager.java:392)
>   at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.allocateContainer(StorageContainerLocationProtocolServerSideTranslatorPB.java:73)
> {noformat}
> datanode was not registered to scm, thus it's still in chill mode.
> *Note*, if we start scm first, there is no such issue, I can create container 
> from CLI without any problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12115) Ozone: SCM: Add queryNode RPC Call

2017-07-14 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-12115:

Attachment: HDFS-12115-HDFS-7240.004.patch

Addressing a test failure

> Ozone: SCM: Add queryNode RPC Call
> --
>
> Key: HDFS-12115
> URL: https://issues.apache.org/jira/browse/HDFS-12115
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-12115-HDFS-7240.001.patch, 
> HDFS-12115-HDFS-7240.002.patch, HDFS-12115-HDFS-7240.003.patch, 
> HDFS-12115-HDFS-7240.004.patch
>
>
> Add queryNode RPC to Storage container location protocol. This allows 
> applications like SCM CLI to get the list of nodes in various states, like 
> Healthy, live or Dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12098) Ozone: Datanode is unable to register with scm if scm starts later

2017-07-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088411#comment-16088411
 ] 

Weiwei Yang commented on HDFS-12098:


Please hold on looking at the test patch, it still has some problems.. working 
on a new one :P

> Ozone: Datanode is unable to register with scm if scm starts later
> --
>
> Key: HDFS-12098
> URL: https://issues.apache.org/jira/browse/HDFS-12098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ozone, scm
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: disabled-scm-test.patch, HDFS-12098-HDFS-7240.001.patch, 
> HDFS-12098-HDFS-7240.002.patch, HDFS-12098-HDFS-7240.testcase.patch, Screen 
> Shot 2017-07-11 at 4.58.08 PM.png, thread_dump.log
>
>
> Reproducing steps
> 1. Start namenode
> {{./bin/hdfs --daemon start namenode}}
> 2. Start datanode
> {{./bin/hdfs datanode}}
> will see following connection issues
> {noformat}
> 17/07/13 21:16:48 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:49 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:50 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:51 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> {noformat}
> this is expected because scm is not started yet
> 3. Start scm
> {{./bin/hdfs scm}}
> expecting datanode can register to this scm, expecting the log in scm
> {noformat}
> 17/07/13 21:22:30 INFO node.SCMNodeManager: Data node with ID: 
> af22862d-aafa-4941-9073-53224ae43e2c Registered.
> {noformat}
> but did *NOT* see this log. (_I debugged into the code and found the datanode 
> state was transited SHUTDOWN unexpectedly because the thread leaks, each of 
> those threads counted to set to next state and they all set to SHUTDOWN 
> state_)
> 4. Create a container from scm CLI
> {{./bin/hdfs scm -container -create -c 20170714c0}}
> this fails with following exception
> {noformat}
> Creating container : 20170714c0.
> Error executing 
> command:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.scm.exceptions.SCMException):
>  Unable to create container while in chill mode
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:241)
>   at 
> org.apache.hadoop.ozone.scm.StorageContainerManager.allocateContainer(StorageContainerManager.java:392)
>   at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.allocateContainer(StorageContainerLocationProtocolServerSideTranslatorPB.java:73)
> {noformat}
> datanode was not registered to scm, thus it's still in chill mode.
> *Note*, if we start scm first, there is no such issue, I can create container 
> from CLI without any problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12098) Ozone: Datanode is unable to register with scm if scm starts later

2017-07-14 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12098:
---
Status: In Progress  (was: Patch Available)

> Ozone: Datanode is unable to register with scm if scm starts later
> --
>
> Key: HDFS-12098
> URL: https://issues.apache.org/jira/browse/HDFS-12098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ozone, scm
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: disabled-scm-test.patch, HDFS-12098-HDFS-7240.001.patch, 
> HDFS-12098-HDFS-7240.002.patch, HDFS-12098-HDFS-7240.testcase.patch, Screen 
> Shot 2017-07-11 at 4.58.08 PM.png, thread_dump.log
>
>
> Reproducing steps
> 1. Start namenode
> {{./bin/hdfs --daemon start namenode}}
> 2. Start datanode
> {{./bin/hdfs datanode}}
> will see following connection issues
> {noformat}
> 17/07/13 21:16:48 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:49 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:50 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> 17/07/13 21:16:51 INFO ipc.Client: Retrying connect to server: 
> ozone1.fyre.ibm.com/172.16.165.133:9861. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
> SECONDS)
> {noformat}
> this is expected because scm is not started yet
> 3. Start scm
> {{./bin/hdfs scm}}
> expecting datanode can register to this scm, expecting the log in scm
> {noformat}
> 17/07/13 21:22:30 INFO node.SCMNodeManager: Data node with ID: 
> af22862d-aafa-4941-9073-53224ae43e2c Registered.
> {noformat}
> but did *NOT* see this log. (_I debugged into the code and found the datanode 
> state was transited SHUTDOWN unexpectedly because the thread leaks, each of 
> those threads counted to set to next state and they all set to SHUTDOWN 
> state_)
> 4. Create a container from scm CLI
> {{./bin/hdfs scm -container -create -c 20170714c0}}
> this fails with following exception
> {noformat}
> Creating container : 20170714c0.
> Error executing 
> command:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.scm.exceptions.SCMException):
>  Unable to create container while in chill mode
>   at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:241)
>   at 
> org.apache.hadoop.ozone.scm.StorageContainerManager.allocateContainer(StorageContainerManager.java:392)
>   at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.allocateContainer(StorageContainerLocationProtocolServerSideTranslatorPB.java:73)
> {noformat}
> datanode was not registered to scm, thus it's still in chill mode.
> *Note*, if we start scm first, there is no such issue, I can create container 
> from CLI without any problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12144) Ozone: KSM: Cleanup of keys in KSM for failed clients

2017-07-14 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar resolved HDFS-12144.
---
Resolution: Duplicate

> Ozone: KSM: Cleanup of keys in KSM for failed clients
> -
>
> Key: HDFS-12144
> URL: https://issues.apache.org/jira/browse/HDFS-12144
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nandakumar
>
> While writing data into Ozone, client can fail after the key is allocated in 
> KSM. Cleanup of these keys has to be done.
> Clients can fail after key allocation in KSM while writing data to the 
> container, in such cases the key in KSM has to be deleted.
> more context: [HDFS-12123 - comment - TODO | 
> https://issues.apache.org/jira/browse/HDFS-12123?focusedCommentId=16087804&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16087804]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12098) Ozone: Datanode is unable to register with scm if scm starts later

2017-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088417#comment-16088417
 ] 

Hadoop QA commented on HDFS-12098:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 154 unchanged - 0 fixed = 159 total (was 154) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
58s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.server.datanode.DataNode.datanodeStateMachine; locked 
42% of time  Unsynchronized access at DataNode.java:42% of time  Unsynchronized 
access at DataNode.java:[line 3228] |
| Failed junit tests | hadoop.ozone.TestMiniOzoneCluster |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.ozone.container.replication.TestContainerReplicationManager |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.ozone.TestOzoneConfigurationFields |
| Timed out junit tests | 
org.apache.hadoop.ozone.container.ozoneimpl.TestRatisManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-12098 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877433/HDFS-12098-HDFS-7240.testcase.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 719ca50388a4 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 90f1d58 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20284/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|

  1   2   >