[jira] [Commented] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923978#comment-16923978
 ] 

Hadoop QA commented on HDFS-14609:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 24s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 26s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14609 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979613/HDFS-14609.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 61d8ede84b05 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 494d75e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27798/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27798/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27798/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
| checkstyle | 
https://bui

[jira] [Updated] (HDFS-14795) Add Throttler for writing block

2019-09-06 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-14795:
---
Attachment: HDFS-14795.002.patch

> Add Throttler for writing block
> ---
>
> Key: HDFS-14795
> URL: https://issues.apache.org/jira/browse/HDFS-14795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-14795.001.patch, HDFS-14795.002.patch
>
>
> DataXceiver#writeBlock
> {code:java}
> blockReceiver.receiveBlock(mirrorOut, mirrorIn, replyOut,
> mirrorAddr, null, targets, false);
> {code}
> As above code, DataXceiver#writeBlock doesn't throttler.
>  I think it is necessary to throttle for writing block, while add throttler 
> in stage of PIPELINE_SETUP_APPEND_RECOVERY or 
> PIPELINE_SETUP_STREAMING_RECOVERY.
> Default throttler value is still null.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2094) TestOzoneManagerRatisServer is failing

2019-09-06 Thread Nanda kumar (Jira)
Nanda kumar created HDDS-2094:
-

 Summary: TestOzoneManagerRatisServer is failing
 Key: HDDS-2094
 URL: https://issues.apache.org/jira/browse/HDDS-2094
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Reporter: Nanda kumar


{{TestOzoneManagerRatisServer}} is failing on trunk with the following error
{noformat}
[ERROR] 
verifyRaftGroupIdGenerationWithCustomOmServiceId(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerRatisServer)
  Time elapsed: 0.418 s  <<< ERROR!
org.apache.hadoop.metrics2.MetricsException: Metrics source 
OzoneManagerDoubleBufferMetrics already exists!
at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
at 
org.apache.hadoop.ozone.om.ratis.metrics.OzoneManagerDoubleBufferMetrics.create(OzoneManagerDoubleBufferMetrics.java:50)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.(OzoneManagerDoubleBuffer.java:110)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.(OzoneManagerDoubleBuffer.java:88)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.(OzoneManagerStateMachine.java:87)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.getStateMachine(OzoneManagerRatisServer.java:314)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.(OzoneManagerRatisServer.java:244)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.newOMRatisServer(OzoneManagerRatisServer.java:302)
at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerRatisServer.verifyRaftGroupIdGenerationWithCustomOmServiceId(TestOzoneManagerRatisServer.java:209)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-14826:
-

 Summary: dfs.ha.zkfc.port property duplicated in hdfs-default.xml
 Key: HDFS-14826
 URL: https://issues.apache.org/jira/browse/HDFS-14826
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Renukaprasad C


"dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
file with common value (port number - 8019) & different description.

This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924003#comment-16924003
 ] 

Renukaprasad C commented on HDFS-14826:
---

I would like to work on it, can some one please assign to me?

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Priority: Major
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1561) Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1561?focusedWorklogId=307681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307681
 ]

ASF GitHub Bot logged work on HDDS-1561:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:45
Start Date: 06/Sep/19 07:45
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on issue #1401: HDDS-1561: Mark 
OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
URL: https://github.com/apache/hadoop/pull/1401#issuecomment-528749936
 
 
   Test failures are not related.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307681)
Time Spent: 1.5h  (was: 1h 20m)

> Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
> -
>
> Key: HDDS-1561
> URL: https://issues.apache.org/jira/browse/HDDS-1561
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: HDDS-1561.001.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Right now, if a pipeline is destroyed by SCM, all the container on the 
> pipeline are marked as quasi closed when datanode received close container 
> command. SCM while processing these containers reports, marks these 
> containers as closed once majority of the nodes are available.
> This is however not a sufficient condition in cases where the raft log 
> directory is missing or corrupted. As the containers will not have all the 
> applied transaction. 
> To solve this problem, we should QUASI_CLOSE the containers in datanode as 
> part of ratis groupRemove. If a container is in OPEN state in datanode 
> without any active pipeline, it will be marked as Unhealthy while processing 
> close container command.
> cc [~jnp], [~shashikant], [~sdeka], [~nandakumar131]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1561) Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1561?focusedWorklogId=307682&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307682
 ]

ASF GitHub Bot logged work on HDDS-1561:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:45
Start Date: 06/Sep/19 07:45
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1401: 
HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
URL: https://github.com/apache/hadoop/pull/1401
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307682)
Time Spent: 1h 40m  (was: 1.5h)

> Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
> -
>
> Key: HDDS-1561
> URL: https://issues.apache.org/jira/browse/HDDS-1561
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: HDDS-1561.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Right now, if a pipeline is destroyed by SCM, all the container on the 
> pipeline are marked as quasi closed when datanode received close container 
> command. SCM while processing these containers reports, marks these 
> containers as closed once majority of the nodes are available.
> This is however not a sufficient condition in cases where the raft log 
> directory is missing or corrupted. As the containers will not have all the 
> applied transaction. 
> To solve this problem, we should QUASI_CLOSE the containers in datanode as 
> part of ratis groupRemove. If a container is in OPEN state in datanode 
> without any active pipeline, it will be marked as Unhealthy while processing 
> close container command.
> cc [~jnp], [~shashikant], [~sdeka], [~nandakumar131]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HDFS-14826:
---

Assignee: Renukaprasad C

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1561) Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

2019-09-06 Thread Nanda kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924005#comment-16924005
 ] 

Nanda kumar commented on HDDS-1561:
---

Thanks [~ljain] for the contribution. Merged to trunk.

> Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
> -
>
> Key: HDDS-1561
> URL: https://issues.apache.org/jira/browse/HDDS-1561
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: HDDS-1561.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Right now, if a pipeline is destroyed by SCM, all the container on the 
> pipeline are marked as quasi closed when datanode received close container 
> command. SCM while processing these containers reports, marks these 
> containers as closed once majority of the nodes are available.
> This is however not a sufficient condition in cases where the raft log 
> directory is missing or corrupted. As the containers will not have all the 
> applied transaction. 
> To solve this problem, we should QUASI_CLOSE the containers in datanode as 
> part of ratis groupRemove. If a container is in OPEN state in datanode 
> without any active pipeline, it will be marked as Unhealthy while processing 
> close container command.
> cc [~jnp], [~shashikant], [~sdeka], [~nandakumar131]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924006#comment-16924006
 ] 

Ayush Saxena commented on HDFS-14826:
-

Thanx [~prasad-acit] for the interest, Have added you to the contributors list 
and assigned the JIRA to you.
Welcome to Hadoop. :)

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1561) Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924007#comment-16924007
 ] 

Hudson commented on HDDS-1561:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17237 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17237/])
HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis (nanda: rev 
6e4cdf89effb11c5ec36578da83a46d3d3c48c11)
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/TestCSMMetrics.java
* (edit) 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestCloseContainerCommandHandler.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisServer.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
* (edit) hadoop-hdds/pom.xml
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/server/TestSecureContainerServer.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
* (edit) hadoop-ozone/pom.xml
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestCloseContainerByPipeline.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerData.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/server/TestContainerServer.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/CloseContainerCommandHandler.java


> Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
> -
>
> Key: HDDS-1561
> URL: https://issues.apache.org/jira/browse/HDDS-1561
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Blocker
>  Labels: pull-request-available
> Attachments: HDDS-1561.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Right now, if a pipeline is destroyed by SCM, all the container on the 
> pipeline are marked as quasi closed when datanode received close container 
> command. SCM while processing these containers reports, marks these 
> containers as closed once majority of the nodes are available.
> This is however not a sufficient condition in cases where the raft log 
> directory is missing or corrupted. As the containers will not have all the 
> applied transaction. 
> To solve this problem, we should QUASI_CLOSE the containers in datanode as 
> part of ratis groupRemove. If a container is in OPEN state in datanode 
> without any active pipeline, it will be marked as Unhealthy while processing 
> close container command.
> cc [~jnp], [~shashikant], [~sdeka], [~nandakumar131]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1561) Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

2019-09-06 Thread Nanda kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-1561:
--
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove
> -
>
> Key: HDDS-1561
> URL: https://issues.apache.org/jira/browse/HDDS-1561
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1561.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Right now, if a pipeline is destroyed by SCM, all the container on the 
> pipeline are marked as quasi closed when datanode received close container 
> command. SCM while processing these containers reports, marks these 
> containers as closed once majority of the nodes are available.
> This is however not a sufficient condition in cases where the raft log 
> directory is missing or corrupted. As the containers will not have all the 
> applied transaction. 
> To solve this problem, we should QUASI_CLOSE the containers in datanode as 
> part of ratis groupRemove. If a container is in OPEN state in datanode 
> without any active pipeline, it will be marked as Unhealthy while processing 
> close container command.
> cc [~jnp], [~shashikant], [~sdeka], [~nandakumar131]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2095) Submit mr job to yarn failed, Error messegs is "Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"

2019-09-06 Thread luhuachao (Jira)
luhuachao created HDDS-2095:
---

 Summary: Submit mr job to yarn failed,   Error messegs is 
"Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"
 Key: HDDS-2095
 URL: https://issues.apache.org/jira/browse/HDDS-2095
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Filesystem
Affects Versions: 0.4.1
Reporter: luhuachao


below is the submit command 
{code:java}
hadoop jar hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  nnbench 
-Dfs.defaultFS=o3fs://buc.volume-test  -maps 3   -bytesToWrite 1 -numberOfFiles 
1000  -blockSize 16  -operation create_write
{code}
clinet fail with message 
{code:java}
19/09/06 15:26:52 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
/user/hdfs/.staging/job_1567754782562_000119/09/06 15:26:52 INFO 
mapreduce.JobSubmitter: Cleaning up the staging area 
/user/hdfs/.staging/job_1567754782562_0001java.io.IOException: 
org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
application_1567754782562_0001 to YARN : 
org.apache.hadoop.security.token.TokenRenewer: Provider 
org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:345) at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at 
org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at 
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at 
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at 
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at 
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at 
org.apache.hadoop.hdfs.NNBench.runTests(NNBench.java:487) at 
org.apache.hadoop.hdfs.NNBench.run(NNBench.java:604) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
org.apache.hadoop.hdfs.NNBench.main(NNBench.java:579) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
 at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at 
org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:144) at 
org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:152) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.util.RunJar.run(RunJar.java:308) at 
org.apache.hadoop.util.RunJar.main(RunJar.java:222)Caused by: 
org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
application_1567754782562_0001 to YARN : 
org.apache.hadoop.security.token.TokenRenewer: Provider 
org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:304)
 at 
org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:299)
 at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:330) ... 34 
more
{code}
the log in resourcemanager 
{code:java}
2019-09-06 15:26:51,836 WARN  security.DelegationTokenRenewer 
(DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(923)) - Unable to 
add the application to the delegation token renewer.
java.util.ServiceConfigurationError: 
org.apache.hadoop.security.token.TokenRenewer: Provider 
org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found
at java.util.ServiceLoader.fail(ServiceLoader.java:239)
at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.security.token.

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307688&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307688
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:55
Start Date: 06/Sep/19 07:55
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321616777
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   Do you think it makes sense to have a field inside DatanodeInfo of type 
NodeStatus, so we can always pass the states around as a pair, or should we add 
two individual fields to DatanodeInfo - nodeHealth and nodeOperationalState?
   
   Also, one other thing to consider, is nodeStateMap originally kept a list of 
healthy, stale and dead, so it was possible to quickly return all nodes in that 
state. However now, we need to iterate over the whole list to find those nodes. 
One reason for this, is that we have 15 different states now instead of 3. If 
we move nodeStatus into datanodeInfo, it would be more difficult to optimise 
this later if needed. However it would simplify things if we simply remove this 
stateMap.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307688)
Time Spent: 3h 40m  (was: 3.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2095) Submit mr job to yarn failed, Error messegs is "Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"

2019-09-06 Thread luhuachao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luhuachao updated HDDS-2095:

Attachment: HDDS-2095.001.patch

> Submit mr job to yarn failed,   Error messegs is "Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"
> ---
>
> Key: HDDS-2095
> URL: https://issues.apache.org/jira/browse/HDDS-2095
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.4.1
>Reporter: luhuachao
>Priority: Major
> Attachments: HDDS-2095.001.patch
>
>
> below is the submit command 
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  nnbench 
> -Dfs.defaultFS=o3fs://buc.volume-test  -maps 3   -bytesToWrite 1 
> -numberOfFiles 1000  -blockSize 16  -operation create_write
> {code}
> clinet fail with message 
> {code:java}
> 19/09/06 15:26:52 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_000119/09/06 15:26:52 INFO 
> mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_0001java.io.IOException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:345) at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at 
> org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) 
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at 
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at 
> org.apache.hadoop.hdfs.NNBench.runTests(NNBench.java:487) at 
> org.apache.hadoop.hdfs.NNBench.run(NNBench.java:604) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
> org.apache.hadoop.hdfs.NNBench.main(NNBench.java:579) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>  at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:152) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.RunJar.run(RunJar.java:308) at 
> org.apache.hadoop.util.RunJar.main(RunJar.java:222)Caused by: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:304)
>  at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:299)
>  at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:330) ... 34 
> more
> {code}
> the log in resourcemanager 
> {code:java}
> 2019-09-06 15:26:51,836 WARN  security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(923)) - Unable to 
> add the application to the delegation token renewer.
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.To

[jira] [Updated] (HDDS-2095) Submit mr job to yarn failed, Error messegs is "Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"

2019-09-06 Thread luhuachao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luhuachao updated HDDS-2095:

Status: Patch Available  (was: Open)

> Submit mr job to yarn failed,   Error messegs is "Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"
> ---
>
> Key: HDDS-2095
> URL: https://issues.apache.org/jira/browse/HDDS-2095
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.4.1
>Reporter: luhuachao
>Priority: Major
> Attachments: HDDS-2095.001.patch
>
>
> below is the submit command 
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  nnbench 
> -Dfs.defaultFS=o3fs://buc.volume-test  -maps 3   -bytesToWrite 1 
> -numberOfFiles 1000  -blockSize 16  -operation create_write
> {code}
> clinet fail with message 
> {code:java}
> 19/09/06 15:26:52 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_000119/09/06 15:26:52 INFO 
> mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_0001java.io.IOException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:345) at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at 
> org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) 
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at 
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at 
> org.apache.hadoop.hdfs.NNBench.runTests(NNBench.java:487) at 
> org.apache.hadoop.hdfs.NNBench.run(NNBench.java:604) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
> org.apache.hadoop.hdfs.NNBench.main(NNBench.java:579) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>  at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:152) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.RunJar.run(RunJar.java:308) at 
> org.apache.hadoop.util.RunJar.main(RunJar.java:222)Caused by: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:304)
>  at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:299)
>  at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:330) ... 34 
> more
> {code}
> the log in resourcemanager 
> {code:java}
> 2019-09-06 15:26:51,836 WARN  security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(923)) - Unable to 
> add the application to the delegation token renewer.
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.tok

[jira] [Updated] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-06 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-14802:
---
Status: Patch Available  (was: Open)

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, 
> HDFS-14802.003.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-06 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-14802:
---
Status: Open  (was: Patch Available)

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, 
> HDFS-14802.003.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1899) DeleteBlocksCommandHandler is unable to find the container in SCM

2019-09-06 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned HDDS-1899:
---

Assignee: (was: Nanda kumar)

> DeleteBlocksCommandHandler is unable to find the container in SCM
> -
>
> Key: HDDS-1899
> URL: https://issues.apache.org/jira/browse/HDDS-1899
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> DeleteBlocksCommandHandler is unable to find a container in SCM.
> {code}
> 2019-08-02 14:04:56,735 WARN  commandhandler.DeleteBlocksCommandHandler 
> (DeleteBlocksCommandHandler.java:lambda$handle$0(140)) - Failed to delete 
> blocks for container=33, TXID=184
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the container 33
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.lambda$handle$0(DeleteBlocksCommandHandler.java:122)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at 
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.handle(DeleteBlocksCommandHandler.java:114)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307697&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307697
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 08:23
Start Date: 06/Sep/19 08:23
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1344: 
HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321626983
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   It is better to have `NodeStatus` inside `DatanodeInfo` rather than having 
two separate fields.
   
Yes, `stateMap` helped us to easily get the list/count of nodes in a 
specific state, but with the current changes it is not straight forward to 
maintain `state -> list of nodes`. In any case we will be iteration over all 
the available nodes to get list of nodes in a given state. 
   The number of nodes in a cluster should not go beyond 3-4 order of 
magnitude. We can re-visit and optimize this, if we run into any performance 
issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307697)
Time Spent: 3h 50m  (was: 3h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307698&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307698
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 08:24
Start Date: 06/Sep/19 08:24
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1344: 
HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321626983
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   It is better to have `NodeStatus` inside `DatanodeInfo` rather than having 
two separate fields.
   
Yes, `stateMap` helped us to easily get the list/count of nodes in a 
specific state, but with the current changes it is not straight forward to 
maintain `state -> list of nodes`. In any case we will be iterating over all 
the available nodes to get list of nodes in a given state. 
   The number of nodes in a cluster should not go beyond 3-4 order of 
magnitude. We can re-visit and optimize this, if we run into any performance 
issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307698)
Time Spent: 4h  (was: 3h 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1899) DeleteBlocksCommandHandler is unable to find the container in SCM

2019-09-06 Thread Nanda kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924057#comment-16924057
 ] 

Nanda kumar commented on HDDS-1899:
---

There are valid scenarios in which this could happen.

When we get delete block request on an over replicated container, the block 
deleting service will try to send delete block command to all the datanodes 
which has a replica.
At the same time replication manager will send delete container (replica) 
command to one (or more) datanode(s).

There could be a race condition here and if the delete container command is 
added to the {{SCMNodeManager#commandQueue}} before delete block command, 
deletion of the container in datanode will happen before processing the delete 
block request which will result in {{StorageContainerException}}.

> DeleteBlocksCommandHandler is unable to find the container in SCM
> -
>
> Key: HDDS-1899
> URL: https://issues.apache.org/jira/browse/HDDS-1899
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> DeleteBlocksCommandHandler is unable to find a container in SCM.
> {code}
> 2019-08-02 14:04:56,735 WARN  commandhandler.DeleteBlocksCommandHandler 
> (DeleteBlocksCommandHandler.java:lambda$handle$0(140)) - Failed to delete 
> blocks for container=33, TXID=184
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the container 33
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.lambda$handle$0(DeleteBlocksCommandHandler.java:122)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at 
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.handle(DeleteBlocksCommandHandler.java:114)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-14826:
--
Attachment: HDFS-14826.001.patch

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-14826:
--
Status: Patch Available  (was: Open)

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1843) Undetectable corruption after restart of a datanode

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1843?focusedWorklogId=307712&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307712
 ]

ASF GitHub Bot logged work on HDDS-1843:


Author: ASF GitHub Bot
Created on: 06/Sep/19 09:14
Start Date: 06/Sep/19 09:14
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1364: 
HDDS-1843. Undetectable corruption after restart of a datanode.
URL: https://github.com/apache/hadoop/pull/1364#discussion_r321645502
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
 ##
 @@ -240,14 +240,46 @@ public ContainerReportsProto getContainerReport() throws 
IOException {
   }
 
   /**
-   * Builds the missing container set by taking a diff total no containers
-   * actually found and number of containers which actually got created.
+   * Builds the missing container set by taking a diff between total no
+   * containers actually found and number of containers which actually
+   * got created. It also validates the BCSID stored in the snapshot file
+   * for each container as against what is reported in containerScan.
* This will only be called during the initialization of Datanode Service
* when  it still not a part of any write Pipeline.
-   * @param createdContainerSet ContainerId set persisted in the Ratis snapshot
+   * @param container2BCSIDMap Map of containerId to BCSID persisted in the
+   *   Ratis snapshot
*/
-  public void buildMissingContainerSet(Set createdContainerSet) {
-missingContainerSet.addAll(createdContainerSet);
-missingContainerSet.removeAll(containerMap.keySet());
+  public void buildMissingContainerSetAndValidate(
+  Map container2BCSIDMap) {
+container2BCSIDMap.entrySet().parallelStream().forEach((mapEntry) -> {
+  long id = mapEntry.getKey();
+  if (!containerMap.containsKey(id)) {
+LOG.warn("Adding container {} to missing container set.", id);
+missingContainerSet.add(id);
+  } else {
+Container container = containerMap.get(id);
+long containerBCSID = container.getBlockCommitSequenceId();
+long snapshotBCSID = mapEntry.getValue();
+if (containerBCSID < snapshotBCSID) {
+  LOG.warn(
+  "Marking container {} unhealthy as reported BCSID {} is smaller"
+  + " than ratis snapshot recorded value {}", id,
+  containerBCSID, snapshotBCSID);
+  // just mark the container unhealthy. Once the DatanodeStateMachine
+  // thread starts it will send container report to SCM where these
+  // unhealthy containers would be detected
+  try {
+container.markContainerUnhealthy();
+  } catch (StorageContainerException sce) {
+// The container will still be marked unhealthy in memory even if
+// exception occurs. It won't accept any new transactions and will
+// be handled by SCM. Eve if dn restarts, it will still be detected
+// as unheathy as its BCSID won't change.
+LOG.info("Unable to persist unhealthy state for container {}", id);
 
 Review comment:
   This should be `LOG.error`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307712)
Time Spent: 8h 50m  (was: 8h 40m)

> Undetectable corruption after restart of a datanode
> ---
>
> Key: HDDS-1843
> URL: https://issues.apache.org/jira/browse/HDDS-1843
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1843.000.patch
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Right now, all write chunks use BufferedIO ie, sync flag is disabled by 
> default. Also, Rocks Db metadata updates are done in Rocks DB cache first at 
> Datanode. In case, there comes a situation where the buffered chunk data as 
> well as the corresponding metadata update is lost as a part of datanode 
> restart, it may lead to a situation where, it will not be possible to detect 
> the corruption (not even with container scanner) of this nature in a 
> reasonable time frame, until and unless there is a cli

[jira] [Work logged] (HDDS-1843) Undetectable corruption after restart of a datanode

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1843?focusedWorklogId=307714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307714
 ]

ASF GitHub Bot logged work on HDDS-1843:


Author: ASF GitHub Bot
Created on: 06/Sep/19 09:21
Start Date: 06/Sep/19 09:21
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on issue #1364: HDDS-1843. 
Undetectable corruption after restart of a datanode.
URL: https://github.com/apache/hadoop/pull/1364#issuecomment-528781001
 
 
   @bshashikant you might need to rebase the changes on top of HDDS-1561. Even 
though there is no conflict, the compilation fails.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307714)
Time Spent: 9h  (was: 8h 50m)

> Undetectable corruption after restart of a datanode
> ---
>
> Key: HDDS-1843
> URL: https://issues.apache.org/jira/browse/HDDS-1843
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1843.000.patch
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Right now, all write chunks use BufferedIO ie, sync flag is disabled by 
> default. Also, Rocks Db metadata updates are done in Rocks DB cache first at 
> Datanode. In case, there comes a situation where the buffered chunk data as 
> well as the corresponding metadata update is lost as a part of datanode 
> restart, it may lead to a situation where, it will not be possible to detect 
> the corruption (not even with container scanner) of this nature in a 
> reasonable time frame, until and unless there is a client IO failure or Recon 
> server detects it over time. In order to atleast to detect the problem, Ratis 
> snapshot on datanode should sync the rocks db file . In such a way, 
> ContainerScanner will be able to detect this.We can also add a metric around 
> sync to measure how much of a throughput loss it can incurr.
> Thanks [~msingh] for suggesting this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924081#comment-16924081
 ] 

Surendra Singh Lilhore commented on HDFS-14754:
---

Thanks [~hemanthboyina]  for patch.

Some comment for test.
 # This test code is not able to reproduce this issue currently, the logic of 
creating redundant block is wrong, it should be something like this. 
{code:java}
// one missing block
for (; i < groupSize - 1; i++) {
  blk.setBlockId(groupId + i);
  cluster.injectBlocks(i, Arrays.asList(blk), bpid);
}
cluster.triggerBlockReports();
// one redundant block
blk.setBlockId(groupId + 2);
cluster.injectBlocks(i, Arrays.asList(blk), bpid); {code}

 #  DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY configured with 0, this will not 
allow NN to reconstruct the block. This configuration is not required.
 # 
{code:java}
+final LocatedBlock[] blocks =
+StripedBlockUtil.parseStripedBlockGroup(bg, cellSize, 5, 1);{code}
Here pass correct parity and data block count.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2095) Submit mr job to yarn failed, Error messegs is "Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"

2019-09-06 Thread luhuachao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luhuachao updated HDDS-2095:

Labels: kerberos  (was: )

> Submit mr job to yarn failed,   Error messegs is "Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"
> ---
>
> Key: HDDS-2095
> URL: https://issues.apache.org/jira/browse/HDDS-2095
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.4.1
>Reporter: luhuachao
>Priority: Major
>  Labels: kerberos
> Attachments: HDDS-2095.001.patch
>
>
> below is the submit command 
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  nnbench 
> -Dfs.defaultFS=o3fs://buc.volume-test  -maps 3   -bytesToWrite 1 
> -numberOfFiles 1000  -blockSize 16  -operation create_write
> {code}
> clinet fail with message 
> {code:java}
> 19/09/06 15:26:52 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_000119/09/06 15:26:52 INFO 
> mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1567754782562_0001java.io.IOException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:345) at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at 
> org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at 
> org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>  at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) 
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at 
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at 
> org.apache.hadoop.hdfs.NNBench.runTests(NNBench.java:487) at 
> org.apache.hadoop.hdfs.NNBench.run(NNBench.java:604) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
> org.apache.hadoop.hdfs.NNBench.main(NNBench.java:579) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>  at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:144) at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:152) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.RunJar.run(RunJar.java:308) at 
> org.apache.hadoop.util.RunJar.main(RunJar.java:222)Caused by: 
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1567754782562_0001 to YARN : 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:304)
>  at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:299)
>  at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:330) ... 34 
> more
> {code}
> the log in resourcemanager 
> {code:java}
> 2019-09-06 15:26:51,836 WARN  security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(923)) - Unable to 
> add the application to the delegation token renewer.
> java.util.ServiceConfigurationError: 
> org.apach

[jira] [Commented] (HDDS-2095) Submit mr job to yarn failed, Error messegs is "Provider org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924092#comment-16924092
 ] 

Hadoop QA commented on HDDS-2095:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
38m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
44s{color} | {color:green} hadoop-hdds in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m  4s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2773/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-2095 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979624/HDDS-2095.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient |
| uname | Linux ffdeeb615c32 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 6e4cdf8 |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2773/artifact/out/patch-unit-hadoop-ozone.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2773/testReport/ |
| Max. process+thread count | 1311 (vs. ulimit of 5500) |
| modules | C: hadoop-ozone/ozonefs U: hadoop-ozone/ozonefs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2773/console |
| versions | git=2.7.4 maven=3.3.9 |
| Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |


This message was automatically generated.



> Submit mr job to yarn failed,   Error messegs is "Provider 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl$Renewer not found"
> --

[jira] [Commented] (HDFS-14795) Add Throttler for writing block

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924098#comment-16924098
 ] 

Hadoop QA commented on HDFS-14795:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 14s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14795 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979619/HDFS-14795.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux dce61be81da8 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 494d75e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27799/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
htt

[jira] [Commented] (HDFS-14699) Erasure Coding: Can NOT trigger the reconstruction when have the dup internal blocks and missing one internal block

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924128#comment-16924128
 ] 

Surendra Singh Lilhore commented on HDFS-14699:
---

Thanks [~zhaoyim] for path and [~ayushtkn] for review...
 # First thing Jira discription is not matching with fix. It should be some 
thing "Storage not considered in live replica when replication streams hard 
limit reached to threshold"
 # 
{quote}if we move liveBlockIndices.add(blockIndex) before following block. In 
this way it will introduce the DN high resource usage (CPU and Memory).
{quote}
How it will cause high resource usage ? , this DN is not getting added in the 
source node. Only source nodes are used for reconstruction work..

> Erasure Coding: Can NOT trigger the reconstruction when have the dup internal 
> blocks and missing one internal block
> ---
>
> Key: HDFS-14699
> URL: https://issues.apache.org/jira/browse/HDFS-14699
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.1.1, 3.3.0
>Reporter: Zhao Yi Ming
>Assignee: Zhao Yi Ming
>Priority: Critical
>  Labels: patch
> Attachments: HDFS-14699.00.patch, HDFS-14699.01.patch, 
> HDFS-14699.02.patch, HDFS-14699.03.patch, HDFS-14699.04.patch, 
> HDFS-14699.05.patch, image-2019-08-20-19-58-51-872.png, 
> image-2019-09-02-17-51-46-742.png
>
>
> We are tried the EC function on 80 node cluster with hadoop 3.1.1, we hit the 
> same scenario as you said https://issues.apache.org/jira/browse/HDFS-8881. 
> Following are our testing steps, hope it can helpful.(following DNs have the 
> testing internal blocks)
>  # we customized a new 10-2-1024k policy and use it on a path, now we have 12 
> internal block(12 live block)
>  # decommission one DN, after the decommission complete. now we have 13 
> internal block(12 live block and 1 decommission block)
>  # then shutdown one DN which did not have the same block id as 1 
> decommission block, now we have 12 internal block(11 live block and 1 
> decommission block)
>  # after wait for about 600s (before the heart beat come) commission the 
> decommissioned DN again, now we have 12 internal block(11 live block and 1 
> duplicate block)
>  # Then the EC is not reconstruct the missed block
> We think this is a critical issue for using the EC function in a production 
> env. Could you help? Thanks a lot!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1843) Undetectable corruption after restart of a datanode

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1843?focusedWorklogId=307755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307755
 ]

ASF GitHub Bot logged work on HDDS-1843:


Author: ASF GitHub Bot
Created on: 06/Sep/19 11:08
Start Date: 06/Sep/19 11:08
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1364: HDDS-1843. 
Undetectable corruption after restart of a datanode.
URL: https://github.com/apache/hadoop/pull/1364
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307755)
Time Spent: 9h 10m  (was: 9h)

> Undetectable corruption after restart of a datanode
> ---
>
> Key: HDDS-1843
> URL: https://issues.apache.org/jira/browse/HDDS-1843
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1843.000.patch
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Right now, all write chunks use BufferedIO ie, sync flag is disabled by 
> default. Also, Rocks Db metadata updates are done in Rocks DB cache first at 
> Datanode. In case, there comes a situation where the buffered chunk data as 
> well as the corresponding metadata update is lost as a part of datanode 
> restart, it may lead to a situation where, it will not be possible to detect 
> the corruption (not even with container scanner) of this nature in a 
> reasonable time frame, until and unless there is a client IO failure or Recon 
> server detects it over time. In order to atleast to detect the problem, Ratis 
> snapshot on datanode should sync the rocks db file . In such a way, 
> ContainerScanner will be able to detect this.We can also add a metric around 
> sync to measure how much of a throughput loss it can incurr.
> Thanks [~msingh] for suggesting this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1843) Undetectable corruption after restart of a datanode

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1843?focusedWorklogId=307767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307767
 ]

ASF GitHub Bot logged work on HDDS-1843:


Author: ASF GitHub Bot
Created on: 06/Sep/19 11:25
Start Date: 06/Sep/19 11:25
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1364: HDDS-1843. 
Undetectable corruption after restart of a datanode.
URL: https://github.com/apache/hadoop/pull/1364
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307767)
Time Spent: 9h 20m  (was: 9h 10m)

> Undetectable corruption after restart of a datanode
> ---
>
> Key: HDDS-1843
> URL: https://issues.apache.org/jira/browse/HDDS-1843
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1843.000.patch
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Right now, all write chunks use BufferedIO ie, sync flag is disabled by 
> default. Also, Rocks Db metadata updates are done in Rocks DB cache first at 
> Datanode. In case, there comes a situation where the buffered chunk data as 
> well as the corresponding metadata update is lost as a part of datanode 
> restart, it may lead to a situation where, it will not be possible to detect 
> the corruption (not even with container scanner) of this nature in a 
> reasonable time frame, until and unless there is a client IO failure or Recon 
> server detects it over time. In order to atleast to detect the problem, Ratis 
> snapshot on datanode should sync the rocks db file . In such a way, 
> ContainerScanner will be able to detect this.We can also add a metric around 
> sync to measure how much of a throughput loss it can incurr.
> Thanks [~msingh] for suggesting this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924176#comment-16924176
 ] 

Hadoop QA commented on HDFS-14826:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
39m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m  2s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}157m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14826 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979633/HDFS-14826.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  xml  |
| uname | Linux d9ccb31a56ce 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6e4cdf8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27800/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27800/testReport/ |
| Max. process+thread count | 2805 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27800/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.or

[jira] [Commented] (HDFS-14777) RBF: Set ReadOnly is failing for mount Table but actually readonly succeed to set

2019-09-06 Thread Ranith Sardar (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924189#comment-16924189
 ] 

Ranith Sardar commented on HDFS-14777:
--

 Thanks [~surendrasingh] [~elgoiri].

> RBF: Set ReadOnly is failing for mount Table but actually readonly succeed to 
> set
> -
>
> Key: HDFS-14777
> URL: https://issues.apache.org/jira/browse/HDFS-14777
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14777.001.patch, HDFS-14777.002.patch, 
> HDFS-14777.003.patch, HDFS-14777.004.patch
>
>
> # hdfs dfsrouteradmin -update /test hacluster /test -readonly /opt/client # 
> hdfs dfsrouteradmin -update /test hacluster /test -readonly update: /test is 
> in a read only mount 
> pointorg.apache.hadoop.ipc.RemoteException(java.io.IOException): /test is in 
> a read only mount point at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getLocationsForPath(RouterRpcServer.java:1419)
>  at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.getQuotaRemoteLocations(Quota.java:217)
>  at 
> org.apache.hadoop.hdfs.server.federation.router.Quota.setQuota(Quota.java:75) 
> at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.synchronizeQuota(RouterAdminServer.java:288)
>  at 
> org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer.updateMountTableEntry(RouterAdminServer.java:267)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12571) Ozone: remove spaces from the beginning of the hdfs script

2019-09-06 Thread Inderpreet Kaur Jhajj (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924191#comment-16924191
 ] 

Inderpreet Kaur Jhajj commented on HDFS-12571:
--

I am Mac OS user and am still getting the mentioned error.

Can you please point me to any currently open issue log - specific to Mac OS?

Or if any fix has been suggested, please advise.

As per this log, there was an extra space in hdfs script, but am not sure which 
file I should be editing (if its possible to fix locally on one's system).

Currently, am stuck and cannot run Hadoop in Pseudomode cause of this problem.

> Ozone: remove spaces from the beginning of the hdfs script  
> 
>
> Key: HDFS-12571
> URL: https://issues.apache.org/jira/browse/HDFS-12571
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Critical
>  Labels: ozoneMerge
> Fix For: HDFS-7240
>
> Attachments: HDFS-12571-HDFS-7240.001.patch
>
>
> It seems that during one of the previous merge some unnecessary spaces has 
> been added to the hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs file.
> After a dist build I can not start server with the hdfs command:
> {code}
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh: line 398: 
> syntax error near unexpected token `<'
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh: line 398: `  
> done < <(for text in "${input[@]}"; do'
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 70: 
> hadoop_deprecate_envvar: command not found
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 87: 
> hadoop_bootstrap: command not found
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 104: 
> hadoop_parse_args: command not found
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 105: shift: 
> : numeric argument required
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 110: 
> hadoop_find_confdir: command not found
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 111: 
> hadoop_exec_hadoopenv: command not found
> /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 112: 
> hadoop_import_shellprofiles: command not found
> {code}
> See the space at here:
> https://github.com/apache/hadoop/blob/d0bd0f623338dbb558d0dee5e747001d825d92c5/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
> Or see the latest version at:
> https://github.com/apache/hadoop/blob/HDFS-7240/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
> To be honest I don't understand how it could work for others, as it seems to 
> be an older change. Maybe some git magic removed it on OSX (I use linux). 
> Anyway I upload a patch to fix it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924196#comment-16924196
 ] 

Surendra Singh Lilhore commented on HDFS-14826:
---

+1, LGTM

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14827) RBF: Shared DN should display all info's in Router DtaNode UI

2019-09-06 Thread Ranith Sardar (Jira)
Ranith Sardar created HDFS-14827:


 Summary: RBF: Shared DN should display all info's in Router 
DtaNode UI
 Key: HDFS-14827
 URL: https://issues.apache.org/jira/browse/HDFS-14827
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ranith Sardar
Assignee: Ranith Sardar






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14827) RBF: Shared DN should display all info's in Router DataNode UI

2019-09-06 Thread Ranith Sardar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar updated HDFS-14827:
-
Summary: RBF: Shared DN should display all info's in Router DataNode UI  
(was: RBF: Shared DN should display all info's in Router DtaNode UI)

> RBF: Shared DN should display all info's in Router DataNode UI
> --
>
> Key: HDFS-14827
> URL: https://issues.apache.org/jira/browse/HDFS-14827
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2076) Read fails because the block cannot be located in the container

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2076?focusedWorklogId=307796&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307796
 ]

ASF GitHub Bot logged work on HDDS-2076:


Author: ASF GitHub Bot
Created on: 06/Sep/19 12:26
Start Date: 06/Sep/19 12:26
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #1410: HDDS-2076. 
Read fails because the block cannot be located in the container
URL: https://github.com/apache/hadoop/pull/1410
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307796)
Remaining Estimate: 0h
Time Spent: 10m

> Read fails because the block cannot be located in the container
> ---
>
> Key: HDDS-2076
> URL: https://issues.apache.org/jira/browse/HDDS-2076
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: log.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Read fails as the client is not able to read the block from the container.
> {code}
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the block with bcsID 2515 .Container 7 bcsId is 0.
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:536)
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambd2a0$getValid1a9to-08-30
>  12:51:20,081 | INFO  | SCMAudit | user=msingh | ip=192.168.0.r103 
> |List$0(ContainerP
> rotocolCalls.java:569)
> {code}
> The client eventually exits here
> {code}
> 2019-08-30 12:51:20,081 [pool-224-thread-6] ERROR 
> ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:readData(176)) - 
> LOADGEN: Read key:pool-224-thread-6_330651 failed with ex
> ception
> ERROR ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:load(121)) - 
> LOADGEN: Exiting due to exception
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2076) Read fails because the block cannot be located in the container

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2076:
-
Labels: MiniOzoneChaosCluster pull-request-available  (was: 
MiniOzoneChaosCluster)

> Read fails because the block cannot be located in the container
> ---
>
> Key: HDDS-2076
> URL: https://issues.apache.org/jira/browse/HDDS-2076
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: log.zip
>
>
> Read fails as the client is not able to read the block from the container.
> {code}
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the block with bcsID 2515 .Container 7 bcsId is 0.
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:536)
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambd2a0$getValid1a9to-08-30
>  12:51:20,081 | INFO  | SCMAudit | user=msingh | ip=192.168.0.r103 
> |List$0(ContainerP
> rotocolCalls.java:569)
> {code}
> The client eventually exits here
> {code}
> 2019-08-30 12:51:20,081 [pool-224-thread-6] ERROR 
> ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:readData(176)) - 
> LOADGEN: Read key:pool-224-thread-6_330651 failed with ex
> ception
> ERROR ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:load(121)) - 
> LOADGEN: Exiting due to exception
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924210#comment-16924210
 ] 

Hudson commented on HDFS-14826:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17238 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17238/])
HDFS-14826. dfs.ha.zkfc.port property duplicated in hdfs-default.xml. 
(surendralilhore: rev fa7f03fc560e5c19383a54db76583c065e9df352)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2076) Read fails because the block cannot be located in the container

2019-09-06 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-2076:
--
Status: Patch Available  (was: Open)

> Read fails because the block cannot be located in the container
> ---
>
> Key: HDDS-2076
> URL: https://issues.apache.org/jira/browse/HDDS-2076
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: MiniOzoneChaosCluster, pull-request-available
> Attachments: log.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Read fails as the client is not able to read the block from the container.
> {code}
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the block with bcsID 2515 .Container 7 bcsId is 0.
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:536)
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambd2a0$getValid1a9to-08-30
>  12:51:20,081 | INFO  | SCMAudit | user=msingh | ip=192.168.0.r103 
> |List$0(ContainerP
> rotocolCalls.java:569)
> {code}
> The client eventually exits here
> {code}
> 2019-08-30 12:51:20,081 [pool-224-thread-6] ERROR 
> ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:readData(176)) - 
> LOADGEN: Read key:pool-224-thread-6_330651 failed with ex
> ception
> ERROR ozone.MiniOzoneLoadGenerator (MiniOzoneLoadGenerator.java:load(121)) - 
> LOADGEN: Exiting due to exception
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924211#comment-16924211
 ] 

Surendra Singh Lilhore commented on HDFS-14826:
---

Committed to trunk.

Thanks [~prasad-acit]  for contribution.

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14826) dfs.ha.zkfc.port property duplicated in hdfs-default.xml

2019-09-06 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-14826:
--
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> dfs.ha.zkfc.port property duplicated in hdfs-default.xml
> 
>
> Key: HDFS-14826
> URL: https://issues.apache.org/jira/browse/HDFS-14826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14826.001.patch
>
>
> "dfs.ha.zkfc.port" property configuration is duplicated in hdfs-default.xml 
> file with common value (port number - 8019) & different description.
> This redundant entry to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924215#comment-16924215
 ] 

Wei-Chiu Chuang commented on HDFS-14810:


[~ayushtkn] i'm really sorry coming to this late. I am reviewing this and I 
promise to give an update asap. Please pause it a bit.

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch, HDFS-14810.002.patch, 
> HDFS-14810.003.patch, HDFS-14810.004.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13913) LazyPersistFileScrubber.run() should log meaningful warn message

2019-09-06 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13913:
--
Summary: LazyPersistFileScrubber.run() should log meaningful warn message  
(was: LazyPersistFileScrubber.run() error handling is poor)

> LazyPersistFileScrubber.run() should log meaningful warn message
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13913) LazyPersistFileScrubber.run() should log meaningful warn message

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924219#comment-16924219
 ] 

Surendra Singh Lilhore commented on HDFS-13913:
---

+1

> LazyPersistFileScrubber.run() should log meaningful warn message
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14754:
-
Attachment: HDFS-14754.004.patch

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13913) LazyPersistFileScrubber.run() should log meaningful warn message

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924222#comment-16924222
 ] 

Hudson commented on HDFS-13913:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17239 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17239/])
HDFS-13913. LazyPersistFileScrubber.run() should log meaningful warn 
(surendralilhore: rev d98c54816d21d59c4d877ae4b1917b22268ffcef)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> LazyPersistFileScrubber.run() should log meaningful warn message
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13913) LazyPersistFileScrubber.run() should log meaningful warn message

2019-09-06 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13913:
--
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> LazyPersistFileScrubber.run() should log meaningful warn message
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13913) LazyPersistFileScrubber.run() should log meaningful warn message

2019-09-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924223#comment-16924223
 ] 

Surendra Singh Lilhore commented on HDFS-13913:
---

Thanks [~d...@cloudera.com]  for contribution. Thanks [~templedf]  for review.

Committed to trunk.

> LazyPersistFileScrubber.run() should log meaningful warn message
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14758) Decrease lease hard limit

2019-09-06 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924226#comment-16924226
 ] 

hemanthboyina commented on HDFS-14758:
--

attached patch , pls review [~kihwal] [~jojochuang]

> Decrease lease hard limit
> -
>
> Key: HDFS-14758
> URL: https://issues.apache.org/jira/browse/HDFS-14758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14758.001.patch
>
>
> The hard limit is currently hard-coded to be 1 hour. This also determines the 
> NN automatic lease recovery interval. Something like 20 min will make more 
> sense.
> After the 5 min soft limit, other clients can recover the lease. If no one 
> else takes the lease away, the original client still can renew the lease 
> within the hard limit. So even after a NN full GC of 8 minutes, leases can be 
> still valid.
> However, there is one risk in reducing the hard limit. E.g. Reduced to 20 
> min. If the NN crashes and the manual failover takes more than 20 minutes, 
> clients will abort.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14452) Make Op#valueOf() Public

2019-09-06 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924227#comment-16924227
 ] 

hemanthboyina commented on HDFS-14452:
--

_other custom implementations that want to store the Op code a different way._

can you specify these [~belugabehr] , so that we can push patch forward .

thanks

> Make Op#valueOf() Public
> 
>
> Key: HDFS-14452
> URL: https://issues.apache.org/jira/browse/HDFS-14452
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: hemanthboyina
>Priority: Minor
>  Labels: noob
> Attachments: HDFS-14452.patch
>
>
> Change signature of {{private static Op valueOf(byte code)}} to be public.  
> Right now, the only easy way to look up in Op is to pass in a {{DataInput}} 
> object, which is not all that flexible and efficient for other custom 
> implementations that want to store the Op code a different way.
> https://github.com/apache/hadoop/blob/8c95cb9d6bef369fef6a8364f0c0764eba90e44a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/Op.java#L53



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924229#comment-16924229
 ] 

Chen Zhang commented on HDFS-14609:
---

Sorry, uploaded the wrong file for path v3, re-submit again.

> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
> URL: https://issues.apache.org/jira/browse/HDFS-14609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: CR Hota
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14609.001.patch, HDFS-14609.002.patch, 
> HDFS-14609.003.patch, HDFS-14609.003.patch
>
>
> We worked on router based federation security as part of HDFS-13532. We kept 
> it compatible with the way namenode works. However with HADOOP-16314 and 
> HDFS-16354 in trunk, auth filters seems to have been changed causing tests to 
> fail.
> Changes are needed appropriately in RBF, mainly fixing broken tests.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Chen Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-14609:
--
Attachment: HDFS-14609.003.patch

> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
> URL: https://issues.apache.org/jira/browse/HDFS-14609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: CR Hota
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14609.001.patch, HDFS-14609.002.patch, 
> HDFS-14609.003.patch, HDFS-14609.003.patch
>
>
> We worked on router based federation security as part of HDFS-13532. We kept 
> it compatible with the way namenode works. However with HADOOP-16314 and 
> HDFS-16354 in trunk, auth filters seems to have been changed causing tests to 
> fail.
> Changes are needed appropriately in RBF, mainly fixing broken tests.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1843) Undetectable corruption after restart of a datanode

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1843?focusedWorklogId=307831&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307831
 ]

ASF GitHub Bot logged work on HDDS-1843:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:09
Start Date: 06/Sep/19 13:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1364: HDDS-1843. 
Undetectable corruption after restart of a datanode.
URL: https://github.com/apache/hadoop/pull/1364#issuecomment-528848045
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 39 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 4 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 66 | Maven dependency ordering for branch |
   | +1 | mvninstall | 588 | trunk passed |
   | +1 | compile | 382 | trunk passed |
   | +1 | checkstyle | 81 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 872 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 179 | trunk passed |
   | 0 | spotbugs | 421 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 616 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 41 | Maven dependency ordering for patch |
   | +1 | mvninstall | 575 | the patch passed |
   | +1 | compile | 389 | the patch passed |
   | +1 | cc | 389 | the patch passed |
   | +1 | javac | 389 | the patch passed |
   | +1 | checkstyle | 83 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 700 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 177 | the patch passed |
   | +1 | findbugs | 687 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 298 | hadoop-hdds in the patch passed. |
   | -1 | unit | 200 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 48 | The patch does not generate ASF License warnings. |
   | | | 6200 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1364/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1364 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux 806a3bdb7024 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 6e4cdf8 |
   | Default Java | 1.8.0_222 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1364/9/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1364/9/testReport/ |
   | Max. process+thread count | 1298 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/container-service 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1364/9/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307831)
Time Spent: 9.5h  (was: 9h 20m)

> Undetectable corruption after restart of a datanode
> ---
>
> Key: HDDS-1843
> URL: https://issues.apache.org/jira/browse/HDDS-1843
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-1843.000.patch
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> 

[jira] [Work logged] (HDDS-2076) Read fails because the block cannot be located in the container

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2076?focusedWorklogId=307883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307883
 ]

ASF GitHub Bot logged work on HDDS-2076:


Author: ASF GitHub Bot
Created on: 06/Sep/19 14:17
Start Date: 06/Sep/19 14:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1410: HDDS-2076. Read 
fails because the block cannot be located in the container
URL: https://github.com/apache/hadoop/pull/1410#issuecomment-528872753
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 73 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 73 | Maven dependency ordering for branch |
   | +1 | mvninstall | 630 | trunk passed |
   | +1 | compile | 374 | trunk passed |
   | +1 | checkstyle | 74 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 966 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 180 | trunk passed |
   | 0 | spotbugs | 482 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 713 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 37 | Maven dependency ordering for patch |
   | +1 | mvninstall | 588 | the patch passed |
   | +1 | compile | 433 | the patch passed |
   | +1 | javac | 433 | the patch passed |
   | -0 | checkstyle | 46 | hadoop-ozone: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 755 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 191 | the patch passed |
   | +1 | findbugs | 704 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 218 | hadoop-hdds in the patch failed. |
   | -1 | unit | 236 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 39 | The patch does not generate ASF License warnings. |
   | | | 6566 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.container.ozoneimpl.TestOzoneContainer |
   |   | hadoop.ozone.container.keyvalue.TestKeyValueContainer |
   |   | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1410 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 6744f31fdc7d 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / fa7f03f |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/testReport/ |
   | Max. process+thread count | 1270 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/container-service hadoop-ozone/integration-test 
U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1410/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307883)
Time Spent: 20m  (was: 10m)

> Read fails because the block cannot be located in the container
> ---
>
> Key: HDDS-2076
> URL: https://issues.apache.org/jira/browse/HDDS-2076
> Project: Hadoop Di

[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-09-06 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Summary: Failover from Active to Standby Failed(was: [SBN Read]Failover 
from Active to Standby Failed  )

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14528.003.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
> *Started an HA Cluster with three nodes [ _Active ,Standby ,Observer_ ]*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in two cases : When any other standby namenode is down or 
> when any other zkfc is down 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1899) DeleteBlocksCommandHandler is unable to find the container in SCM

2019-09-06 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924282#comment-16924282
 ] 

Lokesh Jain commented on HDDS-1899:
---

[~nandakumar131] I think the exception seems harmless. This exception is thrown 
when the container can not be found before processing a DeleteBlocks command. 
As mentioned by you it can be because replication manager deleted a container 
before block deletion was processed.

There is another issue however. Currently all the synchronization is done via 
locking the container object itself. In case of delete container the container 
is removed from containerSet but the container object may still be alive and 
can be used to acquire a lock on the container. Also in deleteContainer we 
delete the container outside the lock which could race with the other 
operations.

With the current locking semantics we need to check if container exists or not 
after acquiring a lock on it. Also container deletion should be done inside the 
lock itself.

> DeleteBlocksCommandHandler is unable to find the container in SCM
> -
>
> Key: HDDS-1899
> URL: https://issues.apache.org/jira/browse/HDDS-1899
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> DeleteBlocksCommandHandler is unable to find a container in SCM.
> {code}
> 2019-08-02 14:04:56,735 WARN  commandhandler.DeleteBlocksCommandHandler 
> (DeleteBlocksCommandHandler.java:lambda$handle$0(140)) - Failed to delete 
> blocks for container=33, TXID=184
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Unable to find the container 33
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.lambda$handle$0(DeleteBlocksCommandHandler.java:122)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at 
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.handle(DeleteBlocksCommandHandler.java:114)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-09-06 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Description: 
 *In a cluster with more than one Standby namenode, manual failover throws 
exception for some cases*

*When trying to exectue the failover command from active to standby* 

*._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*

  Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
connection exception: java.net.ConnectException: Connection refused

This is encountered in the following cases :

 Scenario 1 : 

Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

When trying to manually failover from NN1 TO NN2 if NN3 is down, Exception is 
thrown

Scenario 2 :

 Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3

When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
down, Exception is thrown

  was:
*Started an HA Cluster with three nodes [ _Active ,Standby ,Observer_ ]*

*When trying to exectue the failover command from active to standby* 

*._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*

  Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
connection exception: java.net.ConnectException: Connection refused

This is encountered in two cases : When any other standby namenode is down or 
when any other zkfc is down 


> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14528.003.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 TO NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-09-06 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Description: 
 *In a cluster with more than one Standby namenode, manual failover throws 
exception for some cases*

*When trying to exectue the failover command from active to standby* 

*._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*

  Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
connection exception: java.net.ConnectException: Connection refused

This is encountered in the following cases :

 Scenario 1 : 

Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
thrown

Scenario 2 :

 Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3

When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
down, Exception is thrown

  was:
 *In a cluster with more than one Standby namenode, manual failover throws 
exception for some cases*

*When trying to exectue the failover command from active to standby* 

*._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*

  Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
connection exception: java.net.ConnectException: Connection refused

This is encountered in the following cases :

 Scenario 1 : 

Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

When trying to manually failover from NN1 TO NN2 if NN3 is down, Exception is 
thrown

Scenario 2 :

 Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)

ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3

When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
down, Exception is thrown


> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14528.003.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14528) Failover from Active to Standby Failed

2019-09-06 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924289#comment-16924289
 ] 

Ravuri Sushma sree commented on HDFS-14528:
---

Thanks [~csun] ,

No, adding remote host twice wasnt intended. I will upload the patch correcting 
the same .

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14528.003.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924298#comment-16924298
 ] 

Hadoop QA commented on HDFS-14609:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch 
generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 52s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 22m 
41s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14609 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979666/HDFS-14609.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cb997fd56e39 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d98c548 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27802/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27802/testReport/ |
| Max. process+thread count | 1599 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27802/console |
| Powered by | Apache Yetus

[jira] [Commented] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924330#comment-16924330
 ] 

Hadoop QA commented on HDFS-14754:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  0s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 475 unchanged - 
0 fixed = 476 total (was 475) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 110 unchanged - 0 fixed = 111 total (was 110) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14754 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979665/HDFS-14754.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 38e047321941 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d98c548 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27801/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27801/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-

[jira] [Updated] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Chen Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-14609:
--
Attachment: HDFS-14609.004.patch

> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
> URL: https://issues.apache.org/jira/browse/HDFS-14609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: CR Hota
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14609.001.patch, HDFS-14609.002.patch, 
> HDFS-14609.003.patch, HDFS-14609.004.patch
>
>
> We worked on router based federation security as part of HDFS-13532. We kept 
> it compatible with the way namenode works. However with HADOOP-16314 and 
> HDFS-16354 in trunk, auth filters seems to have been changed causing tests to 
> fail.
> Changes are needed appropriately in RBF, mainly fixing broken tests.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Chen Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-14609:
--
Attachment: (was: HDFS-14609.003.patch)

> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
> URL: https://issues.apache.org/jira/browse/HDFS-14609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: CR Hota
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14609.001.patch, HDFS-14609.002.patch, 
> HDFS-14609.003.patch, HDFS-14609.004.patch
>
>
> We worked on router based federation security as part of HDFS-13532. We kept 
> it compatible with the way namenode works. However with HADOOP-16314 and 
> HDFS-16354 in trunk, auth filters seems to have been changed causing tests to 
> fail.
> Changes are needed appropriately in RBF, mainly fixing broken tests.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14827) RBF: Shared DN should display all info's in Router DataNode UI

2019-09-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924385#comment-16924385
 ] 

Íñigo Goiri commented on HDFS-14827:


To be honest, the current approach of prepending the subcluster id is not the 
most intuitive.
I think we should create a new method that would give the DNs per subcluster 
and then aggregate from there.
Then a node that is in two subclusters could show twice and we just need to 
make it clear that is the same.

> RBF: Shared DN should display all info's in Router DataNode UI
> --
>
> Key: HDFS-14827
> URL: https://issues.apache.org/jira/browse/HDFS-14827
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ranith Sardar
>Assignee: Ranith Sardar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924386#comment-16924386
 ] 

Chen Zhang commented on HDFS-14609:
---

submit patch v4 to fix checkstyle error

> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
> URL: https://issues.apache.org/jira/browse/HDFS-14609
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: CR Hota
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14609.001.patch, HDFS-14609.002.patch, 
> HDFS-14609.003.patch, HDFS-14609.004.patch
>
>
> We worked on router based federation security as part of HDFS-13532. We kept 
> it compatible with the way namenode works. However with HADOOP-16314 and 
> HDFS-16354 in trunk, auth filters seems to have been changed causing tests to 
> fail.
> Changes are needed appropriately in RBF, mainly fixing broken tests.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2096) Ozone ACL document missing AddAcl API

2019-09-06 Thread Xiaoyu Yao (Jira)
Xiaoyu Yao created HDDS-2096:


 Summary: Ozone ACL document missing AddAcl API
 Key: HDDS-2096
 URL: https://issues.apache.org/jira/browse/HDDS-2096
 Project: Hadoop Distributed Data Store
  Issue Type: Test
Reporter: Xiaoyu Yao


Current Ozone Native ACL APIs document looks like below, the AddAcl is missing.

 
h3. Ozone Native ACL APIs

The ACLs can be manipulated by a set of APIs supported by Ozone. The APIs 
supported are:
 # *SetAcl* – This API will take user principal, the name, type of the ozone 
object and a list of ACLs.
 # *GetAcl* – This API will take the name and type of the ozone object and will 
return a list of ACLs.
 # *RemoveAcl* - This API will take the name, type of the ozone object and the 
ACL that has to be removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14795) Add Throttler for writing block

2019-09-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924393#comment-16924393
 ] 

Íñigo Goiri commented on HDFS-14795:


Thanks [~leosun08] for the patch.
In functionality, it looks good; I would improve readability.
Right now, one has to be really aware of how the BlockConstructionStage stages 
go and the clientName, etc.
I would extract the two ifs and make them functions with a javadoc explaining 
why one is a transfer and why the other is a write:
{code}
if (isTransfer(stage, clientName)) {
  this.throttler = xserver.getTransferThrottler();
} else if(isWrite(stage)) {
  this.throttler = xserver.getWriteThrottler();
}
{code}
Actually, the whole code could be a function {{getThrottler()}}.
As the snippet shows, I would also change the name for {{transferThrottler}}.

Can we also add some test?


> Add Throttler for writing block
> ---
>
> Key: HDFS-14795
> URL: https://issues.apache.org/jira/browse/HDFS-14795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-14795.001.patch, HDFS-14795.002.patch
>
>
> DataXceiver#writeBlock
> {code:java}
> blockReceiver.receiveBlock(mirrorOut, mirrorIn, replyOut,
> mirrorAddr, null, targets, false);
> {code}
> As above code, DataXceiver#writeBlock doesn't throttler.
>  I think it is necessary to throttle for writing block, while add throttler 
> in stage of PIPELINE_SETUP_APPEND_RECOVERY or 
> PIPELINE_SETUP_STREAMING_RECOVERY.
> Default throttler value is still null.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=308006&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308006
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 16:43
Start Date: 06/Sep/19 16:43
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321819701
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   even if you have 15x states, the number of nodes is less. if you have 100 
nodes, there are only 1500 states, and if you have 1000 nodes, it is 15000 
states. It is still trivial to keep these in memory. Here is the real kicker, 
just like we decided not to write all cross products for the NodeState static 
functions, we will end up needing lists of only frequently accessed pattern (in 
mind that would be (in_service, healthy). All other node queries can be 
retrieved by iterating the lists as needed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308006)
Time Spent: 4h 10m  (was: 4h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=308007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308007
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 16:44
Start Date: 06/Sep/19 16:44
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-528927862
 
 
   Just a note; Originally DatanodeInfo was based on the HDFS code. Then I 
think we copied and created our own structure. At this point, diverging should 
not be a big deal is what I think.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308007)
Time Spent: 4h 20m  (was: 4h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2019-09-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924413#comment-16924413
 ] 

Íñigo Goiri commented on HDFS-14090:


BTW, should we also add the fairness per user to the Router RPC server?
It would go to a separate JIRA though.

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, RBF_ 
> Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14754:
-
Attachment: HDFS-14754.005.patch

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14825) [Dynamometer] Workload doesn't start unless an absolute path of Mapper class given

2019-09-06 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924425#comment-16924425
 ] 

Erik Krogen commented on HDFS-14825:


Thanks for filing this [~soyamiyoshi]! I agree that the PR you mentioned should 
fix this.

> [Dynamometer] Workload doesn't start unless an absolute path of Mapper class 
> given
> --
>
> Key: HDFS-14825
> URL: https://issues.apache.org/jira/browse/HDFS-14825
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Soya Miyoshi
>Priority: Major
>
> When starting a workload by start-workload.sh, unless an absolute path of 
> Mapper is given, the workload doesn't start.
>  
> {code:java}
> $ hadoop/tools/dynamometer/dynamometer-workload/bin/start-workload.sh - \
> Dauditreplay.input-path=hdfs:///user/souya/input/audit  \
> -Dauditreplay.output-path=hdfs:///user/souya/results/ \
> -Dauditreplay.num-threads=50 -Dauditreplay.log-start-time.ms=5 \
> -nn_uri hdfs://namenode_address:port/ \
> -mapper_class_name AuditReplayMapper
> {code}
> results in
> {code:java}
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Exception in thread "main" java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.AuditReplayMapper not 
> found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2572)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.getMapperClass(WorkloadDriver.java:183)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.run(WorkloadDriver.java:127)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.main(WorkloadDriver.java:172)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2019-09-06 Thread CR Hota (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924427#comment-16924427
 ] 

CR Hota commented on HDFS-14090:


[~elgoiri] Thanks for the reviews. Some thoughts below.
{quote}My main issue is that PermitAllocationException is too generic.
 As you mention, it currently covers both (1) not enough handlers and (2) 
missconfigured nameservices.
 I think they should be two separate exceptions.
 The #1 case makes sense but the other one seems more like an 
IllegalArgumentException
{quote}
Both are theoretically misconfigurations and hence wanted to keep them under 
the same umbrella of PermitAllocationException which all implementations should 
throw if allocation fails, and this failure will happen due to mis 
configurations.
{quote} 
 BTW, should we also add the fairness per user to the Router RPC server?
 It would go to a separate JIRA though.
{quote}
Fairness at user level can still be enabled via FairCallQueue. We don't need to 
add anything separate from Router's perspective. With HADOOP-16268 already 
checked in, fairness along with balancing across routers is taken care of to a 
large extent.
  

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, RBF_ 
> Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-4819) Update Snapshot doc for HDFS-4758

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924430#comment-16924430
 ] 

Hudson commented on HDFS-4819:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17240 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17240/])
HDFS-4819. [Dynamometer] Fix parsing of audit logs which contain = in (xkrogen: 
rev ae42c8cb61edcf69d0d6a9cf20ee9f936b0722fb)
* (edit) 
hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-workload/src/test/java/org/apache/hadoop/tools/dynamometer/workloadgenerator/audit/TestAuditLogDirectParser.java
* (edit) 
hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-workload/src/main/java/org/apache/hadoop/tools/dynamometer/workloadgenerator/audit/AuditLogDirectParser.java


> Update Snapshot doc for HDFS-4758
> -
>
> Key: HDFS-4819
> URL: https://issues.apache.org/jira/browse/HDFS-4819
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.1.0-beta
>
> Attachments: h4819_20130611.patch
>
>
> Update Snapshot doc to clarify that nested snapshots are not allowed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14819) [Dynamometer] Cannot parse audit logs with ‘=‘ in unexpected places when starting a workload.

2019-09-06 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14819:
---
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

LGTM, thanks a lot [~soyamiyoshi]! I just committed this to trunk.

> [Dynamometer] Cannot parse audit logs with ‘=‘ in unexpected places when 
> starting a workload. 
> --
>
> Key: HDFS-14819
> URL: https://issues.apache.org/jira/browse/HDFS-14819
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Soya Miyoshi
>Assignee: Soya Miyoshi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14819.001.patch, HDFS-14819.002.patch, 
> HDFS-14819.003.patch
>
>
> When trying to launch a workload job, if any of the given audit logs’ values 
> contain `=` aside from at the ends of the log’s keys (such as `ugi`, `src`), 
> the audit log will not be parsed and an exception is thrown.
> For example, this audit log will result in exception, as it contains `=` in 
> the `src` value (“/projects/date=0822”).
>  {code:|borderStyle=solid}
> 2019-08-22 01:00:00,186 INFO FSNamesystem.audit: allowed=true   ugi=feed 
> (auth:a) ip=/119.472.323.333  cmd=getfileinfo
> src=/projects/date=0822 dst=null
> perm=null   proto=rpc
> {code}
> If the second `=` in `src=/projects/date=0822` is removed, it works fine. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14817) [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct arguments are given.

2019-09-06 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924448#comment-16924448
 ] 

Erik Krogen commented on HDFS-14817:


The v3 patch LGTM, thanks [~soyamiyoshi]! I just committed this to trunk.

> [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct 
> arguments are given.
> ---
>
> Key: HDFS-14817
> URL: https://issues.apache.org/jira/browse/HDFS-14817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Soya Miyoshi
>Assignee: Soya Miyoshi
>Priority: Major
> Attachments: HDFS-14817.001.patch, HDFS-14817.002.patch, 
> HDFS-14817.003.patch
>
>
> When trying to launch the infrastructure application to begin the startup of 
> the internal HDFS cluster as shown in the Manual Workload Launch section in 
> [here|https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-dynamometer/Dynamometer.html]
>  {code:|borderStyle=solid}
> $ ./dynamometer-infra/bin/start-dynamometer-cluster.sh \
>  -hadoop_binary_path hadoop-3.0.2.tar.gz \
>  -conf_path my-hadoop-conf \
>  -fs_image_dir hdfs:///fsimage \
>  -block_list_path hdfs:///dyno/blocks
> {code}
>  its usage is always shown even if correct arguments are given, if 
> `-hadoop_binary_path` is placed as a first argument for the script.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14817) [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct arguments are given.

2019-09-06 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14817:
---
Fix Version/s: 3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct 
> arguments are given.
> ---
>
> Key: HDFS-14817
> URL: https://issues.apache.org/jira/browse/HDFS-14817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Soya Miyoshi
>Assignee: Soya Miyoshi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14817.001.patch, HDFS-14817.002.patch, 
> HDFS-14817.003.patch
>
>
> When trying to launch the infrastructure application to begin the startup of 
> the internal HDFS cluster as shown in the Manual Workload Launch section in 
> [here|https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-dynamometer/Dynamometer.html]
>  {code:|borderStyle=solid}
> $ ./dynamometer-infra/bin/start-dynamometer-cluster.sh \
>  -hadoop_binary_path hadoop-3.0.2.tar.gz \
>  -conf_path my-hadoop-conf \
>  -fs_image_dir hdfs:///fsimage \
>  -block_list_path hdfs:///dyno/blocks
> {code}
>  its usage is always shown even if correct arguments are given, if 
> `-hadoop_binary_path` is placed as a first argument for the script.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14817) [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct arguments are given.

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924452#comment-16924452
 ] 

Hudson commented on HDFS-14817:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17243 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17243/])
HDFS-14817. [Dynamometer] Fix start script options parsing which (xkrogen: rev 
9637097ef9b213fcbeffa2538ccb7e0aaabde9c4)
* (edit) 
hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/src/main/java/org/apache/hadoop/tools/dynamometer/Client.java


> [Dynamometer] start-dynamometer-cluster.sh shows its usage even if correct 
> arguments are given.
> ---
>
> Key: HDFS-14817
> URL: https://issues.apache.org/jira/browse/HDFS-14817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Soya Miyoshi
>Assignee: Soya Miyoshi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14817.001.patch, HDFS-14817.002.patch, 
> HDFS-14817.003.patch
>
>
> When trying to launch the infrastructure application to begin the startup of 
> the internal HDFS cluster as shown in the Manual Workload Launch section in 
> [here|https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-dynamometer/Dynamometer.html]
>  {code:|borderStyle=solid}
> $ ./dynamometer-infra/bin/start-dynamometer-cluster.sh \
>  -hadoop_binary_path hadoop-3.0.2.tar.gz \
>  -conf_path my-hadoop-conf \
>  -fs_image_dir hdfs:///fsimage \
>  -block_list_path hdfs:///dyno/blocks
> {code}
>  its usage is always shown even if correct arguments are given, if 
> `-hadoop_binary_path` is placed as a first argument for the script.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12831) HDFS throws FileNotFoundException on getFileBlockLocations(path-to-directory)

2019-09-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina reassigned HDFS-12831:


Assignee: hemanthboyina  (was: Hanisha Koneru)

> HDFS throws FileNotFoundException on getFileBlockLocations(path-to-directory)
> -
>
> Key: HDFS-12831
> URL: https://issues.apache.org/jira/browse/HDFS-12831
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.1
>Reporter: Steve Loughran
>Assignee: hemanthboyina
>Priority: Major
>
> The HDFS implementation of {{getFileBlockLocations(path, offset, len)}} 
> throws an exception if the path references a directory. 
> The base implementation (and all other filesystems) just return an empty 
> array, something implemented in {{getFileBlockLocations(filestatsus, offset, 
> len)}}; something written up in filesystem.md as the correct behaviour. 
> # has been shown to break things: SPARK-14959
> # there's no contract tests for these APIs; shows up in HADOOP-15044. 
> # even if this is considered a wontfix, it should raise something like 
> {{PathIsDirectoryException}} rather than FNFE



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14828) Add TeraSort to acceptance test

2019-09-06 Thread Xiaoyu Yao (Jira)
Xiaoyu Yao created HDFS-14828:
-

 Summary: Add TeraSort to acceptance test
 Key: HDFS-14828
 URL: https://issues.apache.org/jira/browse/HDFS-14828
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Xiaoyu Yao


We may begin with 1GB teragen/terasort/teravalidate.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Moved] (HDDS-2097) Add TeraSort to acceptance test

2019-09-06 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao moved HDFS-14828 to HDDS-2097:
-

 Key: HDDS-2097  (was: HDFS-14828)
Workflow: patch-available, re-open possible  (was: no-reopen-closed, 
patch-avail)
 Project: Hadoop Distributed Data Store  (was: Hadoop HDFS)

> Add TeraSort to acceptance test
> ---
>
> Key: HDDS-2097
> URL: https://issues.apache.org/jira/browse/HDDS-2097
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Xiaoyu Yao
>Priority: Major
>
> We may begin with 1GB teragen/terasort/teravalidate.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2019-09-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924468#comment-16924468
 ] 

Íñigo Goiri commented on HDFS-14090:


{quote}
Both are theoretically misconfigurations and hence wanted to keep them under 
the same umbrella of PermitAllocationException which all implementations should 
throw if allocation fails, and this failure will happen due to mis 
configurations.
{quote}
Right, both are at configuration.
Should we make it IllegalArgumentException then?

Regarding the FairCallQueue, should we add it to HDFS-14558?

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, RBF_ 
> Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14609) RBF: Security should use common AuthenticationFilter

2019-09-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924471#comment-16924471
 ] 

Hadoop QA commented on HDFS-14609:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 
33s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14609 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979690/HDFS-14609.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 24791e3c31a5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d98c548 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27804/testReport/ |
| Max. process+thread count | 1610 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27804/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Security should use common AuthenticationFilter
> 
>
> Key: HDFS-14609
>  

[jira] [Created] (HDFS-14829) [Dynamometer] Update TestDynamometerInfra to be Hadoop 3.2+ compatible

2019-09-06 Thread Erik Krogen (Jira)
Erik Krogen created HDFS-14829:
--

 Summary: [Dynamometer] Update TestDynamometerInfra to be Hadoop 
3.2+ compatible
 Key: HDFS-14829
 URL: https://issues.apache.org/jira/browse/HDFS-14829
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Currently the integration test included with Dynamometer, 
{{TestDynamometerInfra}}, is executing against version 3.1.2 of Hadoop. We 
should update it to run against a more recent version by default (3.2.x) and 
add support for 3.3 in anticipation of HDFS-14412.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14412) Enable Dynamometer to use the local build of Hadoop by default

2019-09-06 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924490#comment-16924490
 ] 

Erik Krogen commented on HDFS-14412:


[~pingsutw] you're right, it looks like currently it's set up for Hadoop 3.1. 
We should definitely update it to be 3.2 and 3.3 compatible. I filed HDFS-14829 
for this.

> Enable Dynamometer to use the local build of Hadoop by default
> --
>
> Key: HDFS-14412
> URL: https://issues.apache.org/jira/browse/HDFS-14412
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Erik Krogen
>Assignee: kevin su
>Priority: Major
>
> Currently, by default, Dynamometer will download a Hadoop tarball from the 
> internet to use as the Hadoop version-under-test. Since it is bundled inside 
> of Hadoop now, it would make more sense for it to use the current version of 
> Hadoop by default.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12831) HDFS throws FileNotFoundException on getFileBlockLocations(path-to-directory)

2019-09-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-12831:
-
   Attachment: HDFS-12831.001.patch
Affects Version/s: (was: 2.8.1)
   3.1.2
   Status: Patch Available  (was: Open)

> HDFS throws FileNotFoundException on getFileBlockLocations(path-to-directory)
> -
>
> Key: HDFS-12831
> URL: https://issues.apache.org/jira/browse/HDFS-12831
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.2
>Reporter: Steve Loughran
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-12831.001.patch
>
>
> The HDFS implementation of {{getFileBlockLocations(path, offset, len)}} 
> throws an exception if the path references a directory. 
> The base implementation (and all other filesystems) just return an empty 
> array, something implemented in {{getFileBlockLocations(filestatsus, offset, 
> len)}}; something written up in filesystem.md as the correct behaviour. 
> # has been shown to break things: SPARK-14959
> # there's no contract tests for these APIs; shows up in HADOOP-15044. 
> # even if this is considered a wontfix, it should raise something like 
> {{PathIsDirectoryException}} rather than FNFE



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2015) Encrypt/decrypt key using symmetric key while writing/reading

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2015?focusedWorklogId=308085&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308085
 ]

ASF GitHub Bot logged work on HDDS-2015:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:43
Start Date: 06/Sep/19 18:43
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1386: HDDS-2015. 
Encrypt/decrypt key using symmetric key while writing/reading
URL: https://github.com/apache/hadoop/pull/1386#issuecomment-528969668
 
 
   @ajayydv @bharatviswa504  Thanks for comments. @dineshchitlangia  Thanks for 
the contribution. I have committed this patch to the trunk branch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308085)
Time Spent: 5h 40m  (was: 5.5h)

> Encrypt/decrypt key using symmetric key while writing/reading
> -
>
> Key: HDDS-2015
> URL: https://issues.apache.org/jira/browse/HDDS-2015
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> *Key Write Path (Encryption)*
> When a bucket metadata has gdprEnabled=true, we generate the GDPRSymmetricKey 
> and add it to Key Metadata before we create the Key.
> This ensures that key is encrypted before writing.
> *Key Read Path(Decryption)*
> While reading the Key, we check for gdprEnabled=true and they get the 
> GDPRSymmetricKey based on secret/algorithm as fetched from Key Metadata.
> Create a stream to decrypt the key and pass it on to client.
> *Test*
> Create Key in GDPR Enabled Bucket -> Read Key -> Verify content is as 
> expected -> Update Key Metadata to remove the gdprEnabled flag -> Read Key -> 
> Confirm the content is not as expected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2015) Encrypt/decrypt key using symmetric key while writing/reading

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2015?focusedWorklogId=308086&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308086
 ]

ASF GitHub Bot logged work on HDDS-2015:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:43
Start Date: 06/Sep/19 18:43
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1386: HDDS-2015. 
Encrypt/decrypt key using symmetric key while writing/reading
URL: https://github.com/apache/hadoop/pull/1386
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308086)
Time Spent: 5h 50m  (was: 5h 40m)

> Encrypt/decrypt key using symmetric key while writing/reading
> -
>
> Key: HDDS-2015
> URL: https://issues.apache.org/jira/browse/HDDS-2015
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> *Key Write Path (Encryption)*
> When a bucket metadata has gdprEnabled=true, we generate the GDPRSymmetricKey 
> and add it to Key Metadata before we create the Key.
> This ensures that key is encrypted before writing.
> *Key Read Path(Decryption)*
> While reading the Key, we check for gdprEnabled=true and they get the 
> GDPRSymmetricKey based on secret/algorithm as fetched from Key Metadata.
> Create a stream to decrypt the key and pass it on to client.
> *Test*
> Create Key in GDPR Enabled Bucket -> Read Key -> Verify content is as 
> expected -> Update Key Metadata to remove the gdprEnabled flag -> Read Key -> 
> Confirm the content is not as expected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2015) Encrypt/decrypt key using symmetric key while writing/reading

2019-09-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924509#comment-16924509
 ] 

Hudson commented on HDDS-2015:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17246 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17246/])
HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading 
(aengineer: rev b15c116c1edaa71a3de86dbbab822ced9df37dbd)
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
* (edit) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/security/GDPRSymmetricKey.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRequest.java
* (edit) 
hadoop-ozone/common/src/test/java/org/apache/hadoop/ozone/security/TestGDPRSymmetricKey.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/keys/PutKeyHandler.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
* (edit) 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java


> Encrypt/decrypt key using symmetric key while writing/reading
> -
>
> Key: HDDS-2015
> URL: https://issues.apache.org/jira/browse/HDDS-2015
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> *Key Write Path (Encryption)*
> When a bucket metadata has gdprEnabled=true, we generate the GDPRSymmetricKey 
> and add it to Key Metadata before we create the Key.
> This ensures that key is encrypted before writing.
> *Key Read Path(Decryption)*
> While reading the Key, we check for gdprEnabled=true and they get the 
> GDPRSymmetricKey based on secret/algorithm as fetched from Key Metadata.
> Create a stream to decrypt the key and pass it on to client.
> *Test*
> Create Key in GDPR Enabled Bucket -> Read Key -> Verify content is as 
> expected -> Update Key Metadata to remove the gdprEnabled flag -> Read Key -> 
> Confirm the content is not as expected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2015) Encrypt/decrypt key using symmetric key while writing/reading

2019-09-06 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-2015:
---
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to the trunk branch.

> Encrypt/decrypt key using symmetric key while writing/reading
> -
>
> Key: HDDS-2015
> URL: https://issues.apache.org/jira/browse/HDDS-2015
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> *Key Write Path (Encryption)*
> When a bucket metadata has gdprEnabled=true, we generate the GDPRSymmetricKey 
> and add it to Key Metadata before we create the Key.
> This ensures that key is encrypted before writing.
> *Key Read Path(Decryption)*
> While reading the Key, we check for gdprEnabled=true and they get the 
> GDPRSymmetricKey based on secret/algorithm as fetched from Key Metadata.
> Create a stream to decrypt the key and pass it on to client.
> *Test*
> Create Key in GDPR Enabled Bucket -> Read Key -> Verify content is as 
> expected -> Update Key Metadata to remove the gdprEnabled flag -> Read Key -> 
> Confirm the content is not as expected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2098) Ozone shell command prints out ERROR when the log4j file is not present.

2019-09-06 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2098:
---

 Summary: Ozone shell command prints out ERROR when the log4j file 
is not present.
 Key: HDDS-2098
 URL: https://issues.apache.org/jira/browse/HDDS-2098
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone CLI
Affects Versions: 0.5.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0


When a log4j file is not present, the default should be console.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2098) Ozone shell command prints out ERROR when the log4j file is not present.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2098?focusedWorklogId=308090&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308090
 ]

ASF GitHub Bot logged work on HDDS-2098:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:56
Start Date: 06/Sep/19 18:56
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #1411: HDDS-2098 
: Ozone shell command prints out ERROR when the log4j file …
URL: https://github.com/apache/hadoop/pull/1411
 
 
   …is not present.
   
   
   Manually tested change on cluster.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308090)
Remaining Estimate: 0h
Time Spent: 10m

> Ozone shell command prints out ERROR when the log4j file is not present.
> 
>
> Key: HDDS-2098
> URL: https://issues.apache.org/jira/browse/HDDS-2098
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a log4j file is not present, the default should be console.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2019-09-06 Thread CR Hota (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota updated HDFS-14090:
---
Attachment: HDFS-14090.012.patch

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2098) Ozone shell command prints out ERROR when the log4j file is not present.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2098:
-
Labels: pull-request-available  (was: )

> Ozone shell command prints out ERROR when the log4j file is not present.
> 
>
> Key: HDDS-2098
> URL: https://issues.apache.org/jira/browse/HDDS-2098
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>
> When a log4j file is not present, the default should be console.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2098) Ozone shell command prints out ERROR when the log4j file is not present.

2019-09-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2098:

Description: 
*Exception Trace*
{code}
log4j:ERROR Could not read configuration file from URL 
[file:/etc/ozone/conf/ozone-shell-log4j.properties].
java.io.FileNotFoundException: /etc/ozone/conf/ozone-shell-log4j.properties (No 
such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
at java.io.FileInputStream.(FileInputStream.java:93)
at 
sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at 
sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.(LogManager.java:127)
at org.slf4j.impl.Log4jLoggerFactory.(Log4jLoggerFactory.java:66)
at org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:72)
at 
org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:45)
at org.slf4j.LoggerFactory.bind(LoggerFactory.java:150)
at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:124)
at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:412)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:357)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383)
at org.apache.hadoop.ozone.web.ozShell.Shell.(Shell.java:35)
log4j:ERROR Ignoring configuration file 
[file:/etc/ozone/conf/ozone-shell-log4j.properties].
log4j:WARN No appenders could be found for logger 
(io.jaegertracing.thrift.internal.senders.ThriftSenderFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
{
  "metadata" : { },
  "name" : "vol-test-putfile-1567740142",
  "admin" : "root",
  "owner" : "root",
  "creationTime" : 1567740146501,
  "acls" : [ {
"type" : "USER",
"name" : "root",
"aclScope" : "ACCESS",
"aclList" : [ "ALL" ]
  }, {
"type" : "GROUP",
"name" : "root",
"aclScope" : "ACCESS",
"aclList" : [ "ALL" ]
  } ],
  "quota" : 1152921504606846976
}
{code}


*Fix*
When a log4j file is not present, the default should be console.

  was:When a log4j file is not present, the default should be console.


> Ozone shell command prints out ERROR when the log4j file is not present.
> 
>
> Key: HDDS-2098
> URL: https://issues.apache.org/jira/browse/HDDS-2098
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Exception Trace*
> {code}
> log4j:ERROR Could not read configuration file from URL 
> [file:/etc/ozone/conf/ozone-shell-log4j.properties].
> java.io.FileNotFoundException: /etc/ozone/conf/ozone-shell-log4j.properties 
> (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method)
>   at java.io.FileInputStream.open(FileInputStream.java:195)
>   at java.io.FileInputStream.(FileInputStream.java:138)
>   at java.io.FileInputStream.(FileInputStream.java:93)
>   at 
> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
>   at 
> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
>   at 
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
>   at 
> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
>   at org.apache.log4j.LogManager.(LogManager.java:127)
>   at org.slf4j.impl.Log4jLoggerFactory.(Log4jLoggerFactory.java:66)
>   at org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:72)
>   at 
> org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:45)
>   at org.slf4j.LoggerFactory.bind(LoggerFactory.java:150)
>   at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:124)
>   at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:412)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:357)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383)
>   at org.apache.hadoop.ozone.web.ozShell.Shell.(Shell.java:35)
> log4j:ERROR Ignoring configuration file 
> [file:/etc/ozone/co

[jira] [Work logged] (HDDS-2098) Ozone shell command prints out ERROR when the log4j file is not present.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2098?focusedWorklogId=308091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308091
 ]

ASF GitHub Bot logged work on HDDS-2098:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:57
Start Date: 06/Sep/19 18:57
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on issue #1411: HDDS-2098 : Ozone 
shell command prints out ERROR when the log4j file …
URL: https://github.com/apache/hadoop/pull/1411#issuecomment-528973963
 
 
   /label ozone
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308091)
Time Spent: 20m  (was: 10m)

> Ozone shell command prints out ERROR when the log4j file is not present.
> 
>
> Key: HDDS-2098
> URL: https://issues.apache.org/jira/browse/HDDS-2098
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When a log4j file is not present, the default should be console.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >