date:20190707

[jira] [Updated] (HDFS-14636) SBN : If you configure the default proxy provider still read Request going to Observer namenode only.

2019-07-07 Thread Harshakiran Reddy (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshakiran Reddy updated HDFS-14636:
-
Labels: SBN  (was: )

> SBN : If you configure the default proxy provider still read Request going to 
> Observer namenode only.
> -
>
> Key: HDFS-14636
> URL: https://issues.apache.org/jira/browse/HDFS-14636
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
>  Labels: SBN
>
> {noformat}
> In Observer cluster, will configure the default proxy provider instead of 
> "org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider", still 
> Read request going to Observer namenode only.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1728) Add metrics for leader's latency in ContainerStateMachine

2019-07-07 Thread Mukul Kumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1728:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review [~ljain]. I have committed this to trunk.

> Add metrics for leader's latency in ContainerStateMachine
> -
>
> Key: HDDS-1728
> URL: https://issues.apache.org/jira/browse/HDDS-1728
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This jira proposes to add metrics around leaders round trip reply to ratis 
> client. This will be done via startTransaction api 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-14636) SBN : If you configure the default proxy provider still read Request going to Observer namenode only.

2019-07-07 Thread Ranith Sardar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar reassigned HDFS-14636:


Assignee: Ranith Sardar

> SBN : If you configure the default proxy provider still read Request going to 
> Observer namenode only.
> -
>
> Key: HDFS-14636
> URL: https://issues.apache.org/jira/browse/HDFS-14636
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
>
> {noformat}
> In Observer cluster, will configure the default proxy provider instead of 
> "org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider", still 
> Read request going to Observer namenode only.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14636) SBN : If you configure the default proxy provider still read Request going to Observer namenode only.

2019-07-07 Thread Harshakiran Reddy (JIRA)

Harshakiran Reddy created HDFS-14636:


 Summary: SBN : If you configure the default proxy provider still 
read Request going to Observer namenode only.
 Key: HDFS-14636
 URL: https://issues.apache.org/jira/browse/HDFS-14636
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.1.1
Reporter: Harshakiran Reddy


{noformat}
In Observer cluster, will configure the default proxy provider instead of 
"org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider", still 
Read request going to Observer namenode only.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1728) Add metrics for leader's latency in ContainerStateMachine

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1728?focusedWorklogId=273092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273092
 ]

ASF GitHub Bot logged work on HDDS-1728:


Author: ASF GitHub Bot
Created on: 08/Jul/19 06:49
Start Date: 08/Jul/19 06:49
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1022: HDDS-1728. 
Add metrics for leader's latency in ContainerStateMachine. Contributed by Mukul 
Kumar Singh.
URL: https://github.com/apache/hadoop/pull/1022
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273092)
Time Spent: 1h 10m  (was: 1h)

> Add metrics for leader's latency in ContainerStateMachine
> -
>
> Key: HDDS-1728
> URL: https://issues.apache.org/jira/browse/HDDS-1728
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This jira proposes to add metrics around leaders round trip reply to ratis 
> client. This will be done via startTransaction api 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-07 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880062#comment-16880062
 ] 

Wei-Chiu Chuang commented on HDFS-14034:


Barring from the trivial checkstyle warnings, the patch looks good to me.

Of course, now we will need to implement the corresponding HttpFs handlers too. 
We should file a new jira for that.

Unrelated. ContentSummary has a field erasureCodingPolicy which was added in 
HDFS-11647, but webhdfs GETCONTENTSUMMARY doesn't include that. 

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1728) Add metrics for leader's latency in ContainerStateMachine

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1728?focusedWorklogId=273077&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273077
 ]

ASF GitHub Bot logged work on HDDS-1728:


Author: ASF GitHub Bot
Created on: 08/Jul/19 06:31
Start Date: 08/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #1022: HDDS-1728. Add 
metrics for leader's latency in ContainerStateMachine. Contributed by Mukul 
Kumar Singh.
URL: https://github.com/apache/hadoop/pull/1022#issuecomment-509096885
 
 
   @mukul1987 Thanks for updating the PR. The changes look good to me. +1.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273077)
Time Spent: 1h  (was: 50m)

> Add metrics for leader's latency in ContainerStateMachine
> -
>
> Key: HDDS-1728
> URL: https://issues.apache.org/jira/browse/HDDS-1728
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This jira proposes to add metrics around leaders round trip reply to ratis 
> client. This will be done via startTransaction api 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880060#comment-16880060
 ] 

Hadoop QA commented on HDFS-14483:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.9 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
31s{color} | {color:green} branch-2.9 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
51s{color} | {color:green} branch-2.9 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
10s{color} | {color:green} branch-2.9 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} branch-2.9 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
23s{color} | {color:green} branch-2.9 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-hdfs-project/hadoop-hdfs-native-client {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
37s{color} | {color:red} hadoop-common-project/hadoop-common in branch-2.9 has 
1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
21s{color} | {color:green} branch-2.9 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
49s{color} | {color:green} branch-2.9 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
59s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-hdfs-project/hadoop-hdfs-native-client {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
44s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
33s{color} | {color:green} hadoop-hdfs-cli

[jira] [Created] (HDFS-14635) Support to refresh the rack awareness dynamically

2019-07-07 Thread liying (JIRA)

liying created HDFS-14635:
-

 Summary: Support to refresh the rack awareness dynamically 
 Key: HDFS-14635
 URL: https://issues.apache.org/jira/browse/HDFS-14635
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 2.7.2
Reporter: liying


At present , there are two ways to load the rack script in the hadoop codes. 
The class ScriptBasedMapping is the cache way, and the class 
ScriptBasedMapping#RawScriptBasedMapping is the way that
it will load script every time（every request）。 The cache is the good way to 
implement this feature， because it consumes cpu performance if loading the 
script for every quest。But here's another question that is we cann't refresh 
the cache, so it is import to support to refresh the rack awareness dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880052#comment-16880052
 ] 

Wei-Chiu Chuang commented on HDFS-14313:


Thank you [~leosun08]. I was out for a few days.

I think overall the patch is almost ready. Please take care of a few nits that 
I spotted.

ReplicaCachingGetSpaceUsed#run() would throw an NPE if ExternalDatasetImpl is 
used since ExternalDatasetImpl#deepCopyReplica() returns a null. IMO, it should 
throw an exception to indicate it is not supported, or return an empty 
Collection.

For the new configuration keys, please update them with a prefix. For example, 
deep.copy.replica.threshold.ms --> 
fs.getspaceused.deep.copy.replica.threshold.ms

Please add description of the new configurations into 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1603) Handle Ratis Append Failure in Container State Machine

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1603?focusedWorklogId=273067&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273067
 ]

ASF GitHub Bot logged work on HDDS-1603:


Author: ASF GitHub Bot
Created on: 08/Jul/19 06:04
Start Date: 08/Jul/19 06:04
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1019: HDDS-1603. 
Handle Ratis Append Failure in Container State Machine. Contributed by Supratim 
Deka
URL: https://github.com/apache/hadoop/pull/1019#discussion_r300932058
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestPipelineClose.java
 ##
 @@ -180,4 +188,77 @@ public void testPipelineCloseWithPipelineAction() throws 
Exception {
 } catch (PipelineNotFoundException e) {
 }
   }
+
+  @Test
+  public void testPipelineCloseWithLogFailure() throws IOException {
+
+EventQueue eventQ = (EventQueue) scm.getEventQueue();
+PipelineActionHandler pipelineActionTest =
+Mockito.mock(PipelineActionHandler.class);
+eventQ.addHandler(SCMEvents.PIPELINE_ACTIONS, pipelineActionTest);
+ArgumentCaptor actionCaptor =
+ArgumentCaptor.forClass(PipelineActionsFromDatanode.class);
+
+ContainerInfo containerInfo = containerManager
+.allocateContainer(RATIS, THREE, "testOwner");
+ContainerWithPipeline containerWithPipeline =
+new ContainerWithPipeline(containerInfo,
+pipelineManager.getPipeline(containerInfo.getPipelineID()));
+Pipeline openPipeline = containerWithPipeline.getPipeline();
+RaftGroupId groupId = RaftGroupId.valueOf(openPipeline.getId().getId());
+
+try {
+  pipelineManager.getPipeline(openPipeline.getId());
+} catch (PipelineNotFoundException e) {
+  Assert.assertTrue("pipeline should exist", false);
 
 Review comment:
   In Junit, the test will exit if an uncaught exception is thrown, so this 
might not be needed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273067)
Time Spent: 1h  (was: 50m)

> Handle Ratis Append Failure in Container State Machine
> --
>
> Key: HDDS-1603
> URL: https://issues.apache.org/jira/browse/HDDS-1603
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> RATIS-573 would add notification to the State Machine on encountering failure 
> during Log append. 
> The scope of this jira is to build on RATIS-573 and define the handling for 
> log append failure in Container State Machine.
> 1. Enqueue pipeline unhealthy action to SCM, add a reason code to the message.
> 2. Trigger heartbeat to SCM
> 3. Notify Ratis volume unhealthy to the Datanode, so that DN can trigger 
> async volume checker
> Changes in the SCM to leverage the additional failure reason code, is outside 
> the scope of this jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1603) Handle Ratis Append Failure in Container State Machine

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1603?focusedWorklogId=273066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273066
 ]

ASF GitHub Bot logged work on HDDS-1603:


Author: ASF GitHub Bot
Created on: 08/Jul/19 06:04
Start Date: 08/Jul/19 06:04
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1019: HDDS-1603. 
Handle Ratis Append Failure in Container State Machine. Contributed by Supratim 
Deka
URL: https://github.com/apache/hadoop/pull/1019#discussion_r300931037
 
 

 ##
 File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
 ##
 @@ -545,18 +545,28 @@ private void handlePipelineFailure(RaftGroupId groupId,
   + roleInfoProto.getRole());
 }
 
+triggerPipelineClose(groupId, msg,
 
 Review comment:
   Lets have 2 Reasons, a) candidate failed, b) leader failed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273066)
Time Spent: 50m  (was: 40m)

> Handle Ratis Append Failure in Container State Machine
> --
>
> Key: HDDS-1603
> URL: https://issues.apache.org/jira/browse/HDDS-1603
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> RATIS-573 would add notification to the State Machine on encountering failure 
> during Log append. 
> The scope of this jira is to build on RATIS-573 and define the handling for 
> log append failure in Container State Machine.
> 1. Enqueue pipeline unhealthy action to SCM, add a reason code to the message.
> 2. Trigger heartbeat to SCM
> 3. Notify Ratis volume unhealthy to the Datanode, so that DN can trigger 
> async volume checker
> Changes in the SCM to leverage the additional failure reason code, is outside 
> the scope of this jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1603) Handle Ratis Append Failure in Container State Machine

2019-07-07 Thread Mukul Kumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1603:

Status: Patch Available  (was: Open)

> Handle Ratis Append Failure in Container State Machine
> --
>
> Key: HDDS-1603
> URL: https://issues.apache.org/jira/browse/HDDS-1603
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> RATIS-573 would add notification to the State Machine on encountering failure 
> during Log append. 
> The scope of this jira is to build on RATIS-573 and define the handling for 
> log append failure in Container State Machine.
> 1. Enqueue pipeline unhealthy action to SCM, add a reason code to the message.
> 2. Trigger heartbeat to SCM
> 3. Notify Ratis volume unhealthy to the Datanode, so that DN can trigger 
> async volume checker
> Changes in the SCM to leverage the additional failure reason code, is outside 
> the scope of this jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1748) Error message for 3 way commit failure is not verbose

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1748?focusedWorklogId=273057&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273057
 ]

ASF GitHub Bot logged work on HDDS-1748:


Author: ASF GitHub Bot
Created on: 08/Jul/19 05:52
Start Date: 08/Jul/19 05:52
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1051: HDDS-1748. 
Error message for 3 way commit failure is not verbose. Contributed by Supratim 
Deka
URL: https://github.com/apache/hadoop/pull/1051#discussion_r300929755
 
 

 ##
 File path: 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientRatis.java
 ##
 @@ -258,8 +258,15 @@ public XceiverClientReply watchForCommit(long index, long 
timeout)
   .sendWatchAsync(index, RaftProtos.ReplicationLevel.ALL_COMMITTED);
   replyFuture.get(timeout, TimeUnit.MILLISECONDS);
 } catch (Exception e) {
+  String nodes = " with Datanodes : ";
   Throwable t = HddsClientUtils.checkForException(e);
-  LOG.warn("3 way commit failed ", e);
+  for (DatanodeDetails datanodeDetails : pipeline.getNodes()) {
+nodes += datanodeDetails.getHostName() + "["
 
 Review comment:
   This line will thrown findbugs, lets use StringBuilder here in place of 
concatenation.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273057)
Time Spent: 0.5h  (was: 20m)

> Error message for 3 way commit failure is not verbose
> -
>
> Key: HDDS-1748
> URL: https://issues.apache.org/jira/browse/HDDS-1748
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The error message for 3 way client commit is not verbose, it should include 
> blockID and pipeline ID along with node details for debugging.
> {code}
> 2019-07-02 09:58:12,025 WARN  scm.XceiverClientRatis 
> (XceiverClientRatis.java:watchForCommit(262)) - 3 way commit failed 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.NotReplicatedException: Request with call Id 39482 
> and log index 11562 is not yet replicated to ALL_COMMITTED
> at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
> at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
> at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnFirstIndex(CommitWatcher.java:135)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:355)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFullBuffer(BlockOutputStream.java:332)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.write(BlockOutputStream.java:259)
> at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.write(BlockOutputStreamEntry.java:129)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:211)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:193)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:103)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:147)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.ratis.protocol.NotReplicatedException: Request with 
> call Id 39482 and log index 11562 is not yet replicated to ALL_COMMITTED
> at 
> org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
> at 
> org

[jira] [Updated] (HDDS-1748) Error message for 3 way commit failure is not verbose

2019-07-07 Thread Mukul Kumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1748:

Status: Patch Available  (was: Open)

> Error message for 3 way commit failure is not verbose
> -
>
> Key: HDDS-1748
> URL: https://issues.apache.org/jira/browse/HDDS-1748
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The error message for 3 way client commit is not verbose, it should include 
> blockID and pipeline ID along with node details for debugging.
> {code}
> 2019-07-02 09:58:12,025 WARN  scm.XceiverClientRatis 
> (XceiverClientRatis.java:watchForCommit(262)) - 3 way commit failed 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.NotReplicatedException: Request with call Id 39482 
> and log index 11562 is not yet replicated to ALL_COMMITTED
> at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259)
> at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194)
> at 
> org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnFirstIndex(CommitWatcher.java:135)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:355)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFullBuffer(BlockOutputStream.java:332)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.write(BlockOutputStream.java:259)
> at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.write(BlockOutputStreamEntry.java:129)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:211)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:193)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:103)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:147)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.ratis.protocol.NotReplicatedException: Request with 
> call Id 39482 and log index 11562 is not yet replicated to ALL_COMMITTED
> at 
> org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245)
> at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254)
> at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421)
> at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> ... 3 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1750) Add block allocation metric for pipelines in SCM

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1750?focusedWorklogId=273054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273054
 ]

ASF GitHub Bot logged work on HDDS-1750:


Author: ASF GitHub Bot
Created on: 08/Jul/19 05:46
Start Date: 08/Jul/19 05:46
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1047: HDDS-1750. 
Add block allocation metrics for pipelines in SCM
URL: https://github.com/apache/hadoop/pull/1047#discussion_r300928815
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/SCMPipelineManager.java
 ##
 @@ -152,6 +152,7 @@ public synchronized Pipeline createPipeline(
   stateManager.addPipeline(pipeline);
   nodeManager.addPipeline(pipeline);
   metrics.incNumPipelineCreated();
+  metrics.createNumBlocksAllocatedMetric(pipeline);
 
 Review comment:
   This should be named as createPerPipelineMetrics or something like that :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273054)
Time Spent: 0.5h  (was: 20m)

> Add block allocation metric for pipelines in SCM
> 
>
> Key: HDDS-1750
> URL: https://issues.apache.org/jira/browse/HDDS-1750
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This Jira aims to add block allocation metrics for pipelines in SCM. This 
> would help in determining the distribution of block allocations among various 
> pipelines in SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1750) Add block allocation metric for pipelines in SCM

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1750?focusedWorklogId=273055&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273055
 ]

ASF GitHub Bot logged work on HDDS-1750:


Author: ASF GitHub Bot
Created on: 08/Jul/19 05:46
Start Date: 08/Jul/19 05:46
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1047: HDDS-1750. 
Add block allocation metrics for pipelines in SCM
URL: https://github.com/apache/hadoop/pull/1047#discussion_r300928623
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/SCMPipelineManager.java
 ##
 @@ -152,6 +152,7 @@ public synchronized Pipeline createPipeline(
   stateManager.addPipeline(pipeline);
   nodeManager.addPipeline(pipeline);
   metrics.incNumPipelineCreated();
+  metrics.createNumBlocksAllocatedMetric(pipeline);
 
 Review comment:
   Can line 154 and 155 be done in one state to pipelineMetrics ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273055)
Time Spent: 0.5h  (was: 20m)

> Add block allocation metric for pipelines in SCM
> 
>
> Key: HDDS-1750
> URL: https://issues.apache.org/jira/browse/HDDS-1750
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This Jira aims to add block allocation metrics for pipelines in SCM. This 
> would help in determining the distribution of block allocations among various 
> pipelines in SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1750) Add block allocation metric for pipelines in SCM

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1750?focusedWorklogId=273056&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273056
 ]

ASF GitHub Bot logged work on HDDS-1750:


Author: ASF GitHub Bot
Created on: 08/Jul/19 05:46
Start Date: 08/Jul/19 05:46
Worklog Time Spent: 10m 
  Work Description: mukul1987 commented on pull request #1047: HDDS-1750. 
Add block allocation metrics for pipelines in SCM
URL: https://github.com/apache/hadoop/pull/1047#discussion_r300928722
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/SCMPipelineManager.java
 ##
 @@ -362,6 +364,7 @@ private void finalizePipeline(PipelineID pipelineId) 
throws IOException {
   for (ContainerID containerID : containerIDs) {
 eventPublisher.fireEvent(SCMEvents.CLOSE_CONTAINER, containerID);
   }
+  metrics.clearMetrics(pipelineId);
 
 Review comment:
   Lets rename this to removePipelineMetrics
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273056)
Time Spent: 0.5h  (was: 20m)

> Add block allocation metric for pipelines in SCM
> 
>
> Key: HDDS-1750
> URL: https://issues.apache.org/jira/browse/HDDS-1750
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This Jira aims to add block allocation metrics for pipelines in SCM. This 
> would help in determining the distribution of block allocations among various 
> pipelines in SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1735) Create separate unit and integration test executor dev-support script

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1735?focusedWorklogId=273048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-273048
 ]

ASF GitHub Bot logged work on HDDS-1735:


Author: ASF GitHub Bot
Created on: 08/Jul/19 05:25
Start Date: 08/Jul/19 05:25
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1035: HDDS-1735. 
Create separate unit and integration test executor dev-support script
URL: https://github.com/apache/hadoop/pull/1035#issuecomment-509083277
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 32 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | 0 | shelldocs | 0 | Shelldocs was not available. |
   | 0 | @author | 0 | Skipping @author checks as author.sh has been patched. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 32 | Maven dependency ordering for branch |
   | +1 | mvninstall | 474 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | -1 | pylint | 1 | Error running pylint. Please check pylint stderr files. |
   | +1 | shadedclient | 774 | branch has no errors when building and testing 
our client artifacts. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 30 | Maven dependency ordering for patch |
   | +1 | mvninstall | 453 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | -1 | pylint | 2 | Error running pylint. Please check pylint stderr files. |
   | +1 | pylint | 2 | There were no new pylint issues. |
   | +1 | shellcheck | 2 | The patch generated 0 new + 0 unchanged - 7 fixed = 
0 total (was 7) |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 702 | patch has no errors when building and testing 
our client artifacts. |
   ||| _ Other Tests _ |
   | +1 | unit | 104 | hadoop-hdds in the patch passed. |
   | +1 | unit | 179 | hadoop-ozone in the patch passed. |
   | +1 | asflicense | 49 | The patch does not generate ASF License warnings. |
   | | | 3029 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1035 |
   | Optional Tests | dupname asflicense mvnsite unit shellcheck shelldocs 
pylint |
   | uname | Linux cc2fdb9998fb 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9c90729 |
   | pylint | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/4/artifact/out/branch-pylint-stderr.txt
 |
   | pylint | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/4/artifact/out/patch-pylint-stderr.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/4/testReport/ |
   | Max. process+thread count | 411 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone hadoop-ozone/fault-injection-test/network-tests 
U: hadoop-ozone |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/4/console |
   | versions | git=2.7.4 maven=3.3.9 shellcheck=0.4.6 pylint=1.9.2 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 273048)
Time Spent: 1h 40m  (was: 1.5h)

> Create separate unit and integration test executor dev-support script
> -
>
> Key: HDDS-1735
> URL: https://issues.apache.org/jira/browse/HDDS-1735
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2019-07-02 at 3.25.33 PM.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> hadoop-ozone/dev-support/checks directory contains multiple helper script to 
> execute different type of testing (findbugs, rat, unit, build).
> They easily define how tests should be executed, with the following contract:
>  * The problems should be pri

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880028#comment-16880028
 ] 

He Xiaoqiao commented on HDFS-14313:


[~leosun08] Thanks for your report + patch this issue. And sorry for missing 
this information. I would like to take review this week. Thanks again.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880022#comment-16880022
 ] 

He Xiaoqiao commented on HDFS-12703:


 [~elgoiri], thanks for your reviews. [^HDFS-12703.006.patch] update with above 
review comments.
{quote}Should we be extra careful and catch also in the run() just in case?
{quote}
Correct, we should also catch exception in the run since 
{{Preconditions.checkState}} at the last part of Monitor#check.
Thanks [~elgoiri] again. Please let me know if there are other corrections.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-12703:
---
Attachment: HDFS-12703.006.patch

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch, 
> HDFS-12703.006.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880012#comment-16880012
 ] 

stack commented on HDFS-14483:
--

[~leosun08] Thanks. Looking at history of hdfs builds, I see that it files in 
the build just before this one for the HDFS-13694 patch. Unrelated then. Let me 
push.

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch, HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880012#comment-16880012
 ] 

stack edited comment on HDFS-14483 at 7/8/19 3:52 AM:
--

[~leosun08] Thanks. Looking at history of hdfs builds, I see that it files in 
the build just before this one for the HDFS-13694 patch. Unrelated then. Let me 
push. Will do tomorrow in case someone else wants to comment in meantime.


was (Author: stack):
[~leosun08] Thanks. Looking at history of hdfs builds, I see that it files in 
the build just before this one for the HDFS-13694 patch. Unrelated then. Let me 
push.

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch, HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread Lisheng Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880009#comment-16880009
 ] 

Lisheng Sun commented on HDFS-14483:


 
Hi [~stack] I confirm again UT is ok with this patch in 
TestJournalNodeRespectsBindHostKeys of my local. Thank you.
{code:java}
TestJournalNodeRespectsBindHostKeys
[INFO] Running 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.781 s 
- in org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys
{code}
 

 

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch, HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880008#comment-16880008
 ] 

stack commented on HDFS-14483:
--

...and +1 on patch. Lets just figure the story on this last flakey...and then 
I'll commit (unless objection). Thanks [~leosun08]

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch, HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HDFS-14483:
-
Attachment: HDFS-14585.branch-2.9.v3.patch

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch, HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14483) Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9

2019-07-07 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880004#comment-16880004
 ] 

stack commented on HDFS-14483:
--

Thanks for fixing the short circuit unit test [~leosun08]. Seems to have 
worked. As said above..

hadoop.hdfs.web.TestWebHdfsTimeouts
hadoop.hdfs.server.datanode.TestDirectoryScanner

... are for sure flakey. TestJournalNodeRespectsBindHostKeys I'm not so sure. 
Will do a survey of recent test history... Meantime let me get another run in.

> Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
> --
>
> Key: HDFS-14483
> URL: https://issues.apache.org/jira/browse/HDFS-14483
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Zheng Hu
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14483.branch-2.8.v1.patch, 
> HDFS-14483.branch-2.9.v1.patch, HDFS-14483.branch-2.9.v1.patch, 
> HDFS-14483.branch-2.9.v2 (2).patch, HDFS-14483.branch-2.9.v2.patch, 
> HDFS-14483.branch-2.9.v2.patch, HDFS-14585.branch-2.9.v3.patch, 
> HDFS-14585.branch-2.9.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14634) the original active namenode should have priority to participate in the election when the zookeeper recovery

2019-07-07 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1687#comment-1687
 ] 

Hadoop QA commented on HDFS-14634:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-14634 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14634 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973869/HDFS-14634-v1.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27161/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> the original active namenode should  have priority to participate in the 
> election when the zookeeper recovery
> -
>
> Key: HDFS-14634
> URL: https://issues.apache.org/jira/browse/HDFS-14634
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 2.7.2
>Reporter: liying
>Priority: Major
> Attachments: HDFS-14634-v1.patch
>
>
> Dynamically generates the namenode's election priorities in the zkfc Module。 
> For example，when the zookeeper crash，all of the namenode remain in their 
> original state。 Then the zookeeper service recovery，the original active 
> namenode should  have priority to participate in the election。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14634) the original active namenode should have priority to participate in the election when the zookeeper recovery

2019-07-07 Thread liying (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liying updated HDFS-14634:
--
Status: Open  (was: Patch Available)

> the original active namenode should  have priority to participate in the 
> election when the zookeeper recovery
> -
>
> Key: HDFS-14634
> URL: https://issues.apache.org/jira/browse/HDFS-14634
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 2.7.2
>Reporter: liying
>Priority: Major
> Attachments: HDFS-14634-v1.patch
>
>
> Dynamically generates the namenode's election priorities in the zkfc Module。 
> For example，when the zookeeper crash，all of the namenode remain in their 
> original state。 Then the zookeeper service recovery，the original active 
> namenode should  have priority to participate in the election。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14634) the original active namenode should have priority to participate in the election when the zookeeper recovery

2019-07-07 Thread liying (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liying updated HDFS-14634:
--
Attachment: HDFS-14634-v1.patch
Status: Patch Available  (was: Open)

wait for some time when the standby namenode join the elector
 to keep the the original active namenode should have priority to participate
 in the election when the zookeeper recovery.

> the original active namenode should  have priority to participate in the 
> election when the zookeeper recovery
> -
>
> Key: HDFS-14634
> URL: https://issues.apache.org/jira/browse/HDFS-14634
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 2.7.2
>Reporter: liying
>Priority: Major
> Attachments: HDFS-14634-v1.patch
>
>
> Dynamically generates the namenode's election priorities in the zkfc Module。 
> For example，when the zookeeper crash，all of the namenode remain in their 
> original state。 Then the zookeeper service recovery，the original active 
> namenode should  have priority to participate in the election。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14634) the original active namenode should have priority to participate in the election when the zookeeper recovery

2019-07-07 Thread liying (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liying updated HDFS-14634:
--
Release Note: 
Wait for some time when the standby namenode join the elector
 to keep the the original active namenode should have priority to participate
 in the election when the zookeeper recovery.
  Status: Patch Available  (was: Open)

> the original active namenode should  have priority to participate in the 
> election when the zookeeper recovery
> -
>
> Key: HDFS-14634
> URL: https://issues.apache.org/jira/browse/HDFS-14634
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 2.7.2
>Reporter: liying
>Priority: Major
>
> Dynamically generates the namenode's election priorities in the zkfc Module。 
> For example，when the zookeeper crash，all of the namenode remain in their 
> original state。 Then the zookeeper service recovery，the original active 
> namenode should  have priority to participate in the election。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13694) Making md5 computing being in parallel with image loading

2019-07-07 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13694:
---
Fix Version/s: 3.1.3
   2.9.3
   3.2.1
   3.0.4
   2.10.0

> Making md5 computing being in parallel with image loading
> -
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: HDFS-13694-001.patch, HDFS-13694-002.patch, 
> HDFS-13694-003.patch, HDFS-13694-004.patch, HDFS-13694-005.patch, 
> HDFS-13694-006.patch, HDFS-13694-007.patch
>
>
> During namenode image loading, it firstly compute the md5 and then load the 
> image. Actually these two steps can be in parallel.
>  Test this patch against a fsimage of a 70PB 2.4 cluster (200million files 
> and 300million blocks), the image loading time be reduced from 1210 seconds 
> to 1105 seconds.So it can reduce up to about 10% of time.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13694) Making md5 computing being in parallel with image loading

2019-07-07 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879994#comment-16879994
 ] 

Íñigo Goiri commented on HDFS-13694:


Cherry-picked to branch-3.2, branch-3.1, branch-3.0, branch-2, and branch-2.9.

> Making md5 computing being in parallel with image loading
> -
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Assignee: Lisheng Sun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-13694-001.patch, HDFS-13694-002.patch, 
> HDFS-13694-003.patch, HDFS-13694-004.patch, HDFS-13694-005.patch, 
> HDFS-13694-006.patch, HDFS-13694-007.patch
>
>
> During namenode image loading, it firstly compute the md5 and then load the 
> image. Actually these two steps can be in parallel.
>  Test this patch against a fsimage of a 70PB 2.4 cluster (200million files 
> and 300million blocks), the image loading time be reduced from 1210 seconds 
> to 1105 seconds.So it can reduce up to about 10% of time.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879992#comment-16879992
 ] 

Íñigo Goiri commented on HDFS-12703:


Thanks [~hexiaoqiao] for checking.
It also looked to me like some multi threading issue with the state of the 
datanodes.
Based on that I think is OK to catch it in the {{check()}} and not in the 
{{run()}}.
Should we be extra careful and catch also in the {{run()}} just in case?

Comments on  [^HDFS-12703.005.patch]:
* In the log, I would report the DN that is failing too as we are playing with 
it in the catch.
* In the unit test I would explain what we are trying to catch in the main 
javadoc and mention the executor swallowing the exceptions by default.
* I think we should extend the unit test and make sure this is not happening. 
Should we check the value before triggering? Checking that the thread is alive 
(and it dies without it)? Checking for the exception message?

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14361) SNN will always upload fsimage

2019-07-07 Thread hunshenshi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879989#comment-16879989
 ] 

hunshenshi commented on HDFS-14361:
---

[~starphin] Thanks. So it should move out from if block.

[~brahmareddy] could you help review the patch again? Thanks

> SNN will always upload fsimage
> --
>
> Key: HDFS-14361
> URL: https://issues.apache.org/jira/browse/HDFS-14361
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Priority: Major
> Fix For: 3.2.0
>
>
> Related to -HDFS-12248.-
> {code:java}
> boolean sendRequest = isPrimaryCheckPointer
> || secsSinceLastUpload >= checkpointConf.getQuietPeriod();
> doCheckpoint(sendRequest);
> {code}
> If sendRequest is true, SNN will upload fsimage. But isPrimaryCheckPointer 
> always is true,
> {code:java}
> if (ie == null && ioe == null) {
>   //Update only when response from remote about success or
>   lastUploadTime = monotonicNow();
>   // we are primary if we successfully updated the ANN
>   this.isPrimaryCheckPointer = success;
> }
> {code}
> isPrimaryCheckPointer should be outside the if condition.
> If the ANN update was not successful, then isPrimaryCheckPointer should be 
> set to false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14361) SNN will always upload fsimage

2019-07-07 Thread star (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879988#comment-16879988
 ] 

star commented on HDFS-14361:
-

Right. isPrimaryCheckPointer will not be changed when any error/exception 
occurred. 

> SNN will always upload fsimage
> --
>
> Key: HDFS-14361
> URL: https://issues.apache.org/jira/browse/HDFS-14361
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Priority: Major
> Fix For: 3.2.0
>
>
> Related to -HDFS-12248.-
> {code:java}
> boolean sendRequest = isPrimaryCheckPointer
> || secsSinceLastUpload >= checkpointConf.getQuietPeriod();
> doCheckpoint(sendRequest);
> {code}
> If sendRequest is true, SNN will upload fsimage. But isPrimaryCheckPointer 
> always is true,
> {code:java}
> if (ie == null && ioe == null) {
>   //Update only when response from remote about success or
>   lastUploadTime = monotonicNow();
>   // we are primary if we successfully updated the ANN
>   this.isPrimaryCheckPointer = success;
> }
> {code}
> isPrimaryCheckPointer should be outside the if condition.
> If the ANN update was not successful, then isPrimaryCheckPointer should be 
> set to false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14634) the original active namenode should have priority to participate in the election when the zookeeper recovery

2019-07-07 Thread liying (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liying updated HDFS-14634:
--
Summary: the original active namenode should  have priority to participate 
in the election when the zookeeper recovery  (was: Dynamically generates the 
namenode's election priorities)

> the original active namenode should  have priority to participate in the 
> election when the zookeeper recovery
> -
>
> Key: HDFS-14634
> URL: https://issues.apache.org/jira/browse/HDFS-14634
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 2.7.2
>Reporter: liying
>Priority: Major
>
> Dynamically generates the namenode's election priorities in the zkfc Module。 
> For example，when the zookeeper crash，all of the namenode remain in their 
> original state。 Then the zookeeper service recovery，the original active 
> namenode should  have priority to participate in the election。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14634) Dynamically generates the namenode's election priorities

2019-07-07 Thread liying (JIRA)

liying created HDFS-14634:
-

 Summary: Dynamically generates the namenode's election priorities
 Key: HDFS-14634
 URL: https://issues.apache.org/jira/browse/HDFS-14634
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: auto-failover
Affects Versions: 2.7.2
Reporter: liying


Dynamically generates the namenode's election priorities in the zkfc Module。 
For example，when the zookeeper crash，all of the namenode remain in their 
original state。 Then the zookeeper service recovery，the original active 
namenode should  have priority to participate in the election。

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1735) Create separate unit and integration test executor dev-support script

2019-07-07 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1735?focusedWorklogId=272986&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-272986
 ]

ASF GitHub Bot logged work on HDDS-1735:


Author: ASF GitHub Bot
Created on: 07/Jul/19 22:56
Start Date: 07/Jul/19 22:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1035: HDDS-1735. 
Create separate unit and integration test executor dev-support script
URL: https://github.com/apache/hadoop/pull/1035#issuecomment-509037232
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 32 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | 0 | shelldocs | 1 | Shelldocs was not available. |
   | 0 | @author | 0 | Skipping @author checks as author.sh has been patched. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 44 | Maven dependency ordering for branch |
   | +1 | mvninstall | 504 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | -1 | pylint | 1 | Error running pylint. Please check pylint stderr files. |
   | +1 | shadedclient | 772 | branch has no errors when building and testing 
our client artifacts. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 32 | Maven dependency ordering for patch |
   | +1 | mvninstall | 454 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | -1 | pylint | 2 | Error running pylint. Please check pylint stderr files. |
   | +1 | pylint | 2 | There were no new pylint issues. |
   | +1 | shellcheck | 1 | The patch generated 0 new + 0 unchanged - 7 fixed = 
0 total (was 7) |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 708 | patch has no errors when building and testing 
our client artifacts. |
   ||| _ Other Tests _ |
   | +1 | unit | 103 | hadoop-hdds in the patch passed. |
   | +1 | unit | 179 | hadoop-ozone in the patch passed. |
   | +1 | asflicense | 48 | The patch does not generate ASF License warnings. |
   | | | 3076 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1035 |
   | Optional Tests | dupname asflicense mvnsite unit shellcheck shelldocs 
pylint |
   | uname | Linux f16dcd073b52 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9c90729 |
   | pylint | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/3/artifact/out/branch-pylint-stderr.txt
 |
   | pylint | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/3/artifact/out/patch-pylint-stderr.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/3/testReport/ |
   | Max. process+thread count | 446 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone hadoop-ozone/fault-injection-test/network-tests 
U: hadoop-ozone |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1035/3/console |
   | versions | git=2.7.4 maven=3.3.9 shellcheck=0.4.6 pylint=1.9.2 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 272986)
Time Spent: 1.5h  (was: 1h 20m)

> Create separate unit and integration test executor dev-support script
> -
>
> Key: HDDS-1735
> URL: https://issues.apache.org/jira/browse/HDDS-1735
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2019-07-02 at 3.25.33 PM.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> hadoop-ozone/dev-support/checks directory contains multiple helper script to 
> execute different type of testing (findbugs, rat, unit, build).
> They easily define how tests should be executed, with the following contract:
>  * The problems should be print

[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879944#comment-16879944
 ] 

Hadoop QA commented on HDFS-12703:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 52s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-12703 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973858/HDFS-12703.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 26b692590bd0 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c90729 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27160/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27160/testReport/ |
| Max. process+thread count | 3921 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apa

[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-07 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879932#comment-16879932
 ] 

Hadoop QA commented on HDFS-14034:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project: The patch generated 5 new + 
261 unchanged - 0 fixed = 266 total (was 261) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14034 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973855/HDFS-14034.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4a2fca299251 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c90729 |
| maven | version: Apache Ma

[jira] [Comment Edited] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879917#comment-16879917
 ] 

He Xiaoqiao edited comment on HDFS-12703 at 7/7/19 6:10 PM:


Upload patch [^HDFS-12703.005.patch] with unit test and try to fix this issue.
I think the root cause is that interface of DatanodeDescriptor is 
non-thread-safe after dig the decommission logic. Consider that 
{{DatanodeAdminManager#monitor}} is running, another thread set {{adminState}} 
to {{Decommissioned}} of corresponding DataNode, then this issue will reprod.
 [^HDFS-12703.005.patch] just catch the exception and remove datanode from 
{{outOfServiceNodeBlocks}} and push back to {{pendingNodes}} then it will 
process next loop.
{quote}Does it need a restart or another refreshNodes to take it out of the 
invalid state?
{quote}
Since postpone to check and it will meet the proper state in next loop, so we 
do not need to operation DataNode or refreshNodes again.

To [~xuel1], I just assign JIRA to myself, please feel free to assign back to 
you if would like to go on working on this issue before we resolve it.


was (Author: hexiaoqiao):
Upload patch [^HDFS-12703.005.patch] with unit test and try to fix this issue.
I think the root cause is that interface of DatanodeDescriptor is 
non-thread-safe after dig the decommission logic. Consider that 
{{DatanodeAdminManager#monitor}} is running, another thread set {{adminState}} 
to {{Decommissioned}} of corresponding DataNode, then this issue will reprod.
 [^HDFS-12703.005.patch] just catch the exception and remove datanode from 
{{outOfServiceNodeBlocks}} and push back to {{pendingNodes}} then it will 
process next loop.
{code:java}
Does it need a restart or another refreshNodes to take it out of the invalid 
state?
{code}
Since postpone to check and it will meet the proper state in next loop, so we 
do not need to operation DataNode or refreshNodes again.

To [~xuel1], I just assign JIRA to myself, please feel free to assign back to 
you if would like to go on working on this issue before we resolve it.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879917#comment-16879917
 ] 

He Xiaoqiao commented on HDFS-12703:


Upload patch [^HDFS-12703.005.patch] with unit test and try to fix this issue.
I think the root cause is that interface of DatanodeDescriptor is 
non-thread-safe after dig the decommission logic. Consider that 
{{DatanodeAdminManager#monitor}} is running, another thread set {{adminState}} 
to {{Decommissioned}} of corresponding DataNode, then this issue will reprod.
 [^HDFS-12703.005.patch] just catch the exception and remove datanode from 
{{outOfServiceNodeBlocks}} and push back to {{pendingNodes}} then it will 
process next loop.
{code:java}
Does it need a restart or another refreshNodes to take it out of the invalid 
state?
{code}
Since postpone to check and it will meet the proper state in next loop, so we 
do not need to operation DataNode or refreshNodes again.

To [~xuel1], I just assign JIRA to myself, please feel free to assign back to 
you if would like to go on working on this issue before we resolve it.

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-12703:
---
Attachment: HDFS-12703.005.patch

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: Xue Liu
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-12703) Exceptions are fatal to decommissioning monitor

2019-07-07 Thread He Xiaoqiao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao reassigned HDFS-12703:
--

Assignee: He Xiaoqiao  (was: Xue Liu)

> Exceptions are fatal to decommissioning monitor
> ---
>
> Key: HDFS-12703
> URL: https://issues.apache.org/jira/browse/HDFS-12703
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: He Xiaoqiao
>Priority: Critical
> Attachments: HDFS-12703.001.patch, HDFS-12703.002.patch, 
> HDFS-12703.003.patch, HDFS-12703.004.patch, HDFS-12703.005.patch
>
>
> The {{DecommissionManager.Monitor}} runs as an executor scheduled task.  If 
> an exception occurs, all decommissioning ceases until the NN is restarted.  
> Per javadoc for {{executor#scheduleAtFixedRate}}: *If any execution of the 
> task encounters an exception, subsequent executions are suppressed*.  The 
> monitor thread is alive but blocked waiting for an executor task that will 
> never come.  The code currently disposes of the future so the actual 
> exception that aborted the task is gone.
> Failover is insufficient since the task is also likely dead on the standby.  
> Replication queue init after the transition to active will fix the under 
> replication of blocks on currently decommissioning nodes but future nodes 
> never decommission.  The standby must be bounced prior to failover – and 
> hopefully the error condition does not reoccur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-07 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-14034:

Status: Patch Available  (was: Open)

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-07 Thread Chao Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879892#comment-16879892
 ] 

Chao Sun commented on HDFS-14034:
-

Sorry for the delay. Submitted patch v0.

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2019-07-07 Thread Chao Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-14034:

Attachment: HDFS-14034.000.patch

> Support getQuotaUsage API in WebHDFS
> 
>
> Key: HDFS-14034
> URL: https://issues.apache.org/jira/browse/HDFS-14034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, webhdfs
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14034.000.patch
>
>
> HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch 
> quota usage on a directory with significantly lower impact than the similar 
> {{getContentSummary}}. This JIRA is to track adding support for this API to 
> WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879882#comment-16879882
 ] 

Hadoop QA commented on HDFS-14313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 16s{color} | {color:orange} root: The patch generated 2 new + 245 unchanged 
- 1 fixed = 247 total (was 246) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
22s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 59s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14313 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973852/HDFS-14313.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cd4c7e158ee8 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven

[jira] [Updated] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du

2019-07-07 Thread Lisheng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-14313:
---
Attachment: HDFS-14313.005.patch

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> 
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

51 matches

Mail list logo