[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569022#comment-16569022 ] Ajay Kumar commented on HDDS-298: - [~nandakumar131] , thanks for review. Patch v4 throws SCM IO exception in place of Precondition. > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570441#comment-16570441 ] Ajay Kumar commented on HDDS-298: - patch v5 to fix failed test case. > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-298: Attachment: HDDS-298.05.patch > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-335) Fix logging for scm events
Ajay Kumar created HDDS-335: --- Summary: Fix logging for scm events Key: HDDS-335 URL: https://issues.apache.org/jira/browse/HDDS-335 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Ajay Kumar Assignee: Ajay Kumar Logs should print event type. \{code}java.lang.IllegalArgumentException: No event handler registered for event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 at org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) at org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310)\{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-335: Description: Logs should print event type. {code}java.lang.IllegalArgumentException: No event handler registered for event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 at org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) at org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} was: Logs should print event type. \{code}java.lang.IllegalArgumentException: No event handler registered for event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 at org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) at org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310)\{code} > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Logs should print event type. > {code}java.lang.IllegalArgumentException: No event handler registered for > event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-335: Attachment: HDDS-335.00.patch > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-335.00.patch > > > Logs should print event type. > {code}java.lang.IllegalArgumentException: No event handler registered for > event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-335: Status: Patch Available (was: Open) > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-335.00.patch > > > Logs should print event type, classname and object currently logged are not > very useful. > > {code:java} > java.lang.IllegalArgumentException: No event handler registered for event > org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-335: Description: Logs should print event type, classname and object currently logged are not very useful. {code:java} java.lang.IllegalArgumentException: No event handler registered for event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 at org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) at org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} was: Logs should print event type. {code}java.lang.IllegalArgumentException: No event handler registered for event org.apache.hadoop.hdds.server.events.TypedEvent@69464649 at org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) at org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-335.00.patch > > > Logs should print event type, classname and object currently logged are not > very useful. > > {code:java} > java.lang.IllegalArgumentException: No event handler registered for event > org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13532) RBF: Adding security
[ https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573454#comment-16573454 ] Ajay Kumar commented on HDFS-13532: --- [~crh], sure, i work out of PST. > RBF: Adding security > > > Key: HDFS-13532 > URL: https://issues.apache.org/jira/browse/HDFS-13532 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Íñigo Goiri >Assignee: Sherwood Zheng >Priority: Major > Attachments: RBF _ Security delegation token thoughts.pdf, > RBF-DelegationToken-Approach1b.pdf, Security_for_Router-based > Federation_design_doc.pdf > > > HDFS Router based federation should support security. This includes > authentication and delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573673#comment-16573673 ] Ajay Kumar commented on HDDS-335: - [~nandakumar131] thanks for checking this. Ya, that should handle it. Resolving ticket. > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-335.00.patch > > > Logs should print event type, classname and object currently logged are not > very useful. > > {code:java} > java.lang.IllegalArgumentException: No event handler registered for event > org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-335) Fix logging for scm events
[ https://issues.apache.org/jira/browse/HDDS-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-335: Resolution: Not A Problem Status: Resolved (was: Patch Available) > Fix logging for scm events > -- > > Key: HDDS-335 > URL: https://issues.apache.org/jira/browse/HDDS-335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-335.00.patch > > > Logs should print event type, classname and object currently logged are not > very useful. > > {code:java} > java.lang.IllegalArgumentException: No event handler registered for event > org.apache.hadoop.hdds.server.events.TypedEvent@69464649 > at > org.apache.hadoop.hdds.server.events.EventQueue.fireEvent(EventQueue.java:116) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher.dispatch(SCMDatanodeHeartbeatDispatcher.java:66) > at > org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.sendHeartbeat(SCMDatanodeProtocolServer.java:219) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:19310){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13823) NameNode UI : "Utilities -> Browse the file system -> open a file -> Head the file" is not working
[ https://issues.apache.org/jira/browse/HDFS-13823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578761#comment-16578761 ] Ajay Kumar commented on HDFS-13823: --- [~nandakumar131] thanks for fixing this. Tested it locally. +1 (non-binding) > NameNode UI : "Utilities -> Browse the file system -> open a file -> Head the > file" is not working > -- > > Key: HDFS-13823 > URL: https://issues.apache.org/jira/browse/HDFS-13823 > Project: Hadoop HDFS > Issue Type: Bug > Components: ui >Affects Versions: 3.1.1 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDFS-13823.000.patch > > > In NameNode UI 'Head the file' and 'Tail the file' links under {{'Utilities > -> Browse the file system -> open a file'}} are not working. The file > contents box is coming up as empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578797#comment-16578797 ] Ajay Kumar commented on HDDS-268: - [~xyao], CommandStatusReportHandler already publishes {{SCMEvents.CLOSE_CONTAINER_STATUS}}. New watcher will be listening to this. In case of Failure it will send a event to CloseContainerCommandHandler which may resend the command to datanodes. For all other cases watcher will remove it from its internal queue and consider the event as completed. I think we need to handle PENDING status separately as well. {code}@Override protected synchronized void handleCompletion(CloseContainerStatus status, EventPublisher publisher) throws LeaseNotFoundException { CloseContainerRetryableReq closeCont = getTrackedEventbyId(status.getId()); super.handleCompletion(status, publisher); if (status.getCmdStatus().getStatus().equals(Status.FAILED) && closeCont != null) {this.resendEventToHandler(closeCont.getId(), publisher); } }{code} Had a discussion regarding this with [~nandakumar131]. If we don't consider Container to be closed until we receive ack from all related DN's than we need to add DN id to CloseContainerRetryableReq to track command to every datanode. If you agree i will submit a new patch with both the changes. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-298: Attachment: HDDS-298.06.patch > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch, HDDS-298.06.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-298: Attachment: HDDS-298.06.patch > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch, HDDS-298.06.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-298: Attachment: (was: HDDS-298.06.patch) > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch, HDDS-298.06.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580084#comment-16580084 ] Ajay Kumar commented on HDDS-298: - [~xyao] thanks for review. {code} ContainerMapping.java Line 79: NIT: CLOSE->CLOSED, Close-pipeline- => Closed-pipeline-{code} With HDDS-324 we have replaced string based pipeline name with UUID based pipelineId. So removed this name field. {code} Line 206-208: NIT: unrelated change {code} removed {code} Line 214 getContainerReplica() returns a immutable set, we can avoid allocate a new ArrayList for the datanodes.{code} done! {code} Line 215-217: Can we fold this with a new API containerStateManager#getContainerReplica() to avoid expose the complete containerStateMap here? {code} Done. We already have an API, i think i missed it initially. {code} Line 221: can we define a more specific error code here? {code} Added NO_REPLICA_FOUND in ResultCodes but removed these lines as ContainerStateMap#getContainerReplicas already throws an SCM exception if no replicas are found. Updated test case to check for this exception. {code} Line 224: should we use a different replication type here for closed containers?{code} I was thinking closed container might already have STANDALONE type. Changed it to STANDALONE explicitly. > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch, HDDS-298.06.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-298) Implement SCMClientProtocolServer.getContainerWithPipeline for closed containers
[ https://issues.apache.org/jira/browse/HDDS-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580528#comment-16580528 ] Ajay Kumar commented on HDDS-298: - [~xyao] thanks for review and commit. [~msingh],[~ljain] thanks for reviews. > Implement SCMClientProtocolServer.getContainerWithPipeline for closed > containers > > > Key: HDDS-298 > URL: https://issues.apache.org/jira/browse/HDDS-298 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Ajay Kumar >Priority: Critical > Fix For: 0.2.1 > > Attachments: HDDS-298.00.patch, HDDS-298.01.patch, HDDS-298.02.patch, > HDDS-298.03.patch, HDDS-298.04.patch, HDDS-298.05.patch, HDDS-298.06.patch > > > As [~ljain] mentioned during the review of HDDS-245 > SCMClientProtocolServer.getContainerWithPipeline doesn't return with good > data for closed containers. For closed containers we are maintaining the > datanodes for a containerId in the ContainerStateMap.contReplicaMap. We need > to create fake Pipeline object on-request and return it for the client to > locate the right datanodes to download data. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
Ajay Kumar created HDDS-350: --- Summary: ContainerMapping#flushContainerInfo doesn't set containerId Key: HDDS-350 URL: https://issues.apache.org/jira/browse/HDDS-350 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Ajay Kumar ContainerMapping#flushContainerInfo doesn't set containerId which results in containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar reassigned HDDS-350: --- Assignee: Ajay Kumar > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar reassigned HDDS-351: --- Assignee: Ajay Kumar > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-351) Add chill mode state to SCM
Ajay Kumar created HDDS-351: --- Summary: Add chill mode state to SCM Key: HDDS-351 URL: https://issues.apache.org/jira/browse/HDDS-351 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Ajay Kumar Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-350: Attachment: HDDS-350.00.patch > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-350: Attachment: (was: HDDS-350.00.patch) > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-350: Attachment: HDDS-350.00.patch > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-350: Status: Patch Available (was: Open) > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-354) VolumeInfo.getScmUsed throws NPE
Ajay Kumar created HDDS-354: --- Summary: VolumeInfo.getScmUsed throws NPE Key: HDDS-354 URL: https://issues.apache.org/jira/browse/HDDS-354 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Ajay Kumar java.lang.NullPointerException at org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) at org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:366) at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:264) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-354) VolumeInfo.getScmUsed throws NPE
[ https://issues.apache.org/jira/browse/HDDS-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-354: Description: {code}java.lang.NullPointerException at org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) at org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:366) at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:264) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745){code} was: java.lang.NullPointerException at org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) at org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:366) at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:264) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) > VolumeInfo.getScmUsed throws NPE > > > Key: HDDS-354 > URL: https://issues.apache.org/jira/browse/HDDS-354 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Priority: Major > > {code}java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) > at > org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:366) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:264) > at > org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) > at > org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) > at > org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) > at > org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.00.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
Ajay Kumar created HDDS-362: --- Summary: Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol Key: HDDS-362 URL: https://issues.apache.org/jira/browse/HDDS-362 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Ajay Kumar Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-119) Skip Apache license header check for some ozone doc scripts
[ https://issues.apache.org/jira/browse/HDDS-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583358#comment-16583358 ] Ajay Kumar commented on HDDS-119: - [~xyao] thanks for updated patch. Tested locally. +1 > Skip Apache license header check for some ozone doc scripts > --- > > Key: HDDS-119 > URL: https://issues.apache.org/jira/browse/HDDS-119 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: document >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-119.00.patch, HDDS-119.01.patch, HDDS-119.02.patch, > HDDS-119.03.patch > > > {code} > Lines that start with ? in the ASF License report indicate files that do > not have an Apache license header: !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/theme.toml !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/fonts/glyphicons-halflings-regular.svg > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/js/bootstrap.min.js > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/js/jquery.min.js > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/css/bootstrap-theme.min.css > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/css/bootstrap.min.css.map > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/css/bootstrap.min.css > !? > /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/static/css/bootstrap-theme.min.css.map > !? /testptch/hadoop/hadoop-ozone/docs/themes/ozonedoc/layouts/index.html > !? /testptch/hadoop/hadoop-ozone/docs/static/OzoneOverview.svg > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583411#comment-16583411 ] Ajay Kumar commented on HDDS-268: - [~xyao] thanks for the information on REPLICATE_ALL semantic. Will update the patch for PENDING state. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-363) Faster datanode registration during the first startup
[ https://issues.apache.org/jira/browse/HDDS-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584078#comment-16584078 ] Ajay Kumar commented on HDDS-363: - agree, registration can be little more aggressive. Once registered datanode can fallback to configured HB. > Faster datanode registration during the first startup > - > > Key: HDDS-363 > URL: https://issues.apache.org/jira/browse/HDDS-363 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Minor > Fix For: 0.2.1 > > > During the first startup usually we need to wait about 30 s to find the scm > usable. The datanode registration is a multiple step process > (request/response + request/response) and we need to wait the next HB to > finish the registration. > I propose to use a more higher HB frequency at startup (let's say 2 seconds) > and set the configured HB only at the end of the registration. > It also helps for the first users as it could be less confusing (the datanode > can be seen almost immediately on the UI) > Also it would help a lot for me during the testing (yes, I can decrease the > HB frequency but in that case it's harder the follow the later HBs) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-364) Update open container replica information in SCM during DN register
Ajay Kumar created HDDS-364: --- Summary: Update open container replica information in SCM during DN register Key: HDDS-364 URL: https://issues.apache.org/jira/browse/HDDS-364 Project: Hadoop Distributed Data Store Issue Type: New Feature Reporter: Ajay Kumar Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: HDDS-364.00.patch > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-364.00.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar reassigned HDDS-364: --- Assignee: Ajay Kumar > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-364.00.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584296#comment-16584296 ] Ajay Kumar commented on HDDS-364: - Unit test case for testing replica information after registration needs further changes in some related classes. It will be added as part of [HDDS-351]. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-364.00.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Status: Patch Available (was: Open) > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-364.00.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584428#comment-16584428 ] Ajay Kumar commented on HDDS-268: - patch v4 with changed in CloseContainerWatcher handle PENDING status. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch, HDDS-268.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-268: Attachment: HDDS-268.04.patch > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch, HDDS-268.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-222) Remove hdfs command line from ozone distrubution.
[ https://issues.apache.org/jira/browse/HDDS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584446#comment-16584446 ] Ajay Kumar commented on HDDS-222: - [~elek] thanks for updating the patch. Tested locally, build was successful along with acceptance test. Alternatively we can keep the dependencies as provided and export them in ozone classpath inside ozone-config.sh? > Remove hdfs command line from ozone distrubution. > - > > Key: HDDS-222 > URL: https://issues.apache.org/jira/browse/HDDS-222 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-222.001.patch, HDDS-222.002.patch > > > As the ozone release artifact doesn't contain a stable namenode/datanode code > the hdfs command should be removed from the ozone artifact. > ozone-dist-layout-stitching also could be simplified to copy only the > required jar files (we don't need to copy the namenode/datanode server side > jars, just the common artifacts -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-297) Add pipeline actions in Ozone
[ https://issues.apache.org/jira/browse/HDDS-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584534#comment-16584534 ] Ajay Kumar commented on HDDS-297: - [~msingh] thanks for working on this. Could you please rebase it. > Add pipeline actions in Ozone > - > > Key: HDDS-297 > URL: https://issues.apache.org/jira/browse/HDDS-297 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-297.001.patch, HDDS-297.002.patch > > > Pipeline in Ozone are created out of a group of nodes depending upon the > replication factor and type. These pipeline provide a transport protocol for > data transfer. > Inorder to detect any failure of pipeline, SCM should receive pipeline > reports from Datanodes and process it to identify various raft rings. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.00.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: (was: HDDS-351.00.patch) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586146#comment-16586146 ] Ajay Kumar commented on HDDS-351: - Patch needs to applied on top of [HDDS-350] and [HDDS-364]. > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-268: Attachment: HDDS-268.05.patch > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch, HDDS-268.04.patch, HDDS-268.05.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586346#comment-16586346 ] Ajay Kumar commented on HDDS-268: - patch v5 to fix checkstyle issues in TestCloseContainerWatcher. Test failure in TestEventWatcher looks unrelated, passes locally. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch, HDDS-268.04.patch, HDDS-268.05.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-366) Update functions impacted by SCM chill mode in StorageContainerLocationProtocol
[ https://issues.apache.org/jira/browse/HDDS-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-366: Description: Modify functions impacted by SCM chill mode in StorageContainerLocationProtocol. (was: Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol) > Update functions impacted by SCM chill mode in > StorageContainerLocationProtocol > --- > > Key: HDDS-366 > URL: https://issues.apache.org/jira/browse/HDDS-366 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Priority: Major > > Modify functions impacted by SCM chill mode in > StorageContainerLocationProtocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-366) Update functions impacted by SCM chill mode in StorageContainerLocationProtocol
Ajay Kumar created HDDS-366: --- Summary: Update functions impacted by SCM chill mode in StorageContainerLocationProtocol Key: HDDS-366 URL: https://issues.apache.org/jira/browse/HDDS-366 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Ajay Kumar Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587561#comment-16587561 ] Ajay Kumar edited comment on HDDS-350 at 8/21/18 3:07 PM: -- [~xyao] thanks for looking into this. You are right, currently we just update allocatedBytes. I might be missing something here but this creates confusion about which source will take precedence over other? (Among memory and db) flushContainerInfo is called when we close(usually during shutdown) , essentially indicating we want to flush all our in memory state to db. If memory has all most recent changes than we should flush it. Let me know your thoughts. was (Author: ajayydv): [~xyao] thanks for looking into this. You are right, currently we just update allocatedBytes. I might be missing something here but this creates confusion about which source will take precedence over other? (Among memory and db) flushContainerInfo is called when we close(usually during shutdown) , essentially indicating we want to flush all our in memory state to db. If memory has all most recent changes than we should flush it. > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587561#comment-16587561 ] Ajay Kumar commented on HDDS-350: - [~xyao] thanks for looking into this. You are right, currently we just update allocatedBytes. I might be missing something here but this creates confusion about which source will take precedence over other? (Among memory and db) flushContainerInfo is called when we close(usually during shutdown) , essentially indicating we want to flush all our in memory state to db. If memory has all most recent changes than we should flush it. > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-297) Add pipeline actions in Ozone
[ https://issues.apache.org/jira/browse/HDDS-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587908#comment-16587908 ] Ajay Kumar commented on HDDS-297: - [~msingh] thanks for updating the patch. LGTM. Few minor comments and questions: * ContainerMapping#handlePipelineClose log pipeline id if its not found? * ContainerStateMachine L119 rename server to ratisServer * HDDSConfigKeys Just curious about 20 limit. For a medium to big cluster this may be too small? * RatisHelper L58. Is this "_" in peerId generated by Ratis API? If yes, than may be we should replace it with an internal config constant. (To avoid any potential breakage by ratis in future) * StateContext L275 Shall we add unordered() before limit? This is what javadoc says about limit. {code}Using an unordered * stream source (such as {@link #generate(Supplier)}) or removing the * ordering constraint with {@link #unordered()} may result in significant * speedups of {@code limit()} in parallel pipelines, if the semantics of * your situation permit. If consistency with encounter order is required, * and you are experiencing poor performance or memory utilization with * {@code limit()} in parallel pipelines, switching to sequential execution * with {@link #sequential()} may improve performance.{code} * StorageContainerDatanodeProtocol.proto Could you please share why ClosePipelineInfo in PipelineAction is optional? In PipelineEventHandler that seems to be required field. * PipelineEventHandler: Rename class to PipelineActionEventHandler? * TestNodeFailure#testPipelineFail Shall we assert that ratisContainer1 pipeline is Open before we shutdown the datanode in that pipeline? Also shall we test failure for leader and follower separately? > Add pipeline actions in Ozone > - > > Key: HDDS-297 > URL: https://issues.apache.org/jira/browse/HDDS-297 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-297.001.patch, HDDS-297.002.patch, > HDDS-297.003.patch > > > Pipeline in Ozone are created out of a group of nodes depending upon the > replication factor and type. These pipeline provide a transport protocol for > data transfer. > Inorder to detect any failure of pipeline, SCM should receive pipeline > reports from Datanodes and process it to identify various raft rings. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: HDDS-364.00.patch > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.00.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: HDDS-364.01.patch > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: (was: HDDS-364.00.patch) > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589114#comment-16589114 ] Ajay Kumar commented on HDDS-364: - [~elek] thanks for review. Please see my response inline: {quote}What do you think about Node2ContainerMap? Do we need to update it as well?{quote} I think we can get rid of processReport by sending missing and new containers (deltas) directly form DN. This will reduce the SCM-DN communication and will make DN responsible for sending only changed state. But this can be done in separate jira. {quote}2. As I see all the containers will be persisted twice. (First, they will be imported, after that they will be reconciled.). Don't think it's a big problem. IMHO later we need to cleanup all the processing path anyway. One option may be just saving all the initial data to the state map without processing the reports (checking the required closing state, etc.). The downside here is some action would be delayed until the first real container report.{quote} Not sure which codepath you are referring here. This patch adds containers reported in register call to replicaMap which is in memory. Everything else remains same. {quote} The import part (in case of isRegisterCall=true) is the first part of processContainerReport method. I think it would be very easy to move to a separated method and call it independently from SCMDatanodeProtocolServer.register method. Could be more simple, and maybe it could be easier to test. Currently (as I understood) there is no specific test to test the isRegisterCall=true path. But this is not a blocking problem. Depends from your consideration{quote} Approach you are suggesting is close to the attached first patch. Had a discussion regarding this with [~xyao]. Moving it to processContainerReport is a small optimization to not iterate through whole list. (For a DN with 24 disks of 12 TB each we can have roughly 57600 of 5Gb) iterating through it and adding it to replicaMap should be quick but a large cluster with large no of DN may overwhelm the SCM during initial registration process. Moving this inside processContainerReport results in only one iteration of that list. At some point we can refactor this along with logic in ContainerCommandHandler and Node2ContainerMap#processReport. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589114#comment-16589114 ] Ajay Kumar edited comment on HDDS-364 at 8/22/18 4:56 PM: -- [~elek] thanks for review. Please see my response inline: {quote}What do you think about Node2ContainerMap? Do we need to update it as well?{quote} I think we can get rid of processReport by sending missing and new containers (deltas) directly form DN. This will reduce the SCM-DN communication and will make DN responsible for sending only changed state. But this can be done in separate jira. {quote}2. As I see all the containers will be persisted twice. (First, they will be imported, after that they will be reconciled.). Don't think it's a big problem. IMHO later we need to cleanup all the processing path anyway. One option may be just saving all the initial data to the state map without processing the reports (checking the required closing state, etc.). The downside here is some action would be delayed until the first real container report.{quote} Not sure which codepath you are referring here. This patch adds containers reported in register call to replicaMap which is in memory. Everything else remains same. {quote} The import part (in case of isRegisterCall=true) is the first part of processContainerReport method. I think it would be very easy to move to a separated method and call it independently from SCMDatanodeProtocolServer.register method. Could be more simple, and maybe it could be easier to test. Currently (as I understood) there is no specific test to test the isRegisterCall=true path. But this is not a blocking problem. Depends from your consideration{quote} Approach you are suggesting is close to the attached first patch. Had a discussion regarding this with [~xyao]. Moving it to processContainerReport is a small optimization to not iterate through whole list. (For a DN with 24 disks of 12 TB each we can have roughly 57600 containers of 5Gb) iterating through it and adding it to replicaMap should be quick but a large cluster with large no of DN may overwhelm the SCM during initial registration process. Moving this inside processContainerReport results in only one iteration of that list. At some point we can refactor this along with logic in ContainerCommandHandler and Node2ContainerMap#processReport. was (Author: ajayydv): [~elek] thanks for review. Please see my response inline: {quote}What do you think about Node2ContainerMap? Do we need to update it as well?{quote} I think we can get rid of processReport by sending missing and new containers (deltas) directly form DN. This will reduce the SCM-DN communication and will make DN responsible for sending only changed state. But this can be done in separate jira. {quote}2. As I see all the containers will be persisted twice. (First, they will be imported, after that they will be reconciled.). Don't think it's a big problem. IMHO later we need to cleanup all the processing path anyway. One option may be just saving all the initial data to the state map without processing the reports (checking the required closing state, etc.). The downside here is some action would be delayed until the first real container report.{quote} Not sure which codepath you are referring here. This patch adds containers reported in register call to replicaMap which is in memory. Everything else remains same. {quote} The import part (in case of isRegisterCall=true) is the first part of processContainerReport method. I think it would be very easy to move to a separated method and call it independently from SCMDatanodeProtocolServer.register method. Could be more simple, and maybe it could be easier to test. Currently (as I understood) there is no specific test to test the isRegisterCall=true path. But this is not a blocking problem. Depends from your consideration{quote} Approach you are suggesting is close to the attached first patch. Had a discussion regarding this with [~xyao]. Moving it to processContainerReport is a small optimization to not iterate through whole list. (For a DN with 24 disks of 12 TB each we can have roughly 57600 of 5Gb) iterating through it and adding it to replicaMap should be quick but a large cluster with large no of DN may overwhelm the SCM during initial registration process. Moving this inside processContainerReport results in only one iteration of that list. At some point we can refactor this along with logic in ContainerCommandHandler and Node2ContainerMap#processReport. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Featur
[jira] [Comment Edited] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589114#comment-16589114 ] Ajay Kumar edited comment on HDDS-364 at 8/22/18 5:03 PM: -- [~elek] thanks for review. Please see my response inline: {quote}What do you think about Node2ContainerMap? Do we need to update it as well?{quote} I think we can get rid of processReport by sending missing and new containers (deltas) directly form DN. This will reduce the SCM-DN communication and will make DN responsible for sending only changed state. But this can be done in separate jira. {quote}2. As I see all the containers will be persisted twice. (First, they will be imported, after that they will be reconciled.). Don't think it's a big problem. IMHO later we need to cleanup all the processing path anyway. One option may be just saving all the initial data to the state map without processing the reports (checking the required closing state, etc.). The downside here is some action would be delayed until the first real container report.{quote} Not sure which codepath you are referring here. This patch adds containers reported in register call to replicaMap which is in memory. Everything else remains same. {quote} The import part (in case of isRegisterCall=true) is the first part of processContainerReport method. I think it would be very easy to move to a separated method and call it independently from SCMDatanodeProtocolServer.register method. Could be more simple, and maybe it could be easier to test. Currently (as I understood) there is no specific test to test the isRegisterCall=true path. But this is not a blocking problem. Depends from your consideration{quote} Approach you are suggesting is close to the attached first patch. Had a discussion regarding this with [~xyao]. Moving it to processContainerReport is a small optimization to not iterate through whole list. (For a DN with 24 disks of 12 TB each we can have roughly 57600 containers of 5Gb) iterating through it and adding it to replicaMap should be quick but a large cluster with large no of DN may overwhelm the SCM during initial registration process. Moving this inside processContainerReport results in only one iteration of that list. At some point we can refactor this along with logic in ContainerCommandHandler and Node2ContainerMap#processReport. Updated test in TestContainerMapping checks call to processContainerReport with isRegisterCall=true. was (Author: ajayydv): [~elek] thanks for review. Please see my response inline: {quote}What do you think about Node2ContainerMap? Do we need to update it as well?{quote} I think we can get rid of processReport by sending missing and new containers (deltas) directly form DN. This will reduce the SCM-DN communication and will make DN responsible for sending only changed state. But this can be done in separate jira. {quote}2. As I see all the containers will be persisted twice. (First, they will be imported, after that they will be reconciled.). Don't think it's a big problem. IMHO later we need to cleanup all the processing path anyway. One option may be just saving all the initial data to the state map without processing the reports (checking the required closing state, etc.). The downside here is some action would be delayed until the first real container report.{quote} Not sure which codepath you are referring here. This patch adds containers reported in register call to replicaMap which is in memory. Everything else remains same. {quote} The import part (in case of isRegisterCall=true) is the first part of processContainerReport method. I think it would be very easy to move to a separated method and call it independently from SCMDatanodeProtocolServer.register method. Could be more simple, and maybe it could be easier to test. Currently (as I understood) there is no specific test to test the isRegisterCall=true path. But this is not a blocking problem. Depends from your consideration{quote} Approach you are suggesting is close to the attached first patch. Had a discussion regarding this with [~xyao]. Moving it to processContainerReport is a small optimization to not iterate through whole list. (For a DN with 24 disks of 12 TB each we can have roughly 57600 containers of 5Gb) iterating through it and adding it to replicaMap should be quick but a large cluster with large no of DN may overwhelm the SCM during initial registration process. Moving this inside processContainerReport results in only one iteration of that list. At some point we can refactor this along with logic in ContainerCommandHandler and Node2ContainerMap#processReport. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apach
[jira] [Created] (HDDS-370) Add and implement following functions in SCMClientProtocolServer
Ajay Kumar created HDDS-370: --- Summary: Add and implement following functions in SCMClientProtocolServer Key: HDDS-370 URL: https://issues.apache.org/jira/browse/HDDS-370 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Ajay Kumar Modify functions impacted by SCM chill mode in StorageContainerLocationProtocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-370) Add and implement following functions in SCMClientProtocolServer
[ https://issues.apache.org/jira/browse/HDDS-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-370: Description: Add and implement following functions in SCMClientProtocolServer # isScmInChillMode # forceScmEnterChillMode # forceScmExitChillMode was:Modify functions impacted by SCM chill mode in StorageContainerLocationProtocol. > Add and implement following functions in SCMClientProtocolServer > > > Key: HDDS-370 > URL: https://issues.apache.org/jira/browse/HDDS-370 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Priority: Major > > Add and implement following functions in SCMClientProtocolServer > # isScmInChillMode > # forceScmEnterChillMode > # forceScmExitChillMode -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-370) Add and implement following functions in SCMClientProtocolServer
[ https://issues.apache.org/jira/browse/HDDS-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar reassigned HDDS-370: --- Assignee: Ajay Kumar > Add and implement following functions in SCMClientProtocolServer > > > Key: HDDS-370 > URL: https://issues.apache.org/jira/browse/HDDS-370 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Add and implement following functions in SCMClientProtocolServer > # isScmInChillMode > # forceScmEnterChillMode > # forceScmExitChillMode -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-350) ContainerMapping#flushContainerInfo doesn't set containerId
[ https://issues.apache.org/jira/browse/HDDS-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590424#comment-16590424 ] Ajay Kumar commented on HDDS-350: - [~xyao] thanks for review and commit. > ContainerMapping#flushContainerInfo doesn't set containerId > --- > > Key: HDDS-350 > URL: https://issues.apache.org/jira/browse/HDDS-350 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-350.00.patch > > > ContainerMapping#flushContainerInfo doesn't set containerId which results in > containerId being null in flushed containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: HDDS-364.02.patch > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch, HDDS-364.02.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590755#comment-16590755 ] Ajay Kumar commented on HDDS-364: - patch v2 to fix test failure and findbug warning. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch, HDDS-364.02.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-375) Update ContainerReportHandler to not send events for open containers
Ajay Kumar created HDDS-375: --- Summary: Update ContainerReportHandler to not send events for open containers Key: HDDS-375 URL: https://issues.apache.org/jira/browse/HDDS-375 Project: Hadoop Distributed Data Store Issue Type: New Feature Reporter: Ajay Kumar -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-364: Attachment: HDDS-364.03.patch > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch, HDDS-364.02.patch, > HDDS-364.03.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Description: Update ContainerReportHandler to not send events for open containers. > Update ContainerReportHandler to not send events for open containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Priority: Major > > Update ContainerReportHandler to not send events for open containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-375) Update ContainerReportHandler to not send events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar reassigned HDDS-375: --- Assignee: Ajay Kumar > Update ContainerReportHandler to not send events for open containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Update ContainerReportHandler to not send events for open containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591921#comment-16591921 ] Ajay Kumar commented on HDDS-364: - patch v3 to fix yetus issues. Created [HDDS-375] to avoid sending open container replication events from ContainerReportHandler. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch, HDDS-364.02.patch, > HDDS-364.03.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Attachment: HDDS-375.00.patch > Update ContainerReportHandler to not send events for open containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-375.00.patch > > > Update ContainerReportHandler to not send events for open containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Status: Patch Available (was: Open) > Update ContainerReportHandler to not send events for open containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-375.00.patch > > > Update ContainerReportHandler to not send events for open containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: (was: HDDS-351.00.patch) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.00.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Status: Patch Available (was: Open) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-351.00.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send replication events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Summary: Update ContainerReportHandler to not send replication events for open containers (was: Update ContainerReportHandler to not send events for open containers) > Update ContainerReportHandler to not send replication events for open > containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-375.00.patch > > > Update ContainerReportHandler to skip sending replication events for open > containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Description: Update ContainerReportHandler to skip sending replication events for open containers. (was: Update ContainerReportHandler to not send events for open containers.) > Update ContainerReportHandler to not send events for open containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-375.00.patch > > > Update ContainerReportHandler to skip sending replication events for open > containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-375) Update ContainerReportHandler to not send replication events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-375: Attachment: HDDS-375.01.patch > Update ContainerReportHandler to not send replication events for open > containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-375.00.patch, HDDS-375.01.patch > > > Update ContainerReportHandler to skip sending replication events for open > containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-364) Update open container replica information in SCM during DN register
[ https://issues.apache.org/jira/browse/HDDS-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593860#comment-16593860 ] Ajay Kumar commented on HDDS-364: - [~elek] thanks for review and commit. > Update open container replica information in SCM during DN register > --- > > Key: HDDS-364 > URL: https://issues.apache.org/jira/browse/HDDS-364 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-364.00.patch, HDDS-364.01.patch, HDDS-364.02.patch, > HDDS-364.03.patch > > > Update open container replica information in SCM during DN register. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-375) Update ContainerReportHandler to not send replication events for open containers
[ https://issues.apache.org/jira/browse/HDDS-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593859#comment-16593859 ] Ajay Kumar commented on HDDS-375: - [~xyao] thanks for reviewing this. Updated summary section. Patch v1 to address checkstyles. > Update ContainerReportHandler to not send replication events for open > containers > > > Key: HDDS-375 > URL: https://issues.apache.org/jira/browse/HDDS-375 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-375.00.patch, HDDS-375.01.patch > > > Update ContainerReportHandler to skip sending replication events for open > containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-369) Remove the containers of a dead node from the container state map
[ https://issues.apache.org/jira/browse/HDDS-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594120#comment-16594120 ] Ajay Kumar commented on HDDS-369: - [~elek] Patch LGTM. Minor nit: Change log statement to parametrized logging. i.e {code}LOG.info("Datanode " + datanodeDetails.getUuid() + " is dead. Removing replications from the in-memory state.");{code} to LOG.info("Datanode {} is dead. Removing replications from the in-memory state.", datanodeDetails.getUuid()); {code}LOG.error("Can't remove container from containerStateMap " + container.getId(), e);{code} to {code}LOG.error("Can't remove container from containerStateMap {}", container.getId(), e);{code} Also apart from updating replica info, shouldn't this handler also update information related to storage (space left/used) for cluster? > Remove the containers of a dead node from the container state map > - > > Key: HDDS-369 > URL: https://issues.apache.org/jira/browse/HDDS-369 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-369.001.patch, HDDS-369.002.patch > > > In case of a node is dead we need to update the container replicas > information of the containerStateMap for all the containers from that > specific node. > With removing the replica information we can detect the under replicated > state and start the replication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-369) Remove the containers of a dead node from the container state map
[ https://issues.apache.org/jira/browse/HDDS-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594120#comment-16594120 ] Ajay Kumar edited comment on HDDS-369 at 8/27/18 7:13 PM: -- [~elek] Patch LGTM. Minor nit: Change log statement to parametrized logging. i.e {code}LOG.info("Datanode " + datanodeDetails.getUuid() + " is dead. Removing replications from the in-memory state."); LOG.info("Datanode {} is dead. Removing replications from the in-memory state.", datanodeDetails.getUuid());{code} {code}LOG.error("Can't remove container from containerStateMap " + container.getId(), e); to LOG.error("Can't remove container from containerStateMap {}", container.getId(), e);{code} Also apart from updating replica info, shouldn't this handler also update information related to storage (space left/used) for cluster? was (Author: ajayydv): [~elek] Patch LGTM. Minor nit: Change log statement to parametrized logging. i.e {code}LOG.info("Datanode " + datanodeDetails.getUuid() + " is dead. Removing replications from the in-memory state.");{code} to LOG.info("Datanode {} is dead. Removing replications from the in-memory state.", datanodeDetails.getUuid()); {code}LOG.error("Can't remove container from containerStateMap " + container.getId(), e);{code} to {code}LOG.error("Can't remove container from containerStateMap {}", container.getId(), e);{code} Also apart from updating replica info, shouldn't this handler also update information related to storage (space left/used) for cluster? > Remove the containers of a dead node from the container state map > - > > Key: HDDS-369 > URL: https://issues.apache.org/jira/browse/HDDS-369 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-369.001.patch, HDDS-369.002.patch > > > In case of a node is dead we need to update the container replicas > information of the containerStateMap for all the containers from that > specific node. > With removing the replica information we can detect the under replicated > state and start the replication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-376) Create custom message structure for use in AuditLogging
[ https://issues.apache.org/jira/browse/HDDS-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594170#comment-16594170 ] Ajay Kumar commented on HDDS-376: - [~dineshchitlangia] thanks for working on this. atch v4 LGTM. Few question and suggestions: * AuditMessage#getParameters: this can return parameter values of params map? * Why not wrap Level,AuditEventStatus and Throwable within AuditMessage? This will make audit message self sufficient for logging. With this we can provide a single API to users to log all there messages. > Create custom message structure for use in AuditLogging > --- > > Key: HDDS-376 > URL: https://issues.apache.org/jira/browse/HDDS-376 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Major > Labels: audit, logging > Fix For: 0.2.1 > > Attachments: HDDS-376.001.patch, HDDS-376.002.patch, > HDDS-376.003.patch, HDDS-376.004.patch > > > In HDDS-198 we introduced a framework for AuditLogging in Ozone. > We had used StructuredDataMessage for formatting the messages to be logged. > > Based on discussion with [~jnp] and [~anu], this Jira proposes to create a > custom message structure to generate audit messages in the following format: > user=xxx ip=xxx op=_ \{key=val, key1=val1..} ret=XX -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-222) Remove hdfs command line from ozone distrubution.
[ https://issues.apache.org/jira/browse/HDDS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594285#comment-16594285 ] Ajay Kumar commented on HDDS-222: - Since ozone can sit in parallel with hdfs, i was thinking of avoiding the hdfs/hadoop jars inside ozone. (easier to maintain... patching etc) With both scopes we are having hadoop jars in ozone target dir. So i guess compile scope is ok. > Remove hdfs command line from ozone distrubution. > - > > Key: HDDS-222 > URL: https://issues.apache.org/jira/browse/HDDS-222 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-222.001.patch, HDDS-222.002.patch > > > As the ozone release artifact doesn't contain a stable namenode/datanode code > the hdfs command should be removed from the ozone artifact. > ozone-dist-layout-stitching also could be simplified to copy only the > required jar files (we don't need to copy the namenode/datanode server side > jars, just the common artifacts -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595436#comment-16595436 ] Ajay Kumar commented on HDDS-351: - [~anu] thanks for detailed review. Please see my response inlun {quote}EventQueue.java - I am afraid adding removeHandler would cause race conditions to the code since the Data structures do not seem to be protected. {quote} Changed corresponding data structure to cover this. {quote}nit: MiniOzoneClusterImpl.java - "Stopping down" ==> stopping {quote} done {quote}SCMChillModeManager.java: Line 72#run() - Why do we have a run function in this code path? That thread starts up and immediately dies. I don't understand why we need to run this in the context of a thread. From a quick reading, it looks like we could have done this in a constructor. {quote} Basic idea was to handle rules which doesn't listen for specific events but just monitor state. Removed for time being. {quote}SCMChillModeManager.java: Line 75 - This implies that getScmContainerManager is finished initializing. If that is the contract that we want to support, we must make it explicit. We need to create a flag in ScmContainerManager that indicates that is done with init. We need to assert that info here and take a dependency on that; otherwise we can avoid depending on ContainerManager. {quote} This explicit contract is no better than checking for instance being not null as for all practical purpose we can't check that flag without instance being initialized. If instance is not null than it implicitly means container data structure is already initialized. Added not null check in new patch. {quote}exitRules[0] - This seems like a poor coding practice. If we want to add rules that need to evaluate, I suggest that we create a list, and invoke the add function. The problem is not in this line. The problem is in line 118, where if someone changes the Array index the code will break. That kind of dependency on an index of an array and function pointer is very brittle. Already commented this seems like a bad pattern of code. exitRules[0].process(nodeRegistrationContainerReport); {quote} Initially it was a list but changed it to array as size of rules is definitive/static. Ya, it looks little brittle but issue remains same in list as well as you have to access members by index. But i agree from readability perspective this can be improved so changed it to a map. {quote}private double maxContainer; why is this double? we are assigning an "int" to this later. {quote} do avoid casting during cutoff calculations. {quote}Line 82: validateChillModeExitRules – if we have rule framework, then hard coding if (maxContainer == 0) exitChillMode seems a violation of the framework that we are building up. {quote} good find, this should move inside inner class. {quote}maxContainer – why is this variable being accessed from an inner class? Why are we not passing this as a value into the inner class? Why bind to the outer class? {quote} moved to inner class. {quote}((ContainerChillModeRule) exitRules[0]).clear(); – why? {quote} Renamed it to cleanup and moved it to interface to allow optional cleanup of resources on chill mode exit. {quote}The model of exit criteria seems little off in my mind. I think we should have an object that takes the current state of the cluster and the expected state of the cluster – Nodes, Closed Containers, Open Containers, and Pipelines. The expected and current state will allow these *ExitRule classes to implement code that decides if we should exit from chill mode. {quote} Individual inner rule classes will maintain there own state, we should just query them if they have meet the exit criteria or not. Thats what validate function does. {quote}SCMDatanodeProtocolServer.java: Shouldn't the {{eventPublisher.fireEvent}} be inside the if? {quote} agree, moved it inside. {quote}What is the difference between NODE_REGISTRATION_CONT_REPORT and just CONTAINER_REPORT? {quote} In short NODE_REGISTRATION_CONT_REPORT is generated during node registration while CONTAINER_REPORT is emitted continuously on heartbeats. Since we are interested in knowing initial container reports, we are getting them from register instead of heartbeat. This also avoid any special handling of redundant containerReports in case we choose to use CONTAINER_REPORT. {quote}StorageContainerManager.java – why remove the START_REPLICATION event? {quote} This is now transmitted on chill mode exit. > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.pa
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.01.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595436#comment-16595436 ] Ajay Kumar edited comment on HDDS-351 at 8/28/18 7:05 PM: -- [~anu] thanks for detailed review. Please see my response inline. {quote}EventQueue.java - I am afraid adding removeHandler would cause race conditions to the code since the Data structures do not seem to be protected. {quote} Changed corresponding data structure to cover this. {quote}nit: MiniOzoneClusterImpl.java - "Stopping down" ==> stopping {quote} done {quote}SCMChillModeManager.java: Line 72#run() - Why do we have a run function in this code path? That thread starts up and immediately dies. I don't understand why we need to run this in the context of a thread. From a quick reading, it looks like we could have done this in a constructor. {quote} Basic idea was to handle rules which doesn't listen for specific events but just monitor state. Removed for time being. {quote}SCMChillModeManager.java: Line 75 - This implies that getScmContainerManager is finished initializing. If that is the contract that we want to support, we must make it explicit. We need to create a flag in ScmContainerManager that indicates that is done with init. We need to assert that info here and take a dependency on that; otherwise we can avoid depending on ContainerManager. {quote} This explicit contract is no better than checking for instance being not null as for all practical purpose we can't check that flag without instance being initialized. If instance is not null than it implicitly means container data structure is already initialized. Added not null check in new patch. {quote}exitRules[0] - This seems like a poor coding practice. If we want to add rules that need to evaluate, I suggest that we create a list, and invoke the add function. The problem is not in this line. The problem is in line 118, where if someone changes the Array index the code will break. That kind of dependency on an index of an array and function pointer is very brittle. Already commented this seems like a bad pattern of code. exitRules[0].process(nodeRegistrationContainerReport); {quote} Initially it was a list but changed it to array as size of rules is definitive/static. Ya, it looks little brittle but issue remains same in list as well as you have to access members by index. But i agree from readability perspective this can be improved so changed it to a map. {quote}private double maxContainer; why is this double? we are assigning an "int" to this later. {quote} do avoid casting during cutoff calculations. {quote}Line 82: validateChillModeExitRules – if we have rule framework, then hard coding if (maxContainer == 0) exitChillMode seems a violation of the framework that we are building up. {quote} good find, this should move inside inner class. {quote}maxContainer – why is this variable being accessed from an inner class? Why are we not passing this as a value into the inner class? Why bind to the outer class? {quote} moved to inner class. {quote}((ContainerChillModeRule) exitRules[0]).clear(); – why? {quote} Renamed it to cleanup and moved it to interface to allow optional cleanup of resources on chill mode exit. {quote}The model of exit criteria seems little off in my mind. I think we should have an object that takes the current state of the cluster and the expected state of the cluster – Nodes, Closed Containers, Open Containers, and Pipelines. The expected and current state will allow these *ExitRule classes to implement code that decides if we should exit from chill mode. {quote} Individual inner rule classes will maintain there own state, we should just query them if they have meet the exit criteria or not. Thats what validate function does. {quote}SCMDatanodeProtocolServer.java: Shouldn't the {{eventPublisher.fireEvent}} be inside the if? {quote} agree, moved it inside. {quote}What is the difference between NODE_REGISTRATION_CONT_REPORT and just CONTAINER_REPORT? {quote} In short NODE_REGISTRATION_CONT_REPORT is generated during node registration while CONTAINER_REPORT is emitted continuously on heartbeats. Since we are interested in knowing initial container reports, we are getting them from register instead of heartbeat. This also avoid any special handling of redundant containerReports in case we choose to use CONTAINER_REPORT. {quote}StorageContainerManager.java – why remove the START_REPLICATION event? {quote} This is now transmitted on chill mode exit. was (Author: ajayydv): [~anu] thanks for detailed review. Please see my response inlun {quote}EventQueue.java - I am afraid adding removeHandler would cause race conditions to the code since the Data structures do not seem to be protected. {quote} Changed corresponding data structure to cover this. {quote}nit: MiniOzoneClusterImpl.java - "Stopping down"
[jira] [Comment Edited] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595436#comment-16595436 ] Ajay Kumar edited comment on HDDS-351 at 8/28/18 7:18 PM: -- [~anu] thanks for detailed review. Please see my response inline. {quote}EventQueue.java - I am afraid adding removeHandler would cause race conditions to the code since the Data structures do not seem to be protected. {quote} Changed corresponding data structure to cover this. Nested list and map are still unsynchronized. Since removeHandler is limited to chill mode i think this should be ok. Alternatively we can get rid of these new API's and let SCMChillModeManager handle register events. (will need a minor change in SCMChillModeManager) {quote}nit: MiniOzoneClusterImpl.java - "Stopping down" ==> stopping {quote} done {quote}SCMChillModeManager.java: Line 72#run() - Why do we have a run function in this code path? That thread starts up and immediately dies. I don't understand why we need to run this in the context of a thread. From a quick reading, it looks like we could have done this in a constructor. {quote} Basic idea was to handle rules which doesn't listen for specific events but just monitor state. Removed for time being. {quote}SCMChillModeManager.java: Line 75 - This implies that getScmContainerManager is finished initializing. If that is the contract that we want to support, we must make it explicit. We need to create a flag in ScmContainerManager that indicates that is done with init. We need to assert that info here and take a dependency on that; otherwise we can avoid depending on ContainerManager. {quote} This explicit contract is no better than checking for instance being not null as for all practical purpose we can't check that flag without instance being initialized. If instance is not null than it implicitly means container data structure is already initialized. Added not null check in new patch. {quote}exitRules[0] - This seems like a poor coding practice. If we want to add rules that need to evaluate, I suggest that we create a list, and invoke the add function. The problem is not in this line. The problem is in line 118, where if someone changes the Array index the code will break. That kind of dependency on an index of an array and function pointer is very brittle. Already commented this seems like a bad pattern of code. exitRules[0].process(nodeRegistrationContainerReport); {quote} Initially it was a list but changed it to array as size of rules is definitive/static. Ya, it looks little brittle but issue remains same in list as well as you have to access members by index. But i agree from readability perspective this can be improved so changed it to a map. {quote}private double maxContainer; why is this double? we are assigning an "int" to this later. {quote} do avoid casting during cutoff calculations. {quote}Line 82: validateChillModeExitRules – if we have rule framework, then hard coding if (maxContainer == 0) exitChillMode seems a violation of the framework that we are building up. {quote} good find, this should move inside inner class. {quote}maxContainer – why is this variable being accessed from an inner class? Why are we not passing this as a value into the inner class? Why bind to the outer class? {quote} moved to inner class. {quote}((ContainerChillModeRule) exitRules[0]).clear(); – why? {quote} Renamed it to cleanup and moved it to interface to allow optional cleanup of resources on chill mode exit. {quote}The model of exit criteria seems little off in my mind. I think we should have an object that takes the current state of the cluster and the expected state of the cluster – Nodes, Closed Containers, Open Containers, and Pipelines. The expected and current state will allow these *ExitRule classes to implement code that decides if we should exit from chill mode. {quote} Individual inner rule classes will maintain there own state, we should just query them if they have meet the exit criteria or not. Thats what validate function does. {quote}SCMDatanodeProtocolServer.java: Shouldn't the {{eventPublisher.fireEvent}} be inside the if? {quote} agree, moved it inside. {quote}What is the difference between NODE_REGISTRATION_CONT_REPORT and just CONTAINER_REPORT? {quote} In short NODE_REGISTRATION_CONT_REPORT is generated during node registration while CONTAINER_REPORT is emitted continuously on heartbeats. Since we are interested in knowing initial container reports, we are getting them from register instead of heartbeat. This also avoid any special handling of redundant containerReports in case we choose to use CONTAINER_REPORT. {quote}StorageContainerManager.java – why remove the START_REPLICATION event? {quote} This is now transmitted on chill mode exit. was (Author: ajayydv): [~anu] thanks for detailed review. Please see my response inl
[jira] [Created] (HDDS-384) Add api to remove handler in EventQueue
Ajay Kumar created HDDS-384: --- Summary: Add api to remove handler in EventQueue Key: HDDS-384 URL: https://issues.apache.org/jira/browse/HDDS-384 Project: Hadoop Distributed Data Store Issue Type: New Feature Reporter: Ajay Kumar Assignee: Ajay Kumar Add api to remove handler in EventQueue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-384) Add api to remove event handler in EventQueue
[ https://issues.apache.org/jira/browse/HDDS-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-384: Summary: Add api to remove event handler in EventQueue (was: Add api to remove handler in EventQueue) > Add api to remove event handler in EventQueue > - > > Key: HDDS-384 > URL: https://issues.apache.org/jira/browse/HDDS-384 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Add api to remove handler in EventQueue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-384) Add api to remove event handler in EventQueue
[ https://issues.apache.org/jira/browse/HDDS-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-384: Description: Add api to remove event handler in EventQueue. (was: Add api to remove handler in EventQueue) > Add api to remove event handler in EventQueue > - > > Key: HDDS-384 > URL: https://issues.apache.org/jira/browse/HDDS-384 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > > Add api to remove event handler in EventQueue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.02.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-268: Description: Add a event watcher for CLOSE_CONTAINER_STATUS events. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch, HDDS-268.02.patch, > HDDS-268.03.patch, HDDS-268.04.patch, HDDS-268.05.patch > > > Add a event watcher for CLOSE_CONTAINER_STATUS events. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595624#comment-16595624 ] Ajay Kumar commented on HDDS-351: - [~anu] thanks for quick review. patch v2 with following changes. * Removed changes in EventQueue to remove event handler. Opened HDDS-384 to discuss it separately. * Updated test case for 0 container case. {quote}nit: Start function is commented out. StorageContainerManager.java: Line 607 : is commented out. nit: testSCMChillMode - Line 474: OZONE_METADATA_STORE_IMPL_LEVELDB ==> Replace with RocksDB.{quote} Done > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: (was: HDDS-351.02.patch) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.02.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-362) Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol
[ https://issues.apache.org/jira/browse/HDDS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-362: Attachment: HDDS-362.00.patch > Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol > --- > > Key: HDDS-362 > URL: https://issues.apache.org/jira/browse/HDDS-362 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDDS-362.00.patch > > > Modify functions impacted by SCM chill mode in ScmBlockLocationProtocol -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Attachment: HDDS-351.03.patch > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch, > HDDS-351.03.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595752#comment-16595752 ] Ajay Kumar commented on HDDS-351: - [~anu] ya failure in {{TestContainerMapping}} is related. Fixed it in patch v3. TestOzoneConfigurationFields failure is unrelated. Both checkstyle warning may be ignored. > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch, > HDDS-351.03.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Status: Open (was: Patch Available) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch, > HDDS-351.03.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-351) Add chill mode state to SCM
[ https://issues.apache.org/jira/browse/HDDS-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-351: Status: Patch Available (was: Open) > Add chill mode state to SCM > --- > > Key: HDDS-351 > URL: https://issues.apache.org/jira/browse/HDDS-351 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-351.00.patch, HDDS-351.01.patch, HDDS-351.02.patch, > HDDS-351.03.patch > > > Add chill mode state to SCM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org