[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-700582338 > There is following logic in ReplicatioManager, which will handle the replicas reported during container state is DELETING. Sorry I missed that. You are correct. I am +1 on this change as it is now, so feel free to commit it. @linyiqun I do agree that I think this could be handled more cleanly and efficiently in the container report handler. However its probably not much of an overhead for replication manager. I am happy for us to commit the change as it is, and we can see how it performs in practice. Worst case we have to refactor the change out of RM into the report handler. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-700016623 This change looks almost good now. I wonder about two final things: 1. In `updateContainerStats(...)` do you think we should return if the container is DELETING or DELETED without making any updates? If this is a stale replica, they it may not be empty and hence could update the container stats incorrectly. If the Container is already in DELETING or DELETED state, then we can ignore any changes to it, as we know the stale replica will get removed anyway. 2. I am wondering if we could receive a stale replica when the container is DELETING. Then a replica would get added in `updateContainerReplica(...)`. Then later the container will go to DELETED and the state replica will get reported again - at that point we will send a delete command, but the replica will never get removed from memory now I think. Would it make sense to send the delete command for any replicas received when the container is DELETING or DELETED? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-698259315 Sorry for the slow reply on this. I have been caught up on some other things. > After a second thought, deleting the container record in SCM DB immediately while keep it in memory maybe a better and clean choice. So if there is stale container replica, it can be deleted based on in memory information. I think this is a good enough idea for now. If SCM is up for a very long time, perhaps in the future we will want to add a thread to clear all the in memory DELETED containers. One small concern is that if a container goes DELETED and then SCM is restarted soon after. Then a DN is restarted and reports a stale replica, it will just be seen as an unknown container. The default position there, is to log a warning. The config hdds.scm.unknown-container.action controls this. This is all an edge case - most of the time, all DNs should be up anyway. I left just one comment on a suggested refactor in the container report handler, when dealing with replicas from a DELETED container. Could you also add a test in TestContainerReportHander to check the logic around deleting a replica from a DELETED container? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-698259315 Sorry for the slow reply on this. I have been caught up on some other things. > After a second thought, deleting the container record in SCM DB immediately while keep it in memory maybe a better and clean choice. So if there is stale container replica, it can be deleted based on in memory information. I think this is a good enough idea for now. If SCM is up for a very long time, perhaps in the future we will want to add a thread to clear all the in memory DELETED containers. One small concern is that if a container goes DELETED and then SCM is restarted soon after. Then a DN is restarted and reports a stale replica, it will just be seen as an unknown container. The default position there, is to log a warning. The config hdds.scm.unknown-container.action controls this. This is all an edge case - most of the time, all DNs should be up anyway. I left just one comment on a suggested refactor in the container report handler, when dealing with replicas from a DELETED container. Could you also add a test in TestContainerReportHander to check the logic around deleting a replica from a DELETED container? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-688978976 Thinking some more about what to do with containers in the DELETED state in SCM. If we are not confident to delete them immediately and automatically, maybe we should provide an command "purge empty containers" which can be run by an admin, or via Recon. We should show in Recon how many empty containers are present in the cluster and then provide an option to remove them. Alternatively we could have some threshold in SCM, so that when the DELETED containers crosses some threshold, they are all removed. We could do either of these in a separate jira if this idea makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-688774840 As I said before, I have a feeling that it *might* be better to have a separate service for deleting empty containers, and it would perhaps naturally be part of a "merge small containers" service, which we will need to create eventually. However, the changes to the replication manager here are not too big, and you could argue the current solution fits here OK. Therefore I am OK with keeping this logic in the replication manager. However when it comes to merging small containers in the future, I believe we should create a separate service and not clutter RM with that new feature too. The patch looks mostly good to me, there are just a couple of areas we need to consider. 1. What do we do with the containers in DELETED state, and with no replicas? I don't think we should keep them in SCM forever, but we are also worried about removing them. However that said, if the replicas have all been deleted, what use is there in keeping the container in SCM? 2. Consider if a datanode was dead for some time, and is then restarted. It will start reporting a replica for the DELETED container and with the current logic I don't think we will clean it up as `isContainerEmpty(...)` will always return false due to it checking for only CLOSED containers. I feel that this should be handled in the ContainerReportHandler code. If a replica is reported for a container which is DELETED in SCM, then we should instruct the datanode to delete the replica. What do you think? The parameter hdds.scm.unknown-container.action currently controls what to do if a DN reports a containers which SCM does not know about. The default is to log a warning, but for a DELETED container, I think we should just delete the replica. 3. There is another corner case, where the dead node may have a container which is out of sync with the others, as it was dead when the blocks should have been removed. Therefore we had 3 empty container replicas, plus one non-empty dead one. The 3 empty replicas are removed, and the container moved to "DELETED". Then the dead node starts reporting a non-empty replica for the DELETED container. This is actually a wider problem, in that I believe there are scenarios where container replicas can get out of sync with one another and there is nothing to reconcile them and fix it. 4. In ContainerSafeModeRule, I think we need to consider DELETING and DELETED. If there are a lot of DELETED containers in the cluster, then they will never get replicas and the safemode rule will never get met. I think this could be as simple as excluding DELETING and DELETED when loading the containerMap (right now OPEN and CLOSING are excluded) so we don't wait for replicas on these containers. 5. In `ReplicationManager.isContainerUnderReplicated(...)`, should we immediately return true if the containerState is DELETING or DELETED? 6. Are there any other places outside of ReplicationManager and ContainerSafeModeRule where we monitor under / over replicated containers I may have missed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-676523668 I raised HDDS-4131 / https://github.com/apache/hadoop-ozone/pull/1339 discuss and address the bytesUsed and KeyCount metric problem I mentioned above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] sodonnel commented on pull request #1338: HDDS-4023. Delete closed container after all blocks have been deleted.
sodonnel commented on pull request #1338: URL: https://github.com/apache/hadoop-ozone/pull/1338#issuecomment-676472849 Thanks for working on this @ChenSammi. I wonder if it would be simpler to remove empty containers as part of Container Report processing? In `AbstractContainerReportHandler#updateContainerState`, we could check the size and number of keys of the reported containers in the CLOSED branch of the switch statement, and then take action to delete an empty container there? I have a feeling it might be simpler, but I am not sure. The disadvantage of doing it in the Container Report Processing, is that we are dealing with only a single replica at that stage. However if the container is CLOSED in SCM, and a report says it is empty then we should be good to simply remove the container from SCM and issue the delete container command when processing the container report. In looking at how this all works, I noticed there is a bug in the existing method AbstractContainerReportHandler#updateContainerStats, which might stop your change working: ``` private void updateContainerStats(final ContainerID containerId, final ContainerReplicaProto replicaProto) throws ContainerNotFoundException { if (isHealthy(replicaProto::getState)) { final ContainerInfo containerInfo = containerManager .getContainer(containerId); if (containerInfo.getSequenceId() < replicaProto.getBlockCommitSequenceId()) { containerInfo.updateSequenceId( replicaProto.getBlockCommitSequenceId()); } if (containerInfo.getUsedBytes() < replicaProto.getUsed()) { containerInfo.setUsedBytes(replicaProto.getUsed()); } if (containerInfo.getNumberOfKeys() < replicaProto.getKeyCount()) { containerInfo.setNumberOfKeys(replicaProto.getKeyCount()); } } } ``` This logic assumes the bytes used and keyCount is always increasing on the container. However in this case, where blocks are deleted the keyCount and bytesUsed can decrease. Therefore I don't think the stats will ever get updated in SCM via the container reports when blocks are removed. I then noticed you recently made a change in ReplicationManager to update these counts here: https://github.com/apache/hadoop-ozone/pull/1295/files#diff-218132dfe72e6e39b9eaaf59737adcf2R780 I think the ReplicationManager part of that change should be reverted and it handled via the ContainerReports by fixing the bug I mentioned above. The only reason we need the Replication Manager change is to work around the bug above. If it was fixed, we don't need that logic in RM. If we decide to keep your changes inside replicationManager, then we would need a few more tests added to TestReplicationManager to cover the new scenarios. I also have a question - when the replicas are all deleted, and we call: ``` containerManager.updateContainerState(container.containerID(), HddsProtos.LifeCycleEvent.CLEANUP); ``` Does this somehow cause the container to be removed from SCM memory and the persistent store? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org