ashishkumar50 commented on PR #10440: URL: https://github.com/apache/ozone/pull/10440#issuecomment-4742742310
> I scanned over this and my immediate thought it that there are a lot of places where allocation, deallocation and rollback have to happen - it would be easy for bugs to creep in. > > ReplicationManager tracks all its pending container creations and deletes in ContainerReplicaPendingOps - would it be cleaner and easier to hook into there to adjust the space allocated on a datanode? Once a container is added to the PendingOps tracking its definitely going to be sent - so we don't need to worry about the overloaded exceptions etc at that point or rollbacks. > > Pending ops is also cleared by the ICRs or a timeout if the new container does not appear in time. It feels like it would be a much more central place to add the accounting. @sodonnel Thanks for the review. Main idea was to first account space in SCM before sending any datanode command to avoid any over-allocation to DN. But yes with the current approach to make more accurate it has more complexity and chances of errors are high. Agree with your suggestion, we can just account space in ContainerReplicaPendingOps(ADD). And this will simplify everything. Only downside is, there may be some over-allocation. But with other [PR change](https://github.com/apache/ozone/pull/10054), SCM has extra buffer space than DN. I will go with your suggestion to simplify. > Surely when RM asks the placement policy for a node, the node should already have been checked for space? It shouldn't be RMs job to validate free space in a node it just asked for, and then ask for another. The placement policy should do the check and only return "good" nodes (I thought it did that already?) or thrown an error "no possible nodes found". Yes placement policy already checks for space, but problem is placement policy is not aware of container. So we cannot account space there. And it will make non-atomic while accounting space in RM. But as per first comment we can keep now RM just to record without checking space again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
