prashantpogde commented on pull request #1533: URL: https://github.com/apache/ozone/pull/1533#issuecomment-724181049
@GlenGeng thank you for the patch. This can be addressed in other ways also - let the datanode die and restart - handle this at recon end. Identify this at recon and get it restarted. Why are other data nodes not exhibiting the same behavior ? it should be same for other data nodes as well eventually. - do not cache reports if the queue exceeds some limit. Overall this is an unhealthy situation for the whole cluster. How would keeping the datanode state machine thread alive help this situation ? Instead it would keep reporting in heartbeat that data node is healthy. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
