[ https://issues.apache.org/jira/browse/HDFS-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rakesh R updated HDFS-13166: ---------------------------- Description: Presently {{#getLiveDatanodeStorageReport()}} is fetched for every file and does the computation. This Jira sub-task is to discuss and implement a cache mechanism which in turn reduces the number of function calls. Also, could define a configurable refresh interval and periodically refresh the DN cache by fetching latest {{#getLiveDatanodeStorageReport}} on this interval. Following comments taken from HDFS-10285, here Comment-7) {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} was:Presently {{#getLiveDatanodeStorageReport}} is fetched for every file and does the computation. This task is to discuss and implement a cache mechanism to minimize the number of function calls. Probably, we could define a configurable refresh interval and periodically refresh the DN cache by fetching latest {{#getLiveDatanodeStorageReport}}. > [SPS]: Implement caching mechanism to keep LIVE datanodes to minimize costly > getLiveDatanodeStorageReport() calls > ----------------------------------------------------------------------------------------------------------------- > > Key: HDFS-13166 > URL: https://issues.apache.org/jira/browse/HDFS-13166 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Rakesh R > Assignee: Rakesh R > Priority: Major > > Presently {{#getLiveDatanodeStorageReport()}} is fetched for every file and > does the computation. This Jira sub-task is to discuss and implement a cache > mechanism which in turn reduces the number of function calls. Also, could > define a configurable refresh interval and periodically refresh the DN cache > by fetching latest {{#getLiveDatanodeStorageReport}} on this interval. > Following comments taken from HDFS-10285, here > Comment-7) > {quote}Adding getDatanodeStorageReport is concerning. > getDatanodeListForReport is already a very bad method that should be avoided > for anything but jmx – even then it’s a concern. I eliminated calls to it > years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn > lock for an excessive length of time. Beyond that, the response is going to > be pretty large and tagging all the storage reports is not going to be cheap. > verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem > lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its > storageMap? > Appears to be calling getLiveDatanodeStorageReport for every file. As > mentioned earlier, this is NOT cheap. The SPS should be able to operate on a > fuzzy/cached state of the world. Then it gets another datanode report to > determine the number of live nodes to decide if it should sleep before > processing the next path. The number of nodes from the prior cached view of > the world should suffice. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org