[ 
https://issues.apache.org/jira/browse/HDFS-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081250#comment-14081250
 ] 

Colin Patrick McCabe commented on HDFS-6784:
--------------------------------------------

Ah.  You're right, both cases are going FSN -> CRM, not the other way around.  
That's what I get for commenting so late at night.

Anyway, this is not the right approach.  If modifyDirective is caling 
{{setNeedsRescan}} multiple times, each time could trigger a rescan that 
completes before the next time.  There's no reason to assume that the thread 
will run after all the calls have been made-- in fact, given the way condition 
variables work, it's more likely that it will run immediately.  In that case, 
this patch is not useful, even leaving aside any other issues with it.  You 
need to remove the duplicate calls to {{setNeedsRescan}} and call it only once.

> Avoid rescan twice in HDFS CacheReplicationMonitor for one FS Op if it calls 
> setNeedsRescan multiple times.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6784
>                 URL: https://issues.apache.org/jira/browse/HDFS-6784
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: caching
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>         Attachments: HDFS-6784.001.patch
>
>
> In HDFS CacheReplicationMonitor,  rescan is expensive. Sometimes, 
> {{setNeedsRescan}} is called multiple times, for example, in 
> FSNamesystem#modifyCacheDirective, there are 3 times. In monitor thread of 
> CacheReplicationMonitor, if it checks {{needsRescan}} is true, rescan will 
> happen, but {{needsRescan}} is set to false before real scan. Meanwhile, the 
> 2nd or 3rd time {{setNeedsResacn}} may set {{needsRescan}} to true. So after 
> the scan finish, in next loop, a new rescan will be triggered, that's not 
> necessary at all and inefficient for rescan twice. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to