[ 
https://issues.apache.org/jira/browse/HDFS-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5203:
--------------------------------

    Attachment: HDFS-5203.1.patch

I'm attaching a preliminary patch.  I've tested this manually.  This isn't 
ready to commit though, because it doesn't include tests.  If anyone wants to 
provide feedback while I'm writing tests, I'd appreciate that.

Summary:
* Remove {{CacheManager}} enforcement of uniqueness of descriptors for path + 
pool.
* Add API for {{DistributedFileSystem#removePathBasedCacheDescriptors}} given a 
{{Path}}.  Implement this all the way down through {{DFSClient}}, protobuf, 
{{FSNamesystem}}, and {{CacheManager}}.
* Add a new -removeDescriptors command to {{CacheAdmin}}, so that cluster 
admins still have an easy way to stop all caching on a path.
* Add a new edit log op for remove-all.  I prefer this over possibly sitting in 
a tight loop of multiple remove-one ops with the write lock held.  Note that I 
renumbered some of the ops in this patch, so if you have an existing edit log 
from the HDFS-4949 branch, then this is incompatible.


> Concurrent clients that add a cache directive on the same path may 
> prematurely uncache from each other.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5203
>                 URL: https://issues.apache.org/jira/browse/HDFS-5203
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: HDFS-4949
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HDFS-5203.1.patch
>
>
> When a client adds a cache directive, we assign it a unique ID and return 
> that ID to the client.  If multiple clients add a cache directive for the 
> same path, then we return the same ID.  If one client then removes the cache 
> entry for that ID, then it is removed for all clients.  Then, when this 
> change becomes visible in subsequent cache reports, the datanodes may 
> {{munlock}} the block before the other clients are done with it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to