[ 
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803550#comment-13803550
 ] 

Andrew Wang commented on HDFS-5394:
-----------------------------------

Hi Colin, good catch on these races. There's definitely some sloppiness we can 
prevent here, and some of your other code cleanups look good too.

High-level comments:

* This state machine logic seems like overkill. It's cool how we can do all 
this with atomic swaps, but synchronization isn't a big overhead as this isn't 
a hot path, there aren't that many states, and this the transitions aren't that 
complicated. I think it'd be simpler to just use a couple different maps: 
{{replicasMap}} for cached replicas and {{pendingCache}} / {{pendingUncache}} 
for in-progress replicas.
** {{CACHING_CANCELLED}} could be indicated by just moving it from 
{{pendingCache}} to {{pendingUncache}}
** Removes the need to worry about "visibility" via {{shouldAdvertise}}
* I like the idea of {{UncachingTask}}, it's good future-proofing. However, I 
don't think it should be using the same executor as {{CachingTask}}, since the 
executor is currently capped at 4 threads per volume in the interest of disk 
contention. Uncaching tasks do no I/O, and we wouldn't want stuck uncaching 
tasks to stop other work. It might be better to instead have a background 
sweeper thread that goes through {{pendingUncache}} (kicked on-demand as 
necessary). This works well with a separate {{pendingUncache}} map too.

Lower-level:

* NativeIO: should move the comment up to be javadoc instead
* Nit: FsDatasetCache comment should read "was cancelled" not "got cancelled"
* A remark, this only sets {{nextValue}} to null, which is okay here, but 
seeing it did make me check that it's not an error:
{code}
    Value prevValue, nextValue = null;
{code}
* I don't see a method named "startUncaching", need to update the exception text
* trailing whitespace:
{code}
    if (nextValue.state != State.CACHING_CANCELLED) { 
{code}
* Would be good to refer to the JIRA # in the {{UncachingTask}} TODO
* Javadoc for MappableBlock#getVisibleLength and MappableBlock constructor
* Javadoc for MappableBlock#load refers to "visibleLeng" instead of 
"visibleLength"
* Would be nice to drop a comment in {{uncacheBlock}} about the purpose of the 
do while for atomic swaps
* Why is it called "visibleLength" instead of just just "length"? We don't 
support partial caching, reading of blocks open for append, or cached reads of 
blocks that are being cached, so mention of "visibility" threw me off. That's 
why I wanted javadoc above.
* Could we keep {{FsDatasetImpl#validToCache}} method separate rather than 
inlining it in {{cacheBlock}}?
* Don't we need to remove a {{CACHING_CANCELLED}} {{MappableBlock}} from the 
replicaMap when the swap check in {{CachingTask}} fails?

> fix race conditions in DN caching and uncaching
> -----------------------------------------------
>
>                 Key: HDFS-5394
>                 URL: https://issues.apache.org/jira/browse/HDFS-5394
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5394-caching.001.patch, 
> HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch
>
>
> The DN needs to handle situations where it is asked to cache the same replica 
> more than once.  (Currently, it can actually do two mmaps and mlocks.)  It 
> also needs to handle the situation where caching a replica is cancelled 
> before said caching completes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to