[ https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846682#comment-13846682 ]
Colin Patrick McCabe commented on HDFS-5431: -------------------------------------------- API: * I realize this might be an annoying request -- I know redoing that API code is a pain-- but shouldn't we use an {{EnumSet}} rather than a boolean in {{addDirective}}? We've had so many problems with proliferating function overloads in the past that I just lean towards starting with an {{EnumSet}}. I also think it's more readable. Seeing {{addDirective(foo, FORCE)}} is a lot clearer than {{addDirective(foo, true)}}. CacheReplicationMonitor: * It's good that we are now using {{INodeFile#computeFileSizeNotIncludingLastUcBlock}} in {{CacheReplicationMonitor}}. * I think you have the right general idea here. But it would be nice to do this in a more object-oriented way. How about something like this: {code} private static final class ScanTime { private long timeMs; synchronized void waitForTime(long time) { while (true) { if (timeMs <= time) { return; } this.wait(); } } synchronized void updateTime(long time) { this.timeMs = time; this.notifyAll(); } } {code} Then CRM calls {{updateTime}} when a scan finishes. And {{kickAndWaitForRescan}} calls {{waitForTime(Time.now())}} or something. That way, you don't need volatile variables or variables of type Object floating around-- it's all encapsulated in {{ScanTime}}. Stats * Rather than putting {{bytesOverlimit}} in {{CachePoolStats}}, why not add a method in {{CachePoolEntry}} that just subtracts {{stats.bytesNeeded}} by {{info.limit}}? It bloats the RPCs to transmit essentially the same information twice, and also opens up the possibility of divergence if someone updates one thing and not another. * I don't think we should change {{dfs.namenode.path.based.cache.refresh.interval.ms}}. We selected that value earlier because we felt the cached items would not change that often. It seems like the motivation here behind changing it is to make the case of adding a directive with force=true be a better interface, but that seems like a hack. Why not just have {{listCacheDirectives}} trigger a cache replication monitor run and then wait for it to complete? In basically the same way addDirective without force does now. That way we could support the use case of adding a bunch of directives with {{addDirective(force)}} and then checking to see if we went over with {{listDirectives}}. {code} throw new InvalidRequestException("Caching path " + path + " of size " + stats.getBytesNeeded() + " bytes at replication " + replication + " would exceed pool " + pool.getPoolName() + "'s remaining " + "capacity of " + (pool.getLimit() - pool.getBytesNeeded()) + " bytes."); {code} This is just a small nit, but doesn't {{CacheDirectiveStats#getBytesNeeded}} include the replication factor in it? It might be confusing to say the path has size 30 when it's really a size 10 path with 3x replication. I'm not sure what the remedy is here. You could stop talking about the replication (other than to suggest turning it down). You could also divide by replication, but that feels a little dirty... or is it? * Moving the serialization stuff to {{FSImageSerialization}} seems like a good idea-- thanks for that. Tests * I see you refactored TestCacheDirectives to avoid creating two DFSClusters. Does this accomplish everything we wanted to do in HDFS-5564? That would be good if so. > support cachepool-based limit management in path-based caching > -------------------------------------------------------------- > > Key: HDFS-5431 > URL: https://issues.apache.org/jira/browse/HDFS-5431 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode > Affects Versions: 3.0.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: hdfs-5431-1.patch, hdfs-5431-2.patch, hdfs-5431-3.patch > > > We should support cachepool-based quota management in path-based caching. -- This message was sent by Atlassian JIRA (v6.1.4#6159)