[ https://issues.apache.org/jira/browse/HDFS-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841754#comment-13841754 ]
Andrew Wang commented on HDFS-5636: ----------------------------------- The issue is that if a pool is hanging on to forgotten cached data, that prevents another pool from using it. This is true even with quotas/limits. It's somewhat fixed if we also add minimum reservations for pools, but even then you might want max-TTLs so that forgotten cache is returned to be used by fair share. Basically, right now TTLs are opt in, rather than opt out, and admins might sometimes instead want them to be opt out. There's an impedance mismatch here between admins and the users of a pool; just because a user has access to a cache pool, doesn't necessarily mean they should be able to do whatever they want with it because it might be bad for overall system performance. I see max-TTL as an admin-friendly feature, since it'll help them avoid manually cleaning up cache pools. A couple imagined use cases: * A scratch / temp cache pool with a low max TTL (say, 1 hr) and 0777 permissions, so all users can do some adhoc data exploration. The admin doesn't need to worry about constantly cleaning up forgotten directives. * When caching time-series data, you might only care about caching the last day of data. Thus, the admin could set a max TTL of 24H to enforce this. Sort of related, we might also want a command like {{hdfs cacheadmin -removeExpiredDirectives [-pool <pool>]}} to help people clean up their expired directives. Maybe even a trash-like functionality where directives that have been expired for so long are automatically removed. > Enforce a max TTL per cache pool > -------------------------------- > > Key: HDFS-5636 > URL: https://issues.apache.org/jira/browse/HDFS-5636 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: caching, namenode > Affects Versions: 3.0.0 > Reporter: Andrew Wang > Assignee: Andrew Wang > > It'd be nice for administrators to be able to specify a maximum TTL for > directives in a cache pool. This forces all directives to eventually age out. -- This message was sent by Atlassian JIRA (v6.1#6144)