[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455742#comment-16455742
 ] 

Aaron Fabbri commented on HADOOP-13649:
---------------------------------------

A couple of comments on the patch.  Looks pretty good overall.
{noformat}
+  public static final String CONF_EXPIRY_AFTER_WRITE_SECONDS =
+      "fs.metadatastore.local.expiry_after_write";
 {noformat}
Can we call this {{fs.metadatastore.local.ttl}}? TTL (time to live) is a pretty 
common term and it leaves us latitude to evolve the semantics since it is less 
specific. (Also I don't think we use underscores in config key names.)
{noformat}
-    // Start w/ less than max capacity.  Space / time trade off.
-    fileHash = new LruHashMap<>(maxRecords/2, maxRecords);
-    dirHash = new LruHashMap<>(maxRecords/4, maxRecords);
+    int expiryAfterWrite = conf.getInt(CONF_EXPIRY_AFTER_WRITE_SECONDS,
+        DEFAULT_EXPIRY_AFTER_WRITE_SECONDS);
+
+    fileCache = CacheBuilder.newBuilder()
+        .expireAfterWrite(expiryAfterWrite, TimeUnit.SECONDS)
+        .maximumSize(maxRecords)
+        .build();
+
+    dirCache = CacheBuilder.newBuilder()
+        .expireAfterWrite(expiryAfterWrite, TimeUnit.SECONDS)
+        .maximumSize(maxRecords)
+        .build();
{noformat}
That is cool that the Cache is a basically drop-in replacement for the HashMap. 
I think it would be an improvement to use a single {{Cache}} that references a 
tuple of {{(PathMetadata, DirListingMetadata)}}, but we need to think of how 
TTL and eviction works in either case. Seems like TTL works better with a 
single Cache (whether you do a {{get()}} or a {{listChildren()}}, you always 
"refresh" or "MRU" the metadata for the path.)

The "tuple" could be a trivial static inner class that just has a reference to 
each object.

Also, should we use {{.expireAfterWrite()}} or {{.expireAfterAccess()}}. Access 
makes more sense if this is a cache for performance.  Write makes more sense as 
a S3 Eventual Consistency guard.  I feel like this thing is mostly useful for 
testing and maybe for a cache, so I'm leaning towards Access.  What do you 
think?

Another thought; Cache has internal locking (docs say it is similar to the 
ConcurrentMap). LocalMetadataStore does its own locking (because there are two 
hashes to update, not sure if there are other reasons). We should figure out if 
we can eliminate our {{synchronized}} locks by using a single {{Cache}}, or if 
we still need them.

FYI If you want to measure performance of the implementation, the 
{{ITestLocalMetadataStoreScale}} test is a good place to start. It is more of a 
microbenchmark so you might actually see statistically-relevant differences 
more easily. You may have to tweak the logging but I believe it already logs 
some performance metrics.

The built-in stats that {{Cache}} expose are pretty cool. Would be fun to do a 
large distcp from s3 (or some other read-only operation that is safe for local 
metadata caching) and dump the Cache stats.  Maybe we can expose them via 
{{MetadataStore#getDiagnostics()}} later on.

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13649
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13649
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Aaron Fabbri
>            Assignee: Gabor Bota
>            Priority: Minor
>         Attachments: HADOOP-13649.001.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It 
> may be useful in narrow circumstances where the workload can tolerate 
> short-term lack of inter-node consistency:  Being in-memory, one JVM/node's 
> LocalMetadataStore will not see another node's changes to the underlying 
> filesystem.
> To put a bound on the time during which this inconsistency may occur, we 
> should implement time-based (a.k.a. Time To Live / TTL)  expiration for 
> LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to