[jira] [Commented] (OAK-3090) Caching BlobStore implementation

Chetan Mehrotra (JIRA) Fri, 10 Jul 2015 05:12:03 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622214#comment-14622214
 ]


Chetan Mehrotra commented on OAK-3090:
--------------------------------------

Given we already use Guava in Oak it might be better to just make use of them 
(or LIRSCache if it supports removal listener) and have a simple 
CachingDataStore impl. Have a look at {{DataStoreBlobStore#getInputStream}} 
where we have some caching done for small binaries on heap. 

Extrapolating that design in following way would allow us to implement a simple 
FS based caching layer
# Have a new cache where the cached value is File (or some instance which keeps 
a reference to File)
# Provide support for Weight via a simple 
[Weigher|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/Weigher.html]
 which is based on File size
# Register a 
[RemovalListener|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalListener.html]
 which removes the file from file system upon eviction
# Provide a 
[CacheLoader|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/CacheLoader.html]
 which spools the remote binary to local filesystem

This should be a small logic and would provide you all benefits of Guava cache 
including cache stats. And this would transparently work for any DataStore. 

May be we implement it at {{BlobStore}} level itself and then it would be 
useful for other BlobStore also. Doing it at BlobStore level would require some 
support from {{BlobStore}} to determine the blob length from blobId itself. 

[1] https://code.google.com/p/guava-libraries/wiki/CachesExplained

> Caching BlobStore implementation 
> ---------------------------------
>
>                 Key: OAK-3090
>                 URL: https://issues.apache.org/jira/browse/OAK-3090
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob
>            Reporter: Chetan Mehrotra
>             Fix For: 1.3.4
>
>
> Storing binaries in Mongo puts lots of pressure on the MongoDB for reads. To 
> reduce the read load it would be useful to have a filesystem based cache of 
> frequently used binaries. 
> This would be similar to CachingFDS (OAK-3005) but would be implemented on 
> top of BlobStore API. 
> Requirements
> * Specify the max binary size which can be cached on file system
> * Limit the size of all binary content present in the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-3090) Caching BlobStore implementation

Reply via email to