[ 
https://issues.apache.org/jira/browse/OAK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028715#comment-16028715
 ] 

Vikas Saurabh edited comment on OAK-2808 at 5/30/17 2:56 AM:
-------------------------------------------------------------

[~chetanm],
{quote}
Another approach which can be used here is
* Let DefaultDirectoryFactory implement IndexCommitCallback
* Have a single instance of factory which also refers to 
activeDeletedBlobCollector
* Register the factory with IndexingContext if it implements callback interface
{quote}
Maybe, I'm not following correctly - but even if IndexingContext tells single 
DirectoryFactory instance about commit progress, it won't be able to notify to 
the correct BlobDeletionCallback instance.
We also have the option to go back to earlier implementation of doing the 
wiring all the way down via Editor etc (skipping DirectoryFactory altogether)
bq. No need for that. You can add a reference to CheckpointMBean in 
LuceneIndexProviderService. This can then be passed to 
ActiveDeletedBlobCollectorFactory
Ack. Btw, I'm interpreting this to have a new methond in CheckpoinMBean which 
return oldestTimestampForCheckpoints (maybe, 2 properties - timestamp as well 
as Date().toString() )


was (Author: catholicon):
[~chetanm],
{quote}
Another approach which can be used here is
* Let DefaultDirectoryFactory implement IndexCommitCallback
* Have a single instance of factory which also refers to 
activeDeletedBlobCollector
* Register the factory with IndexingContext if it implements callback interface
{quote}
Maybe, I'm not following correctly - but even if IndexingContext tells single 
DirectoryFactory instance about commit progress, it won't be able to notify to 
the correct BlobDeletionCallback instance.
We also have the earlier implementation of doing the wiring all the way down 
via Editor etc (skipping DirectoryFactory altogether)
bq. No need for that. You can add a reference to CheckpointMBean in 
LuceneIndexProviderService. This can then be passed to 
ActiveDeletedBlobCollectorFactory
Ack. Btw, I'm interpreting this to have a new methond in CheckpoinMBean which 
return oldestTimestampForCheckpoints (maybe, 2 properties - timestamp as well 
as Date().toString() )

> Active deletion of 'deleted' Lucene index files from DataStore without 
> relying on full scale Blob GC
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OAK-2808
>                 URL: https://issues.apache.org/jira/browse/OAK-2808
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Vikas Saurabh
>              Labels: datastore, performance
>             Fix For: 1.8
>
>         Attachments: copyonread-stats.png, OAK-2808-1.patch
>
>
> With storing of Lucene index files within DataStore our usage pattern
> of DataStore has changed between JR2 and Oak.
> With JR2 the writes were mostly application based i.e. if application
> stores a pdf/image file then that would be stored in DataStore. JR2 by
> default would not write stuff to DataStore. Further in deployment
> where large number of binary content is present then systems tend to
> share the DataStore to avoid duplication of storage. In such cases
> running Blob GC is a non trivial task as it involves a manual step and
> coordination across multiple deployments. Due to this systems tend to
> delay frequency of GC
> Now with Oak apart from application the Oak system itself *actively*
> uses the DataStore to store the index files for Lucene and there the
> churn might be much higher i.e. frequency of creation and deletion of
> index file is lot higher. This would accelerate the rate of garbage
> generation and thus put lot more pressure on the DataStore storage
> requirements.
> Discussion thread http://markmail.org/thread/iybd3eq2bh372zrl



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to