[ 
https://issues.apache.org/jira/browse/OAK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346271#comment-16346271
 ] 

Chetan Mehrotra edited comment on OAK-7065 at 1/31/18 5:24 AM:
---------------------------------------------------------------

IndexUpdate should support a close operation for IndexEditor which should be 
invoked at the end of indexing cycle both in case of success and failure. 
Currently the index editor like LuceneIndexEditor implement this close logic by 
relying on {{leave}} call and checking if parent is null. However if there is 
some exception in any other editor then the editors do not get a chance to 
close.

So in case of failure LuceneIndexEditor should

# Close the IndexWriter
# -Also ensure that any new files which have been added by the IndexWriter get 
removed-

Note that cleaning here may not cover all cases. For e.g. if the IndexUpdates 
completes fine but AsyncIndexUpdate merge fails then also new files should be 
considered orphan and be removed. This cleaning would need to be performed for 
both index open for read and write phase

* Index open for read - Here it may happen that previous indexing failed (say 
merge failed) and it did not got chance to cleanup stuff. So when new reader is 
closed it may pull down files with same name which would cause conflict with 
existing files. So it would need to clean local index of any such file first 
before pulling down new files from remote directory
* Index open for write - Here it can also happen that previous indexing failed 
and left some files. And further suppose this indexing happened on a different 
cluster node i.e. different from the one where last indexing happened. So here 
it would try to download files with same name and thus lead to corrupt index 
scenario. 

*Current Index File Garbage Collection Logic*

Currently the files are removed by COR directory upon close. It removes those 
files which are present in local directory at time when COR is initialized but 
not present in remote directory.

In doing so it also account for any new file opened by COW via shared working 
set managed in IndexCopier


was (Author: chetanm):
IndexUpdate should support a close operation for IndexEditor which should be 
invoked at the end of indexing cycle both in case of success and failure. 
Currently the index editor like LuceneIndexEditor implement this close logic by 
relying on {{leave}} call and checking if parent is null. However if there is 
some exception in any other editor then the editors do not get a chance to 
close.

So in case of failure LuceneIndexEditor should

# Close the IndexWriter
# Also ensure that any new files which have been added by the IndexWriter get 
removed

> Remove orphan file from local directory in case indexing fails
> --------------------------------------------------------------
>
>                 Key: OAK-7065
>                 URL: https://issues.apache.org/jira/browse/OAK-7065
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Priority: Major
>             Fix For: 1.10
>
>
> If an indexing cycle fails for some reason it may leave orphan files in local 
> directory. Later on in next indexing cycle Lucene would try to create files 
> with same name on local disk and this may fail on Windows where such files 
> may have been memory mapped and hence cannot  be deleted.
> We should analyze such a scenario and see if system can handle the failure 
> case properly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to