[ https://issues.apache.org/jira/browse/OAK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346271#comment-16346271 ]
Chetan Mehrotra edited comment on OAK-7065 at 1/31/18 5:24 AM: --------------------------------------------------------------- IndexUpdate should support a close operation for IndexEditor which should be invoked at the end of indexing cycle both in case of success and failure. Currently the index editor like LuceneIndexEditor implement this close logic by relying on {{leave}} call and checking if parent is null. However if there is some exception in any other editor then the editors do not get a chance to close. So in case of failure LuceneIndexEditor should # Close the IndexWriter # -Also ensure that any new files which have been added by the IndexWriter get removed- Note that cleaning here may not cover all cases. For e.g. if the IndexUpdates completes fine but AsyncIndexUpdate merge fails then also new files should be considered orphan and be removed. This cleaning would need to be performed for both index open for read and write phase * Index open for read - Here it may happen that previous indexing failed (say merge failed) and it did not got chance to cleanup stuff. So when new reader is closed it may pull down files with same name which would cause conflict with existing files. So it would need to clean local index of any such file first before pulling down new files from remote directory * Index open for write - Here it can also happen that previous indexing failed and left some files. And further suppose this indexing happened on a different cluster node i.e. different from the one where last indexing happened. So here it would try to download files with same name and thus lead to corrupt index scenario. *Current Index File Garbage Collection Logic* Currently the files are removed by COR directory upon close. It removes those files which are present in local directory at time when COR is initialized but not present in remote directory. In doing so it also account for any new file opened by COW via shared working set managed in IndexCopier was (Author: chetanm): IndexUpdate should support a close operation for IndexEditor which should be invoked at the end of indexing cycle both in case of success and failure. Currently the index editor like LuceneIndexEditor implement this close logic by relying on {{leave}} call and checking if parent is null. However if there is some exception in any other editor then the editors do not get a chance to close. So in case of failure LuceneIndexEditor should # Close the IndexWriter # Also ensure that any new files which have been added by the IndexWriter get removed > Remove orphan file from local directory in case indexing fails > -------------------------------------------------------------- > > Key: OAK-7065 > URL: https://issues.apache.org/jira/browse/OAK-7065 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene > Reporter: Chetan Mehrotra > Priority: Major > Fix For: 1.10 > > > If an indexing cycle fails for some reason it may leave orphan files in local > directory. Later on in next indexing cycle Lucene would try to create files > with same name on local disk and this may fail on Windows where such files > may have been memory mapped and hence cannot be deleted. > We should analyze such a scenario and see if system can handle the failure > case properly -- This message was sent by Atlassian JIRA (v7.6.3#76005)