[ https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881257#comment-16881257 ]
Julian Reschke commented on OAK-3629: ------------------------------------- trunk: (1.5.5) [r1751420|http://svn.apache.org/r1751420] [r1751236|http://svn.apache.org/r1751236] [r1750842|http://svn.apache.org/r1750842] [r1750769|http://svn.apache.org/r1750769] > Index corruption seen with CopyOnRead when index defnition is recreated > ----------------------------------------------------------------------- > > Key: OAK-3629 > URL: https://issues.apache.org/jira/browse/OAK-3629 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene > Reporter: Chetan Mehrotra > Assignee: Chetan Mehrotra > Priority: Blocker > Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4 > Fix For: 1.5.5, 1.6.0 > > > CopyOnRead logic relies on {{reindexCount}} to determine the name of > directory in which index files would be copied. In normal flow if the index > is reindexed then this count would get increased and newer index files would > get copied to a new directory. > However if the index definition node gets recreated due to some deployment > process then this count gets reset to 0. Due to which newly created index > files from reindexing would start getting copied to already existing > directory and that can lead to corruption. > So what happened here was > # System started with index definition I1 and indexing got complete with > index files saved under index/hash(indexpath)/1 (where 1 is current reindex > count) > # A new index definition package was deployed which reset the index count. > Now reindex happened again and the CopyOnRead logic per current design reused > the existing index directory. And it so happens that Lucene create file with > same name and same size but different content. This trips the CopyOnRead > defense of length based index corruption check and thus cause new lucene > index to corrupt > *Note that here corruption is transient i.e. persisted index is not > corrupted*. Just that locally copied index gets corrupted. Cleaning up the > index directory would fix the issue and that can be used as a workaround. > *Fix* > After discussing with [~tmueller] following approach can be used. > Instead of relying on reindex count we can maintain a hidden randomly > generated uuid and store it in the index config. This would be used to derive > the name of directory on filesystem. If the index definition gets reset then > the uuid can be regenerated. > *Workaround* > Clean the directory used by CopyOnRead which is <repo home>/index before > restart -- This message was sent by Atlassian JIRA (v7.6.3#76005)