[ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881257#comment-16881257
 ] 

Julian Reschke commented on OAK-3629:
-------------------------------------

trunk: (1.5.5) [r1751420|http://svn.apache.org/r1751420] 
[r1751236|http://svn.apache.org/r1751236] 
[r1750842|http://svn.apache.org/r1750842] 
[r1750769|http://svn.apache.org/r1750769]

> Index corruption seen with CopyOnRead when index defnition is recreated
> -----------------------------------------------------------------------
>
>                 Key: OAK-3629
>                 URL: https://issues.apache.org/jira/browse/OAK-3629
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Blocker
>              Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
>             Fix For: 1.5.5, 1.6.0
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is <repo home>/index before 
> restart



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to