[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2019-07-09 Thread Thomas Mueller (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-3629:

Labels:   (was: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4)

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Blocker
> Fix For: 1.5.5, 1.6.0
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-07-04 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Labels: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4  (was: )

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Blocker
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.6, 1.5.5
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-02-01 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Fix Version/s: (was: 1.3.15)
   1.6

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Blocker
> Fix For: 1.6
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-01-27 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3629:
---
Fix Version/s: (was: 1.4)
   1.3.15

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.3.15
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-01-27 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3629:
---
Priority: Blocker  (was: Major)

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Blocker
> Fix For: 1.3.15
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-01-20 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3629:
---
Priority: Major  (was: Minor)

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.4
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2016-01-03 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Fix Version/s: (was: 1.3.13)
   1.4

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.4
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2015-12-07 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Fix Version/s: (was: 1.3.12)
   1.3.13

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.13
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 
> *Workaround*
> Clean the directory used by CopyOnRead which is /index before 
> restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2015-11-24 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Description: 
CopyOnRead logic relies on {{reindexCount}} to determine the name of directory 
in which index files would be copied. In normal flow if the index is reindexed 
then this count would get increased and newer index files would get copied to a 
new directory.

However if the index definition node gets recreated due to some deployment 
process then this count gets reset to 0. Due to which newly created index files 
from reindexing would start getting copied to already existing directory and 
that can lead to corruption.

So what happened here was
# System started with index definition I1 and indexing got complete with index 
files saved under index/hash(indexpath)/1 (where 1 is current reindex count)
# A new index definition package was deployed which reset the index count. Now 
reindex happened again and the CopyOnRead logic per current design reused the 
existing index directory. And it so happens that Lucene create file with same 
name and same size but different content. This trips the CopyOnRead defense of 
length based index corruption check and thus cause new lucene index to corrupt

*Note that here corruption is transient i.e. persisted index is not corrupted*. 
Just that locally copied index gets corrupted. Cleaning up the index directory 
would fix the issue and that can be used as a workaround.

*Fix*

After discussing with [~tmueller] following approach can be used.

Instead of relying on reindex count we can maintain a hidden randomly generated 
uuid and store it in the index config. This would be used to derive the name of 
directory on filesystem. If the index definition gets reset then the uuid can 
be regenerated. 

*Workaround*

Clean the directory used by CopyOnRead which is /index before restart



  was:
CopyOnRead logic relies on {{reindexCount}} to determine the name of directory 
in which index files would be copied. In normal flow if the index is reindexed 
then this count would get increased and newer index files would get copied to a 
new directory.

However if the index definition node gets recreated due to some deployment 
process then this count gets reset to 0. Due to which newly created index files 
from reindexing would start getting copied to already existing directory and 
that can lead to corruption.

So what happened here was
# System started with index definition I1 and indexing got complete with index 
files saved under index/hash(indexpath)/1 (where 1 is current reindex count)
# A new index definition package was deployed which reset the index count. Now 
reindex happened again and the CopyOnRead logic per current design reused the 
existing index directory. And it so happens that Lucene create file with same 
name and same size but different content. This trips the CopyOnRead defense of 
length based index corruption check and thus cause new lucene index to corrupt

*Note that here corruption is transient i.e. persisted index is not corrupted*. 
Just that locally copied index gets corrupted. Cleaning up the index directory 
would fix the issue and that can be used as a workaround.

*Fix*

After discussing with [~tmueller] following approach can be used.

Instead of relying on reindex count we can maintain a hidden randomly generated 
uuid and store it in the index config. This would be used to derive the name of 
directory on filesystem. If the index definition gets reset then the uuid can 
be regenerated. 




> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.12
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> t

[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2015-11-22 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain updated OAK-3629:
---
Fix Version/s: (was: 1.0.25)
   (was: 1.2.9)
   (was: 1.3.11)
   1.3.12

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.12
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2015-11-16 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3629:
--
Fix Version/s: (was: 1.0.24)
   1.0.25

> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.11, 1.2.9, 1.0.25
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it in the index config. This would be used to derive 
> the name of directory on filesystem. If the index definition gets reset then 
> the uuid can be regenerated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3629) Index corruption seen with CopyOnRead when index defnition is recreated

2015-11-13 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3629:
-
Description: 
CopyOnRead logic relies on {{reindexCount}} to determine the name of directory 
in which index files would be copied. In normal flow if the index is reindexed 
then this count would get increased and newer index files would get copied to a 
new directory.

However if the index definition node gets recreated due to some deployment 
process then this count gets reset to 0. Due to which newly created index files 
from reindexing would start getting copied to already existing directory and 
that can lead to corruption.

So what happened here was
# System started with index definition I1 and indexing got complete with index 
files saved under index/hash(indexpath)/1 (where 1 is current reindex count)
# A new index definition package was deployed which reset the index count. Now 
reindex happened again and the CopyOnRead logic per current design reused the 
existing index directory. And it so happens that Lucene create file with same 
name and same size but different content. This trips the CopyOnRead defense of 
length based index corruption check and thus cause new lucene index to corrupt

*Note that here corruption is transient i.e. persisted index is not corrupted*. 
Just that locally copied index gets corrupted. Cleaning up the index directory 
would fix the issue and that can be used as a workaround.

*Fix*

After discussing with [~tmueller] following approach can be used.

Instead of relying on reindex count we can maintain a hidden randomly generated 
uuid and store it in the index config. This would be used to derive the name of 
directory on filesystem. If the index definition gets reset then the uuid can 
be regenerated. 



  was:
CopyOnRead logic relies on {{reindexCount}} to determine the name of directory 
in which index files would be copied. In normal flow if the index is reindexed 
then this count would get increased and newer index files would get copied to a 
new directory.

However if the index definition node gets recreated due to some deployment 
process then this count gets reset to 0. Due to which newly created index files 
from reindexing would start getting copied to already existing directory and 
that can lead to corruption.

Note that here corruption is transient i.e. persisted index is not corrupted. 
Just that locally copied index gets corrupted. Cleaning up the index directory 
would fix the issue and that can be used as a workaround.

*Fix*

After discussing with [~tmueller] following approach can be used.

Instead of relying on reindex count we can maintain a hidden randomly generated 
uuid and store it in the index config. This would be used to derive the name of 
directory on filesystem. If the index definition gets reset then the uuid can 
be regenerated. 




> Index corruption seen with CopyOnRead when index defnition is recreated
> ---
>
> Key: OAK-3629
> URL: https://issues.apache.org/jira/browse/OAK-3629
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.11, 1.0.24, 1.2.9
>
>
> CopyOnRead logic relies on {{reindexCount}} to determine the name of 
> directory in which index files would be copied. In normal flow if the index 
> is reindexed then this count would get increased and newer index files would 
> get copied to a new directory.
> However if the index definition node gets recreated due to some deployment 
> process then this count gets reset to 0. Due to which newly created index 
> files from reindexing would start getting copied to already existing 
> directory and that can lead to corruption.
> So what happened here was
> # System started with index definition I1 and indexing got complete with 
> index files saved under index/hash(indexpath)/1 (where 1 is current reindex 
> count)
> # A new index definition package was deployed which reset the index count. 
> Now reindex happened again and the CopyOnRead logic per current design reused 
> the existing index directory. And it so happens that Lucene create file with 
> same name and same size but different content. This trips the CopyOnRead 
> defense of length based index corruption check and thus cause new lucene 
> index to corrupt
> *Note that here corruption is transient i.e. persisted index is not 
> corrupted*. Just that locally copied index gets corrupted. Cleaning up the 
> index directory would fix the issue and that can be used as a workaround.
> *Fix*
> After discussing with [~tmueller] following approach can be used.
> Instead of relying on reindex count we can maintain a hidden randomly 
> generated uuid and store it i