[ 
https://issues.apache.org/jira/browse/OAK-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312866#comment-16312866
 ] 

Chetan Mehrotra edited comment on OAK-7122 at 1/5/18 10:24 AM:
---------------------------------------------------------------

Implemented the script at [1]. Currently it build up the structure in memory. 
If this proves to be problamatic for large index can look into building the 
structure on file system

*Usage*

{code}
java -DindexPath=/path/to/indexing-result/indexes/lucene/data \
        -jar oak-run-*.jar \
        console /path/to/segmentstore \
    ":load 
https://raw.githubusercontent.com/chetanmeh/oak-console-scripts/master/src/main/groovy/lucene/luceneIndexDumper.groovy";
{code}

[1] 
https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene


was (Author: chetanm):
Implemented the script at [1]. Currently it build up the structure in memory. 
If this proves to be problamatic for large index can look into building the 
structure on file system

[1] 
https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene

> Implement script to compare lucene indexes logically
> ----------------------------------------------------
>
>                 Key: OAK-7122
>                 URL: https://issues.apache.org/jira/browse/OAK-7122
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: run
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
>
>
> With Document Traversal based indexing we have implemented a newer indexing 
> logic. To validate that index produced by it is is same as one done by 
> existing indexing flow we need to implement a script which can enable 
> comparing the index content logically
> This was recently discussed on lucene mailing list [1] and suggestion there 
> was it can be done by un-inverting the index. So to enable that we need to 
> implement a script which can 
> # Open a Lucene index
> # Map the Lucene Document to path of node
> # For each document determine what all fields are associated with it (stored 
> and non stored)
> # Dump this content in file sorted by path and for each line field name 
> sorted by name
> Then such dumps can be generated for old and new index and compared via 
> simple text diff
> [1] http://lucene.markmail.org/thread/wt22gk6aufs4uz55



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to