[ 
https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711507#comment-16711507
 ] 

Thomas Mueller commented on OAK-7947:
-------------------------------------

The attached solves the issue. It contains various changes, possibly some of 
them are not needed, and some might be incorrect / problematic. This is 
work-in-progress. Still it would be nice to get some feedback from those who 
are more familiar with this code, for example [~catholicon] [~teofili] 
[~chetanm]. Changes I did:

* IndexTracker.getIndexDefinition constructs the node and returns it if the 
index isn't in the indices map yet. I don't know why it returned null before, 
it seems wrong to me.
* LuceneIndexNodeManager always opened the index, I don't know why. 
SearcherHolder now doesn't always do that. I basically make SearcherHolder open 
the index lazily.
* LucenePropertyIndex acquireIndexNode is called when planning, and that method 
opens the index files. I don't know why. I created a class LazyLuceneIndexNode 
that wraps LuceneIndexNode and creates it lazily.
* OakStreamingIndexFile now logs the directory name as well, not just the file 
name.
* DefaultIndexReader now opens the directory (DirectoryReader.open) lazily; 
only when calling getReader.
* FulltextIndexPlanner.estimatedEntryCount now only calls getNumDocs when 
really needed (that is, only if "entryCount" isn't set in the index 
definition). That should avoid having to open the index if we know the 
entryCount is high.

> Lazy loading of Lucene index files startup
> ------------------------------------------
>
>                 Key: OAK-7947
>                 URL: https://issues.apache.org/jira/browse/OAK-7947
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene, query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Major
>         Attachments: OAK-7947.patch
>
>
> Right now, all Lucene index binaries are loaded on startup (I think when the 
> first query is run, to do cost calculation). This is a performance problem if 
> the index files are large, and need to be downloaded from the data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to