[ https://issues.apache.org/jira/browse/OAK-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16724965#comment-16724965 ]
Vikas Saurabh commented on OAK-7947: ------------------------------------ Attached a zip - [^lucene-index-open-access.zip] which contains: * logging-directory.patch - a patch that adds a logging {{Directory}} implementation * open-close-dir-calls.txt - all calls that the patch logged for a 11G damAssetLucene index (same one I listed above) * open-dir-calls.txt - all calls to simply open the index * close-dir-calls.txt - calls to close the index I few things that were quite interesting: * *All* index files were read although mostly only a few reads were incurred * seek were only incurred on {{.tim}}, {{.tip}} and {{.cfs}} files - {{.cfs}} files tended to be in 100MB range * seeks in {{.tim}} and {{.cfs}} went backwards too - so they could require opening input stream multiple times * only a few reads occur even after a seek (there could be other useful patterns to find as well) > Lazy loading of Lucene index files startup > ------------------------------------------ > > Key: OAK-7947 > URL: https://issues.apache.org/jira/browse/OAK-7947 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, query > Reporter: Thomas Mueller > Assignee: Thomas Mueller > Priority: Major > Attachments: OAK-7947.patch, lucene-index-open-access.zip > > > Right now, all Lucene index binaries are loaded on startup (I think when the > first query is run, to do cost calculation). This is a performance problem if > the index files are large, and need to be downloaded from the data store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)