[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Ard Schrijvers (JIRA) Mon, 27 Aug 2007 08:49:53 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523033
 ]


Ard Schrijvers commented on JCR-1064:
-------------------------------------

Aaah I am sorry for the system.out. I replaced a patch and did put a 
sysout.out. Stupid, I'll remove it! I'll do the other 4 (-) as well.

About the parent handler I knew that the system index can be in the old format, 
but AFAICS, this is never an issue. When I am searching an index for workspace 
X, it does not matter wether the parent index is in the old format I think (I 
am doing the tests, with the parent index in old format, and the workspace 
index in new format, and this is no problem)

As I see your example:

A user may do the following:
- Upgrate a pre 1.4 repository (-> all indexes are V1)
- Re-index a workspace (-> workspace index will be V2)
- Execute a query on the workspace (-> will use V2 for queries) 

this will just run fine, as I tested it this way. You can have workspaces with 
old index style along with new index style, as with a system index in new or 
old format. 

It is hard to get it nice backwards compatible, due to the index creation in 
the MultiIndex when there is no index.

For example, when in SearchIndex.doInit() the following line is executed

index = new MultiIndex(indexDir, this, context.getItemStateManager(),
                context.getRootId(), excludedIDs, nsMappings);

the system index is created. Because this is *before* the setIndexFormatVersion 
part in doInit(), in NodeIndexer this part

if(indexFormatVersion == IndexFormatVersion.V2) {
               addPropertyName(doc,propState.getName());
}

will never be called since indexFormatVersion  == null. This means, the system 
index is always indexed without the PROPERTIES_SET, and therefor always in the 
old format. 

Now, I did just test to first set the default indexformat before the new 
MultiIndex, like:

setIndexFormatVersion(IndexFormatVersion.V2);
index = new MultiIndex(indexDir, this, context.getItemStateManager(),
               context.getRootId(), excludedIDs, nsMappings);

which later in doInit might be set to V1

so when a new index is created here, I get an index with the PROPERTIES_SET. 
But...I do not know wether the new MultiIndex(...) creation also indexes after 
it already exists, so that it might index  PROPERTIES_SET, while it should be 
in old format. Hope I am a little clear on the problems? :-)

I'll re-add the patch with your first 4 (-)  solved and wait if you can comment 
on my thing about the parent handler,

thanks for reviewing :-) 






> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //[EMAIL PROTECTED] is transformed into the 
> org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
> MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
> MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still 
> work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Reply via email to