Hi Chetan,
first of all thanks for all your great work on this.
I generally agree with you that we need to be on par with JR2 in terms of
capabilities.
Looking in more detail into the index configuration what about the
following format:
--------------------------------
"indexRules" : {
"rule0" : {
"name" : "title", /* Unscoped property */
"boost" : 2.0
},
"rule1" : {
"type" : "nt:unstructured",
"name" : "title", /* Scoped property */
"boost" : 1.5
},
"rule2" : {
"type" : "nt:file",
"name" : "title", /* Scoped property */
"boost" : 2.0,
"condition" : "@priority = 'high'"
},
}
--------------------------------
The rationale is to handle each rule using the same structure, what do you
think? Would it be feasible?
Regards,
Tommaso
2014-11-05 5:57 GMT+01:00 Chetan Mehrotra <[email protected]>:
> Hi Team,
>
> With OAK-2178 some basic support for boosting has been added. However
> Jackrabbit used to support lots more fine grained boosting [1]. So for
> boost feature to be used in real world scenarios should we aim to
> implement similar support i.e. provide
>
> 1. Conditional boosting based on some criteria
> 2. Node level boosting based on NodeType
>
> Q.1 - Should we support all or some of that. It would introduce some
> complexity but probably for feature to be useful they need to be
> supported
>
> Q.2 - Config format - If we need to support all (or some of that) we
> would need to decide the index definition format
>
> Configuration Format
> -----------------------------
>
> As documented in [2] the new configuration format proposed and being
> used with Lucene Property Index is like following
>
> "assetIndex":
> {
> "jcr:primaryType":"oak:QueryIndexDefinition",
> "declaringNodeTypes":"app:Asset",
> "includePropertyNames":["title", "type"],
> "type":"lucene",
> "async":"async",
> "fulltextEnabled":false,
> "orderedProps":["jcr:content/jcr:lastModified"]
> "properties": {
> "title" : { "boost" : 2.0 }
> }
> }
>
> This works fine for property index where we would restrict the
> definition to some specific NodeType and specific propertyNames
>
> However for full text index which is more generic we would need to
> have way to distinguish properties for specific nodeTypes
>
> If we need to utilize same format to capture index rules at [2] then
> one way would be to capture nodeType scoped property definitions
> separately
>
> --------------------
> "properties": {
> "title" : { "boost" : 2.0 } /* Unscoped property */
> },
> "indexRules" : {
> "nt:unstructured" : {
> "properties" :{
> "title" : { /* Scoped property */
> "boost" : 1.5
> }
> }
> },
> "nt:file" : {
> "boost" : "2.0",
> "condition" : "@priority = 'high'"
> }
> }
> -----------------------
>
> With current design most of the conditions can be support except one
> involving ancesstor as Oak NodeState model does not allow traversing
> up easily
>
> Thoughts?
>
> Chetan Mehrotra
> [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration
> [2] http://jackrabbit.apache.org/oak/docs/query/lucene.html
>