[ 
https://issues.apache.org/jira/browse/OAK-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899392#comment-15899392
 ] 

Thomas Mueller commented on OAK-5899:
-------------------------------------

> scale of how-usable is it

Yes. Many relational databases make [cost 
estimation|https://en.wikipedia.org/wiki/Query_optimization#Cost_estimation] 
using histograms. Even [SQLite supports 
that|https://www.sqlite.org/compile.html#enable_stat4]. The H2 database uses 
"selectivity" on a [per-column 
basis|http://h2database.com/html/functions.html#selectivity].

I think Lucene doesn't provide that, as it's mainly used for fulltext search, 
and not so much for relational queries. But for our case, just having an 
estimate on the number of entries for a certain property value (cardinality) 
would be very useful. A configuration options would help a lot. An "analyze" 
tool for Oak could update those values at runtime, similar to what the SQL 
command "analyze" does for relational database 
([Oracle|https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_4005.htm],
 [PostgreSQL|https://www.postgresql.org/docs/current/static/sql-analyze.html], 
[MySQL|https://dev.mysql.com/doc/refman/5.7/en/analyze-table.html], 
[SQLite|https://www.sqlite.org/lang_analyze.html], 
[H2|http://h2database.com/html/grammar.html#analyze]).

> PropertyDefinitions should allow for some tweakability to declare usefulness
> ----------------------------------------------------------------------------
>
>                 Key: OAK-5899
>                 URL: https://issues.apache.org/jira/browse/OAK-5899
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Vikas Saurabh
>            Priority: Minor
>             Fix For: 1.8
>
>
> At times, we have property definitions which are added to support for dense 
> results right out of the index (e.g. {{contains(\*, 'foo') AND 
> \[bar]='baz'}}).
> In such cases, the added property definition "might" not be the best one to 
> answer queries which only have the property restriction (eg only 
> {{\[bar]='baz'}}
> There should be a way for property definition to declare this. May be there 
> are cases of some spectrum too - i.e. not only a boolean-usable-or-not, but 
> some kind of scale of how-usable is it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to