[
https://issues.apache.org/jira/browse/SOLR-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637502#comment-13637502
]
Linbin Chen edited comment on SOLR-4701 at 4/21/13 8:23 AM:
------------------------------------------------------------
frange now has use PostFilter, but CollectorFilterQParserPlugin create other
collector filter query.
approach case:
case 1: in query
like sql in operate "select * from a where user=123 and status in (1,2,3)"
a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.
has 10 million row index。avg 1 million per one of 'status' field value.
user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.
user:123&fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1
OR 2 OR 3\)
maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024
OpenBitSet.
filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G
cf.in user FieldCache, use RAM = 10M*4 = 40M
in near realtime case, filterCache cache by query, but cf.in cache by
atomicReader. it's hit ratio will higher.
case 2: bit query
like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).
assume bit operate logic query_bit | field_bit !=0
search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}
I not yet upload bit query patch. extends CollectorFilterable easy impl under
CollectorFilterQParserPlugin
In my approach use long save 54 bit options。
was (Author: chenlb):
frange now has use PostFilter, but CollectorFilterQParserPlugin create
other collector filter query.
approach case:
case 1: in query
like sql in operate "select * from a where user=123 and status in (1,2,3)"
a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.
has 10 million row index。avg 1 million per one of 'status' field value.
user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.
user:123&fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1
OR 2 OR 3\)
maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024
OpenBitSet.
filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G
cf.in use RAM = 10M*4 = 40M
case 2: bit query
like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).
assume bit operate logic query_bit | field_bit !=0
search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}
I not yet upload bit query patch. extends CollectorFilterable easy impl under
CollectorFilterQParserPlugin
In my approach use long save 54 bit options。
> CollectorFilterQParserPlugin support Filter Collector at search with
> PostFilter
> -------------------------------------------------------------------------------
>
> Key: SOLR-4701
> URL: https://issues.apache.org/jira/browse/SOLR-4701
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 4.2
> Reporter: Linbin Chen
> Fix For: 4.3
>
> Attachments: SOLR-4701.patch
>
>
> example:
> * {code}fq={!cf name=in}status:(-1, 2){code}
> * {code}fq={!cf name=in not=true}status:(3,4){code}
> * {code}fq={!cf name=range}price:[100 TO 500]{code}
> * {code}fq={!cf name=range}log(page_view):[50 TO 120]{code}
> in operate like sql in, faster then OR boolean query.
> most of the case, range faster then TrieField in lucene query.
> how to do use:
> solrconfig.xml add
> {code:xml}
> <queryParser name="cf" class="solr.CollectorFilterQParserPlugin"/>
> {code}
> cf not use query cache, use PostFilter fiter collector
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]