[jira] [Comment Edited] (SOLR-4701) CollectorFilterQParserPlugin support Filter Collector at search with PostFilter

Linbin Chen (JIRA) Sun, 21 Apr 2013 01:25:20 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637502#comment-13637502
 ]


Linbin Chen edited comment on SOLR-4701 at 4/21/13 8:23 AM:
------------------------------------------------------------

frange now has use PostFilter, but CollectorFilterQParserPlugin create other  
collector filter query.

approach case:

case 1: in query

like sql in operate "select * from a where user=123 and status in (1,2,3)"

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123&fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query，but 10 kind status 
combination，create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in user FieldCache, use RAM = 10M*4 = 40M

in near realtime case, filterCache cache by query, but cf.in cache by 
atomicReader. it's hit ratio will higher.


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch. extends CollectorFilterable easy impl under 
CollectorFilterQParserPlugin

In my approach use long save 54 bit options。
                
      was (Author: chenlb):
    frange now has use PostFilter, but CollectorFilterQParserPlugin create 
other  collector filter query.

approach case:

case 1: in query

like sql in operate "select * from a where user=123 and status in (1,2,3)"

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123&fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query，but 10 kind status 
combination，create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in use RAM = 10M*4 = 40M


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch. extends CollectorFilterable easy impl under 
CollectorFilterQParserPlugin

In my approach use long save 54 bit options。
                  
> CollectorFilterQParserPlugin support Filter Collector at search with 
> PostFilter
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-4701
>                 URL: https://issues.apache.org/jira/browse/SOLR-4701
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 4.2
>            Reporter: Linbin Chen
>             Fix For: 4.3
>
>         Attachments: SOLR-4701.patch
>
>
> example:
>  * {code}fq={!cf name=in}status:(-1, 2){code}
>  * {code}fq={!cf name=in not=true}status:(3,4){code}
>  * {code}fq={!cf name=range}price:[100 TO 500]{code}
>  * {code}fq={!cf name=range}log(page_view):[50 TO 120]{code}
> in operate like sql in， faster then OR boolean query.
> most of the case, range faster then TrieField in lucene query.
> how to do use:
> solrconfig.xml add
> {code:xml}
> <queryParser name="cf" class="solr.CollectorFilterQParserPlugin"/>
> {code}
> cf not use query cache, use PostFilter fiter collector

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-4701) CollectorFilterQParserPlugin support Filter Collector at search with PostFilter

Reply via email to