[ 
https://issues.apache.org/jira/browse/SOLR-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitaliy Zhovtyuk updated SOLR-2908:
-----------------------------------

    Attachment: SOLR-2908.patch

If you limit terms number you'd need to pass at least sorting to shard in order 
to get most relevant terms (if needed).
Added shards.terms.params.override=true parameter, if terms parameters 
(terms.limit, terms.sort, terms.maxcount, terms.mincount) should be passed to 
shards.
Using this parameter with terms.sort=index (no sorting) is ok, but using 
shards.terms.params.override with terms.sort=count can lead to inconsistent 
results with single core.
See org.apache.solr.handler.component.DistributedTermsComponentParametersTest. 

For example, we use 
{code}shards.terms.params.override=true&terms.limit=5&terms.sort=count{code}
and data
{code}    index(id, 18, "b_t", "snake spider shark snail slug seal");
    index(id, 19, "b_t", "snake spider shark snail slug");
    index(id, 20, "b_t", "snake spider shark snail");
    index(id, 21, "b_t", "snake spider shark");
    index(id, 22, "b_t", "snake spider");
    index(id, 23, "b_t", "snake");
    index(id, 24, "b_t", "ant zebra");
    index(id, 25, "b_t", "zebra");
{code}

WIth single core results will be like 
{code}snake=6 spider=5 shark=4 snail=3 slug=2{code}

For 2shards results will be like
shard 1:  {code} snake=3 spider=3 shark=2 snail=2 ant=1 {code}
shard 2: {code} snake=3 spider=2 shark=2 seal=1 slug=1 {code}

Combined result: {code} snake=6 spider=5 shark=4 snail=2 ant=1 {code}

I suggest this parameter override will be useful with sorting and custom 
routing, in case that same terms located on the same shard, 
sorted and limited there correctly.

> To push the terms.limit parameter from the master core to all the shard cores.
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-2908
>                 URL: https://issues.apache.org/jira/browse/SOLR-2908
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 1.4.1
>         Environment: Linux server. 64 bit processor and 16GB Ram.
>            Reporter: sivaganesh
>            Priority: Critical
>              Labels: patch
>             Fix For: 4.7
>
>         Attachments: SOLR-2908.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we pass the terms.limit parameter to the master (which has many shard 
> cores), it's not getting pushed down to the individual cores. Instead the 
> default value of -1 is assigned to Terms.limit parameter is assigned in the 
> underlying shard cores. The issue being the time taken by the Master core to 
> return the required limit of terms is higher when we are having more number 
> of underlying shard cores. This affects the performances of the auto suggest 
> feature. 
> Can thought we can have a parameter to explicitly override the -1 being set 
> to Terms.limit in shards core.
> We saw the source code(TermsComponent.java) and concluded that the same. 
> Please help us in pushing the terms.limit parameter to shard cores. 
> PFB code snippet.
> private ShardRequest createShardQuery(SolrParams params) {
>     ShardRequest sreq = new ShardRequest();
>     sreq.purpose = ShardRequest.PURPOSE_GET_TERMS;
>     // base shard request on original parameters
>     sreq.params = new ModifiableSolrParams(params);
>     // remove any limits for shards, we want them to return all possible
>     // responses
>     // we want this so we can calculate the correct counts
>     // dont sort by count to avoid that unnecessary overhead on the shards
>     sreq.params.remove(TermsParams.TERMS_MAXCOUNT);
>     sreq.params.remove(TermsParams.TERMS_MINCOUNT);
>     sreq.params.set(TermsParams.TERMS_LIMIT, -1);
>     sreq.params.set(TermsParams.TERMS_SORT, TermsParams.TERMS_SORT_INDEX);
>     return sreq;
>   }
> Solr Version:
> Solr Specification Version: 1.4.0.2010.01.13.08.09.44 
>  Solr Implementation Version: 1.5-dev exported - yonik - 2010-01-13 08:09:44 
>  Lucene Specification Version: 2.9.1-dev 
>  Lucene Implementation Version: 2.9.1-dev 888785 - 2009-12-09 18:03:31 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to