[jira] [Commented] (SOLR-13125) Optimize Queries when sorting by router.field

Gus Heck (JIRA) Tue, 16 Jul 2019 07:19:55 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886151#comment-16886151
 ]


Gus Heck commented on SOLR-13125:
---------------------------------

The idea behind this patch is interesting. Unless I misunderstand the intent, 
the idea is to short circuit the response collection when the TRA collection 
names tell us that further responses will all return docs that are too far down 
the result list to ever be included. Unfortunately I don't think this patch 
does that. Issues I see:
 * This patch overrides finishStage() instead of handleResponses, which means 
that by the time your logic runs all responses have been received.
 * I don't see logic to handle values for the start parameter
 * Also not sure I like the tests checking debug messages rather than actual 
code behavior. That could get out of sync.

In any case, it's unclear to me if this can be handled in a search component 
without core changes, even if you override handleResponses() instead, you can't 
stop SearchHandler from looping and attempting to take() the results of every 
request that was sent (unless you throw an exception, but that wont be good). 
What you would need to do is somehow influence the futures that solr is waiting 
on to return early and empty once your request has been filled up from the most 
recent collections. (see 
org/apache/solr/handler/component/HttpShardHandler.java:281). Baring that, you 
could perhaps find a way to empty the pending queue, but that means you still 
have to wait for at least one uninteresting request to complete. The futures 
themselves would be waiting on the 
org/apache/solr/handler/component/HttpShardHandler.java:201. call to 
makeLoadBalancedRequest(), so I think this optimization requires the addition 
of an explicit short-circuit enabling hook. Possibly this could be a new method 
for SearchComponents to override, but we need to think some about how that 
would play with assumptions of existing code some. 

> Optimize Queries when sorting by router.field
> ---------------------------------------------
>
>                 Key: SOLR-13125
>                 URL: https://issues.apache.org/jira/browse/SOLR-13125
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: mosh
>            Priority: Minor
>         Attachments: SOLR-13125-no-commit.patch, SOLR-13125.patch, 
> SOLR-13125.patch, SOLR-13125.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are currently testing TRA using Solr 7.7, having >300 shards in the alias, 
> with much growth in the coming months.
> The "hot" data(in our case, more recent) will be stored on stronger 
> nodes(SSD, more RAM, etc).
> A proposal of optimizing queries sorted by router.field(the field which TRA 
> uses to route the data to the correct collection) has emerged.
> Perhaps, in queries which are sorted by router.field, Solr could be smart 
> enough to wait for the more recent collections, and in case the limit was 
> reached cancel other queries(or just not block and wait for the results)?
> For example:
> When querying a TRA which with a filter on a different field than 
> router.field, but sorting by router.field desc, limit=100.
> Since this is a TRA, solr will issue queries for all the collections in the 
> alias.
> But to optimize this particular type of query, Solr could wait for the most 
> recent collection in the TRA, see whether the result set matches or exceeds 
> the limit. If so, the query could be returned to the user without waiting for 
> the rest of the shards. If not, the issuing node will block until the second 
> query returns, and so forth, until the limit of the request is reached.
> This might also be useful for deep paging, querying each collection and only 
> skipping to the next once there are no more results in the specified 
> collection.
> Thoughts or inputs are always welcome.
> This is just my two cents, and I'm always happy to brainstorm.
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13125) Optimize Queries when sorting by router.field

Reply via email to