[ https://issues.apache.org/jira/browse/SOLR-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856079#action_12856079 ]
Shawn Smith commented on SOLR-1880: ----------------------------------- We mainly use Solr mainly to fetch just document IDs, then look up those IDs in a database. So this would make a big difference for us. In particular, we have a few reports that fetch the IDs of top ~50,000 documents (rows=50000). With so many IDs to return, the GET_TOP_IDS requests execute in a couple of hundred milliseconds but the GET_FIELDS requests take 5-10 seconds. So on those queries we'd get more than a 10x speedup by skipping the 2nd request. > Performance: Distributed Search should skip GET_FIELDS stage if EXECUTE_QUERY > stage gets all fields > --------------------------------------------------------------------------------------------------- > > Key: SOLR-1880 > URL: https://issues.apache.org/jira/browse/SOLR-1880 > Project: Solr > Issue Type: Improvement > Components: search > Affects Versions: 1.4 > Reporter: Shawn Smith > > Right now, a typical distributed search using QueryComponent makes two HTTP > requests to each shard: > # STAGE_EXECUTE_QUERY executes one HTTP request to each shard to get top N > ids and sort keys, merges the results to produce a final list of document IDs > (PURPOSE_GET_TOP_IDS). > # STAGE_GET_FIELDS executes a second HTTP request to each shard to get the > document field values for the final list of document IDs (PURPOSE_GET_FIELDS). > If the "fl" param is just "id" or just "id,score", all document data to > return is already fetched by STAGE_EXECUTE_QUERY. The second > STAGE_GET_FIELDS query is completely unnecessary. Eliminating that 2nd HTTP > request can make a big difference in overall performance. > Also, the "fl" param only gets id, score and sort columns, it would probably > be cheaper to fetch the final sort column data in STAGE_EXECUTE_QUERY which > has to read the sort column data anyway, and skip STAGE_GET_FIELDS. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira