[ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096078#comment-14096078 ]
Joel Bernstein commented on SOLR-5244: -------------------------------------- Yes I think we should do what you suggest. I may not have time to implement this before Solr 4.10 though. I don't think we need to hold this up because the client interface will remain stable and we can simply slide in the new SearchHandler in a later release. Also moving forward they''ll be different types of export functionality and a specialized SearchHandler will be needed to sort out the different options. > Exporting Full Sorted Result Sets > --------------------------------- > > Key: SOLR-5244 > URL: https://issues.apache.org/jira/browse/SOLR-5244 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 5.0 > Reporter: Joel Bernstein > Assignee: Joel Bernstein > Priority: Minor > Fix For: 5.0, 4.10 > > Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch, SOLR-5244.patch, > SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch > > > This ticket allows Solr to export full sorted result sets. A new export > request handler has been created that sets up the default writer type > (SortingResponseWriter) and the required rank query (ExportQParserPlugin). > The syntax is: > {code} > /solr/collection1/export?q=*:*&fl=a,b,c&sort=a desc,b desc > {code} > This capability will open up Solr for a whole range of uses that were > typically done using aggregation engines like Hadoop. For example: > *Large Distributed Joins* > A client outside of Solr calls two different Solr collections and returns the > results sorted by a join key. The client iterates through both streams and > performs a merge join. > *Fully Distributed Field Collapsing/Grouping* > A client outside of Solr makes individual calls to all the servers in a > single collection and returns results sorted by the collapse key. The client > merge joins the sorted lists on the collapse key to perform the field > collapse. > *High Cardinality Distributed Aggregation* > A client outside of Solr makes individual calls to all the servers in a > single collection and sorts on a high cardinality field. The client then > merge joins the sorted lists to perform the high cardinality aggregation. > *Large Scale Time Series Rollups* > A client outside Solr makes individual calls to all servers in a collection > and sorts on time dimensions. The client merge joins the sorted result sets > and rolls up the time dimensions as it iterates through the data. > In these scenarios Solr is being used as a distributed sorting engine. > Developers can write clients that take advantage of this sorting capability > in any way they wish. > *Session Analysis and Aggregation* > A client outside Solr makes individual calls to all servers in a collection > and sorts on the sessionID. The client merge joins the sorted results and > aggregates sessions as it iterates through the results. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org