Re: update Lucene
Clearly I meant "...along with *Lucene* jars" :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Otis Gospodnetic > To: solr-dev@lucene.apache.org > Sent: Wednesday, May 27, 2009 11:59:18 PM > Subject: Re: update Lucene > > > I wonder if it would be useful to commit Lucene's CHANGES.txt into Solr along > with Solr jars. It would then be very easy to tell what changed in Lucene > since > the version Solr has and the current version of Lucene (or some newer > released > version, if we were able to be behind). > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Yonik Seeley > > To: solr-dev@lucene.apache.org > > Sent: Wednesday, May 27, 2009 4:58:39 PM > > Subject: update Lucene > > > > I think we should upgrade Lucene again since the index file format has > changed: > > https://issues.apache.org/jira/browse/LUCENE-1654 > > > > This also contains a fix for unifying the FieldCache and > > ExtendedFieldCache instances. > > > > $ svn diff -r r776177 CHANGES.txt > > Index: CHANGES.txt > > === > > --- CHANGES.txt(revision 776177) > > +++ CHANGES.txt(working copy) > > @@ -27,7 +27,11 @@ > > implement Searchable or extend Searcher, you should change you > > code to implement this method. If you already extend > > IndexSearcher, no further changes are needed to use Collector. > > -(Shai Erera via Mike McCandless) > > + > > +Finally, the values Float.Nan, Float.NEGATIVE_INFINITY and > > +Float.POSITIVE_INFINITY are not valid scores. Lucene uses these > > +values internally in certain places, so if you have hits with such > > +scores it will cause problems. (Shai Erera via Mike McCandless) > > > > Changes in runtime behavior > > > > @@ -107,10 +111,10 @@ > > that's visited. All core collectors now use this API. (Mark > > Miller, Mike McCandless) > > > > -8. LUCENE-1546: Add IndexReader.flush(String commitUserData), allowing > > - you to record an opaque commitUserData into the commit written by > > - IndexReader. This matches IndexWriter's commit methods. (Jason > > - Rutherglen via Mike McCandless) > > +8. LUCENE-1546: Add IndexReader.flush(Map commitUserData), allowing > > + you to record an opaque commitUserData (maps String -> String) into > > + the commit written by IndexReader. This matches IndexWriter's > > + commit methods. (Jason Rutherglen via Mike McCandless) > > > > 9. LUCENE-652: Added org.apache.lucene.document.CompressionTools, to > > enable compressing & decompressing binary content, external to > > @@ -135,6 +139,9 @@ > > not make sense for all subclasses of MultiTermQuery. Check individual > > subclasses to see if they support #getTerm(). (Mark Miller) > > > > +14. LUCENE-1636: Make TokenFilter.input final so it's set only > > +once. (Wouter Heijke, Uwe Schindler via Mike McCandless). > > + > > Bug fixes > > > > 1. LUCENE-1415: MultiPhraseQuery has incorrect hashCode() and equals() > > @@ -176,6 +183,9 @@ > > sort) by doc Id in a consistent manner (i.e., if Sort.FIELD_DOC > > was used vs. > > when it wasn't). (Shai Erera via Michael McCandless) > > > > +10. LUCENE-1647: Fix case where IndexReader.undeleteAll would cause > > +the segment's deletion count to be incorrect. (Mike McCandless) > > + > > New features > > > > 1. LUCENE-1411: Added expert API to open an IndexWriter on a prior > > @@ -186,10 +196,11 @@ > > when building transactional support on top of Lucene. (Mike > > McCandless) > > > > - 2. LUCENE-1382: Add an optional arbitrary String "commitUserData" to > > -IndexWriter.commit(), which is stored in the segments file and is > > -then retrievable via IndexReader.getCommitUserData instance and > > -static methods. (Shalin Shekhar Mangar via Mike McCandless) > > + 2. LUCENE-1382: Add an optional arbitrary Map (String -> String) > > +"commitUserData" to IndexWriter.commit(), which is stored in the > > +segments file and is then retrievable via > > +IndexReader.getCommitUserData instance and static methods. > > +(Shalin Shekhar Mangar via Mike McCandless) > > > > 3. LUCENE-1406: Added Arabic analyzer. (Robert Muir via Grant Ingersoll) > > > > @@ -311,6 +322,10 @@ > > 25. LUCENE-1634: Add calibrateSizeByDeletes to LogMergePolicy, to take > > deletions into account when considering merges. (Yasuhiro Matsuda > > via Mike McCandless) > > + > > +26. LUCENE-1550: Added new n-gram based String distance measure for > > spell checking. > > +See the Javadocs for NGramDistance.java for a reference paper on > > why this is helpful (Tom Morton via Grant Ingersoll) > > + > > > > Optimizations > > > > > > -Yonik > > http://www.lucidimagination.com
Re: Streaming Docs, Terms, TermVectors
On a single server, Solr already does streaming of returned documents... the stored fields of selected docs are retrieved one at a time as they are written to the socket. The servlet container already handles sending out chunked encoding for large responses too. -Yonik http://www.lucidimagination.com On Sat, May 30, 2009 at 12:45 PM, Grant Ingersoll wrote: > Anyone have any thoughts on what is involved with streaming lots of results > out of Solr? > > For instance, if I wanted to get something like 1M docs out of Solr (or > more) via *:* query, how can I tractably do this? Likewise, if I wanted to > return all the terms in the index or all the Term Vectors. > > Obviously, it is impossible to load all of these things into memory and then > create a response, so I was wondering if anyone had any ideas on how to > stream them. > > Thanks, > Grant >
Re: Streaming Docs, Terms, TermVectors
Don't stream, request chunks of 10 or 100 at a time. It works fine and you don't have to write or test any new code. In addition, it works well with HTTP caches, so if two clients want to get the same data, the second can get it from the cache. We do that at Netflix. Each front-end box does a series of queries to get all the movie titles, then loads them into a local index for autocomplete. wunder On 5/30/09 11:01 AM, "Kaktu Chakarabati" wrote: > For a streaming-like solution, it is possible infact to have a working > buffer in-memory that emits chunks on an http connection which is kept alive > by the server until the full response has been sent. > This is quite similar for example to how video streaming protocols which can > operate on top of HTTP work ( cf. a more general discussion on > http://ajaxpatterns.org/HTTP_Streaming#In_A_Blink ). > Another (non-mutually exclusive) possibility is to introduce a novel binary > format for the transmission of such data ( i.e a new wt=<..> type ) over > http (or any other comm. protocol) so that data can be more effectively > compressed and made to better fit into memory. > One such format which has been widely circulating and already has many open > source projects implementing it is Adobe's AMF ( > http://osflash.org/documentation/amf ). It is however a proprietary format > so i'm not sure whether it is incorporable under apache foundation terms. > > -Chak > > > On Sat, May 30, 2009 at 9:58 AM, Dietrich Featherston > wrote: > >> I was actually curious about the same thing. Perhaps an endpoint reference >> could be passed in the request where the documents can be sent >> asynchronously, such as a jms topic. >> >> solr/query?q=*:*&epr=/my/topic&eprtype=jms >> >> Then we would need to consider how to break up the response, how to cancel >> a running query, etc. >> >> Is this along the lines of what you're looking for? I would be interested >> in looking at how the request/response contract changes and what types of >> endpoint references would be supported. >> >> Thanks, >> D >> >> On May 30, 2009, at 12:45 PM, Grant Ingersoll wrote: >> >> Anyone have any thoughts on what is involved with streaming lots of >>> results out of Solr? >>> >>> For instance, if I wanted to get something like 1M docs out of Solr (or >>> more) via *:* query, how can I tractably do this? Likewise, if I wanted to >>> return all the terms in the index or all the Term Vectors. >>> >>> Obviously, it is impossible to load all of these things into memory and >>> then create a response, so I was wondering if anyone had any ideas on how to >>> stream them. >>> >>> Thanks, >>> Grant >>> >>
Re: Streaming Docs, Terms, TermVectors
For a streaming-like solution, it is possible infact to have a working buffer in-memory that emits chunks on an http connection which is kept alive by the server until the full response has been sent. This is quite similar for example to how video streaming protocols which can operate on top of HTTP work ( cf. a more general discussion on http://ajaxpatterns.org/HTTP_Streaming#In_A_Blink ). Another (non-mutually exclusive) possibility is to introduce a novel binary format for the transmission of such data ( i.e a new wt=<..> type ) over http (or any other comm. protocol) so that data can be more effectively compressed and made to better fit into memory. One such format which has been widely circulating and already has many open source projects implementing it is Adobe's AMF ( http://osflash.org/documentation/amf ). It is however a proprietary format so i'm not sure whether it is incorporable under apache foundation terms. -Chak On Sat, May 30, 2009 at 9:58 AM, Dietrich Featherston wrote: > I was actually curious about the same thing. Perhaps an endpoint reference > could be passed in the request where the documents can be sent > asynchronously, such as a jms topic. > > solr/query?q=*:*&epr=/my/topic&eprtype=jms > > Then we would need to consider how to break up the response, how to cancel > a running query, etc. > > Is this along the lines of what you're looking for? I would be interested > in looking at how the request/response contract changes and what types of > endpoint references would be supported. > > Thanks, > D > > > > > > > On May 30, 2009, at 12:45 PM, Grant Ingersoll wrote: > > Anyone have any thoughts on what is involved with streaming lots of >> results out of Solr? >> >> For instance, if I wanted to get something like 1M docs out of Solr (or >> more) via *:* query, how can I tractably do this? Likewise, if I wanted to >> return all the terms in the index or all the Term Vectors. >> >> Obviously, it is impossible to load all of these things into memory and >> then create a response, so I was wondering if anyone had any ideas on how to >> stream them. >> >> Thanks, >> Grant >> >
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714750#action_12714750 ] Martijn van Groningen commented on SOLR-236: I'm looking forward in your experiences with this patch, particular in production. I think in order to make collapsing work on multi shard systems the process method of the CollapseComponent needs to be modified. CollapseComponent already subclasses QueryComponent (which already supports querying on multi shard systems), so it should not be that difficult. > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236-2.patch, > field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, > field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Streaming Docs, Terms, TermVectors
I was actually curious about the same thing. Perhaps an endpoint reference could be passed in the request where the documents can be sent asynchronously, such as a jms topic. solr/query?q=*:*&epr=/my/topic&eprtype=jms Then we would need to consider how to break up the response, how to cancel a running query, etc. Is this along the lines of what you're looking for? I would be interested in looking at how the request/response contract changes and what types of endpoint references would be supported. Thanks, D On May 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Anyone have any thoughts on what is involved with streaming lots of results out of Solr? For instance, if I wanted to get something like 1M docs out of Solr (or more) via *:* query, how can I tractably do this? Likewise, if I wanted to return all the terms in the index or all the Term Vectors. Obviously, it is impossible to load all of these things into memory and then create a response, so I was wondering if anyone had any ideas on how to stream them. Thanks, Grant
Streaming Docs, Terms, TermVectors
Anyone have any thoughts on what is involved with streaming lots of results out of Solr? For instance, if I wanted to get something like 1M docs out of Solr (or more) via *:* query, how can I tractably do this? Likewise, if I wanted to return all the terms in the index or all the Term Vectors. Obviously, it is impossible to load all of these things into memory and then create a response, so I was wondering if anyone had any ideas on how to stream them. Thanks, Grant
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714742#action_12714742 ] Oleg Gnatovskiy commented on SOLR-236: -- Hey guys, are there any plans to make field collapsing work on multi shard systems? > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236-2.patch, > field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, > field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714738#action_12714738 ] Thomas Traeger commented on SOLR-236: - The problem is solved, thanks. I will use your patch for my current project that is planned for golive in 5 weeks. If I find any more issues I will report them here. > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236-2.patch, > field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, > field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (SOLR-1155) Change DirectUpdateHandler2 to allow concurrent adds during an autocommit
Seems ok now... On Fri, May 29, 2009 at 7:47 PM, Mike Klaas wrote: > I'd like to take a look at this but JIRA seems to be down. Is anyone else > experiencing this? > > -Mike > > > On Wed, May 13, 2009 at 7:41 AM, Jayson Minard (JIRA) wrote: > >> >> [ >> https://issues.apache.org/jira/browse/SOLR-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >> >> Jayson Minard updated SOLR-1155: >> >> >> Attachment: Solr-1155.patch >> >> Resolve TODO for commitWithin, and updated AutoCommitTrackerTest to >> validate the fix. >> >> > Change DirectUpdateHandler2 to allow concurrent adds during an autocommit >> > - >> > >> > Key: SOLR-1155 >> > URL: https://issues.apache.org/jira/browse/SOLR-1155 >> > Project: Solr >> > Issue Type: Improvement >> > Components: search >> > Affects Versions: 1.3 >> > Reporter: Jayson Minard >> > Attachments: Solr-1155.patch, Solr-1155.patch >> > >> > >> > Currently DirectUpdateHandler2 will block adds during a commit, and it >> seems to be possible with recent changes to Lucene to allow them to run >> concurrently. >> > See: >> http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--td23435224.html >> >> -- >> This message is automatically generated by JIRA. >> - >> You can reply to this email to add a comment to the issue online. >> >> >
[jira] Updated: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated SOLR-236: --- Attachment: field-collapse-solr-236-2.patch Thanks for the feedback, I fixed the problem you described and I have added a new patch containing the fix. The problem occurred when sorting was done on one ore more normal fields and on scoring. > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236-2.patch, > field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, > field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714676#action_12714676 ] Thomas Traeger commented on SOLR-236: - I made some tests with your patch and trunk (rev. 779497). It looks good so far but I have some problems with occasional null pointer exceptions when using the sort parameter: [http://localhost:8983/solr/select?q=*:*&collapse.field=manu&sort=score%20desc,alphaNameSort%20asc] java.lang.NullPointerException at org.apache.lucene.search.FieldComparator$RelevanceComparator.copy(FieldComparator.java:421) at org.apache.solr.search.CollapseFilter$DocumentComparator.compare(CollapseFilter.java:649) at org.apache.solr.search.CollapseFilter$DocumentPriorityQueue.lessThan(CollapseFilter.java:596) at org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:153) at org.apache.solr.search.CollapseFilter.normalCollapse(CollapseFilter.java:321) at org.apache.solr.search.CollapseFilter.(CollapseFilter.java:211) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:67) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) These queries work as expected: http://localhost:8983/solr/select?q=*:*&collapse.field=manu&sort=score%20desc http://localhost:8983/solr/select?q=*:*&sort=score%20desc,alphaNameSort%20asc > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236.patch, > field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, > field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, > SOLR-236_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed