OK.. The intent is to collapse on the field domain..
Here's a query that works fine and the way I want with the Collapsing query parser.. /select?defType=dismax&fl=score,content,description,keywords,title&fq={!collapse%20field=domain%20nullPolicy=expand}&pf=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06&q=bernie+sanders&qf=title%20description%20keywords%20content%20url This is a complex query with 20 terms mixed alpha & numeric single characters.. /select?defType=dismax&fl=score,content,description,keywords,title&fq={!collapse%20field=domain%20nullPolicy=expand}&pf=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06&q=1+2+e+3+s+a+d+f+r+4+5+t+g+6+7+8+7+1+2+3+6&qf=title%20description%20keywords%20content%20url This query crashes solr with the OOM process killer.. Removing the collapsing query parser {!collapse field=domain nullPolicy=expand} eliminates the problem and never crashes solr on any query by my testing.. A search of 20 alpha & numeric characters with spaces is very slow though.. With the collapsing query parser the single numeric terms cause solr to crash.. using whole words works but slow if there's too many terms.. The debug on all successful queries shows no errors.. the default is 10 rows.. a cold search (not cached) on a 2 word phrase takes 2-4 seconds. Adding more than 3-4 numbers with spaces to the search kills it.. There is no debug for the failed queries as solr is killed by the process killer.. Extreme queries are long multi term queries or long queries of single number & letters with spaces in between. Something like '1 3 s 2 c 4 5 t s 5 6 3 a s 4 e 6 1 4 3 2 4 5 6 ' will cause it to search for all those individual terms which are likely to be very frequent.. This type of query seems to make solr work really hard.. While it's not likely that users would make such searches I need to prevent solr from crashing with the collapsing query parser.. This type of query can cause a heavy load on various types of search systems and can be used in DOS attacks targeting search systems.. You can try a 20 term query made of numbers & letters with spaces between to see what I mean if you have a 100m doc index handy.. I can try to prevent these types of queries through the search API by rewriting the user input.. However if there is a way to make solr time out instead of being killed that would be preferable.. Otherwise I'll have to find a different way to limit the number of results per domain.. I have some more ram to put in the server tomorrow, that might help.. I don't mind if the complex searches are slow.. but crashing out is not good.. especially with the process killer killing solr completely.. Currently this is on a master/slave setup, 150m docs 800GB, 24GB ram, 16GB heap.. ----- Bee Keeper at IZaBEE.com -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html