Re: Query of death? Collapsing Query Parser - Solr 7.5

2019-03-26 Thread IZaBEE_Keeper
OK..

The intent is to collapse on the field domain..

Here's a query that works fine and the way I want with the Collapsing query
parser..

/select?defType=dismax=score,content,description,keywords,title={!collapse%20field=domain%20nullPolicy=expand}=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06=bernie+sanders=title%20description%20keywords%20content%20url

This is a complex query with 20 terms mixed alpha & numeric single
characters..

/select?defType=dismax=score,content,description,keywords,title={!collapse%20field=domain%20nullPolicy=expand}=content^0.05%20description^0.03%20keywords^0.03%20title^0.05%20url^0.06=1+2+e+3+s+a+d+f+r+4+5+t+g+6+7+8+7+1+2+3+6=title%20description%20keywords%20content%20url

This query crashes solr with the OOM process killer..

Removing the collapsing query parser {!collapse field=domain
nullPolicy=expand} eliminates the problem and never crashes solr on any
query by my testing.. A search of 20 alpha & numeric characters with spaces
is very slow though..

With the collapsing query parser the single numeric terms cause solr to
crash.. using whole words works but slow if there's too many terms..

The debug on all successful queries shows no errors.. the default is 10
rows.. a cold search (not cached) on a 2 word phrase takes 2-4 seconds.
Adding more than 3-4 numbers with spaces to the search kills it..

There is no debug for the failed queries as solr is killed by the process
killer..

Extreme queries are long multi term queries or long queries of single number
& letters with spaces in between.  Something like '1 3 s 2 c 4 5 t s 5 6 3 a
s 4 e 6 1 4 3 2 4 5 6 ' will cause it to search for all those individual
terms which are likely to be very frequent.. This type of query seems to
make solr work really hard..

While it's not likely that users would make such searches I need to prevent
solr from crashing with the collapsing query parser.. This type of query can
cause a heavy load on various types of search systems and can be used in DOS
attacks targeting search systems.. You can try a 20 term query made of
numbers & letters with spaces between to see what I mean if you have a 100m
doc index handy..

I can try to prevent these types of queries through the search API by
rewriting the user input.. However if there is a way to make solr time out
instead of being killed that would be preferable.. Otherwise I'll have to
find a different way to limit the number of results per domain..

I have some more ram to put in the server tomorrow, that might help..  I
don't mind if the complex searches are slow.. but crashing out is not good..
especially with the process killer killing solr completely..

Currently this is on a master/slave setup, 150m docs 800GB, 24GB ram, 16GB
heap..



-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Query of death? Collapsing Query Parser - Solr 7.5

2019-03-25 Thread IZaBEE_Keeper
Hi..

I'm wondering if I've found a query of death or just a really expensive
query.. It's killing my solr with OOM..

Collapsing query parser using:
fq={!collapse field=domain nullPolicy=expand}

Everything works fine using words & phrases.. However as soon as there are
numbers involved it crashes out with OOM Killer..

The server has nowhere near enough ram for the index of 800GB & 150M docs..

But a dismax query like '1 2 s 2 s 3 e d 4 r f 3 e s 7 2 1 4 6 7 8 2 9 0 3'
will make it crash..

fq={!collapse field=domain nullPolicy=expand}
PhraseFields( 'content^0.05 description^0.03 keywords^0.03 title^0.05
url^0.06' )
BoostQuery( 'host:"' . $q . '"^0.6 host:"twitter.com"^0.35 domain:"' . $q .
'"^0.6' )

Without the fq it works just fine and only times out on extreme queries..
eventually it finds them..

Do I just need more ram or is there another way to prevent solr from
crashing?

Solr 7.5 24GB ram 16gb heap with ssd lv..



-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Migrate Solr Master To Cloud 7.5

2019-03-21 Thread IZaBEE_Keeper
Hi..

I have a large Solr 7.5 index over 150M docs and 800GB in a master slave
setup.. I need to migrate the core to a Solr Cloud instance with pull
replicas as the index will be exceeding the 2.2B doc limit for a single
core.. 

I found this..
http://lucene.472066.n3.nabble.com/Copy-existing-index-from-standalone-Solr-to-Solr-cloud-td4149920.html

  
It's a bit out dated but sounds like it might work..

Does anyone have any other advice/links for this type of migration? 

Right now I just need to convert the master to cloud before it gets much
bigger.. Re-indexing is an option but I would rather convert which is likely
much faster..

Thanks.. 



-
Bee Keeper at IZaBEE.com
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html