Re: Solr Max Query length

2016-04-23 Thread Erick Erickson
I do have to ask how you're generating 4k queries. I'f certainly seen situations where this is a good thing, but that is a ginormous query Best, Erick On Sat, Apr 23, 2016 at 6:55 AM, Kelly, Frank wrote: > Yes switching to the POST saved me from having to change Jetty settings > > Thanks for

Re: need help with keyword spamming

2016-04-23 Thread Erick Erickson
The problem here is defining "irrelevant". There's nothing in Solr that magically can determine "this term is irrelevant in this doc, but this other one isn't". Best, Erick On Sat, Apr 23, 2016 at 11:08 AM, GW wrote: > No. My project is retail based. I mean people putting in a slew of > irreleva

Re: Replicas for same shard not in sync

2016-04-23 Thread jimi.hullegard
Hi, An extra tip, on top of everything that Erick said: Add an extra field to all documents, that contains the date the document was indexed. That way, you can always compare the solr documents on different machines, and quickly see what "version" exists on each machine. And you don't have to

Re: need help with keyword spamming

2016-04-23 Thread GW
No. My project is retail based. I mean people putting in a slew of irrelevant keywords in addition to relevant keywords in an attempt to get hits on searches and hits outside of context. I used a filter factory to remove duplicates. On 23 April 2016 at 11:30, Doug Turnbull < dturnb...@opensourcec

Re: need help with keyword spamming

2016-04-23 Thread Doug Turnbull
By keyword spamming, do you mean stuffing the same term over and over to game term frequency? If so You might want to try tuning BM25 similarity for your needs. It has a saturation point for term frequency. http://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-releva

need help with keyword spamming

2016-04-23 Thread GW
Hey all, I'm just finishing up a project and I'm hoping for some direction on dealing with keyword spamming. I don't have any urgent issues. I can foresee some bumps in the road. I'm using a custom spider that pulls inventory data from several dozen sources into a single doc schema. 1 record per

Re: Solr Max Query length

2016-04-23 Thread Kelly, Frank
Yes switching to the POST saved me from having to change Jetty settings Thanks for the pointer! -Frank Frank Kelly Principal Software Engineer Predictive Analytics Team (SCBE/HAC/CDA) HERE 5 Wayside Rd, Burlington, MA 01803, USA 42° 29' 7" N 71° 11' 32” W