Hi Doug, Your blog write up on relevancy is very interesting, I didn't know this. Looks like I have to go back to my drawing board and figure out an alternative solution: somehow get those group-based-fields data into a single field using copyField.
Thanks Steve On Wed, May 20, 2015 at 11:17 AM, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > Steven, > > I'd be concerned about your relevance with that many qf fields. Dismax > takes a "winner takes all" point of view to search. Field scores can vary > by an order of magnitude (or even two) despite the attempts of query > normalization. You can read more here > > http://opensourceconnections.com/blog/2013/07/02/getting-dissed-by-dismax-why-your-incorrect-assumptions-about-dismax-are-hurting-search-relevancy/ > > I'm about to win the "blashphemer" merit badge, but ad-hoc all-field like > searching over many fields is actually a good use case for Elasticsearch's > cross field queries. > > https://www.elastic.co/guide/en/elasticsearch/guide/master/_cross_fields_queries.html > > http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-field-search-is-a-lie/ > > It wouldn't be hard (and actually a great feature for the project) to get > the Lucene query associated with cross field search into Solr. You could > easily write a plugin to integrate it into a query parser: > > https://github.com/elastic/elasticsearch/blob/master/src/main/java/org/apache/lucene/queries/BlendedTermQuery.java > > Hope that helps > -Doug > -- > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections, > LLC | 240.476.9983 | http://www.opensourceconnections.com > Author: Relevant Search <http://manning.com/turnbull> from Manning > Publications > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > On Wed, May 20, 2015 at 8:27 AM, Steven White <swhite4...@gmail.com> > wrote: > > > Hi everyone, > > > > My solution requires that users in group-A can only search against a set > of > > fields-A and users in group-B can only search against a set of fields-B, > > etc. There can be several groups, as many as 100 even more. To meet > this > > need, I build my search by passing in the list of fields via "qf". What > > goes into "qf" can be large: as many as 1500 fields and each field name > > averages 15 characters long, in effect the data passed via "qf" will be > > over 20K characters. > > > > Given the above, beside the fact that a search for "apple" translating > to a > > 20K characters passing over the network, what else within Solr and > Lucene I > > should be worried about if any? Will I hit some kind of a limit? Will > > each search now require more CPU cycles? Memory? Etc. > > > > If the network traffic becomes an issue, my alternative solution is to > > create a /select handler for each group and in that handler list the > fields > > under "qf". > > > > I have considered creating pseudo-fields for each group and then use > > copyField into that group. During search, I than can "qf" against that > one > > field. Unfortunately, this is not ideal for my solution because the > fields > > that go into each group dynamically change (at least once a month) and > when > > they do change, I have to re-index everything (this I have to avoid) to > > sync that group-field. > > > > I'm using "qf" with edismax and my Solr version is 5.1. > > > > Thanks > > > > Steve > > >