Hi Doug,

Your blog write up on relevancy is very interesting, I didn't know this.
Looks like I have to go back to my drawing board and figure out an
alternative solution: somehow get those group-based-fields data into a
single field using copyField.

Thanks

Steve

On Wed, May 20, 2015 at 11:17 AM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Steven,
>
> I'd be concerned about your relevance with that many qf fields. Dismax
> takes a "winner takes all" point of view to search. Field scores can vary
> by an order of magnitude (or even two) despite the attempts of query
> normalization. You can read more here
>
> http://opensourceconnections.com/blog/2013/07/02/getting-dissed-by-dismax-why-your-incorrect-assumptions-about-dismax-are-hurting-search-relevancy/
>
> I'm about to win the "blashphemer" merit badge, but ad-hoc all-field like
> searching over many fields is actually a good use case for Elasticsearch's
> cross field queries.
>
> https://www.elastic.co/guide/en/elasticsearch/guide/master/_cross_fields_queries.html
>
> http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-field-search-is-a-lie/
>
> It wouldn't be hard (and actually a great feature for the project) to get
> the Lucene query associated with cross field search into Solr. You could
> easily write a plugin to integrate it into a query parser:
>
> https://github.com/elastic/elasticsearch/blob/master/src/main/java/org/apache/lucene/queries/BlendedTermQuery.java
>
> Hope that helps
> -Doug
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Relevant Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
> On Wed, May 20, 2015 at 8:27 AM, Steven White <swhite4...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > My solution requires that users in group-A can only search against a set
> of
> > fields-A and users in group-B can only search against a set of fields-B,
> > etc.  There can be several groups, as many as 100 even more.  To meet
> this
> > need, I build my search by passing in the list of fields via "qf".  What
> > goes into "qf" can be large: as many as 1500 fields and each field name
> > averages 15 characters long, in effect the data passed via "qf" will be
> > over 20K characters.
> >
> > Given the above, beside the fact that a search for "apple" translating
> to a
> > 20K characters passing over the network, what else within Solr and
> Lucene I
> > should be worried about if any?  Will I hit some kind of a limit?  Will
> > each search now require more CPU cycles?  Memory?  Etc.
> >
> > If the network traffic becomes an issue, my alternative solution is to
> > create a /select handler for each group and in that handler list the
> fields
> > under "qf".
> >
> > I have considered creating pseudo-fields for each group and then use
> > copyField into that group.  During search, I than can "qf" against that
> one
> > field.  Unfortunately, this is not ideal for my solution because the
> fields
> > that go into each group dynamically change (at least once a month) and
> when
> > they do change, I have to re-index everything (this I have to avoid) to
> > sync that group-field.
> >
> > I'm using "qf" with edismax and my Solr version is 5.1.
> >
> > Thanks
> >
> > Steve
> >
>

Reply via email to