Oh, great! Thank you, this is helpful! On 1/24/20, 6:43 PM, "Walter Underwood" <wun...@wunderwood.org> wrote:
Click-based weights are vulnerable to spamming. Some of us fondly remember when Google was showing Microsoft as the first hit for “evil empire” thanks to a click attack. For our ecommerce search, we use the actual titles of books weighted by order volume. Decorated titles are reduced to a base title, so “Managerial Accounting: Student Value Edition” becomes just “Managerial Accounting”. Showing all the variations is the job of the real results page. wunder Walter Underwood wun...@wunderwood.org https://urldefense.proofpoint.com/v2/url?u=http-3A__observer.wunderwood.org_&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M&m=3oEhRJWEHDoz3HXt87Y_FXxPTUZg1zSA5r4P6urviug&s=87IOY_vKNONtR2r2IkW-NnZ4Rn3wI-OIO6RSdqdOMfU&e= (my blog) > On Jan 24, 2020, at 7:07 AM, Lucky Sharma <goku0...@gmail.com> wrote: > > Hi Audrey, > As suggested by Erik, you can index the data into a seperate collection and > You can instead of adding weights inthe document you can also use > LTR(Learning to Rank) with in Solr to rerank on the documents. > And also to increase more relevance with in the Autosuggestion and making > positional context of the user in case of Multi token keywords you can also > bigrams/trigrams to generate edge n-grams. > > > > Regards, > Lucky Sharma > > On Fri, 24 Jan, 2020, 8:28 pm Lucky Sharma, <goku0...@gmail.com> wrote: > >> Hi Audrey, >> As suggested by Erik, you can index the data into a seperate collection >> and You can instead of adding weights inthe document you can also use LTR >> with in Solr to rerank on the features. >> >> Regards, >> Lucky Sharma >> >> On Fri, 24 Jan, 2020, 8:01 pm Audrey Lorberfeld - >> audrey.lorberf...@ibm.com, <audrey.lorberf...@ibm.com> wrote: >> >>> Erik, >>> >>> Thank you! Yes, that's exactly how we were thinking of architecting it. >>> And our ML engineer suggested something else for the suggestion weights, >>> actually -- to build a model that would programmatically update the weights >>> based on those suggestions' live clicks @ position k, etc. Pretty cool >>> idea... >>> >>> >>> >>> On 1/23/20, 2:26 PM, "Erik Hatcher" <erik.hatc...@gmail.com> wrote: >>> >>> It's a great idea. And then index that file into a separate lean >>> collection of just the suggestions, along with the weight as another field >>> on those documents, to use for ranking them at query time with standard >>> /select queries. (this separate suggest collection would also have >>> appropriate tokenization to match the partial words as the user types, like >>> ngramming) >>> >>> Erik >>> >>> >>>> On Jan 20, 2020, at 11:54 AM, Audrey Lorberfeld - >>> audrey.lorberf...@ibm.com <audrey.lorberf...@ibm.com> wrote: >>>> >>>> David, >>>> >>>> Thank you, that is useful. So, would you recommend using a (clean) >>> field over an external dictionary file? We have lots of "top queries" and >>> measure their nDCG. A thought was to programmatically generate an external >>> file where the weight per query term (or phrase) == its nDCG. Bad idea? >>>> >>>> Best, >>>> Audrey >>>> >>>> On 1/20/20, 11:51 AM, "David Hastings" < >>> hastings.recurs...@gmail.com> wrote: >>>> >>>> Ive used this quite a bit, my biggest piece of advice is to >>> choose a field >>>> that you know is clean, with well defined terms/words, you dont >>> want an >>>> autocomplete that has a massive dictionary, also it will make the >>>> start/reload times pretty slow >>>> >>>> On Mon, Jan 20, 2020 at 11:47 AM Audrey Lorberfeld - >>>> audrey.lorberf...@ibm.com <audrey.lorberf...@ibm.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> We plan to incorporate a query autocomplete functionality into our >>> search >>>>> engine (like this: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_8-5F1_suggester.html&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M&m=L8V-izaMW_v4j-1zvfiXSqm6aAoaRtk-VJXA6okBs_U&s=vnE9KGyF3jky9fSi22XUJEEbKLM1CA7mWAKrl2qhKC0&e= >>>>> ). And I was wondering if anyone has personal experience with this >>>>> component and would like to share? Basically, we are just looking >>> for some >>>>> best practices from more experienced Solr admins so that we have a >>> starting >>>>> place to launch this in our beta. >>>>> >>>>> Thank you! >>>>> >>>>> Best, >>>>> Audrey >>>>> >>>> >>>> >>> >>> >>> >>>