Hi all, reviving this thread.
For those of you who use an external file for your suggestions, how do you
decide from your query logs what suggestions to include? Just starting out with
some exploratory analysis of clicks, dwell times, etc., and would love to hear
from the community any advise.
Oh, great! Thank you, this is helpful!
On 1/24/20, 6:43 PM, "Walter Underwood" wrote:
Click-based weights are vulnerable to spamming. Some of us fondly remember
when
Google was showing Microsoft as the first hit for “evil empire” thanks to a
click attack.
For our ecommerce
Click-based weights are vulnerable to spamming. Some of us fondly remember when
Google was showing Microsoft as the first hit for “evil empire” thanks to a
click attack.
For our ecommerce search, we use the actual titles of books weighted by order
volume.
Decorated titles are reduced to a base
David,
True! But we are hoping that these are purely seen as suggestions and that
people, if they know exactly what they are wanting to type/looking for, will
simply ignore the dropdown options.
On 1/24/20, 10:03 AM, "David Hastings" wrote:
This is a really cool idea! My only concern
Hi Audrey,
As suggested by Erik, you can index the data into a seperate collection and
You can instead of adding weights inthe document you can also use
LTR(Learning to Rank) with in Solr to rerank on the documents.
And also to increase more relevance with in the Autosuggestion and making
This is a really cool idea! My only concern is that the edge case
searches, where a user knows exactly what they want to find, would be
autocomplete into something that happens to be more "successful" rather
than what they were looking for. for example, i want to know the legal
implications of
Hi Audrey,
As suggested by Erik, you can index the data into a seperate collection and
You can instead of adding weights inthe document you can also use LTR with
in Solr to rerank on the features.
Regards,
Lucky Sharma
On Fri, 24 Jan, 2020, 8:01 pm Audrey Lorberfeld - audrey.lorberf...@ibm.com,
Hi Alessandro,
I'm so happy there is someone who's done extensive work with QAC here!
Right now, we measure nDCG via a Dynamic Bayesian Network. To break it down,
we:
- use a DBN model to generate a "score" for each query_url pair.
- We then plug that score into a mathematical formula we
Erik,
Thank you! Yes, that's exactly how we were thinking of architecting it. And our
ML engineer suggested something else for the suggestion weights, actually -- to
build a model that would programmatically update the weights based on those
suggestions' live clicks @ position k, etc. Pretty
It's a great idea. And then index that file into a separate lean collection
of just the suggestions, along with the weight as another field on those
documents, to use for ranking them at query time with standard /select queries.
(this separate suggest collection would also have appropriate
I have been working extensively on query autocompletion, these blogs should
be helpful to you:
https://sease.io/2015/07/solr-you-complete-me.html
https://sease.io/2018/06/apache-lucene-blendedinfixsuggester-how-it-works-bugs-and-improvements.html
You idea of using search quality evaluation to
Not a bad idea at all, however ive never used an external file before, just
a field in the index, so not an area im familiar with
On Mon, Jan 20, 2020 at 11:55 AM Audrey Lorberfeld -
audrey.lorberf...@ibm.com wrote:
> David,
>
> Thank you, that is useful. So, would you recommend using a (clean)
David,
Thank you, that is useful. So, would you recommend using a (clean) field over
an external dictionary file? We have lots of "top queries" and measure their
nDCG. A thought was to programmatically generate an external file where the
weight per query term (or phrase) == its nDCG. Bad
Ive used this quite a bit, my biggest piece of advice is to choose a field
that you know is clean, with well defined terms/words, you dont want an
autocomplete that has a massive dictionary, also it will make the
start/reload times pretty slow
On Mon, Jan 20, 2020 at 11:47 AM Audrey Lorberfeld -
14 matches
Mail list logo