[ 
https://issues.apache.org/jira/browse/SOLR-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942908#comment-15942908
 ] 

Alessandro Benedetti commented on SOLR-10359:
---------------------------------------------

Hi [~arafalov],
let me answer to your considerations :
?? Logging the search queries and results received (either as count or as 
specific ids). And - maybe - statistics on that. ??

Main intention here was to store "impressions" , the results we are showing to 
the users for each query.
This is a required component when calculating Click Through Rate.
In my opinion, logging queries could even be a third aspect, not strictly 
related to the scope of this Jira issue, but definetly related to the component.
Indeed could end up in a sibling module, with also different storage and data 
structure, but I agree it is an aspect definitely worth to consider, I would 
love to have it as well.

?? The second one seems to be happening well out of Solr control (UI clicks, 
what user selected, etc). ??

Absolutely correct, and actually this is the main point of this new feature :
allow Solr to process out of the box, users interactions with Solr results, to 
have a better evaluation of how the search engine behaves.
So it is actually a way of giving Solr the control over events happening 
outside, but strictly realted to relevancy measurement.

?? The second one seems to be happening well out of Solr control (UI clicks, 
what user selected, etc). ??
Main point is to expose an easy REST endpoint out of the box, that will allow 
users to post events to Solr and then be able to evaluate their search engine ( 
and compare different A/B tests) through different evaluation metrics.
?? I am not sure if that fits into Solr itself ??
I do believe could be a really nice addition in Solr. I see a lot of Solr users 
struggling in identifying the quality of their search engine, and how good the 
system is behaving generally or per specific queries.
Giving them an ability to evaluate and compare their system with a set of 
estabilished evaluation metrics could be very useful.
This will allow anyone to do it without installing additional software or 
building complicated collections or abstractions not really easy to do for 
everyone.
The main scope is to simplify the approach and make it super easy to do it out 
of the box.
??  Commercial platforms (such as Fusion) might be integrating it, but they 
control more of a stack ??
This is correct, but according to what I know, if the data collected may be 
similar, the target is completely different.
I think at the moment signals processing in Fusion is used as an additional 
factor for relevancy tuning ( basically pushing well clicked documents up the 
ranking).
The first scope of this new feature will be focused on evaluation of relevancy, 
not tuning it.
I agree of course that as soon as the component(s) is/are in place, this will 
open the doors to a big new world and we could add additional ways of using the 
data collected ( LTR training set, relevancy tuning, ect ect).
But at the moment this is out of scope.

Answering [~diegoceccarelli] :
1) UserInteraction is probably a better name instead of Users Events, I like it 
!

??how to create a unique search id? (should be responsability of solr? I think 
yes)??
If I get what you refer to, I do believe Solr should be responsible to generate 
a query Id per user interaction but also store the original query "plain" as 
will be quite interesting to analyse the evaluation metrics on a query basis ( 
in a human readable way).  To perform the aggregations I do agree that a clever 
Id could make them more performant.
Were you referring to this as " search Id" ?

??if I want to use metric like the CTR (i.e., Click Through Rate, number of 
clicks / number of impressions) in the scoring formula how can I do that 
without joining the two collections? ( (maybe that could be a way to 'import' a 
particular metric into the main collection? )??
A) For a query time approach, the quickest thing that comes to my head is 
defining a custom function query, that will access the data structure we 
defined to store the users interactions and calculate the aggregation on a per 
document basis.
This could imply we need to design the underlying data structure in a different 
way, as I don't know if running those aggregations on a Lucene index will be 
fast enough ( as the function query will need to be calculate for every 
document).

B) For an indexing time approach, we could :
1) design an additional data structure in the index, specifically designed to 
map a docId to the metric value. This could potentially be in the docValues 
format with some tweak.
2) design a demon ( I need to deeply take a look to them as I have not yet used 
them in Solr, but I saw them related to Streaming expressions) that from time 
to time, runs in background and generate this data structure.
3) at query time  the scorer could access the data structure and get the  
value. ( potentially the function query in point A could be used together with 
this approach)

??how this could work in case of multiple shards??
I would assume that moving to Solr Cloud will complciate the things.
Probably we should route the interactions as we route the documents but then we 
could have a problem with the impressions ( as the result set a shard see will 
not coincide what the user finally sees).
So the component that will address the impressions collection, should be only 
able to collect the aggregated impressions ( as showed to the users) and route 
the impressions data as well to the correct shard.
Each shard will then store locally in a storeDir the data structure.

??it should be easy to implement complex metrics that are computed from simple 
metrics, some examples: 1. the click through rate: for a document, or a 
document and a particular query, collect the number of clicks and divide by the 
number of impressions (ignoring multiple requests from the same user? 2. time 
spent on a document after a query: if a log time of click and time of closure 
of a document, I want to compute how much time the users spent on the document 
3. number of clicks per query.??
I do agree, I would say probably to initially focus only on point 1, but I am 
super excited in adding the other metrics and data later .

?? with respect to the data model, I would add:
a user-id
a blob containing an optional payload
score of the document ??

+1

Quite happy to discuss it, any additional feedback is welcome :)




> User Events Logger Component
> ----------------------------
>
>                 Key: SOLR-10359
>                 URL: https://issues.apache.org/jira/browse/SOLR-10359
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Alessandro Benedetti
>              Labels: CTR, evaluation
>
> *Introduction*
> Being able to evaluate the quality of your search engine is becoming more and 
> more important day by day.
> This issue is to put a milestone to integrate online evaluation metrics with 
> Solr.
> *Scope*
> Scope of this issue is to provide a set of components able to :
> 1) Collect Search Results impressions ( results shown per query)
> 2) Collect Users events ( user interactions on the search results per query 
> e.g. clicks, bookmarking,ect )
> 3) Calculate evaluation metrics on demand, such as Click Through Rate, DCG ...
> *Technical Design*
> A SearchComponent can be designed :
> *UsersEventsLoggerComponent*
> A property (such as storeDir) will define where the data collected will be 
> stored.
> Different data structures can be explored, to keep it simple, a first 
> implementation can be a Lucene Index.
> *Data Model*
> The user event can be modelled in the following way :
> <query> - the user query the event is related to
> <result_id> - the ID of the search result involved in the interaction
> <result_position> - the position in the ranking of the search result involved 
> in the interaction
> <timestamp> - time when the interaction happened
> <relevancy_rating> - 0 for impressions, a value between 1-5 to identify the 
> type of user event, the semantic will depend on the domain and use cases
> <test_group> - this can identify a variant, in A/B testing
> *Impressions Logging*
> When the SearchComponent  is assigned to a request handler, everytime it 
> processes a request and return to the user a result set for a query, the 
> component will collect the impressions ( results returned) and index them in 
> the auxiliary lucene index.
> This will happen in parallel as soon as you return the results to avoid 
> affecting the query time.
> Of course an impact on CPU load and memory is expected, will be interesting 
> to minimise it.
> * User Events Logging *
> An UpdateHandler will be exposed to accept POST requests and collect user 
> events.
> Everytime a request is sent, the user event will be indexed in the underline 
> auxiliary Lucene Index.
> * Stats Calculation *
> A RequestHandler will be exposed to be able to calculate stats and 
> aggregations for the metrics :
> /evaluation?metric=ctr&stats=query&compare=testA,testB
> This request could calculate the CTR for our testA and testB to compare.
> Showing stats in total and per query ( to highlight the queries with 
> lower/higher CTR).
> The calculations will happen separating the <test_group> for an easy 
> comparison.
> Will be important to keep it as simple as possible for a first version, to 
> then extend it as much as we like



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to