Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer
Hi Experts, We are migrating our entire search platform from SPHINX to SOLR, we wanted to do this without any flaw so any suggestion would be greatly appreciated. Thanks! On Fri, Sep 6, 2019 at 11:13 AM Doss wrote: > Dear Experts, > > For a matchmaking portal, we have one requirement where in, if a customer > viewed complete details of a bride or groom then we have to exclude that > profile id from further search results. Currently, along with other details > we are storing the viewed profile ids in a field (multivalued field) > against that bride or groom's details. > > Eg., if A viewed B, then in B's document under the field saw_me we will > add A's id > > while searching, lets say, the currently searching members id is 123456 > then we will fire a query like > > fq=-saw_me:(123456) > > Problem #1: The saw_me field value is growing like anything. > Problem #2: Removal of ids which are deleted from the base. Right now we > are doing this job as follows >Query #1: fq=saw_me:(123456)=DocId //Get all document ids > which has the deleted id as part of saw_me field. >Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} > //loop through the results got through the 1st query and fire the update > query one by one > > We feel that this method of handling is not that optimum, so we need > expert advice. Please guide. >
Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer
Jorn Thanks for the input, I learned something new today! https://cwiki.apache.org/confluence/display/solr/BloomIndexComponent this works per segment level, but our requirement is per document level. Thanks, Mohandoss. On Fri, Sep 6, 2019 at 11:41 AM Jörn Franke wrote: > I am not 100% sure if Solr has something out of the box, but you could > implement a bloom filter https://en.wikipedia.org/wiki/Bloom_filter and > store it in Solr. It is a probabilistic data structure, which is not > growing, but can achieve your use case. > However it has a caveat: it can, for example in your case, only say for > sure if a person A has NOT visited person B. If you want to know if Person > A has visited person B then there might be (with a known probability) false > positives. > > Nevertheless, it still seems to address your use case as you want to show > only not visited profiles. > > > Am 06.09.2019 um 07:43 schrieb Doss : > > > > Dear Experts, > > > > For a matchmaking portal, we have one requirement where in, if a customer > > viewed complete details of a bride or groom then we have to exclude that > > profile id from further search results. Currently, along with other > details > > we are storing the viewed profile ids in a field (multivalued field) > > against that bride or groom's details. > > > > Eg., if A viewed B, then in B's document under the field saw_me we will > add > > A's id > > > > while searching, lets say, the currently searching members id is 123456 > > then we will fire a query like > > > > fq=-saw_me:(123456) > > > > Problem #1: The saw_me field value is growing like anything. > > Problem #2: Removal of ids which are deleted from the base. Right now we > > are doing this job as follows > > Query #1: fq=saw_me:(123456)=DocId //Get all document ids > > which has the deleted id as part of saw_me field. > > Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} > //loop > > through the results got through the 1st query and fire the update query > one > > by one > > > > We feel that this method of handling is not that optimum, so we need > expert > > advice. Please guide. >
Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer
I am not 100% sure if Solr has something out of the box, but you could implement a bloom filter https://en.wikipedia.org/wiki/Bloom_filter and store it in Solr. It is a probabilistic data structure, which is not growing, but can achieve your use case. However it has a caveat: it can, for example in your case, only say for sure if a person A has NOT visited person B. If you want to know if Person A has visited person B then there might be (with a known probability) false positives. Nevertheless, it still seems to address your use case as you want to show only not visited profiles. > Am 06.09.2019 um 07:43 schrieb Doss : > > Dear Experts, > > For a matchmaking portal, we have one requirement where in, if a customer > viewed complete details of a bride or groom then we have to exclude that > profile id from further search results. Currently, along with other details > we are storing the viewed profile ids in a field (multivalued field) > against that bride or groom's details. > > Eg., if A viewed B, then in B's document under the field saw_me we will add > A's id > > while searching, lets say, the currently searching members id is 123456 > then we will fire a query like > > fq=-saw_me:(123456) > > Problem #1: The saw_me field value is growing like anything. > Problem #2: Removal of ids which are deleted from the base. Right now we > are doing this job as follows > Query #1: fq=saw_me:(123456)=DocId //Get all document ids > which has the deleted id as part of saw_me field. > Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} //loop > through the results got through the 1st query and fire the update query one > by one > > We feel that this method of handling is not that optimum, so we need expert > advice. Please guide.
Suggestion Needed: Exclude documents that are already served / viewed by a customer
Dear Experts, For a matchmaking portal, we have one requirement where in, if a customer viewed complete details of a bride or groom then we have to exclude that profile id from further search results. Currently, along with other details we are storing the viewed profile ids in a field (multivalued field) against that bride or groom's details. Eg., if A viewed B, then in B's document under the field saw_me we will add A's id while searching, lets say, the currently searching members id is 123456 then we will fire a query like fq=-saw_me:(123456) Problem #1: The saw_me field value is growing like anything. Problem #2: Removal of ids which are deleted from the base. Right now we are doing this job as follows Query #1: fq=saw_me:(123456)=DocId //Get all document ids which has the deleted id as part of saw_me field. Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} //loop through the results got through the 1st query and fire the update query one by one We feel that this method of handling is not that optimum, so we need expert advice. Please guide.