Thank you for these useful resources, please allow me to spend some time look into it. I’ll let you know asap!!
Thanks Hank > On May 10, 2024, at 12:34 PM, Michael Wechner <michael.wech...@wyona.com> > wrote: > > also we might want to consider how this relates to > > https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/Rescorer.html > > In vector search reranking has become quite popular, e.g. > > https://docs.cohere.com/docs/reranking > > IIUC LangChain (python) for example adds the reranker as an argument to the > searcher/retriever > > https://python.langchain.com/v0.1/docs/integrations/retrievers/cohere-reranker/ > > So maybe the following might make sense as well > > TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10); > TopDocs topDocsVector = vectorSearcher.search(query, 50, new > CohereReranker()); > > TopDocs topDocs = TopDocs.merge(new RRFRanker(), topDocsKeyword, > topDocsVector); > > WDYT? > > Thanks > > Michael > > > Am 10.05.24 um 21:08 schrieb Michael Wechner: >> great, yes, let's get started :-) >> >> What about the following pseudo code, assuming that there might be >> alternative ranking algorithms to RRF >> >> StoredFieldsKeyword storedFieldsKeyword = indexReaderKeyword.storedFields(); >> StoredFieldsVector storedFieldsVector = indexReaderKeyword.storedFields(); >> >> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10); >> TopDocs topDocsVector = vectorSearcher.search(vectorQuery, 50); >> >> Ranker ranker = new RRFRanker(); >> TopDocs topDocs = TopDocs.rank(ranker, topDocsKeyword, topDocsVector); >> >> for (ScoreDoc scoreDoc : topDocs.scoreDocs) { >> Document docK = storedFieldsKeyword.document(scoreDoc.doc); >> Document docV = storedFieldsVector.document(scoreDoc.doc); >> .... >> } >> >> whereas also see >> >> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/TopDocs.html >> https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html >> >> WDYT? >> >> Thanks >> >> Michael >> >> >> >> >> Am 10.05.24 um 20:01 schrieb Chang Hank: >>> Hi Michael, >>> >>> Sounds good to me. >>> Let’s do it!! >>> >>> Cheers, >>> Hank >>> >>>> On May 10, 2024, at 10:50 AM, Michael Wechner <michael.wech...@wyona.com> >>>> <mailto:michael.wech...@wyona.com> wrote: >>>> >>>> Hi Hank >>>> >>>> Very cool! >>>> >>>> Adrien Grand suggested to implement it as a utility method on the TopDocs >>>> class, and since Adrien worked for a decade on Lucene >>>> https://www.elastic.co/de/blog/author/adrien-grand >>>> I guess it makes sense to follow his advice :-) >>>> >>>> We could create a PR and work together on it, WDYT? >>>> >>>> All the best >>>> >>>> Michael >>>> >>>> Am 10.05.24 um 18:51 schrieb Chang Hank: >>>>> Hi Michael, >>>>> >>>>> Thank you for the reply. >>>>> This is really a cool issue to work on, I’m happy to work on this with >>>>> you. I’ll try to do research on RRF first. >>>>> Also, are we going to implement this on the TopDocs class? >>>>> >>>>> Best, >>>>> Hank >>>>> >>>>> >>>>>> On May 9, 2024, at 11:08 PM, Michael Wechner <michael.wech...@wyona.com> >>>>>> <mailto:michael.wech...@wyona.com> wrote: >>>>>> >>>>>> Hi Hank >>>>>> >>>>>> Thanks for offering your help! >>>>>> >>>>>> I recently suggested to implement RRF (Reciprocal Rank Fusion) >>>>>> >>>>>> https://lists.apache.org/thread/vvwvjl0gk67okn8z1wg33ogyf9qm07sz >>>>>> >>>>>> but still have not found the time to really work on this. >>>>>> >>>>>> Maybe you would be interested to do this or that we work on it together >>>>>> somehow? >>>>>> >>>>>> Thanks >>>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> >>>>>> Am 10.05.24 um 07:27 schrieb Chang Hank: >>>>>>> Hi everyone, >>>>>>> >>>>>>> I’m Hank Chang, currently studying Information Retrieval topics. I’m >>>>>>> really interested in contributing to Apache Lucene and enhance my >>>>>>> understanding to the field. >>>>>>> I’ve reviewed several issues posted on the Github repository but >>>>>>> haven’t found a straightforward starting point. Could someone please >>>>>>> recommend suitable issues for a newcomer like me or suggest areas I >>>>>>> could assist with? >>>>>>> >>>>>>> Thank you for your time and guidance. >>>>>>> >>>>>>> Best regards, >>>>>>> Hank Chang >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>>>>>> <mailto:dev-unsubscr...@lucene.apache.org> >>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org >>>>>>> <mailto:dev-h...@lucene.apache.org> >>>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>>>>> <mailto:dev-unsubscr...@lucene.apache.org> >>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org >>>>>> <mailto:dev-h...@lucene.apache.org> >>>>>> >>>>> >>>> >>> >> >