Thank you for these useful resources, please allow me to spend some time look 
into it. 
I’ll let you know asap!!

Thanks

Hank

> On May 10, 2024, at 12:34 PM, Michael Wechner <michael.wech...@wyona.com> 
> wrote:
> 
> also we might want to consider how this relates to
> 
> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/Rescorer.html
> 
> In vector search reranking has become quite popular, e.g.
> 
> https://docs.cohere.com/docs/reranking
> 
> IIUC LangChain (python) for example adds the reranker as an argument to the 
> searcher/retriever
> 
> https://python.langchain.com/v0.1/docs/integrations/retrievers/cohere-reranker/
> 
> So maybe the following might make sense as well
> 
> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
> TopDocs topDocsVector = vectorSearcher.search(query, 50, new 
> CohereReranker());
> 
> TopDocs topDocs = TopDocs.merge(new RRFRanker(), topDocsKeyword, 
> topDocsVector);
> 
> WDYT?
> 
> Thanks
> 
> Michael
> 
> 
> Am 10.05.24 um 21:08 schrieb Michael Wechner:
>> great, yes, let's get started :-)
>> 
>> What about the following pseudo code, assuming that there might be 
>> alternative ranking algorithms to RRF
>> 
>> StoredFieldsKeyword storedFieldsKeyword = indexReaderKeyword.storedFields();
>> StoredFieldsVector storedFieldsVector = indexReaderKeyword.storedFields();
>> 
>> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
>> TopDocs topDocsVector = vectorSearcher.search(vectorQuery, 50);
>> 
>> Ranker ranker = new RRFRanker();
>> TopDocs topDocs = TopDocs.rank(ranker, topDocsKeyword, topDocsVector);
>> 
>> for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
>>     Document docK = storedFieldsKeyword.document(scoreDoc.doc);
>>     Document docV = storedFieldsVector.document(scoreDoc.doc);
>>     ....
>> } 
>> 
>> whereas also see 
>> 
>> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/TopDocs.html
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html
>> 
>> WDYT?
>> 
>> Thanks
>> 
>> Michael
>> 
>> 
>> 
>> 
>> Am 10.05.24 um 20:01 schrieb Chang Hank:
>>> Hi Michael,
>>> 
>>> Sounds good to me. 
>>> Let’s do it!!
>>> 
>>> Cheers,
>>> Hank
>>> 
>>>> On May 10, 2024, at 10:50 AM, Michael Wechner <michael.wech...@wyona.com> 
>>>> <mailto:michael.wech...@wyona.com> wrote:
>>>> 
>>>> Hi Hank
>>>> 
>>>> Very cool!
>>>> 
>>>> Adrien Grand suggested to implement it as a utility method on the TopDocs 
>>>> class, and since Adrien worked for a decade on Lucene
>>>> https://www.elastic.co/de/blog/author/adrien-grand
>>>> I guess it makes sense to follow his advice :-)
>>>> 
>>>> We could create a PR and work together on it, WDYT?
>>>> 
>>>> All the best
>>>> 
>>>> Michael
>>>> 
>>>> Am 10.05.24 um 18:51 schrieb Chang Hank:
>>>>> Hi Michael, 
>>>>> 
>>>>> Thank you for the reply.
>>>>> This is really a cool issue to work on,  I’m happy to work on this with 
>>>>> you. I’ll try to do research on RRF first.
>>>>> Also, are we going to implement this on the TopDocs class?
>>>>> 
>>>>> Best,
>>>>> Hank
>>>>> 
>>>>> 
>>>>>> On May 9, 2024, at 11:08 PM, Michael Wechner <michael.wech...@wyona.com> 
>>>>>> <mailto:michael.wech...@wyona.com> wrote:
>>>>>> 
>>>>>> Hi Hank
>>>>>> 
>>>>>> Thanks for offering your help!
>>>>>> 
>>>>>> I recently suggested to implement RRF (Reciprocal Rank Fusion)
>>>>>> 
>>>>>> https://lists.apache.org/thread/vvwvjl0gk67okn8z1wg33ogyf9qm07sz
>>>>>> 
>>>>>> but still have not found the time to really work on this.
>>>>>> 
>>>>>> Maybe you would be interested to do this or that we work on it together 
>>>>>> somehow?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Am 10.05.24 um 07:27 schrieb Chang Hank:
>>>>>>> Hi everyone,
>>>>>>> 
>>>>>>> I’m Hank Chang, currently studying Information Retrieval topics. I’m 
>>>>>>> really interested in contributing to Apache Lucene and enhance my 
>>>>>>> understanding to the field.
>>>>>>> I’ve reviewed several issues posted on the Github repository but 
>>>>>>> haven’t found a straightforward starting point. Could someone please 
>>>>>>> recommend suitable issues for a newcomer like me or suggest areas I 
>>>>>>> could assist with?
>>>>>>> 
>>>>>>> Thank you for your time and guidance.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Hank Chang
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>>>>> <mailto:dev-h...@lucene.apache.org>
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>>>> <mailto:dev-h...@lucene.apache.org>
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to