Re: Seeking Insights on New Features in Lucene

2024-05-30 Thread Chang Hank
Thanks for this reference. I'm sorry, I just joined this month and hadn’t read 
this announcement before. 
I really appreciate your help!

Thanks,
Hank


> On May 30, 2024, at 8:04 PM, Robert Muir  wrote:
> 
> Check out this thread which lists some:
> https://lists.apache.org/thread/4bhnkkvvodxxgrpj4yqm5yrgj0ppc59r
> 
> On Thu, May 30, 2024 at 10:49 PM Chang Hank  wrote:
>> 
>> Hi all,
>> 
>> I’m curious about the future development of Lucene and would like to know if 
>> there are any planned new features.
>> Could you share some insights into the main focus areas for upcoming 
>> releases? Are there specific features or improvements the community is 
>> currently working on and maybe I can help with?
>> 
>> Best regards,
>> Hank
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Seeking Insights on New Features in Lucene

2024-05-30 Thread Chang Hank
Hi all, 

I’m curious about the future development of Lucene and would like to know if 
there are any planned new features.
Could you share some insights into the main focus areas for upcoming releases? 
Are there specific features or improvements the community is currently working 
on and maybe I can help with?

Best regards,
Hank

Re: Improve testing

2024-05-24 Thread Chang Hank
After I added the new test case, I failed the ./gradlew check and it seems like 
the check failed because I added the new test case.
Is there anything I need to do before executing ./gradlew check?

Best,
Hank

> On May 24, 2024, at 12:53 PM, Chang Hank  wrote:
> 
> Hi Robert, 
> 
> Thanks for your advice, will look into it!!
> 
> Best,
> Hank
>> On May 24, 2024, at 12:46 PM, Robert Muir  wrote:
>> 
>> On Fri, May 24, 2024 at 2:33 PM Chang Hank  wrote:
>>> 
>>> Hi all,
>>> 
>>> I want to improve the code coverage for Lucene, which package would you 
>>> recommend testing to do so? Do we need more coverage in the Core package?
>>> 
>> 
>> Hello,
>> 
>> I'd recommend looking at the help/tests.txt file, you can generate the
>> coverage report easily and find untested code:
>> 
>> https://github.com/apache/lucene/blob/main/help/tests.txt#L193
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Improve testing

2024-05-24 Thread Chang Hank
Hi Robert, 

Thanks for your advice, will look into it!!

Best,
Hank
> On May 24, 2024, at 12:46 PM, Robert Muir  wrote:
> 
> On Fri, May 24, 2024 at 2:33 PM Chang Hank  wrote:
>> 
>> Hi all,
>> 
>> I want to improve the code coverage for Lucene, which package would you 
>> recommend testing to do so? Do we need more coverage in the Core package?
>> 
> 
> Hello,
> 
> I'd recommend looking at the help/tests.txt file, you can generate the
> coverage report easily and find untested code:
> 
> https://github.com/apache/lucene/blob/main/help/tests.txt#L193
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Improve testing

2024-05-24 Thread Chang Hank
Hi all,

I want to improve the code coverage for Lucene, which package would you 
recommend testing to do so? Do we need more coverage in the Core package?

Best,
Hank
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Any recommended issues to work on for a newcomer?

2024-05-18 Thread Chang Hank
Hey Michael,

I wrote the first version of my idea about implementing RRF in Lucene, here the 
link of the code 
https://gist.github.com/hack4chang/ee2b37eab80bd82e574ff4f94ed204e9.
Right now I have some questions, one is about the shardIndex to be returned, 
another one is the TotalHits value, please take a look at the code and kindly 
leave some comments below.

Thanks,
Hank

> On May 18, 2024, at 2:01 PM, Chang Hank  wrote:
> 
> Or maybe we can first create an issue and PR based on the issue number?
> WDYT?
> 
> Best,
> 
> Hank
> 
>> On May 18, 2024, at 11:29 AM, Chang Hank  wrote:
>> 
>> Hey Michael, 
>> 
>> Sorry I was a bit busy this week, but I’ve looked into the resources you 
>> provided and also some useful advice from Alessandro and Adrien.
>> 
>> I have a briefly understanding of how RRF works, but I’m not quite sure how 
>> we should implement it. Based on the advice from Alessandro and Adrien, it 
>> seems we need to consider that the search results are located at different 
>> shards. According to Alessandro, we should aggregate the ranked lists from 
>> all distributed nodes and then apply RRF.
>> Are we going to implement this aggregation logic inside our RRF method? 
>> 
>> Also could you please create a PR so we can discuss more details further?
>> 
>> All the best,
>> 
>> Hank
>> 
>>> On May 13, 2024, at 10:09 AM, Michael Wechner  
>>> wrote:
>>> 
>>> Great, sounds like we have plan :-)
>>> 
>>> Hank and I can get started trying to understand the internals better ...
>>> 
>>> Thanks
>>> 
>>> Michael
>>> 
>>> Am 13.05.24 um 18:21 schrieb Alessandro Benedetti:
>>>> Sure, we can make it work but in a distributed environment you have to run 
>>>> first each query distributed (aggregating all nodes) and then RRF on top 
>>>> of the aggregated ranked lists.
>>>> Doing RRF per node first and then aggregate per shard won't return the 
>>>> same results I suspect.
>>>> When I go back to working on the task I'll be able to elaborate more!
>>>> 
>>>> Cheers
>>>> --
>>>> Alessandro Benedetti
>>>> Director @ Sease Ltd.
>>>> Apache Lucene/Solr Committer
>>>> Apache Solr PMC Member
>>>> 
>>>> e-mail: a.benede...@sease.io <mailto:a.benede...@sease.io>
>>>> 
>>>> 
>>>> Sease - Information Retrieval Applied
>>>> Consulting | Training | Open Source
>>>> 
>>>> Website: Sease.io <http://sease.io/>
>>>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter 
>>>> <https://twitter.com/seaseltd> | Youtube 
>>>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github 
>>>> <https://github.com/seaseltd>
>>>> 
>>>> On Mon, 13 May 2024 at 14:12, Adrien Grand >>> <mailto:jpou...@gmail.com>> wrote:
>>>>> > Maybe Adrien Grand and others might also have some feedback :-)
>>>>> 
>>>>> I'd suggest the signature to look something like `TopDocs TopDocs#rrf(int 
>>>>> topN, int k, TopDocs[] hits)` to be consistent with `TopDocs#merge`. 
>>>>> Internally, it should look at `ScoreDoc#shardId` and `ScoreDoc#doc` to 
>>>>> figure out which hits map to the same document.
>>>>> 
>>>>> > Back in the day, I was reasoning on this and I didn't think Lucene was 
>>>>> > the right place for an interleaving algorithm, given that Reciprocal 
>>>>> > Rank Fusion is affected by distribution and it's not supposed to work 
>>>>> > per node.
>>>>> 
>>>>> To me this is like `TopDocs#merge`. There are changes needed on the 
>>>>> application side to hook this call into the logic that combines hits that 
>>>>> come from multiple shards (multiple queries in the case of RRF), but 
>>>>> Lucene can still provide the merging logic.
>>>>> 
>>>>> On Mon, May 13, 2024 at 1:41 PM Michael Wechner 
>>>>> mailto:michael.wech...@wyona.com>> wrote:
>>>>>> Thanks for your feedback Alessandro!
>>>>>> 
>>>>>> I am using Lucene independent of Solr or OpenSearch, Elasticsearch, but 
>>>>>> would like to combine different result sets using RRF, therefore think 
>>>>>> that Lucene itself could be a go

Re: Any recommended issues to work on for a newcomer?

2024-05-18 Thread Chang Hank
Or maybe we can first create an issue and PR based on the issue number?
WDYT?

Best,

Hank

> On May 18, 2024, at 11:29 AM, Chang Hank  wrote:
> 
> Hey Michael, 
> 
> Sorry I was a bit busy this week, but I’ve looked into the resources you 
> provided and also some useful advice from Alessandro and Adrien.
> 
> I have a briefly understanding of how RRF works, but I’m not quite sure how 
> we should implement it. Based on the advice from Alessandro and Adrien, it 
> seems we need to consider that the search results are located at different 
> shards. According to Alessandro, we should aggregate the ranked lists from 
> all distributed nodes and then apply RRF.
> Are we going to implement this aggregation logic inside our RRF method? 
> 
> Also could you please create a PR so we can discuss more details further?
> 
> All the best,
> 
> Hank
> 
>> On May 13, 2024, at 10:09 AM, Michael Wechner  
>> wrote:
>> 
>> Great, sounds like we have plan :-)
>> 
>> Hank and I can get started trying to understand the internals better ...
>> 
>> Thanks
>> 
>> Michael
>> 
>> Am 13.05.24 um 18:21 schrieb Alessandro Benedetti:
>>> Sure, we can make it work but in a distributed environment you have to run 
>>> first each query distributed (aggregating all nodes) and then RRF on top of 
>>> the aggregated ranked lists.
>>> Doing RRF per node first and then aggregate per shard won't return the same 
>>> results I suspect.
>>> When I go back to working on the task I'll be able to elaborate more!
>>> 
>>> Cheers
>>> --
>>> Alessandro Benedetti
>>> Director @ Sease Ltd.
>>> Apache Lucene/Solr Committer
>>> Apache Solr PMC Member
>>> 
>>> e-mail: a.benede...@sease.io <mailto:a.benede...@sease.io>
>>> 
>>> 
>>> Sease - Information Retrieval Applied
>>> Consulting | Training | Open Source
>>> 
>>> Website: Sease.io <http://sease.io/>
>>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter 
>>> <https://twitter.com/seaseltd> | Youtube 
>>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github 
>>> <https://github.com/seaseltd>
>>> 
>>> On Mon, 13 May 2024 at 14:12, Adrien Grand >> <mailto:jpou...@gmail.com>> wrote:
>>>> > Maybe Adrien Grand and others might also have some feedback :-)
>>>> 
>>>> I'd suggest the signature to look something like `TopDocs TopDocs#rrf(int 
>>>> topN, int k, TopDocs[] hits)` to be consistent with `TopDocs#merge`. 
>>>> Internally, it should look at `ScoreDoc#shardId` and `ScoreDoc#doc` to 
>>>> figure out which hits map to the same document.
>>>> 
>>>> > Back in the day, I was reasoning on this and I didn't think Lucene was 
>>>> > the right place for an interleaving algorithm, given that Reciprocal 
>>>> > Rank Fusion is affected by distribution and it's not supposed to work 
>>>> > per node.
>>>> 
>>>> To me this is like `TopDocs#merge`. There are changes needed on the 
>>>> application side to hook this call into the logic that combines hits that 
>>>> come from multiple shards (multiple queries in the case of RRF), but 
>>>> Lucene can still provide the merging logic.
>>>> 
>>>> On Mon, May 13, 2024 at 1:41 PM Michael Wechner >>> <mailto:michael.wech...@wyona.com>> wrote:
>>>>> Thanks for your feedback Alessandro!
>>>>> 
>>>>> I am using Lucene independent of Solr or OpenSearch, Elasticsearch, but 
>>>>> would like to combine different result sets using RRF, therefore think 
>>>>> that Lucene itself could be a good place actually.
>>>>> 
>>>>> Looking forward to your additional elaboration!
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Michael
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Am 13.05.2024 um 12:34 schrieb Alessandro Benedetti 
>>>>>> mailto:a.benede...@sease.io>>:
>>>>>> 
>>>>>> This is not strictly related to Lucene, but I'll give a talk at Berlin 
>>>>>> Buzzwords on how I am implementing Reciprocal Rank Fusion in Apache Solr.
>>>>>> I'll resume my work on the contribution next week and have more to share 
>>>>>> later.
>>>>>> 
>

Re: Any recommended issues to work on for a newcomer?

2024-05-18 Thread Chang Hank
;>>>> --
>>>>> Alessandro Benedetti
>>>>> Director @ Sease Ltd.
>>>>> Apache Lucene/Solr Committer
>>>>> Apache Solr PMC Member
>>>>> 
>>>>> e-mail: a.benede...@sease.io <mailto:a.benede...@sease.io>
>>>>> 
>>>>> 
>>>>> Sease - Information Retrieval Applied
>>>>> Consulting | Training | Open Source
>>>>> 
>>>>> Website: Sease.io <http://sease.io/>
>>>>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter 
>>>>> <https://twitter.com/seaseltd> | Youtube 
>>>>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github 
>>>>> <https://github.com/seaseltd>
>>>>> 
>>>>> On Sat, 11 May 2024 at 09:10, Michael Wechner >>>> <mailto:michael.wech...@wyona.com>> wrote:
>>>>>> sure, no problem!
>>>>>> 
>>>>>> Maybe Adrien Grand and others might also have some feedback :-)
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> Am 10.05.24 um 23:03 schrieb Chang Hank:
>>>>>>> Thank you for these useful resources, please allow me to spend some 
>>>>>>> time look into it. 
>>>>>>> I’ll let you know asap!!
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> Hank
>>>>>>> 
>>>>>>>> On May 10, 2024, at 12:34 PM, Michael Wechner 
>>>>>>>>  <mailto:michael.wech...@wyona.com> wrote:
>>>>>>>> 
>>>>>>>> also we might want to consider how this relates to
>>>>>>>> 
>>>>>>>> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/Rescorer.html
>>>>>>>> 
>>>>>>>> In vector search reranking has become quite popular, e.g.
>>>>>>>> 
>>>>>>>> https://docs.cohere.com/docs/reranking
>>>>>>>> 
>>>>>>>> IIUC LangChain (python) for example adds the reranker as an argument 
>>>>>>>> to the searcher/retriever
>>>>>>>> 
>>>>>>>> https://python.langchain.com/v0.1/docs/integrations/retrievers/cohere-reranker/
>>>>>>>> 
>>>>>>>> So maybe the following might make sense as well
>>>>>>>> 
>>>>>>>> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
>>>>>>>> TopDocs topDocsVector = vectorSearcher.search(query, 50, new 
>>>>>>>> CohereReranker());
>>>>>>>> 
>>>>>>>> TopDocs topDocs = TopDocs.merge(new RRFRanker(), topDocsKeyword, 
>>>>>>>> topDocsVector);
>>>>>>>> 
>>>>>>>> WDYT?
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> Michael
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Am 10.05.24 um 21:08 schrieb Michael Wechner:
>>>>>>>>> great, yes, let's get started :-)
>>>>>>>>> 
>>>>>>>>> What about the following pseudo code, assuming that there might be 
>>>>>>>>> alternative ranking algorithms to RRF
>>>>>>>>> 
>>>>>>>>> StoredFieldsKeyword storedFieldsKeyword = 
>>>>>>>>> indexReaderKeyword.storedFields();
>>>>>>>>> StoredFieldsVector storedFieldsVector = 
>>>>>>>>> indexReaderKeyword.storedFields();
>>>>>>>>> 
>>>>>>>>> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
>>>>>>>>> TopDocs topDocsVector = vectorSearcher.search(vectorQuery, 50);
>>>>>>>>> 
>>>>>>>>> Ranker ranker = new RRFRanker();
>>>>>>>>> TopDocs topDocs = TopDocs.rank(ranker, topDocsKeyword, topDocsVector);
>>>>>>>>> 
>>>>>>>>> for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
>>>>>>>>> Document docK = storedFieldsKeywo

Re: Any recommended issues to work on for a newcomer?

2024-05-10 Thread Chang Hank
Thank you for these useful resources, please allow me to spend some time look 
into it. 
I’ll let you know asap!!

Thanks

Hank

> On May 10, 2024, at 12:34 PM, Michael Wechner  
> wrote:
> 
> also we might want to consider how this relates to
> 
> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/Rescorer.html
> 
> In vector search reranking has become quite popular, e.g.
> 
> https://docs.cohere.com/docs/reranking
> 
> IIUC LangChain (python) for example adds the reranker as an argument to the 
> searcher/retriever
> 
> https://python.langchain.com/v0.1/docs/integrations/retrievers/cohere-reranker/
> 
> So maybe the following might make sense as well
> 
> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
> TopDocs topDocsVector = vectorSearcher.search(query, 50, new 
> CohereReranker());
> 
> TopDocs topDocs = TopDocs.merge(new RRFRanker(), topDocsKeyword, 
> topDocsVector);
> 
> WDYT?
> 
> Thanks
> 
> Michael
> 
> 
> Am 10.05.24 um 21:08 schrieb Michael Wechner:
>> great, yes, let's get started :-)
>> 
>> What about the following pseudo code, assuming that there might be 
>> alternative ranking algorithms to RRF
>> 
>> StoredFieldsKeyword storedFieldsKeyword = indexReaderKeyword.storedFields();
>> StoredFieldsVector storedFieldsVector = indexReaderKeyword.storedFields();
>> 
>> TopDocs topDocsKeyword = keywordSearcher.search(keywordQuery, 10);
>> TopDocs topDocsVector = vectorSearcher.search(vectorQuery, 50);
>> 
>> Ranker ranker = new RRFRanker();
>> TopDocs topDocs = TopDocs.rank(ranker, topDocsKeyword, topDocsVector);
>> 
>> for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
>> Document docK = storedFieldsKeyword.document(scoreDoc.doc);
>> Document docV = storedFieldsVector.document(scoreDoc.doc);
>> 
>> } 
>> 
>> whereas also see 
>> 
>> https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/TopDocs.html
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html
>> 
>> WDYT?
>> 
>> Thanks
>> 
>> Michael
>> 
>> 
>> 
>> 
>> Am 10.05.24 um 20:01 schrieb Chang Hank:
>>> Hi Michael,
>>> 
>>> Sounds good to me. 
>>> Let’s do it!!
>>> 
>>> Cheers,
>>> Hank
>>> 
>>>> On May 10, 2024, at 10:50 AM, Michael Wechner  
>>>> <mailto:michael.wech...@wyona.com> wrote:
>>>> 
>>>> Hi Hank
>>>> 
>>>> Very cool!
>>>> 
>>>> Adrien Grand suggested to implement it as a utility method on the TopDocs 
>>>> class, and since Adrien worked for a decade on Lucene
>>>> https://www.elastic.co/de/blog/author/adrien-grand
>>>> I guess it makes sense to follow his advice :-)
>>>> 
>>>> We could create a PR and work together on it, WDYT?
>>>> 
>>>> All the best
>>>> 
>>>> Michael
>>>> 
>>>> Am 10.05.24 um 18:51 schrieb Chang Hank:
>>>>> Hi Michael, 
>>>>> 
>>>>> Thank you for the reply.
>>>>> This is really a cool issue to work on,  I’m happy to work on this with 
>>>>> you. I’ll try to do research on RRF first.
>>>>> Also, are we going to implement this on the TopDocs class?
>>>>> 
>>>>> Best,
>>>>> Hank
>>>>> 
>>>>> 
>>>>>> On May 9, 2024, at 11:08 PM, Michael Wechner  
>>>>>> <mailto:michael.wech...@wyona.com> wrote:
>>>>>> 
>>>>>> Hi Hank
>>>>>> 
>>>>>> Thanks for offering your help!
>>>>>> 
>>>>>> I recently suggested to implement RRF (Reciprocal Rank Fusion)
>>>>>> 
>>>>>> https://lists.apache.org/thread/vvwvjl0gk67okn8z1wg33ogyf9qm07sz
>>>>>> 
>>>>>> but still have not found the time to really work on this.
>>>>>> 
>>>>>> Maybe you would be interested to do this or that we work on it together 
>>>>>> somehow?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Am 10.05.24 um 07:27 schrieb Chang Hank:
>>>>>>> Hi everyone,
>>>>>>> 
>>>>>>> I’m Hank Chang, currently studying Information Retrieval topics. I’m 
>>>>>>> really interested in contributing to Apache Lucene and enhance my 
>>>>>>> understanding to the field.
>>>>>>> I’ve reviewed several issues posted on the Github repository but 
>>>>>>> haven’t found a straightforward starting point. Could someone please 
>>>>>>> recommend suitable issues for a newcomer like me or suggest areas I 
>>>>>>> could assist with?
>>>>>>> 
>>>>>>> Thank you for your time and guidance.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Hank Chang
>>>>>>> -
>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>>>>> <mailto:dev-h...@lucene.apache.org>
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>>>> <mailto:dev-h...@lucene.apache.org>
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 



Re: Any recommended issues to work on for a newcomer?

2024-05-10 Thread Chang Hank
Hi Michael,

Sounds good to me. 
Let’s do it!!

Cheers,
Hank

> On May 10, 2024, at 10:50 AM, Michael Wechner  
> wrote:
> 
> Hi Hank
> 
> Very cool!
> 
> Adrien Grand suggested to implement it as a utility method on the TopDocs 
> class, and since Adrien worked for a decade on Lucene
> https://www.elastic.co/de/blog/author/adrien-grand
> I guess it makes sense to follow his advice :-)
> 
> We could create a PR and work together on it, WDYT?
> 
> All the best
> 
> Michael
> 
> Am 10.05.24 um 18:51 schrieb Chang Hank:
>> Hi Michael, 
>> 
>> Thank you for the reply.
>> This is really a cool issue to work on,  I’m happy to work on this with you. 
>> I’ll try to do research on RRF first.
>> Also, are we going to implement this on the TopDocs class?
>> 
>> Best,
>> Hank
>> 
>> 
>>> On May 9, 2024, at 11:08 PM, Michael Wechner  
>>> <mailto:michael.wech...@wyona.com> wrote:
>>> 
>>> Hi Hank
>>> 
>>> Thanks for offering your help!
>>> 
>>> I recently suggested to implement RRF (Reciprocal Rank Fusion)
>>> 
>>> https://lists.apache.org/thread/vvwvjl0gk67okn8z1wg33ogyf9qm07sz
>>> 
>>> but still have not found the time to really work on this.
>>> 
>>> Maybe you would be interested to do this or that we work on it together 
>>> somehow?
>>> 
>>> Thanks
>>> 
>>> Michael
>>> 
>>> 
>>> 
>>> Am 10.05.24 um 07:27 schrieb Chang Hank:
>>>> Hi everyone,
>>>> 
>>>> I’m Hank Chang, currently studying Information Retrieval topics. I’m 
>>>> really interested in contributing to Apache Lucene and enhance my 
>>>> understanding to the field.
>>>> I’ve reviewed several issues posted on the Github repository but haven’t 
>>>> found a straightforward starting point. Could someone please recommend 
>>>> suitable issues for a newcomer like me or suggest areas I could assist 
>>>> with?
>>>> 
>>>> Thank you for your time and guidance.
>>>> 
>>>> Best regards,
>>>> Hank Chang
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>> <mailto:dev-h...@lucene.apache.org>
>>>> 
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>> <mailto:dev-h...@lucene.apache.org>
>>> 
>> 
> 



Re: Any recommended issues to work on for a newcomer?

2024-05-10 Thread Chang Hank
Hi Michael, 

Thank you for the reply.
This is really a cool issue to work on,  I’m happy to work on this with you. 
I’ll try to do research on RRF first.
Also, are we going to implement this on the TopDocs class?

Best,
Hank


> On May 9, 2024, at 11:08 PM, Michael Wechner  
> wrote:
> 
> Hi Hank
> 
> Thanks for offering your help!
> 
> I recently suggested to implement RRF (Reciprocal Rank Fusion)
> 
> https://lists.apache.org/thread/vvwvjl0gk67okn8z1wg33ogyf9qm07sz
> 
> but still have not found the time to really work on this.
> 
> Maybe you would be interested to do this or that we work on it together 
> somehow?
> 
> Thanks
> 
> Michael
> 
> 
> 
> Am 10.05.24 um 07:27 schrieb Chang Hank:
>> Hi everyone,
>> 
>> I’m Hank Chang, currently studying Information Retrieval topics. I’m really 
>> interested in contributing to Apache Lucene and enhance my understanding to 
>> the field.
>> I’ve reviewed several issues posted on the Github repository but haven’t 
>> found a straightforward starting point. Could someone please recommend 
>> suitable issues for a newcomer like me or suggest areas I could assist with?
>> 
>> Thank you for your time and guidance.
>> 
>> Best regards,
>> Hank Chang
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 



Any recommended issues to work on for a newcomer?

2024-05-09 Thread Chang Hank
Hi everyone, 

I’m Hank Chang, currently studying Information Retrieval topics. I’m really 
interested in contributing to Apache Lucene and enhance my understanding to the 
field.
I’ve reviewed several issues posted on the Github repository but haven’t found 
a straightforward starting point. Could someone please recommend suitable 
issues for a newcomer like me or suggest areas I could assist with?

Thank you for your time and guidance.

Best regards,
Hank Chang
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org