Re: Any recommended issues to work on for a newcomer?

2024-05-13 Thread Michael Wechner
Great, sounds like we have plan :-) Hank and I can get started trying to understand the internals better ... Thanks Michael Am 13.05.24 um 18:21 schrieb Alessandro Benedetti: Sure, we can make it work but in a distributed environment you have to run first each query distributed (aggregating a

Re: Any recommended issues to work on for a newcomer?

2024-05-13 Thread Alessandro Benedetti
Sure, we can make it work but in a distributed environment you have to run first each query distributed (aggregating all nodes) and then RRF on top of the aggregated ranked lists. Doing RRF per node first and then aggregate per shard won't return the same results I suspect. When I go back to workin

Re: Any recommended issues to work on for a newcomer?

2024-05-13 Thread Adrien Grand
> Maybe Adrien Grand and others might also have some feedback :-) I'd suggest the signature to look something like `TopDocs TopDocs#rrf(int topN, int k, TopDocs[] hits)` to be consistent with `TopDocs#merge`. Internally, it should look at `ScoreDoc#shardId` and `ScoreDoc#doc` to figure out which h

Re: Any recommended issues to work on for a newcomer?

2024-05-13 Thread Michael Wechner
Thanks for your feedback Alessandro! I am using Lucene independent of Solr or OpenSearch, Elasticsearch, but would like to combine different result sets using RRF, therefore think that Lucene itself could be a good place actually. Looking forward to your additional elaboration! Thanks Michael

Re: Any recommended issues to work on for a newcomer?

2024-05-13 Thread Alessandro Benedetti
This is not strictly related to Lucene, but I'll give a talk at Berlin Buzzwords on how I am implementing Reciprocal Rank Fusion in Apache Solr. I'll resume my work on the contribution next week and have more to share later. Back in the day, I was reasoning on this and I didn't think Lucene was th