Hi Robert, I will be very happy to see this problem fixed :-) I can not image what reasons people have to use software with bugs, I guess that others bugs in lucene are removed. Anyway, if finally you are going to fix the problem, these are good news :-) thank you very much for your time.
jose On Wed, May 5, 2010 at 3:10 PM, Robert Muir <rcm...@gmail.com> wrote: > 2010/5/5 José Ramón Pérez Agüera <jose.agu...@gmail.com> > >> Hi Robert, >> >> the problem is not the linear combination of fields, the problem is to >> apply the boost factor per field after the term frequency saturation >> function and then make the linear combination of fields. Every system >> that implement BM25F, including terrier, take care of that, because if >> you don't do it you have a bug in your ranking function and not just a >> different ranking function. >> > > José, well then this should not be much of a problem to handle in > LUCENE-2392, because as I mentioned, if you have a tf() or idf() its really > because you decided to do this yourself. So you could easily apply the boost > inside your log or sqrt or whatever, if you want. > > But what I propose we do, is make sure the relevance functions we provide > (especially any default for 4.0) take care of this for your structured case, > while still providing the capability for someone to get the old behavior > [see below] > > >> If you implement this little >> change, Lucene ranking fucntion will work properly with structured >> documents and all your other concerns about allowing users to >> implement different ranking functions for different situations will be >> not affected by this change. >> >> > Well, I'm not sure all my concerns go away! I think its best to implement a > change like this in the flexible scoring framework (LUCENE-2392), so that > users, if they want, can get the old behavior: "the bug" as you call it. > > The reason I say this due to the unique cases of lucene, some people are > doing scoring in very crazy ways and if they aren't able to get the old > behavior with regards to boosting, they might be upset... even if it is > really giving them worse relevance... > > -- > Robert Muir > rcm...@gmail.com > -- Jose R. Pérez-Agüera Clinical Assistant Professor Metadata Research Center School of Information and Library Science University of North Carolina at Chapel Hill email: jagu...@email.unc.edu Web page: http://www.unc.edu/~jaguera/ MRC website: http://ils.unc.edu/mrc/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org