Is there any reason why you are not storing each DOC_TYPE in its own index?
On Tue, Dec 3, 2019 at 1:50 PM Ravikumar Govindarajan <ravikumar.govindara...@gmail.com> wrote: > > Hello, > > We are using TF-IDF for scoring (Yet to migrate to BM25). Different > entities (DOC_TYPES) are crunched & stored together in a single index. > > When it comes to IDF, I find that there is a single value computed across > documents & stored as part of TermStats, whereas our documents are not > homogeneous. So, a single IDF value doesn't work for us > > We would like to compute IDF for each <Term/DOC_TYPE> pair, store it & > later use the paired-IDF values during query time. Is something like this > possible via Codecs or other mechanisms? > > Any help is much appreciated > > -- > Ravi -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org