How did u implement BM25PartialVectorReducer and BM25Converter?? The present implementations for TFIDFConverter and Reducer are MR. Mahout is not accepting any new MapReduce code.
On Wed, Oct 1, 2014 at 7:18 AM, Arian Pasquali <ar...@arianpasquali.com> wrote: > Hey guys, > I think it is fair to give you some feedback. > I managed to implement BM25+ <http://en.wikipedia.org/wiki/Okapi_BM25> > term > score on Mahout. > It was straightforward using the current TFIDF implementation as an > example. > > Basically what I did was implement the interface > org.apache.mahout.vectorizer.Weight, create a BM25Converter and > BM25PartialVectorReducer similar to TFIDFConverter > < > https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/vectorizer/tfidf/TFIDFConverter.html > > > and > TFIDFPartialVectorReducer > < > https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/vectorizer/tfidf/TFIDFPartialVectorReducer.html > > > respectively . > > cheers > Arian > > Arian Pasquali > http://about.me/arianpasquali > > 2014-09-24 14:14 GMT+01:00 Arian Pasquali <ar...@arianpasquali.com>: > > > Yes, > > I'm studying his work <http://nlp.uned.es/~jperezi/Lucene-BM25/> and the > > current mahout's tfidf code. > > Trying to understand how I would port that to mr. > > I ll try to share something if I succeed. > > > > Arian Pasquali > > http://about.me/arianpasquali > > > > 2014-09-24 5:12 GMT+01:00 Suneel Marthi <suneel.mar...@gmail.com>: > > > >> Lucene 4.x supports okapi-bm25. So it should be easy to implement. > >> > >> On Tue, Sep 23, 2014 at 11:57 PM, Ted Dunning <ted.dunn...@gmail.com> > >> wrote: > >> > >> > Should be pretty easy. I haven't heard of anyone doing it. > >> > > >> > Sent from my iPhone > >> > > >> > > On Sep 23, 2014, at 18:53, Arian Pasquali <ar...@arianpasquali.com> > >> > wrote: > >> > > > >> > > Hi, > >> > > I was wondering if would be possible to support bm25 term weighting > >> > > extending Mahout's tf-idf implementation. > >> > > > >> > > I was curious to know if anyone here has already tried to do so. > >> > > If not, what would be your suggestion for such implementation on > >> Mahout? > >> > > > >> > > > >> > > Arian Pasquali > >> > > http://about.me/arianpasquali > >> > > >> > > > > >