We are moving to higher performance platforms than Hadoop mapreduce, like 
Spark. You can still do map/reduce style code but Mahout's not taking new 
Hadoop mr code.

On Oct 1, 2014, at 6:30 AM, Arian Pasquali <ar...@arianpasquali.com> wrote:

Yes Suneel,
Indeed It is in MR fashion.

What exactly do you mean when you said Mahout is not accepting any new
MapReduce code?
Do you mean for submitting a patch?
I'm sure there might be better ways to implement it, but I'm more
interesting in the results right now.

What would be your suggestion?

best





Arian Pasquali
http://about.me/arianpasquali

2014-10-01 13:10 GMT+01:00 Suneel Marthi <smar...@apache.org>:

> How did u implement BM25PartialVectorReducer and BM25Converter?? The
> present implementations for TFIDFConverter and Reducer are MR.
> Mahout is not accepting any new MapReduce code.
> 
> On Wed, Oct 1, 2014 at 7:18 AM, Arian Pasquali <ar...@arianpasquali.com>
> wrote:
> 
>> Hey guys,
>> I think it is fair to give you some feedback.
>> I managed to implement BM25+ <http://en.wikipedia.org/wiki/Okapi_BM25>
>> term
>> score on Mahout.
>> It was straightforward using the current TFIDF implementation as an
>> example.
>> 
>> Basically what I did was implement the interface
>> org.apache.mahout.vectorizer.Weight, create a BM25Converter and
>> BM25PartialVectorReducer similar to TFIDFConverter
>> <
>> 
> https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/vectorizer/tfidf/TFIDFConverter.html
>>> 
>> and
>> TFIDFPartialVectorReducer
>> <
>> 
> https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/vectorizer/tfidf/TFIDFPartialVectorReducer.html
>>> 
>> respectively .
>> 
>> cheers
>> Arian
>> 
>> Arian Pasquali
>> http://about.me/arianpasquali
>> 
>> 2014-09-24 14:14 GMT+01:00 Arian Pasquali <ar...@arianpasquali.com>:
>> 
>>> Yes,
>>> I'm studying his work <http://nlp.uned.es/~jperezi/Lucene-BM25/> and
> the
>>> current mahout's tfidf code.
>>> Trying to understand how I would port that to mr.
>>> I ll try to share something if I succeed.
>>> 
>>> Arian Pasquali
>>> http://about.me/arianpasquali
>>> 
>>> 2014-09-24 5:12 GMT+01:00 Suneel Marthi <suneel.mar...@gmail.com>:
>>> 
>>>> Lucene 4.x supports okapi-bm25. So it should be easy to implement.
>>>> 
>>>> On Tue, Sep 23, 2014 at 11:57 PM, Ted Dunning <ted.dunn...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Should be pretty easy. I haven't heard of anyone doing it.
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Sep 23, 2014, at 18:53, Arian Pasquali <
> ar...@arianpasquali.com>
>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> I was wondering if would be possible to support bm25 term
> weighting
>>>>>> extending Mahout's tf-idf implementation.
>>>>>> 
>>>>>> I was curious to know if anyone here has already tried to do so.
>>>>>> If not, what would be your suggestion for such implementation on
>>>> Mahout?
>>>>>> 
>>>>>> 
>>>>>> Arian Pasquali
>>>>>> http://about.me/arianpasquali
>>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Reply via email to