Re: mahout tf-idf vs lucene tf-idf

2016-06-04 Thread Ted Dunning
On Sat, Jun 4, 2016 at 10:14 AM, forme book  wrote:

> On the (Lucene side) has already by default this implementations, what I do
> struggle to understand what is the advantage of having lucene.vector in
> mahout when Lucene offer that feature out of the box ?
>
> Maybe I'm missing something big but what’s the Connection Between then ?
>  could you please explain a possible user case ?
>

The point of the Mahout implementation is that it worked well with the
Mahout math library.

If you don't need any of the other machinery of Mahout, then avoiding the
extra dependency might be a much better option for you.


mahout tf-idf vs lucene tf-idf

2016-06-04 Thread forme book
Hi,

I'm start to study text processing and I see that for evaluating two text
is possible to obtaing vector model through TF-IDF technique.

With Mahout is possible to create vectors from text with the use of
lucene.vector, if I have not misheard takes a lucene index and then map as
a tf-idf,

On the (Lucene side) has already by default this implementations, what I do
struggle to understand what is the advantage of having lucene.vector in
mahout when Lucene offer that feature out of the box ?

Maybe I'm missing something big but what’s the Connection Between then ?
 could you please explain a possible user case ?

Thanks for help

Richard