Re: Question on RecommenderJob

2014-09-23 Thread Wei Li
OK, got your points, thanks Ferrel and Peng. On Sun, Sep 21, 2014 at 11:39 PM, Pat Ferrel pat.fer...@gmail.com wrote: @li Mahout down-samples the input data based on how “important” the cooccurrence of interactions seems to be. I’d use SIMILARITY_LOGLIKELIHOOD for the best measure of this.

Apache Mahout 0.9 LDA CVB Example

2014-09-23 Thread Shahid Shaikh
Hi, I am currently working on a project that needs categorization of documents (UN-structured data) based on internal context of document. I am using Apache mahout clustering solution for the same. So far we have explored Kmeans, Canopy with Kmeans, We have also used Lucene

Mahout Classification Beginner

2014-09-23 Thread user12121
Hey, I've just begun working with Mahout and I'm having trouble with very basic stuff. So, essentially, I want to use the common iris.csv data and classify it using the LogisticRegression classifier. I'm having trouble in going about the whole vectorization, as the features are integers and the

Re: NumberFormatException when running mahout

2014-09-23 Thread Bart Vandewoestyne
On 09/23/2014 07:48 AM, Ted Dunning wrote: On Mon, Sep 22, 2014 at 8:13 AM, Bart Vandewoestyne bart.vandewoest...@telenet.be wrote: 14/09/22 17:05:01 INFO mapreduce.Job: Task Id : attempt_1410945757266_2536_m_00_0, Status : FAILED Error: java.lang.NumberFormatException: For input string:

Universal Recommender

2014-09-23 Thread Pat Ferrel
Name suggestions are appreciated but this is meant to be about the similarity engine (search engine) recommender. Recently Lucidworks (the Solr people) announced Fusion, a closed source extension to the Lucidworks offering. It includes a recommender API, which makes it easier to deal with

word weights using BM25

2014-09-23 Thread Arian Pasquali
Hi, I was wondering if would be possible to support bm25 term weighting extending Mahout's tf-idf implementation. I was curious to know if anyone here has already tried to do so. If not, what would be your suggestion for such implementation on Mahout? Arian Pasquali

Re: word weights using BM25

2014-09-23 Thread Ted Dunning
Should be pretty easy. I haven't heard of anyone doing it. Sent from my iPhone On Sep 23, 2014, at 18:53, Arian Pasquali ar...@arianpasquali.com wrote: Hi, I was wondering if would be possible to support bm25 term weighting extending Mahout's tf-idf implementation. I was curious to

Re: word weights using BM25

2014-09-23 Thread Suneel Marthi
Lucene 4.x supports okapi-bm25. So it should be easy to implement. On Tue, Sep 23, 2014 at 11:57 PM, Ted Dunning ted.dunn...@gmail.com wrote: Should be pretty easy. I haven't heard of anyone doing it. Sent from my iPhone On Sep 23, 2014, at 18:53, Arian Pasquali ar...@arianpasquali.com