date:20130907

Hadoop implementation of ParallelSGDFactorizer

2013-09-07 Thread Tevfik Aytekin

Hi, There seems to be no Hadoop implementation of ParallelSGDFactorizer. ALSWRFactorizer has a Hadoop implementation. ParallelSGDFactorizer (since it is based on stochastic gradient descent) is much faster than ALSWRFactorizer. I don't know Hadoop much. But it seems to me that a Hadoop

Re: Kmeans - clustering help

2013-09-07 Thread P Kal

It seems that I've had the wrong idea the entire time. Thanks for the help. On Fri, Sep 6, 2013 at 3:45 PM, Suneel Marthi suneel_mar...@yahoo.comwrote: seq2sparse uses Lucene Standard tokenization to generate the tfidf vectors. But since your data is in CSV format (from the example u had

Re: Hadoop implementation of ParallelSGDFactorizer

2013-09-07 Thread Sebastian Schelter

IIRC the algorithm behind ParallelSGDFactorizer needs shared memory, which is not given in a shared-nothing environment. On 07.09.2013 19:08, Tevfik Aytekin wrote: Hi, There seems to be no Hadoop implementation of ParallelSGDFactorizer. ALSWRFactorizer has a Hadoop implementation.

Re: Solr recommender

2013-09-07 Thread Ted Dunning

On Fri, Sep 6, 2013 at 9:33 AM, Pat Ferrel pat.fer...@gmail.com wrote: One of the unique things about the Solr recommender is online recs. Two scenarios come to mind: 1) ask the user to pick from among a list of videos, taking the picks as preferences and making recs. Make more and see if

Re: Hadoop implementation of ParallelSGDFactorizer

2013-09-07 Thread Tevfik Aytekin

Sebastian, what is IIRC? On Sat, Sep 7, 2013 at 8:24 PM, Sebastian Schelter ssc.o...@googlemail.com wrote: IIRC the algorithm behind ParallelSGDFactorizer needs shared memory, which is not given in a shared-nothing environment. On 07.09.2013 19:08, Tevfik Aytekin wrote: Hi, There seems to

Re: Hadoop implementation of ParallelSGDFactorizer

2013-09-07 Thread Ted Dunning

That means If I Recall Correctly. It is an internet slang. See also http://en.wiktionary.org/wiki/Appendix:English_internet_slang On Sat, Sep 7, 2013 at 12:39 PM, Tevfik Aytekin tevfik.ayte...@gmail.comwrote: Sebastian, what is IIRC? On Sat, Sep 7, 2013 at 8:24 PM, Sebastian Schelter

Re: Mahout readable output

2013-09-07 Thread Ted Dunning

Darius comments are good. You also have to think about what similar means to you. From the data you describe, I see several possibilities: - geo-location from machine id (if it includes IP address) - content from the query - frequency of posting - diurnal phase of posting (tells us time

Re: Solr recommender

2013-09-07 Thread Pat Ferrel

On Sep 7, 2013, at 10:36 AM, Ted Dunning ted.dunn...@gmail.com wrote: On Fri, Sep 6, 2013 at 9:33 AM, Pat Ferrel pat.fer...@gmail.com wrote: One of the unique things about the Solr recommender is online recs. Two scenarios come to mind: 1) ask the user to pick from among a list of

Re: Solr recommender

2013-09-07 Thread Ted Dunning

On Sat, Sep 7, 2013 at 2:35 PM, Pat Ferrel p...@occamsmachete.com wrote: ... Clustering can be done by doing SVD or ALS on the user x thing matrix first or by directly clustering the columns of the user x thing matrix after some kind of IDF weighting. I think that only the streaming

Hadoop implementation of ParallelSGDFactorizer

Re: Kmeans - clustering help

Re: Hadoop implementation of ParallelSGDFactorizer

Re: Solr recommender

Re: Hadoop implementation of ParallelSGDFactorizer

Re: Hadoop implementation of ParallelSGDFactorizer

Re: Mahout readable output

Re: Solr recommender

Re: Solr recommender

9 matches

Site Navigation

Mail list logo

Footer information