Re: Install Mahout without Internet

2014-06-25 Thread Dominik Hübner
As long as your compiling machine has internet access you could build a fat jar with all dependencies as well http://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven On 25 Jun 2014, at 16:10, Adri Gómez adri12...@gmail.com wrote: Hi, I

Re: Netflix dataset

2014-02-13 Thread Dominik Hübner
As far as I remember they denied further distribution after a paper has been published, describing how IMDB profiles can be used to de-anonymize the netflix dataset. On 13 Feb 2014, at 14:01, Abhijith CHandraprabhu abhiji...@gmail.com wrote: Unfortunately the dataset is no more available as

Re: Solr-recommender for Mahout 0.9

2013-11-07 Thread Dominik Hübner
Does anyone know what the difference is between keeping the ids in a space delimited string and indexing a multivalued field of ids? I recently tried the latter since ... it felt right, however I am not sure which of both has which advantages. On 07 Nov 2013, at 18:18, Pat Ferrel

Re: Tweaking ALS models to filter out highly related items when an item has been purchased

2013-09-05 Thread Dominik Hübner
Just a quick a assumption, maybe I have not thought this through enough: 1. Users probably tend to compare products = similar VIEWS 2. User as well might tend to PURCHASE accessory products, like the laptop bag you mentioned May be you could filter out products that have a similarity computed

Re: Tweaking ALS models to filter out highly related items when an item has been purchased

2013-09-05 Thread Dominik Hübner
a penalty score. When done with this, I re-sort the results and the duplicative content falls to the bottom of the recommendations. On Thu, Sep 5, 2013 at 1:15 AM, Dominik Hübner cont...@dhuebner.com wrote: Just a quick a assumption, maybe I have not thought this through enough: 1

Content-Based Recommendation Approaches

2013-08-06 Thread Dominik Hübner
I've just finished implementing a collaborative recommender system based on views, sales and similar user-product interactions. This approach works quite well and I am pleased with the results. Nevertheless, I am want to know what influence the integration of some item specific features would

Re: Setting up a recommender

2013-07-19 Thread Dominik Hübner
+1 for getting something like that in a future release of Mahout On Jul 19, 2013, at 10:02 PM, Sebastian Schelter s...@apache.org wrote: It would be awesome if we could get a nice, easily deployable implementation of that approach into Mahout before 1.0 2013/7/19 Ted Dunning

Re: Blending initial recommendations for cross recommendation

2013-06-04 Thread Dominik Hübner
-as-recommendation On Jun 1, 2013, at 12:19 PM, Dominik Hübner cont...@dhuebner.com wrote: Thanks for those detailed responses! So I assume that the problem of scaling the initial recommendations can be implicitly solved by learning the weights for the linear combination. The log rank Ted mentioned

Re: Blending initial recommendations for cross recommendation

2013-06-01 Thread Dominik Hübner
? On Fri, May 31, 2013 at 3:07 PM, Dominik Hübner cont...@dhuebner.com wrote: Hey, I have implemented a cross recommender based on the approach Ted Dunning proposed (cannot find the original post, but here is a follow up http://www.mail-archive.com/user@mahout.apache.org/msg12983.html

Blending initial recommendations for cross recommendation

2013-05-31 Thread Dominik Hübner
Hey, I have implemented a cross recommender based on the approach Ted Dunning proposed (cannot find the original post, but here is a follow up http://www.mail-archive.com/user@mahout.apache.org/msg12983.html). Currently I am struggling with the last step of blending the initial

Clustering product views and sales

2013-05-06 Thread Dominik Hübner
I am currently working on a dataset containing product views and sales of about 10^7 users and 6000 items for my master's thesis in CS. My goal is to build product clusters from this. As expected, item-(row)-vectors are VERY sparse. My current approach is to implement PCA using the SVDSolver

Re: Clustering product views and sales

2013-05-06 Thread Dominik Hübner
And running the clustering on the cooccurrence matrix or doing PCA by removing eigenvalues/vectors? On May 6, 2013, at 8:52 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner cont...@dhuebner.comwrote: Oh, and I forgot how the views and sales

Re: Clustering product views and sales

2013-05-06 Thread Dominik Hübner
). What is the high level goal that you are trying to solve with this clustering? On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner cont...@dhuebner.comwrote: And running the clustering on the cooccurrence matrix or doing PCA by removing eigenvalues/vectors? On May 6, 2013, at 8:52 PM, Ted

Re: Clustering product views and sales

2013-05-06 Thread Dominik Hübner
to best performance. On Mon, May 6, 2013 at 12:35 PM, Dominik Hübner cont...@dhuebner.comwrote: Well, as you already might have guessed, I am building a product recommender system for my thesis. I am planning to evaluate ALS (both, implicit and explicit) as well as item -similarity

Re: Clustering product views and sales

2013-05-06 Thread Dominik Hübner
sparsification or ALS+reconstruction. These indicators can be historical items or static items such as geo information. These indicators can be combined in a single step using a search engine. On Mon, May 6, 2013 at 2:58 PM, Dominik Hübner cont...@dhuebner.com wrote: The cluster

Re: Mahout in Action with Mahout 0.7

2013-04-14 Thread Dominik Hübner
Hey, have a look here https://github.com/tdunning/MiA/tree/mahout-0.7 Examples from the book ported to 0.7 On Apr 14, 2013, at 8:12 PM, Max Bridgewater max.bridgewa...@gmail.com wrote: Hi Folks, I am learning Mahout using Mahout in Action. The problem: it is not based on the current

Re: JobConf and ClassPath

2013-04-09 Thread Dominik Hübner
Do you run hadoop in pseudo mode or on a real cluster? If hadoop is running on a cluster, the classes in your class path are of course not reachable. I usually let maven include the mahout jars into my project's package. This way your applications becomes independent on what kind of hadoop

Re: JobConf and ClassPath

2013-04-09 Thread Dominik Hübner
Try adding this to your pom file build plugins plugin groupIdorg.apache.maven.plugins/groupId artifactIdmaven-assembly-plugin/artifactId executions execution

QR decomposition in ALS-WR code

2013-03-15 Thread Dominik Hübner
I was recently having a look at the ALS-WR factorisation code. Why is there a QR decomposition before computing u_i or m_j instead of multiplying the inverse of A_i with V_i straightaway? (reference to these two classes