Build Error when running Mahout examples

2012-12-02 Thread Reinhard Denis Najogie
Hi all, I just installed Apache Mahout and now I want to try running some example dataset with Mahout. I followed the example from here: https://cwiki.apache.org/MAHOUT/recommendationexamples.html Then I tried the MovieLens example with this command line input: mvn -q exec:java -Dexec.mainClass="

Re: Build Error when running Mahout examples

2012-12-02 Thread Abhijith CHandraprabhu
Hello, I am too a newbie with mahout, however I tried the same command u have, and it worked. I had the ratings.dat in my /home/abijith/Downloads and I went into the examples directory in /workspace/mahout-0.7/examples and ececuted the same command u have except changed the location of the the inpu

Mahout Amazon EMR usage cost

2012-12-02 Thread Koobas
I was wondering if somebody could give me a rough estimate of the cost of running Mahout on Amazon's Elastic MapReduce for a specific problem. I am working with a common case of implicit feedback. I have a simple, boolean input, i.e., user-item pairs (userID, itemID). I would like to find 50 neares

Re: Mahout Amazon EMR usage cost

2012-12-02 Thread Sean Owen
My guess is: less than $10. Little enough that I wouldn't worry about it. But I have not tried it directly. You just have 10K items, so it ought to be relatively quick to find similar items for them. You will want to look at ItemSimilarityJob. Setting some parameters like --maxSimilaritiesPerRow a

Re: Mahout Amazon EMR usage cost

2012-12-02 Thread Koobas
Thank you very much. The pointer to Myrrix is a very useful piece of information. Myrrix, however, relies on an iterative sparse matrix factorization to do PCA. I want to produce Amazon-like recommendations. I.e., "70% of users who bough this, also bought that." So, I specifically want the direct k

CVB CPU Utilization

2012-12-02 Thread Markus Paaso
Hi, I have some problems to utilize all available CPU power for 'mahout cvb' command. The CPU usage is just about 35% and IO wait ~0%. I have 8 cores and 28 GB memory in a single computer that is running Mahout 0.7-cdh-4.1.2 with Hadoop 2.0.0-cdh4.1.2 in pseudo-distributed mode. How can I take adv