Re: Predictive analysis problem

2011-09-11 Thread Dmitriy Lyubimov
Yep I was at that presentation, I think they just went on to say that at that scale it is just much more effective to get rid of these 2 percent than trying to keep sending them for triage and figure out what it is about them that makes their hardware to fail. Very good presentation for any farm a

Re: Predictive analysis problem

2011-09-11 Thread Ted Dunning
Yeah... interesting you should say that. One of the things that MapR does is to monitor disk speeds and mark disks as bad when they start to stand out from the background (in a bad way). That can be a bit pessimistic, but it is really best to get the bad apples out early even at the cost of some

Recommendation with a dataset with no/same preference

2011-09-11 Thread Manju
Dear Mahout team, Need some advice. The books "Mahout/Hadoop in action" and online information has helped me digest the basic concepts and setup a single node hadoop + mahout (run examples/write test programs/build etc.). I am prototyping a solution for an analytics problem using User/Itemrecom

Re: Recommendation with a dataset with no/same preference

2011-09-11 Thread Sean Owen
This is small enough that you can fit this into memory on one machine, and you do not need Hadoop. I would simply start with a GenericBooleanPrefItemBasedRecommender, and attach it to a LogLikelihoodSimilarity similarity metric. Wrap the LogLikelihoodSimilarity in a CachingItemSimilarity. You can

Error while running any clustering tasks

2011-09-11 Thread Varun Thacker
I'm using Mahout 0.5.I am using Lucene ( the matching version in the pom.xml) to index a tiny data set for testing. This is what the index looks like: _0.fdt _0.fnm _0.nrm _0.tii _0.tvd _0.tvx segments.gen _0.fdx _0.frq _0.prx _0.tis _0.tvf segments_1 Now I use this command to create vectors the

Re: Recommendation with a dataset with no/same preference

2011-09-11 Thread Ted Dunning
Binary preferences are fine. In fact, I generally recommend that all ratings and related information be distilled down to a single binary indicator such as you already have. The fact that you have so few items will be both your advantage and disadvantage. It will help you avoid problems with spa

Re: Recommendation with a dataset with no/same preference

2011-09-11 Thread Manju
Ted and Sean, Thanks for the suggestion/advice. My prototype ran successfully (programatically:) with GenericBooleanPrefItemBasedRecommender. I am reviewing/reflecting on the output. Thanks again. Manju From: Ted Dunning To: user@mahout.apache.org; Manju Cc

Re: Recommendation with a dataset with no/same preference

2011-09-11 Thread Ted Dunning
Good luck. Let us know how it turns out. On Sun, Sep 11, 2011 at 2:55 PM, Manju wrote: > Ted and Sean, > Thanks for the suggestion/advice. My prototype ran successfully > (programatically:) with GenericBooleanPrefItemBasedRecommender. I am > reviewing/reflecting on the output. > Thanks again.

Spec for a common import/export service for Mahout jobs

2011-09-11 Thread Lance Norskog
https://cwiki.apache.org/confluence/display/MAHOUT/Import+Export+Sequence+File+Formats Please have a look; comment or rewrite as you please. It's a wish list of what I would want, approaching Mahout either as an experienced user or as a newbie. -- Lance Norskog goks...@gmail.com