I am new to mahout, i have svn the trunk and installed it using mvn.. now i
wish to write a java program(instead of the shell script
build-reuters.sh/cluster-reuters.sh) that performs a kmeans clustering by
calling the methods or by creating instance (if possible) in the classes
which convert the
I think mahout-core ( and its internal dependencies ) can do most of
what you need.
You will have to create your vectors yourself and write to HDFS.
Then use KMeansDriver's run method to do clustering.
Then use ClusterOutputPostProcessor to separate out vectors belonging to
different
Have you tried this link ?
http://shuyo.wordpress.com/2011/02/14/mahout-development-environment-with-maven-and-eclipse-2/
It is telling you how to import mahout in action examples in eclipse.
Just add Hadoop and mahout dependencies in pom.xml and there is a small
Mahout in action example to run
I think you've got some old code lying around; this class doesn't exist anymore.
On Tue, Jan 3, 2012 at 2:32 PM, Andrea Leistra
andrea.leis...@concur.com wrote:
This morning I checked out the mahout trunk. When attempting mvn install I
get the following error:
[ERROR] COMPILATION ERROR :
The recent data is usually just the user history, not the off-line
item-item relationship build.
For brand new items, there is the cold start problem, but this is often
handled by putting these items on a New Arrivals page so that you can
expose them to users until you get enough data to include
You math is correct.
When you say you have 105 features, what do you mean? Are these textual
features? Or what?
On Tue, Jan 3, 2012 at 2:53 PM, Grant Ingersoll gsing...@apache.org wrote:
I'm trying to run the full ASF email SGD classifier problem and am facing
heap size issues. My current
Does these algorithms have good locality? For doing giant online
computations it might be worth storing these in memory-mapped files.
Or, give up and get the M/R SGD code in.
On Tue, Jan 3, 2012 at 2:59 PM, Ted Dunning ted.dunn...@gmail.com wrote:
You math is correct.
When you say you have 105
No. They don't have particularly good locality. The would have moderate
hotspots, but these would be scatter all over. The hotspots might allow L2
cache to help, but would not allow disk based data to work.
The major opportunity for improvement here is to incorporate some of the
advances that
Hi All,
I'm currently running an item based recommendation
using KnnItemBasedRecommender. My data set isn't very large at
approximately 30k preferences over 10k items. When running
a AverageAbsoluteDifferenceRecommenderEvaluator evaluation on a 0.9
training set the result is ~0.80 (on a
On Jan 3, 2012, at 5:59 PM, Ted Dunning wrote:
You math is correct.
When you say you have 105 features, what do you mean?
Sorry, that should have been 105 categories/labels. I'm trying to do the ASF
email equivalent of 20 news groups, but in this case it's 105 ASF projects.
The basic
If you can use an SVD-based recommender, here is a way to update an
SVD in constant time that is much much smaller than the original
decomposition.
http://www.merl.com/papers/docs/TR2006-059.pdf
On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning ted.dunn...@gmail.com wrote:
The recent data is usually
This topic has been discussed earlier. Check out this thread. This might
answer your question.
http://comments.gmane.org/gmane.comp.apache.mahout.user/10988
On 03-01-2012 22:18, prasenjit mukherjee wrote:
After I use mahout kmeans to create the clusters, does Mahout have
any tools/utilities
Ahh... of course. I should have understood that from the multiplication
you did since 104 = 105-1.
On Tue, Jan 3, 2012 at 7:58 PM, Grant Ingersoll gsing...@apache.org wrote:
On Jan 3, 2012, at 5:59 PM, Ted Dunning wrote:
You math is correct.
When you say you have 105 features, what do
That is the opposite of what you'd expect, and I think that's a possible
explanation you've identified, but still seems unlikely to me. Something
else may be wrong. Is this repeatable, and not just a fluke of the random
number generator? What are the exact args you're using, just to make sure
14 matches
Mail list logo