This is a little meditation on user v.s. item matrix density. The
heavy users and heavy items can be subsampled, once they are
identified. Hadoop's built-in sort does give a very simple
map-increase way to do this sort.
http://ultrawhizbang.blogspot.com/2011/08/sorted-recommender-data.html
On
That's correct. Well you just have to recompose the user row you are
interested in. It will no longer be sparse, at all. Those new values are
your estimated ratings.
On Fri, Aug 26, 2011 at 12:07 AM, Jeff Hansen dsche...@gmail.com wrote:
I also think I may have missed a big step of the puzzle.
I got this axis of interest concept from a presentation by one of the
Netflix team runner-ups, I don't know which one. He did not give a name for
it. Is there a standard term? I hate just making up new words.
Also, there are clusters of items at both ends, but there are also items
along the axis
Hi there,
i'm really new to Apache Mahout, did the Quickstart and started now this
example:
https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
I wanted to run the Demo section. If I start jetty per mvn jetty:run-war I'm
getting the following Exceptions
2011-08-26
The problem may be your changes; I would start with exactly what's in the
distribution, since it works.
I am not sure that the demo will work when accesses through Jetty, as a web
service. I don't know that Jetty has Axis in it. But the servlet-based API
should work fine.
Really, you'd want to
Hi Sean,
thank you for your hints. I used now the original pom, where packing is set to
.jar so I get no .war file created. Do you have an idea what to change?
Thanks,
RK
-Ursprüngliche Nachricht-
Von: Sean Owen [mailto:sro...@gmail.com]
Gesendet: Freitag, 26. August 2011 16:38
An:
Hi,
I am working on a text mining of huge data. I have big set of strings
(separated by a new line character), on which I want to run a algorithm
which can give me similarity distances between the string. Further, I want
to use that distance to group those strings based on their similarities.
Thanks for the math Ted -- that was very helpful.
I've been using sparseMatrix() from libray(Matrix) -- largely based on your
response to somebody elses email. I've been playing with smaller matrices
mainly for my own learning purposes -- it's much easier to read through 200
movies (most of
Atul, welcome to Mahout!
Although there are many interesting things you can do with your data, I would
recommend using k-means clustering to get a feel for Mahout's input mechanism,
sequence files etc.
You can find detailed explanation of clustering
on our wiki. The Mahout in Action book is
Ah. You are right. This doesn't work at the moment. The good news is a can
and have fixed it. And, the process for building the demo is faster and
simpler now. See my updated instructions on the wiki, and make sure to get
all the latest code from Subversion.
The bad news is that it has become
On Fri, Aug 26, 2011 at 8:29 AM, Jeff Hansen dsche...@gmail.com wrote:
Thanks for the math Ted -- that was very helpful.
NP.
... I've been playing with smaller matrices
mainly for my own learning purposes -- it's much easier to read through 200
movies (most of which I've heard of) and
Which is why Hoss always sends http://people.apache.org/~hossman/#threadhijack
when that happens.
On Aug 25, 2011, at 8:48 AM, Jeff Eastman wrote:
And, people (incl. me) often hit reply[-all] to a message to get the to:
and cc: fields, add an entirely new subject: and begin an entirely new
Thanks for doing that wiki maintenence.
In the long run, is this the best way to package the query phase of a
recommender? There are design tensions: understandability, ease of
demonstrating, pathway to making a packaged app.
On Fri, Aug 26, 2011 at 9:14 AM, Sean Owen sro...@gmail.com wrote:
http://www.nytimes.com/interactive/2010/01/10/nyregion/20100110-netflix-map.html
Do not fear demographics. Yes, some people rent movies with all-black casts,
and other people rent movies with all-white casts. And the Walmarts in the
SF East Bay have palettes full of Tyler Perry videos, while most
14 matches
Mail list logo