On Saturday 28 November 2009 08:30:26 Sean Owen wrote:
I'm all for generating and publishing this.
Great. Than I will go an tweak the checks to match our guidelines, twiddle a
bit with the output format and than integrate the stuff into our nightly
build.
I didn't see anything big flagged,
On Fri, Nov 27, 2009 at 11:23 PM, Ted Dunning ted.dunn...@gmail.com wrote:
Summarize yes.
But this is, actually, theoretically better because the summarization
introduces useful smoothing. That way you get recommendations for items
even if there is no direct overlap.
Summarize, smooth,
Restricted Boltzmann are of real interest, but again, I repeat the
obligatory warning about replicating all things from the Netflix
competition.
To take a few concrete examples,
- user biases were a huge advance in terms of RMS error, but they don't
affect the ordering of the results presented
Jake,
Do you have any concrete information about how much difference there
actually is in these decompositions?
On Sat, Nov 28, 2009 at 8:31 AM, Jake Mannix jake.man...@gmail.com wrote:
or more precisely, a sparse SVD which doesn't treat
missing data as the numerical 0 or mean of the values
Isabel,
Wow, this looks great. There's lots of information in here. Sean
definitely has a point where it would be very nice to eliminate the
information about things we're not really concerned with. Also, I
wonder if these are cases where we need to add more checks.Is there
some report that
On Saturday 28 November 2009 21:29:05 Drew Farris wrote:
It will be be interesting to see the reports for the other modules as
well. examples, utils, matrix.
As a little preview: Just substitute mahout-core with mahout-modulename in
the url below:
df/mapred works with the old hadoop API
df/mapreduce works with hadoop 0.20 API
On Saturday, November 28, 2009, Sean Owen sro...@gmail.com wrote:
I'm all for generating and publishing this.
The CPD results highlight a question I had: what's up with the amount
of duplication between