Re: Publish code quality reports on web-site?

2009-11-28 Thread deneche abdelhakim
df/mapred works with the old hadoop API df/mapreduce works with hadoop 0.20 API On Saturday, November 28, 2009, Sean Owen wrote: > I'm all for generating and publishing this. > > > The CPD results highlight a question I had: what's up with the amount > of duplication between org/apache/mahout/df/

Re: Publish code quality reports on web-site?

2009-11-28 Thread Isabel Drost
On Saturday 28 November 2009 21:29:05 Drew Farris wrote: > It will be be interesting to see the reports for the other modules as > well. examples, utils, matrix. As a little preview: Just substitute mahout-core with mahout- in the url below: http://people.apache.org/~isabel/mahout_site/mahout-co

[jira] Updated: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2009-11-28 Thread Drew Farris (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-11: -- Attachment: MAHOUT-11-all-cleanup-20091128.patch MAHOUT-11-all-cleanup-20091128.patch eliminates the

Re: Publish code quality reports on web-site?

2009-11-28 Thread Drew Farris
Isabel, Wow, this looks great. There's lots of information in here. Sean definitely has a point where it would be very nice to eliminate the information about things we're not really concerned with. Also, I wonder if these are cases where we need to add more checks.Is there some report that tracks

Re: NMF for Taste

2009-11-28 Thread Ted Dunning
Jake, Do you have any concrete information about how much difference there actually is in these decompositions? On Sat, Nov 28, 2009 at 8:31 AM, Jake Mannix wrote: > or more precisely, a sparse SVD which doesn't treat > missing data as the numerical 0 or mean of the values > -- Ted Dunning,

Re: NMF for Taste

2009-11-28 Thread Ted Dunning
Restricted Boltzmann are of real interest, but again, I repeat the obligatory warning about replicating all things from the Netflix competition. To take a few concrete examples, - user biases were a huge advance in terms of RMS error, but they don't affect the ordering of the results presented to

Re: NMF for Taste

2009-11-28 Thread Jake Mannix
On Fri, Nov 27, 2009 at 11:23 PM, Ted Dunning wrote: > Summarize yes. > > But this is, actually, theoretically better because the summarization > introduces useful smoothing. That way you get recommendations for items > even if there is no direct overlap. > Summarize, smooth, and enhance clus

[jira] Created: (MAHOUT-210) Publish code quality reports through maven

2009-11-28 Thread Isabel Drost (JIRA)
Publish code quality reports through maven -- Key: MAHOUT-210 URL: https://issues.apache.org/jira/browse/MAHOUT-210 Project: Mahout Issue Type: New Feature Components: Website Affects Ver

Re: Publish code quality reports on web-site?

2009-11-28 Thread Isabel Drost
On Saturday 28 November 2009 08:30:26 Sean Owen wrote: > I'm all for generating and publishing this. Great. Than I will go an tweak the checks to match our guidelines, twiddle a bit with the output format and than integrate the stuff into our nightly build. > I didn't see anything big flagged,