Re: [jira] [Commented] (MAHOUT-737) Implicit Alternating Least Squares SVD

2012-01-14 Thread Ted Dunning
I Would recommend not worrying about having a special symmetric matrix for now. It won't make a huge difference and it will be a pain to convert the one that ssvd uses. The diagonal matrix could make a much bigger difference. Sent from my iPhone On Jan 14, 2012, at 15:06, Lance Norskog wro

Build failed in Jenkins: Mahout-Quality #1306

2012-01-14 Thread Apache Jenkins Server
See -- [...truncated 33132 lines...] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.mahout.cf.taste.common.TopKMinKTest Tests run: 6, Failures: 0, Errors: 0, S

Re: [jira] [Commented] (MAHOUT-737) Implicit Alternating Least Squares SVD

2012-01-14 Thread Lance Norskog
There is a packed symmetric matrix impl in the Stochastic SVD stuff, but it is hard-coded to a packed implementation. org.apache.mahout.math.hadoop.stochasticsvd.UpperTriangular - mahout/math You could recode this to use the Vector class for storage. Be sure to run all of the Matrix unit tests if

[jira] [Commented] (MAHOUT-890) Performance issue in FPGrowth

2012-01-14 Thread Robin Anil (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186346#comment-13186346 ] Robin Anil commented on MAHOUT-890: --- Strike that, it works in the order 920, 921, 927, 8

[jira] [Commented] (MAHOUT-890) Performance issue in FPGrowth

2012-01-14 Thread Robin Anil (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186345#comment-13186345 ] Robin Anil commented on MAHOUT-890: --- The Patch fails when applying. Can you send me a ne

Jenkins build is back to normal : Mahout-Examples-Cluster-Reuters-II #11

2012-01-14 Thread Apache Jenkins Server
See

Re: Hadoop 1.0

2012-01-14 Thread Shannon Quinn
I spent a lot of time trying to get the mapside-joins in the DistributedRowMatrix.multiply() to work without the .mapred.join package, and it simply can't be done without some major hacks, or when the Hadoop folks decide to include the mapreduce.lib.join.* package in a stable release. iPhone'd

CF: parallel SGD for matrix factorization by Gemulla et al.

2012-01-14 Thread Zeno Gantner
Hi list, I was talking to Isabel Drost in December, and we talked about a nice paper from last year's KDD conference that suggests a neat trick that allows doing SGD for matrix factorization in parallel. She said this would be interesting for some of you here. Here is the paper: http://www.mpi-i

Re: Hadoop 1.0

2012-01-14 Thread Sean Owen
True that but I think most of the use of .mapred. is not of this form. It's still using the old Mappers and Reducers and InputFormats and such. Maybe it's all actually somehow necessary to still use ChainReducer or MultipleInputs though my impression was that most of it was not. For example right

Re: Hadoop 1.0

2012-01-14 Thread Jake Mannix
Re: o.a.h.mapred package dependency: haven't we been over this a thousand times? If we are not *forcing* our users to upgrade Hadoop past 0.20.2-ish, and we want to have nice things like mapside joins, ChainMapper/ChainReducer, and MultipleOutputs, then we're sometimes stuck in the old-and-faded A

[jira] [Commented] (MAHOUT-945) The variance calculation of Random forest regression tree

2012-01-14 Thread Ikumasa Mukai (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186196#comment-13186196 ] Ikumasa Mukai commented on MAHOUT-945: -- Appreciate your quick checking. Of cause I c

[jira] [Commented] (MAHOUT-945) The variance calculation of Random forest regression tree

2012-01-14 Thread Sean Owen (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186184#comment-13186184 ] Sean Owen commented on MAHOUT-945: -- That's good, but the new implementation just duplicat

[jira] [Commented] (MAHOUT-943) Improbe the way to make the split point on DF.

2012-01-14 Thread Ikumasa Mukai (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186179#comment-13186179 ] Ikumasa Mukai commented on MAHOUT-943: -- I posted a patch for Regressionsplit.java on

[jira] [Updated] (MAHOUT-945) The variance calculation of Random forest regression tree

2012-01-14 Thread Ikumasa Mukai (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ikumasa Mukai updated MAHOUT-945: - Status: Patch Available (was: Open) > The variance calculation of Random forest regression t

[jira] [Updated] (MAHOUT-945) The variance calculation of Random forest regression tree

2012-01-14 Thread Ikumasa Mukai (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ikumasa Mukai updated MAHOUT-945: - Attachment: MAHOUT-945.patch At Ted-san's suggestion, I made a patch for using Welford's method t

[jira] [Issue Comment Edited] (MAHOUT-945) The variance calculation of Random forest regression tree

2012-01-14 Thread Ikumasa Mukai (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186174#comment-13186174 ] Ikumasa Mukai edited comment on MAHOUT-945 at 1/14/12 11:59 AM:

Re: [jira] [Commented] (MAHOUT-737) Implicit Alternating Least Squares SVD

2012-01-14 Thread Tamas Jambor
thanks, ideally I would need both symmetric and diagonal. On Sat, Jan 14, 2012 at 8:26 AM, Sebastian Schelter wrote: > I think Tamas referred to matrices that are symmetric (only the upper > triangular half would need to be stored) not diagonal matrices. > > > On 14.01.2012 05:25, Lance Norskog

Re: [jira] [Commented] (MAHOUT-737) Implicit Alternating Least Squares SVD

2012-01-14 Thread Sebastian Schelter
I think Tamas referred to matrices that are symmetric (only the upper triangular half would need to be stored) not diagonal matrices. On 14.01.2012 05:25, Lance Norskog wrote: > org.apache.mahout.math.DiagonalMatrix > > It even supports sparse values in the diagonal! > > On Fri, Jan 13, 2012 at