[jira] Updated: (MAHOUT-596) Testing if the weight assigned to points when calling the observe method in AbstractCluster incorrectly affect the number of points in a cluster

2011-01-26 Thread Yuval Merhav (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Merhav updated MAHOUT-596: Attachment: TestAbstractCluster.java Default: Three one dimensional points with value of 1.0 each.

[jira] Created: (MAHOUT-596) Testing if the weight assigned to points when calling the observe method in AbstractCluster incorrectly affect the number of points in a cluster

2011-01-26 Thread Yuval Merhav (JIRA)
Testing if the weight assigned to points when calling the observe method in AbstractCluster incorrectly affect the number of points in a cluster -

[jira] Updated: (MAHOUT-542) MapReduce implementation of ALS-WR

2011-01-26 Thread Danny Bickson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Bickson updated MAHOUT-542: - Attachment: (was: unifiedpatch) > MapReduce implementation of ALS-WR > --

[jira] Updated: (MAHOUT-542) MapReduce implementation of ALS-WR

2011-01-26 Thread Danny Bickson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Bickson updated MAHOUT-542: - Attachment: MAHOUT-542-4.patch Hi everyone, Issue solved. I've created a new patch and tested it

[jira] Commented: (MAHOUT-594) FileWriter may garble non-ASCII output if the environment variable LANG/LC_ALL is not appropriate.

2011-01-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987338#action_12987338 ] Hudson commented on MAHOUT-594: --- Integrated in Mahout-Quality #588 (See [https://hudson.apac

[jira] Commented: (MAHOUT-594) FileWriter may garble non-ASCII output if the environment variable LANG/LC_ALL is not appropriate.

2011-01-26 Thread Shige Takeda (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987312#action_12987312 ] Shige Takeda commented on MAHOUT-594: - no problem as long as this issue is addressed. t

[jira] Commented: (MAHOUT-155) ARFF VectorIterable

2011-01-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987299#action_12987299 ] Grant Ingersoll commented on MAHOUT-155: It is in the code (since 0.3) as a baselin

[jira] Updated: (MAHOUT-594) FileWriter may garble non-ASCII output if the environment variable LANG/LC_ALL is not appropriate.

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-594: - Resolution: Fixed Status: Resolved (was: Patch Available) I committed a variant on your patch wh

[jira] Commented: (MAHOUT-155) ARFF VectorIterable

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987282#action_12987282 ] Sean Owen commented on MAHOUT-155: -- Coincidentally -- I was applying another patch and not

Re: Slope One recommender and MySQLJDBCDiffStorage

2011-01-26 Thread Sean Owen
I'd bet there is a problem. I recently changed this radically and don't have good tests in place. I will look into these. On Wed, Jan 26, 2011 at 3:46 PM, Eric Sellin wrote: > Hello all, > > I am looking at the Mahout 0.4 code for MySQLJDBCDiffStorage in the > Slope One recommender, and in parti

[jira] Commented: (MAHOUT-542) MapReduce implementation of ALS-WR

2011-01-26 Thread Danny Bickson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987175#action_12987175 ] Danny Bickson commented on MAHOUT-542: -- Thanks Sebastian! I have access to some large

[jira] Updated: (MAHOUT-594) FileWriter may garble non-ASCII output if the environment variable LANG/LC_ALL is not appropriate.

2011-01-26 Thread Shige Takeda (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shige Takeda updated MAHOUT-594: Status: Patch Available (was: Open) In order to minimize the change, I introduced FileWriterUTF8 a

[jira] Updated: (MAHOUT-594) FileWriter may garble non-ASCII output if the environment variable LANG/LC_ALL is not appropriate.

2011-01-26 Thread Shige Takeda (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shige Takeda updated MAHOUT-594: Attachment: 0001-set-file-reader-and-writer-character-encoding-to-utf.patch > FileWriter may garble

Re: The number of points in AbstractCluster

2011-01-26 Thread Yuval Merhav
Thanks for your reply Grant. Sure, I will write a test case to see how the weight affects the number of points in a cluster (and more importantly the centroid). For the cluster dumper, I would not call it a bug, but it might be a limitation for users. I think that it prints the clusters from the

Slope One recommender and MySQLJDBCDiffStorage

2011-01-26 Thread Eric Sellin
Hello all, I am looking at the Mahout 0.4 code for MySQLJDBCDiffStorage in the Slope One recommender, and in particular the first two SQL statements. In the getDiffSQL statement, isn't it missing the stdevColumn in the second SELECT of the UNION, and more importantly, shouldn't the avgColumn in t

Re: The number of points in AbstractCluster

2011-01-26 Thread Grant Ingersoll
Hi Yuval, I haven't looked, but I don't want to leave you hanging. This is definitely something we should check on and you may very well have found a bug. Perhaps you can write up a test case? I will try to look at this soon, if someone else doesn't beat me to it. -Grant On Jan 22, 2011, at

[jira] Commented: (MAHOUT-384) Implement of AVF algorithm

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987050#action_12987050 ] Sean Owen commented on MAHOUT-384: -- I'd say you're welcome to polish it up and commit then

[jira] Commented: (MAHOUT-155) ARFF VectorIterable

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987048#action_12987048 ] Sean Owen commented on MAHOUT-155: -- "Later" is probably a better label, yes. I don't think

[jira] Commented: (MAHOUT-384) Implement of AVF algorithm

2011-01-26 Thread Robin Anil (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987041#action_12987041 ] Robin Anil commented on MAHOUT-384: --- This was missing some tests which tony mentioned he

[jira] Commented: (MAHOUT-155) ARFF VectorIterable

2011-01-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987039#action_12987039 ] Grant Ingersoll commented on MAHOUT-155: Sean, just b/c something isn't active does

[jira] Commented: (MAHOUT-308) Improve Lanczos to handle extremely large feature sets (without hashing)

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987036#action_12987036 ] Sean Owen commented on MAHOUT-308: -- This one's also going stale. Jake, do you have thought

[jira] Resolved: (MAHOUT-155) ARFF VectorIterable

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-155. -- Resolution: Won't Fix Fix Version/s: (was: 0.5) Seems to have died > ARFF VectorIterable >

[jira] Commented: (MAHOUT-384) Implement of AVF algorithm

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987034#action_12987034 ] Sean Owen commented on MAHOUT-384: -- This is another one that seems to have died? Tony, Rob

[jira] Updated: (MAHOUT-334) Proposal for GSoC2010 (Linear SVM for Mahout)

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-334: - Resolution: Won't Fix Fix Version/s: (was: 0.5) Status: Resolved (was: Patch Availa

[jira] Commented: (MAHOUT-416) Input format of Random Forest in Mahout 0.3 is N*M matrix. It doesn't take sparse matrix as an input.

2011-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987028#action_12987028 ] Sean Owen commented on MAHOUT-416: -- Can you be more specific about the change you'd like t

[jira] Commented: (MAHOUT-590) add TSV (Tab Separate Value) input file support to SequenceFilesFromDirectory

2011-01-26 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986999#action_12986999 ] Isabel Drost commented on MAHOUT-590: - Please ignore last comment. What I actually mean

[jira] Updated: (MAHOUT-590) add TSV (Tab Separate Value) input file support to SequenceFilesFromDirectory

2011-01-26 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Drost updated MAHOUT-590: Attachment: MAHOUT-590.patch Please apply with git -p1 ... > add TSV (Tab Separate Value) input fi

[jira] Reopened: (MAHOUT-590) add TSV (Tab Separate Value) input file support to SequenceFilesFromDirectory

2011-01-26 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Drost reopened MAHOUT-590: - I agree the problem should be solved in a different tool. However I think there might be a way to re