[jira] [Created] (MAHOUT-1338) Reduce mahout-integration transitive dependencies to avoid JAR hell, version conflicts

2013-09-18 Thread Sean Owen (JIRA)
Sean Owen created MAHOUT-1338: - Summary: Reduce mahout-integration transitive dependencies to avoid JAR hell, version conflicts Key: MAHOUT-1338 URL: https://issues.apache.org/jira/browse/MAHOUT-1338 Proj

[jira] [Updated] (MAHOUT-1338) Reduce mahout-integration transitive dependencies to avoid JAR hell, version conflicts

2013-09-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-1338: -- Attachment: MAHOUT-1338.patch > Reduce mahout-integration transitive dependencies to avoid JAR hel

[jira] [Updated] (MAHOUT-1338) Reduce mahout-integration transitive dependencies to avoid JAR hell, version conflicts

2013-09-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-1338: -- Status: Patch Available (was: Open) > Reduce mahout-integration transitive dependencies to avoid

[jira] [Updated] (MAHOUT-1325) SequenceFilesFromMailArchivesTest.testMapReduce is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic updated MAHOUT-1325: - Attachment: org.apache.mahout.text.SequenceFilesFromMailArchivesTest-output.txt Whenever mapred

[jira] [Resolved] (MAHOUT-1324) SequenceFilesFromLuceneStorageMRJobTest.testRun is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic resolved MAHOUT-1324. -- Resolution: Duplicate Fix Version/s: (was: 0.9) Resolving as duplicate of MAHOUT-13

[jira] [Commented] (MAHOUT-1325) SequenceFilesFromMailArchivesTest.testMapReduce is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770867#comment-13770867 ] Stevo Slavic commented on MAHOUT-1325: -- This unwanted behavior appears to be caused

[jira] [Resolved] (MAHOUT-1321) TestSequenceFilesFromDirectory.testSequenceFileFromDirectoryMapReduce is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic resolved MAHOUT-1321. -- Resolution: Duplicate Fix Version/s: (was: 0.9) Resolving as duplicate of MAHOUT-13

[jira] [Updated] (MAHOUT-1325) MapReduce unit tests cannot be run in parallel

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic updated MAHOUT-1325: - Summary: MapReduce unit tests cannot be run in parallel (was: SequenceFilesFromMailArchivesTest

[jira] [Commented] (MAHOUT-1325) MapReduce unit tests cannot be run in parallel

2013-09-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770890#comment-13770890 ] Sean Owen commented on MAHOUT-1325: --- Fair enough, I thought it was actually a small eno

[jira] [Commented] (MAHOUT-1325) MapReduce unit tests cannot be run in parallel

2013-09-18 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770917#comment-13770917 ] Suneel Marthi commented on MAHOUT-1325: --- That pretty much explains the mysterious a

[jira] [Commented] (MAHOUT-1322) TestDistributedRowMatrix.testTranspose is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770930#comment-13770930 ] Stevo Slavic commented on MAHOUT-1322: -- We have to check if this one is, like MAHOUT

[jira] [Commented] (MAHOUT-1323) ParallelALSFactorizationJobTest.completeJobToyExample is unstable

2013-09-18 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770931#comment-13770931 ] Stevo Slavic commented on MAHOUT-1323: -- We have to check if this one is, like MAHOUT

Why Kahan summation was not used anywhere?

2013-09-18 Thread Peng Cheng
For a large scale computational engine this seems unwashed. Most summation/average and dot product of vectors still use naive summation despite of its O(n) error. Is there a reason? All the best, Yours Peng

Re: Why Kahan summation was not used anywhere?

2013-09-18 Thread Ted Dunning
This has come up before. As a background, if you add up lots of numbers using a straightforward loop, you can lose precision. In the worse case the loss is O(n \epsilon), but in virtually all real examples the lossage is O(\epsilon \sqrt(n)). IF we are summing a billion numbers, the square root

Mahout 0.8 compilation issue on hadoop 1.0.3

2013-09-18 Thread Mehant Baid
I was trying to compile Mahout-0.8 on Hadoop version 1.0.3. In DummyStatusReporter.java the @override directive is added for the method getProgress(). The getProgress() method does not exist in its base class (StatusReporter.java) in hadoop versions 1.0.3, 1.0.4 and is only included in 1.1.1 on