[jira] [Created] (MAHOUT-1320) BallKMeansTest.testClustering is unstable
Stevo Slavic created MAHOUT-1320: Summary: BallKMeansTest.testClustering is unstable Key: MAHOUT-1320 URL: https://issues.apache.org/jira/browse/MAHOUT-1320 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: Apache Jenkins nodes Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 From time to time this test fails with following in build log: {noformat} Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 48.134 sec FAILURE! - in org.apache.mahout.clustering.streaming.cluster.BallKMeansTest testClustering(org.apache.mahout.clustering.streaming.cluster.BallKMeansTest) Time elapsed: 2.051 sec FAILURE! java.lang.AssertionError: expected:625.0 but was:796.0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:494) at org.junit.Assert.assertEquals(Assert.java:592) at org.apache.mahout.clustering.streaming.cluster.BallKMeansTest.testClustering(BallKMeansTest.java:119) {noformat} Here is a bit more of build log output, which also shows other tests were running in parallel with this one: {noformat} [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-core --- [INFO] Surefire report directory: /home/jenkins/jenkins-slave/workspace/Mahout-Quality/trunk/core/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.common.distance.TestChebyshevMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec - in org.apache.mahout.common.distance.TestChebyshevMeasure Running org.apache.mahout.common.distance.TestMinkowskiMeasure Running org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestManhattanDistanceMeasure Running org.apache.mahout.common.distance.CosineDistanceMeasureTest Running org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 sec - in org.apache.mahout.common.distance.TestMinkowskiMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.078 sec - in org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.099 sec - in org.apache.mahout.common.distance.TestManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.075 sec - in org.apache.mahout.common.distance.CosineDistanceMeasureTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec - in org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Running org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 sec - in org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Running org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.111 sec - in org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.121 sec - in org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestSamplingIterator Running org.apache.mahout.common.iterator.TestStableFixedSizeSampler Running org.apache.mahout.common.DummyRecordWriterTest Running org.apache.mahout.common.StringUtilsTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.093 sec - in org.apache.mahout.common.iterator.TestStableFixedSizeSampler Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.113 sec - in org.apache.mahout.common.DummyRecordWriterTest Running org.apache.mahout.common.AbstractJobTest Running org.apache.mahout.common.IntPairWritableTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec - in org.apache.mahout.common.IntPairWritableTest
[jira] [Commented] (MAHOUT-1302) SequenceFilesFromMailArchivesTest.testSequential failing
[ https://issues.apache.org/jira/browse/MAHOUT-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753449#comment-13753449 ] Stevo Slavic commented on MAHOUT-1302: -- Issue is fixed - was able to reproduce failing test when changing order of dirs and files (dirs first, files after - then it fails), and code in place now guarantees correct order as expected by test (first files then dirs). Anyway, will close the issue once test passes on either ubuntu-3 or ubuntu-6 node. SequenceFilesFromMailArchivesTest.testSequential failing Key: MAHOUT-1302 URL: https://issues.apache.org/jira/browse/MAHOUT-1302 Project: Mahout Issue Type: Bug Components: Integration Affects Versions: 0.8 Environment: ubuntu-3 and ubuntu-6 Apache Jenkins nodes Reporter: Stevo Slavic Assignee: Suneel Marthi Priority: Minor Labels: test Fix For: 0.9 SequenceFilesFromMailArchivesTest.testSequential is failing only on ubuntu3 and ubuntu6 Jenkins nodes. Because of that, MahoutQuality and integration job builds either fail or are successful depending on where they get run. Test fails because it expects entries in chunk-0 SequenceFile to be in specific order, but that order is not guaranteed because of the way the chunk-0 is created/filled - SequenceFilesFromMailArchives traverses input using Java's File[] java.io.File.listFiles(FileFilter filter) which does not guarantee order of files/directories. Unless we want in SequenceFileIterator to guarantee order by sorting, test needs to be changed to verify presence of given files and their content, but not their exact order. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: You are invited to Apache Mahout meet-up
That was a great talk, Ted; any chance you could share your slides? On Sat, Aug 24, 2013 at 4:19 PM, Andrew Musselman andrew.mussel...@gmail.com wrote: That's a fine meetup; see you then. On Aug 23, 2013, at 10:57 PM, Ted Dunning ted.dunn...@gmail.com wrote: See this URL. http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/events/120290942/ Sent from my iPhone On Aug 23, 2013, at 9:07, Andrew Musselman andrew.mussel...@gmail.com wrote: Excellent! Where are you speaking? On Fri, Aug 23, 2013 at 9:02 AM, Ted Dunning ted.dunn...@gmail.com wrote: I will be speaking in Seattle next week. It would be great to see everybody there. On Fri, Aug 23, 2013 at 8:55 AM, Pat Ferrel pat.fer...@gmail.com wrote: In Seattle too. I think Jake Mannix is in Seattle and already has a more general meetup here. http://www.meetup.com/Seattle-DAML/ It seems very non-Mahout specific, I haven't attended. On Aug 22, 2013, at 8:38 PM, Andrew Musselman andrew.mussel...@gmail.com wrote: Likewise; we talked about getting some other local Mahout meetups going. I'm in Seattle and I know there are other people up here. Let's get one started too.
[jira] [Updated] (MAHOUT-1320) BallKMeansTest.testClustering is unstable
[ https://issues.apache.org/jira/browse/MAHOUT-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-1320: Status: Patch Available (was: Open) Here is a patch that should solve this. BallKMeansTest.testClustering is unstable - Key: MAHOUT-1320 URL: https://issues.apache.org/jira/browse/MAHOUT-1320 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: Apache Jenkins nodes Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1320.patch From time to time this test fails with following in build log: {noformat} Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 48.134 sec FAILURE! - in org.apache.mahout.clustering.streaming.cluster.BallKMeansTest testClustering(org.apache.mahout.clustering.streaming.cluster.BallKMeansTest) Time elapsed: 2.051 sec FAILURE! java.lang.AssertionError: expected:625.0 but was:796.0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:494) at org.junit.Assert.assertEquals(Assert.java:592) at org.apache.mahout.clustering.streaming.cluster.BallKMeansTest.testClustering(BallKMeansTest.java:119) {noformat} Here is a bit more of build log output, which also shows other tests were running in parallel with this one: {noformat} [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-core --- [INFO] Surefire report directory: /home/jenkins/jenkins-slave/workspace/Mahout-Quality/trunk/core/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.common.distance.TestChebyshevMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec - in org.apache.mahout.common.distance.TestChebyshevMeasure Running org.apache.mahout.common.distance.TestMinkowskiMeasure Running org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestManhattanDistanceMeasure Running org.apache.mahout.common.distance.CosineDistanceMeasureTest Running org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 sec - in org.apache.mahout.common.distance.TestMinkowskiMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.078 sec - in org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.099 sec - in org.apache.mahout.common.distance.TestManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.075 sec - in org.apache.mahout.common.distance.CosineDistanceMeasureTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec - in org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Running org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 sec - in org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Running org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.111 sec - in org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.121 sec - in org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestSamplingIterator Running org.apache.mahout.common.iterator.TestStableFixedSizeSampler Running org.apache.mahout.common.DummyRecordWriterTest Running org.apache.mahout.common.StringUtilsTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.093 sec - in org.apache.mahout.common.iterator.TestStableFixedSizeSampler Tests run: 1,
[jira] [Updated] (MAHOUT-1320) BallKMeansTest.testClustering is unstable
[ https://issues.apache.org/jira/browse/MAHOUT-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-1320: Attachment: MAHOUT-1320.patch The problem was that the test data was not seeing that this should be using the test seed and the test is not deterministic. BallKMeansTest.testClustering is unstable - Key: MAHOUT-1320 URL: https://issues.apache.org/jira/browse/MAHOUT-1320 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: Apache Jenkins nodes Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1320.patch From time to time this test fails with following in build log: {noformat} Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 48.134 sec FAILURE! - in org.apache.mahout.clustering.streaming.cluster.BallKMeansTest testClustering(org.apache.mahout.clustering.streaming.cluster.BallKMeansTest) Time elapsed: 2.051 sec FAILURE! java.lang.AssertionError: expected:625.0 but was:796.0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:494) at org.junit.Assert.assertEquals(Assert.java:592) at org.apache.mahout.clustering.streaming.cluster.BallKMeansTest.testClustering(BallKMeansTest.java:119) {noformat} Here is a bit more of build log output, which also shows other tests were running in parallel with this one: {noformat} [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-core --- [INFO] Surefire report directory: /home/jenkins/jenkins-slave/workspace/Mahout-Quality/trunk/core/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.common.distance.TestChebyshevMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec - in org.apache.mahout.common.distance.TestChebyshevMeasure Running org.apache.mahout.common.distance.TestMinkowskiMeasure Running org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestManhattanDistanceMeasure Running org.apache.mahout.common.distance.CosineDistanceMeasureTest Running org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 sec - in org.apache.mahout.common.distance.TestMinkowskiMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.078 sec - in org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.099 sec - in org.apache.mahout.common.distance.TestManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.075 sec - in org.apache.mahout.common.distance.CosineDistanceMeasureTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec - in org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Running org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 sec - in org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Running org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.111 sec - in org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.121 sec - in org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestSamplingIterator Running org.apache.mahout.common.iterator.TestStableFixedSizeSampler Running org.apache.mahout.common.DummyRecordWriterTest Running org.apache.mahout.common.StringUtilsTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.093 sec -
Build failed in Jenkins: Mahout-Quality #2216
See https://builds.apache.org/job/Mahout-Quality/2216/ -- [...truncated 310940 lines...] [java] Applying edu.umd.cs.findbugs.detect.ComparatorIdiom to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindFieldSelfAssignment to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindSelfComparison to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindSelfComparison2 to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.DroppedException to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.LoadOfKnownNullValue to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.DumbMethodInvocations to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.URLProblems to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.DumbMethods to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.NumberConstructor to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindSqlInjection to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindDoubleCheck to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindFinalizeInvocations to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindHEmismatch to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindNakedNotify to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindReturnRef to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindRunInvocations to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.SwitchFallthrough to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindSpinLoop to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindNonShortCircuit to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindTwoLockWait to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindUnconditionalWait to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.DontUseEnum to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindUnsyncGet to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.InitializationChain to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.IteratorIdioms to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.PreferZeroLengthArrays to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.SynchronizingOnContentsOfFieldToProtectField to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.MutableLock to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.MutableStaticFields to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.Naming to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.ReadReturnShouldBeChecked to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.SerializableIdiom to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.StartInConstructor to org/apache/mahout/cf/taste/impl/recommender/svd/ParallelSGDFactorizer [java] Applying edu.umd.cs.findbugs.detect.FindBadForLoop to
[jira] [Created] (MAHOUT-1321) TestSequenceFilesFromDirectory.testSequenceFileFromDirectoryMapReduce is unstable
Stevo Slavic created MAHOUT-1321: Summary: TestSequenceFilesFromDirectory.testSequenceFileFromDirectoryMapReduce is unstable Key: MAHOUT-1321 URL: https://issues.apache.org/jira/browse/MAHOUT-1321 Project: Mahout Issue Type: Bug Components: Integration Affects Versions: 0.8 Environment: ubuntu4 Apache Jenkins CI node Reporter: Stevo Slavic Assignee: Stevo Slavic Fix For: 0.9 Relevant Mahout-Quality job execution #2216 build output: {noformat} [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration --- [INFO] Surefire report directory: /home/jenkins/jenkins-slave/workspace/Mahout-Quality/trunk/integration/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.text.LuceneStorageConfigurationTest Running org.apache.mahout.text.LuceneSegmentInputSplitTest Running org.apache.mahout.text.LuceneSegmentInputFormatTest Running org.apache.mahout.text.SequenceFilesFromLuceneStorageDriverTest Running org.apache.mahout.text.MailArchivesClusteringAnalyzerTest Running org.apache.mahout.text.LuceneSegmentRecordReaderTest Running org.apache.mahout.text.SequenceFilesFromMailArchivesTest Running org.apache.mahout.text.SequenceFilesFromLuceneStorageMRJobTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec - in org.apache.mahout.text.MailArchivesClusteringAnalyzerTest Running org.apache.mahout.text.SequenceFilesFromLuceneStorageTest Running org.apache.mahout.utils.vectors.VectorHelperTest Running org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest Running org.apache.mahout.text.TestSequenceFilesFromDirectory Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.04 sec - in org.apache.mahout.utils.vectors.VectorHelperTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.455 sec - in org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.905 sec - in org.apache.mahout.text.LuceneStorageConfigurationTest Running org.apache.mahout.utils.vectors.arff.MapBackedARFFModelTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.029 sec - in org.apache.mahout.utils.vectors.arff.MapBackedARFFModelTest Running org.apache.mahout.utils.vectors.arff.DriverTest Running org.apache.mahout.utils.vectors.arff.ARFFVectorIterableTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.194 sec - in org.apache.mahout.utils.vectors.arff.ARFFVectorIterableTest Running org.apache.mahout.utils.vectors.arff.ARFFTypeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.791 sec - in org.apache.mahout.utils.vectors.arff.DriverTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec - in org.apache.mahout.utils.vectors.arff.ARFFTypeTest Running org.apache.mahout.utils.vectors.lucene.DriverTest Running org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest Running org.apache.mahout.utils.vectors.lucene.LuceneIterableTest Running org.apache.mahout.utils.vectors.io.VectorWriterTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.421 sec - in org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.161 sec - in org.apache.mahout.text.SequenceFilesFromMailArchivesTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.704 sec - in org.apache.mahout.utils.vectors.lucene.LuceneIterableTest Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.302 sec FAILURE! - in org.apache.mahout.text.TestSequenceFilesFromDirectory testSequenceFileFromDirectoryMapReduce(org.apache.mahout.text.TestSequenceFilesFromDirectory) Time elapsed: 3.133 sec FAILURE! java.lang.AssertionError: expected:1 but was:0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.mahout.text.TestSequenceFilesFromDirectory.checkMRResultFiles(TestSequenceFilesFromDirectory.java:282) at org.apache.mahout.text.TestSequenceFilesFromDirectory.testSequenceFileFromDirectoryMapReduce(TestSequenceFilesFromDirectory.java:135) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see:
Build failed in Jenkins: mahout-nightly ยป Mahout Integration #1337
See https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/1337/ -- Aug 29, 2013 11:05:25 PM org.apache.maven.cli.event.ExecutionEventLogger projectStarted INFO: Aug 29, 2013 11:05:25 PM org.apache.maven.cli.event.ExecutionEventLogger projectStarted INFO: Aug 29, 2013 11:05:25 PM org.apache.maven.cli.event.ExecutionEventLogger projectStarted INFO: Building Mahout Integration 0.9-SNAPSHOT Aug 29, 2013 11:05:25 PM org.apache.maven.cli.event.ExecutionEventLogger projectStarted INFO: Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration --- [INFO] Deleting https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-resources-plugin:2.6:resources (default-resources) @ mahout-integration --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 0 resource Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:27 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-compiler-plugin:3.1:compile (default-compile) @ mahout-integration --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 127 source files to https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/classes [WARNING] Note: https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/src/main/java/org/apache/mahout/cf/taste/impl/model/mongodb/MongoDBDataModel.java uses or overrides a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [WARNING] Note: https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/src/main/java/org/apache/mahout/cf/taste/impl/model/mongodb/MongoDBDataModel.java uses unchecked or unsafe operations. [WARNING] Note: Recompile with -Xlint:unchecked for details. Aug 29, 2013 11:05:30 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:30 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-resources-plugin:2.6:testResources (default-testResources) @ mahout-integration --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 10 resources Aug 29, 2013 11:05:30 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:30 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ mahout-integration --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 38 source files to https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/test-classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. Aug 29, 2013 11:05:31 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:31 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration --- [INFO] Surefire report directory: https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.utils.vectors.arff.MapBackedARFFModelTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.022 sec - in org.apache.mahout.utils.vectors.arff.MapBackedARFFModelTest Running org.apache.mahout.text.MailArchivesClusteringAnalyzerTest Running org.apache.mahout.utils.vectors.io.VectorWriterTest Running org.apache.mahout.text.LuceneStorageConfigurationTest Running org.apache.mahout.utils.nlp.collocations.llr.BloomTokenFilterTest Running org.apache.mahout.utils.regex.RegexUtilsTest Running org.apache.mahout.utils.vectors.VectorHelperTest Running org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest Running org.apache.mahout.utils.email.MailProcessorTest Running
Build failed in Jenkins: mahout-nightly #1337
See https://builds.apache.org/job/mahout-nightly/1337/ -- [...truncated 1821 lines...] INFO: Aug 29, 2013 11:05:07 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-jar-plugin:2.4:jar (default-jar) @ mahout-core --- [INFO] Building jar: https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar Aug 29, 2013 11:05:08 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:08 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-jar-plugin:2.4:test-jar (default) @ mahout-core --- [INFO] Building jar: https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar [WARNING] Artifact org.apache.mahout:mahout-core:test-jar:tests:0.9-SNAPSHOT already attached to project, ignoring duplicate Aug 29, 2013 11:05:09 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:09 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-assembly-plugin:2.4:single (job) @ mahout-core --- [INFO] Reading assembly descriptor: src/main/assembly/job.xml [INFO] Building jar: https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar [WARNING] Artifact org.apache.mahout:mahout-core:jar:job:0.9-SNAPSHOT already attached to project, ignoring duplicate Aug 29, 2013 11:05:15 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:15 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-source-plugin:2.2.1:jar-no-fork (attach-sources) @ mahout-core --- [WARNING] Artifact org.apache.mahout:mahout-core:java-source:sources:0.9-SNAPSHOT already attached to project, ignoring duplicate Aug 29, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-install-plugin:2.4:install (default-install) @ mahout-core --- [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar to /home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.jar [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/pom.xml to /home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.pom [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar to /home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-tests.jar [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar to /home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-job.jar [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-sources.jar to /home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-sources.jar Aug 29, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: Aug 29, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger mojoStarted INFO: --- maven-deploy-plugin:2.5:deploy (default-deploy) @ mahout-core --- Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml (2 KB at 2.2 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130829.230517-34.jar Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130829.230517-34.jar (1332 KB at 2522.4 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130829.230517-34.pom Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130829.230517-34.pom (7 KB at 88.8 KB/sec) Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml (344 B at 1.4 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml Uploaded:
[jira] [Updated] (MAHOUT-1320) BallKMeansTest.testClustering is unstable
[ https://issues.apache.org/jira/browse/MAHOUT-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-1320: Attachment: MAHOUT-1320.patch Here is a revised patch with all suggestions implemented (except the change to RandomWrapper and the call to useTestSeed in the hypercube generation function) BallKMeansTest.testClustering is unstable - Key: MAHOUT-1320 URL: https://issues.apache.org/jira/browse/MAHOUT-1320 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: Apache Jenkins nodes Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1320.patch, MAHOUT-1320.patch From time to time this test fails with following in build log: {noformat} Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 48.134 sec FAILURE! - in org.apache.mahout.clustering.streaming.cluster.BallKMeansTest testClustering(org.apache.mahout.clustering.streaming.cluster.BallKMeansTest) Time elapsed: 2.051 sec FAILURE! java.lang.AssertionError: expected:625.0 but was:796.0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:494) at org.junit.Assert.assertEquals(Assert.java:592) at org.apache.mahout.clustering.streaming.cluster.BallKMeansTest.testClustering(BallKMeansTest.java:119) {noformat} Here is a bit more of build log output, which also shows other tests were running in parallel with this one: {noformat} [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-core --- [INFO] Surefire report directory: /home/jenkins/jenkins-slave/workspace/Mahout-Quality/trunk/core/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.common.distance.TestChebyshevMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec - in org.apache.mahout.common.distance.TestChebyshevMeasure Running org.apache.mahout.common.distance.TestMinkowskiMeasure Running org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestManhattanDistanceMeasure Running org.apache.mahout.common.distance.CosineDistanceMeasureTest Running org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 sec - in org.apache.mahout.common.distance.TestMinkowskiMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.078 sec - in org.apache.mahout.common.distance.TestMahalanobisDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.099 sec - in org.apache.mahout.common.distance.TestManhattanDistanceMeasure Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.075 sec - in org.apache.mahout.common.distance.CosineDistanceMeasureTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec - in org.apache.mahout.common.distance.TestTanimotoDistanceMeasure Running org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Running org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.135 sec - in org.apache.mahout.common.distance.TestWeightedManhattanDistanceMeasure Running org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in org.apache.mahout.common.iterator.CountingIteratorTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in org.apache.mahout.common.iterator.TestFixedSizeSampler Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.111 sec - in org.apache.mahout.common.distance.TestWeightedEuclideanDistanceMeasureTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.121 sec - in org.apache.mahout.common.distance.TestEuclideanDistanceMeasure Running org.apache.mahout.common.iterator.TestSamplingIterator Running org.apache.mahout.common.iterator.TestStableFixedSizeSampler Running org.apache.mahout.common.DummyRecordWriterTest Running org.apache.mahout.common.StringUtilsTest Tests run: 5, Failures:
[jira] [Commented] (MAHOUT-1286) Memory-efficient DataModel, supporting fast online updates and element-wise iteration
[ https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754269#comment-13754269 ] Peng Cheng commented on MAHOUT-1286: Hi Gokhan, No problem, but it only has two files, I'll post the patch immediately. -Yours Peng Memory-efficient DataModel, supporting fast online updates and element-wise iteration - Key: MAHOUT-1286 URL: https://issues.apache.org/jira/browse/MAHOUT-1286 Project: Mahout Issue Type: Improvement Components: Collaborative Filtering Affects Versions: 0.9 Reporter: Peng Cheng Labels: collaborative-filtering, datamodel, patch, recommender Fix For: 0.9 Attachments: InMemoryDataModel.java, InMemoryDataModelTest.java Original Estimate: 336h Remaining Estimate: 336h Most DataModel implementation in current CF component use hash map to enable fast 2d indexing and update. This is not memory-efficient for big data set. e.g. Netflix prize dataset takes 11G heap space as a FileDataModel. Improved implementation of DataModel should use more compact data structure (like arrays), this can trade a little of time complexity in 2d indexing for vast improvement in memory efficiency. In addition, any online recommender or online-to-batch converted recommender will not be affected by this in training process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-1286) Memory-efficient DataModel, supporting fast online updates and element-wise iteration
[ https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Cheng updated MAHOUT-1286: --- Attachment: Semifinal-implementation-added.patch Sorry about the late reply, and please be noted that the code can still be optimized at many places, I'll keep maintain it and keep an ear on all suggestions. Memory-efficient DataModel, supporting fast online updates and element-wise iteration - Key: MAHOUT-1286 URL: https://issues.apache.org/jira/browse/MAHOUT-1286 Project: Mahout Issue Type: Improvement Components: Collaborative Filtering Affects Versions: 0.9 Reporter: Peng Cheng Labels: collaborative-filtering, datamodel, patch, recommender Fix For: 0.9 Attachments: InMemoryDataModel.java, InMemoryDataModelTest.java, Semifinal-implementation-added.patch Original Estimate: 336h Remaining Estimate: 336h Most DataModel implementation in current CF component use hash map to enable fast 2d indexing and update. This is not memory-efficient for big data set. e.g. Netflix prize dataset takes 11G heap space as a FileDataModel. Improved implementation of DataModel should use more compact data structure (like arrays), this can trade a little of time complexity in 2d indexing for vast improvement in memory efficiency. In addition, any online recommender or online-to-batch converted recommender will not be affected by this in training process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira