Build failed in Jenkins: mahout-nightly #1300

2013-07-23 Thread Apache Jenkins Server
See 

--
[...truncated 1925 lines...]
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130723.235741-12-job.jar
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130723.235741-12-job.jar
 (19462 KB at 24086.0 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (2 KB at 23.3 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130723.235741-12-sources.jar
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130723.235741-12-sources.jar
 (1155 KB at 9094.4 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (2 KB at 22.2 KB/sec)
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: Building Mahout Integration 0.9-SNAPSHOT
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration 
---
[INFO] Deleting 

Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:resources (default-resources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:compile (default-compile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 131 source files to 

[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: 

 uses unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 10 resources
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 39 source files to 

[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
Jul 23, 2013 11:57:49 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:49 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration 
---
[INFO] Surefire report directory: 

[INFO] parallel='classes', perCoreThreadCount=false, threadCount=1

Build failed in Jenkins: mahout-nightly ยป Mahout Integration #1300

2013-07-23 Thread Apache Jenkins Server
See 


--
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: Building Mahout Integration 0.9-SNAPSHOT
Jul 23, 2013 11:57:43 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration 
---
[INFO] Deleting 

Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:resources (default-resources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:45 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:compile (default-compile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 131 source files to 

[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: 

 uses unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 10 resources
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:48 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 39 source files to 

[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
Jul 23, 2013 11:57:49 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Jul 23, 2013 11:57:49 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration 
---
[INFO] Surefire report directory: 

[INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, 
useUnlimitedThreads=false

---
 T E S T S
---

---
 T E S T S
---
Running 
org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.138 sec - in 
org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest
Running org.apache.mahout.clustering.TestClusterEvaluator
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.031 sec - 
in org.apache.mahout.clustering.TestClusterEvaluator
Running org.apache.mahout.clustering.TestClusterDumper
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.01 sec - in 
org.apache.mahout.clustering.TestClusterDumper
Running org.apache.mahout.clustering.dirichlet.TestL1ModelClustering
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.795 sec - in 
org.apache.mahout.clustering.dirichlet.TestL1ModelClustering
Running org.apache.mahout.clustering.cdbw.TestCDbwEvaluator
Tests

Build failed in Jenkins: Mahout-Quality #2154

2013-07-23 Thread Apache Jenkins Server
See 

--
[...truncated 197593 lines...]
Running org.apache.mahout.classifier.df.data.DatasetTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.326 sec - in 
org.apache.mahout.classifier.df.data.DatasetTest
Running org.apache.mahout.classifier.df.data.DescriptorUtilsTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.058 sec - in 
org.apache.mahout.classifier.df.data.DescriptorUtilsTest
Running org.apache.mahout.classifier.df.data.DataLoaderTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.867 sec - in 
org.apache.mahout.classifier.df.data.DataLoaderTest
Running org.apache.mahout.classifier.df.data.DataConverterTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.38 sec - in 
org.apache.mahout.classifier.df.data.DataConverterTest
Running org.apache.mahout.classifier.df.data.DataTest
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.777 sec - in 
org.apache.mahout.classifier.df.data.DataTest
Running org.apache.mahout.classifier.df.mapreduce.partial.Step1MapperTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.443 sec - in 
org.apache.mahout.classifier.df.mapreduce.partial.Step1MapperTest
Running org.apache.mahout.classifier.df.mapreduce.partial.TreeIDTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.239 sec - in 
org.apache.mahout.classifier.df.mapreduce.partial.TreeIDTest
Running org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilderTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.447 sec - in 
org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilderTest
Running org.apache.mahout.classifier.df.mapreduce.inmem.InMemInputSplitTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 sec - in 
org.apache.mahout.classifier.df.mapreduce.inmem.InMemInputSplitTest
Running org.apache.mahout.classifier.df.mapreduce.inmem.InMemInputFormatTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.179 sec - in 
org.apache.mahout.classifier.df.mapreduce.inmem.InMemInputFormatTest
Running org.apache.mahout.classifier.df.builder.DecisionTreeBuilderTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.09 sec - in 
org.apache.mahout.classifier.df.builder.DecisionTreeBuilderTest
Running org.apache.mahout.classifier.df.builder.DefaultTreeBuilderTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.077 sec - in 
org.apache.mahout.classifier.df.builder.DefaultTreeBuilderTest
Running org.apache.mahout.classifier.df.builder.InfiniteRecursionTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.319 sec - in 
org.apache.mahout.classifier.df.builder.InfiniteRecursionTest
Running org.apache.mahout.classifier.df.DecisionForestTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.321 sec - in 
org.apache.mahout.classifier.df.DecisionForestTest
Running org.apache.mahout.classifier.df.node.NodeTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.036 sec - in 
org.apache.mahout.classifier.df.node.NodeTest
Running org.apache.mahout.classifier.df.tools.VisualizerTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.356 sec - in 
org.apache.mahout.classifier.df.tools.VisualizerTest
Running org.apache.mahout.classifier.RegressionResultAnalyzerTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.043 sec - in 
org.apache.mahout.classifier.RegressionResultAnalyzerTest
Running org.apache.mahout.classifier.ConfusionMatrixTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec - in 
org.apache.mahout.classifier.ConfusionMatrixTest
Running org.apache.mahout.classifier.naivebayes.NaiveBayesTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.336 sec - in 
org.apache.mahout.classifier.naivebayes.NaiveBayesTest
Running 
org.apache.mahout.classifier.naivebayes.ComplementaryNaiveBayesClassifierTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.059 sec - in 
org.apache.mahout.classifier.naivebayes.ComplementaryNaiveBayesClassifierTest
Running org.apache.mahout.classifier.naivebayes.NaiveBayesModelTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.056 sec - in 
org.apache.mahout.classifier.naivebayes.NaiveBayesModelTest
Running org.apache.mahout.classifier.naivebayes.StandardNaiveBayesClassifierTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.063 sec - in 
org.apache.mahout.classifier.naivebayes.StandardNaiveBayesClassifierTest
Running 
org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapperTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.167 sec - in 
org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapperTest
Running org.apache.mahout.classifier.naivebayes.training.ThetaMapperTest
Tests run: 1,

Re: [jira] [Commented] (MAHOUT-1286) Memory-efficient DataModel, supporting fast online updates and element-wise iteration

2013-07-23 Thread Peng Cheng
That's exactly what I'm trying to do right now :) (I'm testing 
FastByIDArrayMap), but we probably have more problems than just HashMap, 
based on the heap dump analysis result, PreferenceArray probably will be 
our next target. This is awesome, as your FactorizablePreferences didn't 
use it in the first place.


Yours Peng

On 13-07-23 05:46 PM, Sebastian Schelter wrote:

IMHO you will always have memory issues if you try to provide constant time
random access. Thats why I proposed to created a special memory efficient
DataModel for sequential access.


2013/7/23 Peng Cheng (JIRA) 


 [
https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717659#comment-13717659]

Peng Cheng commented on MAHOUT-1286:


Aye aye, I just did, turns out that instances of
PreferenceArray$PreferenceView has taken 1.7G. Quite unexpected right?
Thanks a lot for the advice.
My next experiment will just use GenericPreference [] directly, there will
be no more PreferenceArray.

Class Name
 |Objects |  Shallow Heap |Retained Heap

---
org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray$PreferenceView|
72,237,632 | 1,733,703,168 | >= 1,733,703,168
long[]
 |480,199 |   818,209,680 |   >= 818,209,680
float[]
  |480,190 |   410,563,592 |   >= 410,563,592
java.lang.Object[]
 | 18,230 |   361,525,488 | >= 2,443,647,088
org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray
 |480,189 |15,366,048 | >= 1,237,456,672
java.util.ArrayList
  | 17,811 |   427,464 | >= 2,092,416,104
char[]
 |  2,150 |   272,632 |   >= 272,632
byte[]
 |141 |54,048 |>= 54,048
java.lang.String
 |  2,119 |50,856 |   >= 271,920
java.util.concurrent.ConcurrentHashMap$HashEntry
 |673 |21,536 |>= 38,104
java.net.URL
 |229 |14,656 |>= 40,720
java.util.HashMap$Entry
  |344 |11,008 |>= 68,760

---



Memory-efficient DataModel, supporting fast online updates and

element-wise iteration
-

 Key: MAHOUT-1286
 URL: https://issues.apache.org/jira/browse/MAHOUT-1286
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.9
Reporter: Peng Cheng
Assignee: Sean Owen
   Original Estimate: 336h
  Remaining Estimate: 336h

Most DataModel implementation in current CF component use hash map to

enable fast 2d indexing and update. This is not memory-efficient for big
data set. e.g. Netflix prize dataset takes 11G heap space as a
FileDataModel.

Improved implementation of DataModel should use more compact data

structure (like arrays), this can trade a little of time complexity in 2d
indexing for vast improvement in memory efficiency. In addition, any online
recommender or online-to-batch converted recommender will not be affected
by this in training process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira






Re: [jira] [Commented] (MAHOUT-1286) Memory-efficient DataModel, supporting fast online updates and element-wise iteration

2013-07-23 Thread Sebastian Schelter
IMHO you will always have memory issues if you try to provide constant time
random access. Thats why I proposed to created a special memory efficient
DataModel for sequential access.


2013/7/23 Peng Cheng (JIRA) 

>
> [
> https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717659#comment-13717659]
>
> Peng Cheng commented on MAHOUT-1286:
> 
>
> Aye aye, I just did, turns out that instances of
> PreferenceArray$PreferenceView has taken 1.7G. Quite unexpected right?
> Thanks a lot for the advice.
> My next experiment will just use GenericPreference [] directly, there will
> be no more PreferenceArray.
>
> Class Name
> |Objects |  Shallow Heap |Retained Heap
>
> ---
> org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray$PreferenceView|
> 72,237,632 | 1,733,703,168 | >= 1,733,703,168
> long[]
> |480,199 |   818,209,680 |   >= 818,209,680
> float[]
>  |480,190 |   410,563,592 |   >= 410,563,592
> java.lang.Object[]
> | 18,230 |   361,525,488 | >= 2,443,647,088
> org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray
> |480,189 |15,366,048 | >= 1,237,456,672
> java.util.ArrayList
>  | 17,811 |   427,464 | >= 2,092,416,104
> char[]
> |  2,150 |   272,632 |   >= 272,632
> byte[]
> |141 |54,048 |>= 54,048
> java.lang.String
> |  2,119 |50,856 |   >= 271,920
> java.util.concurrent.ConcurrentHashMap$HashEntry
> |673 |21,536 |>= 38,104
> java.net.URL
> |229 |14,656 |>= 40,720
> java.util.HashMap$Entry
>  |344 |11,008 |>= 68,760
>
> ---
>
>
> > Memory-efficient DataModel, supporting fast online updates and
> element-wise iteration
> >
> -
> >
> > Key: MAHOUT-1286
> > URL: https://issues.apache.org/jira/browse/MAHOUT-1286
> > Project: Mahout
> >  Issue Type: Improvement
> >  Components: Collaborative Filtering
> >Affects Versions: 0.9
> >Reporter: Peng Cheng
> >Assignee: Sean Owen
> >   Original Estimate: 336h
> >  Remaining Estimate: 336h
> >
> > Most DataModel implementation in current CF component use hash map to
> enable fast 2d indexing and update. This is not memory-efficient for big
> data set. e.g. Netflix prize dataset takes 11G heap space as a
> FileDataModel.
> > Improved implementation of DataModel should use more compact data
> structure (like arrays), this can trade a little of time complexity in 2d
> indexing for vast improvement in memory efficiency. In addition, any online
> recommender or online-to-batch converted recommender will not be affected
> by this in training process.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>


[jira] [Commented] (MAHOUT-1286) Memory-efficient DataModel, supporting fast online updates and element-wise iteration

2013-07-23 Thread Peng Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717659#comment-13717659
 ] 

Peng Cheng commented on MAHOUT-1286:


Aye aye, I just did, turns out that instances of PreferenceArray$PreferenceView 
has taken 1.7G. Quite unexpected right? Thanks a lot for the advice.
My next experiment will just use GenericPreference [] directly, there will be 
no more PreferenceArray.

Class Name 
|Objects |  Shallow Heap |Retained Heap
---
org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray$PreferenceView|
 72,237,632 | 1,733,703,168 | >= 1,733,703,168
long[] 
|480,199 |   818,209,680 |   >= 818,209,680
float[]
|480,190 |   410,563,592 |   >= 410,563,592
java.lang.Object[] 
| 18,230 |   361,525,488 | >= 2,443,647,088
org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray   
|480,189 |15,366,048 | >= 1,237,456,672
java.util.ArrayList
| 17,811 |   427,464 | >= 2,092,416,104
char[] 
|  2,150 |   272,632 |   >= 272,632
byte[] 
|141 |54,048 |>= 54,048
java.lang.String   
|  2,119 |50,856 |   >= 271,920
java.util.concurrent.ConcurrentHashMap$HashEntry   
|673 |21,536 |>= 38,104
java.net.URL   
|229 |14,656 |>= 40,720
java.util.HashMap$Entry
|344 |11,008 |>= 68,760
---


> Memory-efficient DataModel, supporting fast online updates and element-wise 
> iteration
> -
>
> Key: MAHOUT-1286
> URL: https://issues.apache.org/jira/browse/MAHOUT-1286
> Project: Mahout
>  Issue Type: Improvement
>  Components: Collaborative Filtering
>Affects Versions: 0.9
>Reporter: Peng Cheng
>Assignee: Sean Owen
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Most DataModel implementation in current CF component use hash map to enable 
> fast 2d indexing and update. This is not memory-efficient for big data set. 
> e.g. Netflix prize dataset takes 11G heap space as a FileDataModel.
> Improved implementation of DataModel should use more compact data structure 
> (like arrays), this can trade a little of time complexity in 2d indexing for 
> vast improvement in memory efficiency. In addition, any online recommender or 
> online-to-batch converted recommender will not be affected by this in 
> training process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Mahout-Examples-Cluster-Reuters-II #551

2013-07-23 Thread Apache Jenkins Server
See 



Re: [VOTE] Release Mahout 0.8

2013-07-23 Thread Grant Ingersoll
Working on this today.  LDAP at ASF seems to be mucked up at the moment, so 
when it becomes available, I'll do the necessary copying, etc.

-Grant

On Jul 19, 2013, at 9:55 AM, Grant Ingersoll  wrote:

> This passes.  I will finish off the release either tonight or tomorrow AM.
> 
> On Jul 19, 2013, at 3:06 AM, Jake Mannix  wrote:
> 
>> +1 from me, I used the jars to run some LDA (on a couple hundred million
>> documents) on the work cluster (1.0.something small), and it worked fine.
>> Other clustering example (with reuters) also worked as expected.
>> 
>> 
>> 
>> On Thu, Jul 18, 2013 at 11:27 AM, Suneel Marthi 
>> wrote:
>> 
>>> +1 from me.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: Sebastian Schelter 
>>> To: dev@mahout.apache.org
>>> Sent: Thursday, July 18, 2013 1:22 PM
>>> Subject: Re: [VOTE] Release Mahout 0.8
>>> 
>>> 
>>> +1 from me, recommender stuff worked fine in my tests
>>> 
>>> 
>>> 2013/7/18 Grant Ingersoll 
>>> 
 +1 from me.
 
 On Jul 16, 2013, at 4:52 PM, Grant Ingersoll 
>>> wrote:
 
> Applying a forcing function:
> 
> Please vote on releasing the 0.8 artifacts at
 
>>> https://repository.apache.org/content/repositories/orgapachemahout-113/org/apache/mahout/
 .
> 
> Release notes are at
 https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
> 
> [] +1 Looks good
> [] 0 - No opinion
> [] -1 Don't release
> 
> Vote criteria from https://www.apache.org/dev/release.html
> 
> What are the ASF requirements on approving a release?
> Votes on whether a package is ready to be released use majority
>>> approval
 -- i.e., at least three PMC members must vote affirmatively for release,
 and there must be more positive than negative votes. Releases may not be
 vetoed. Before voting +1 PMC members are required to download the signed
 source code package, compile it as provided, and test the resulting
 executable on their own platform, along with also verifying that the
 package meets the requirements of the ASF policy on releases.
> 
> Thanks,
> Grant
 
 
 
>>> 
>> 
>> 
>> 
>> -- 
>> 
>>  -jake
> 
> 
> Grant Ingersoll | @gsingers
> http://www.lucidworks.com
> 
> 
> 
> 
> 


Grant Ingersoll | @gsingers
http://www.lucidworks.com