[jira] [Updated] (MAHOUT-1270) Broken link on Developer Resources page

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1270:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Broken link on Developer Resources page
 ---

 Key: MAHOUT-1270
 URL: https://issues.apache.org/jira/browse/MAHOUT-1270
 Project: Mahout
  Issue Type: Bug
  Components: Website
Affects Versions: 0.7
Reporter: Erhan Bagdemir
Assignee: Robin Anil
Priority: Minor
 Fix For: 0.8


 The link How to contribute on the page
 https://cwiki.apache.org/confluence/display/MAHOUT/Developer+Resources
 is broken :-| 
 https://cwiki.apache.org/MAHOUT/how-to-contribute.html returns 404. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1202) Speed up Vector operations

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1202:
--

Fix Version/s: 0.8

 Speed up Vector operations
 --

 Key: MAHOUT-1202
 URL: https://issues.apache.org/jira/browse/MAHOUT-1202
 Project: Mahout
  Issue Type: Improvement
  Components: Math
Affects Versions: 0.8
Reporter: Dan Filimon
 Fix For: 0.8


 Vector assign() and aggregate() can be significantly improved in some 
 conditions taking into account the different properties of the vectors we're 
 working with.
 This issue relates to the design document at 
 https://docs.google.com/document/d/1g1PjUuvjyh2LBdq2_rKLIcUiDbeOORA1sCJiSsz-JVU/edit#heading=h.koi571fvwha3jj
 and the patch at
 https://reviews.apache.org/r/10669
 The benchmarks are at
 https://docs.google.com/spreadsheet/ccc?key=0AochdzPoBmWodG9RTms1UG40YlNQd3ByUFpQY0FLWmcpli=1#gid=10
 and while there are a few regressions (which will be fixed later regarding 
 RandomAccessSparseVectors), it improves a lot of benchmarks as well as cleans 
 up the code significantly.
 Part 1, the new function interfaces is merged. [Committed revision 1478853.]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1216) Add locality sensitive hashing and a LocalitySensitiveHash searcher

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1216:
--

Fix Version/s: 0.8

 Add locality sensitive hashing and a LocalitySensitiveHash searcher
 ---

 Key: MAHOUT-1216
 URL: https://issues.apache.org/jira/browse/MAHOUT-1216
 Project: Mahout
  Issue Type: New Feature
  Components: Math
Affects Versions: 0.8
Reporter: Dan Filimon
 Fix For: 0.8


 This issue tackles the LocalitySensitiveHashSearch, that was initially 
 supposed to be part of MAHOUT-1156.
 It adds HashedVector, the class that adds the LSH to vectors, a new searcher 
 (although a better implementation is possible) and adds support in the 
 existing tests and new StreamingKMeans infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1205) ParallelALSFactorizationJob should leverage the distributed cache

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1205:
--

Fix Version/s: 0.8

 ParallelALSFactorizationJob should leverage the distributed cache
 -

 Key: MAHOUT-1205
 URL: https://issues.apache.org/jira/browse/MAHOUT-1205
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.8
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
 Fix For: 0.8


 ParallelALSFactorizationJob should use the DistributedCache to broadcast the 
 feature matrices only once per re-computation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1198) Allow Latex in javadox

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1198:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Allow Latex in javadox
 --

 Key: MAHOUT-1198
 URL: https://issues.apache.org/jira/browse/MAHOUT-1198
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Ted Dunning
 Fix For: 0.8


 We are headed into a release (hopefully) and now would be a nice time to add 
 the capability to generate javadocs with embedded latex.
 Following a hint from commons math, I tested a way to inject mathjax into the 
 header of the resulting web-site and got good results (see 
 http://tdunning.github.io/bandit-ranking/ especially docs for 
 GammaNormalDistribution and BetaBinomialDistribution.
 The basic idea is that we need to add the following config to the javadocs 
 plugin:
 {quote}
 configuration
 additionalparam-header apos;lt;script 
 type=quot;text/javascriptquot; 
 src=quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTMLquot;gt;
  lt;/scriptgt;apos;/additionalparam
 /configuration
 {quote}
 Having done this, \[ \] and \( \) can be used to embed latex equations in the 
 javadocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1182) remove useless append

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1182:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 remove useless append
 -

 Key: MAHOUT-1182
 URL: https://issues.apache.org/jira/browse/MAHOUT-1182
 Project: Mahout
  Issue Type: Improvement
  Components: Integration
Affects Versions: 0.7
Reporter: Dave Brosius
Priority: Trivial
 Fix For: 0.8

 Attachments: uselessappend.txt


 .append() removed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1119) code bug in org.apache.mahout.text.SequenceFilesFromDirectory

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1119:
--

Fix Version/s: 0.8

 code bug in org.apache.mahout.text.SequenceFilesFromDirectory
 -

 Key: MAHOUT-1119
 URL: https://issues.apache.org/jira/browse/MAHOUT-1119
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.7
 Environment: linux、JDK1.6
Reporter: 徐家
Assignee: Sebastian Schelter
  Labels: SequenceFilesFromDirectory
 Fix For: 0.8

   Original Estimate: 1h
  Remaining Estimate: 1h

 in  org.apache.mahout.text.SequenceFilesFromDirectory from line 89 to 96 the 
 code is 
   pathFilterClass.getConstructor(Configuration.class,
String.class,
Map.class,
ChunkedWriter.class,
Charset.class,
FileSystem.class);
 pathFilter = constructor.newInstance(conf, keyPrefix, options, 
 writer, charset,fs);
 obviously,the method  constructor.newInstance lacks a parameter 
 charset,if i implements a subclass of 
 SequenceFilesFromDirectoryFilter,there will be a runtime error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1148) QR Decomposition is too slow

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1148:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 QR Decomposition is too slow
 

 Key: MAHOUT-1148
 URL: https://issues.apache.org/jira/browse/MAHOUT-1148
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Ted Dunning
 Fix For: 0.8


 A user reported that QR decomposition is too slow.  I coded up a replacement 
 that can be 10x faster under certain cases and the new version is also tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1127) OnlineLogisticRegression test is flaky (and wrong)

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1127:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 OnlineLogisticRegression test is flaky (and wrong)
 --

 Key: MAHOUT-1127
 URL: https://issues.apache.org/jira/browse/MAHOUT-1127
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Ted Dunning
 Fix For: 0.8




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1174) Lanczos code and javadocs should refer users to the SSVD stuff

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1174:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Lanczos code and javadocs should refer users to the SSVD stuff
 --

 Key: MAHOUT-1174
 URL: https://issues.apache.org/jira/browse/MAHOUT-1174
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Ted Dunning
Assignee: Ted Dunning
 Fix For: 0.8




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1166) Multithreaded version of distributed ALS

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1166:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Multithreaded version of distributed ALS
 

 Key: MAHOUT-1166
 URL: https://issues.apache.org/jira/browse/MAHOUT-1166
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: MAHOUT-1166.patch


 Our implementation of ALS broadcasts the feature matrices in each iteration. 
 Therefore, it makes sense to run the mappers in multithreaded mode to not 
 have to load one copy of the feature matrix per core, but share the read-only 
 in-memory copy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1173) Reactivate checkstyle

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1173:
--

Fix Version/s: 0.8

 Reactivate checkstyle 
 --

 Key: MAHOUT-1173
 URL: https://issues.apache.org/jira/browse/MAHOUT-1173
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: mahout-checkstyle.xml


 I would like to reactivate checkstyle in our build. IMHO we should not make 
 it fail on checkstyle errors at the moment (anyone disagree on this?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1143) DecisionForest classifier should output label string instead of code

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1143:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 DecisionForest classifier should output label string instead of code
 

 Key: MAHOUT-1143
 URL: https://issues.apache.org/jira/browse/MAHOUT-1143
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 0.7
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
Priority: Critical
 Fix For: 0.8

 Attachments: MAHOUT-1143.patch


 when calling TestForest with a classification problem, output labels are 
 numerical values corresponding to the label's internal code. TestForest 
 should instead output the string label instead of the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1157) AbstractCluster.formatVector iteration bug.

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1157:
--

Fix Version/s: 0.8

 AbstractCluster.formatVector iteration bug.
 ---

 Key: MAHOUT-1157
 URL: https://issues.apache.org/jira/browse/MAHOUT-1157
 Project: Mahout
  Issue Type: Bug
  Components: Clustering
Affects Versions: 0.7
Reporter: Adam Bozanich
 Fix For: 0.8

 Attachments: mahout.patch


 AbstractCluster.formatVector's use of the size field of the given vector 
 causes problems when the vector is sparse.
 I clustered a handful of vectors which had been initialized with a 
 cardinality of Integer.MAX_VALUE. Running seqdump on the resulting 
 clusteredPoints took over four minutes.  This is because formatVector() was 
 iterating over the entire integer space for every vector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1151) Object reuse in distributed ALS

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1151:
--

Fix Version/s: 0.8

 Object reuse in distributed ALS
 ---

 Key: MAHOUT-1151
 URL: https://issues.apache.org/jira/browse/MAHOUT-1151
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.8
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: MAHOUT-1151-2.patch, MAHOUT-1151.patch


 In order to improve the performance our distributed ALS code, we should try 
 to avoid object instantiation as much as possible, especially when it is done 
 per input tuple.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1019) VectorDistanceSimilarityJob

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1019:
--

Fix Version/s: 0.8

 VectorDistanceSimilarityJob
 ---

 Key: MAHOUT-1019
 URL: https://issues.apache.org/jira/browse/MAHOUT-1019
 Project: Mahout
  Issue Type: Improvement
  Components: Math
Affects Versions: 0.8
 Environment: all
Reporter: Timothy Potter
Priority: Minor
  Labels: VectorDistanceSimilarityJob, distance, vector
 Fix For: 0.8

 Attachments: MAHOUT-1019.patch

   Original Estimate: 12h
  Remaining Estimate: 12h

 The VectorDistanceSimilarityJob is a fantastic tool, but poses the risk of 
 creating terabytes of output of dubious value. For example, I have ~10K seed 
 vectors and millions of vectors to compute the similarity between so I would 
 like to add an optional parameter to this job to specify a maximum distance 
 threshold that prevents any distances above this value from being written to 
 the output. The default would be 1.0d so no filtering is applied which 
 ensures backwards compatibility, but if supplied, only rows where the 
 distance is less than the threshold would be output from the mapper. This can 
 help reduce the storage requirements of the output immensely. Probably name 
 the parameter something like: noOutputIfDistanceGreaterThan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1082) driver seqdirectory fails with param -filter set

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1082:
--

Fix Version/s: 0.8

 driver seqdirectory fails with param -filter set
 

 Key: MAHOUT-1082
 URL: https://issues.apache.org/jira/browse/MAHOUT-1082
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.7
Reporter: Johannes Rauber
Priority: Minor
 Fix For: 0.8


 The following error is thrown when an own implementation of 
 PrefixAdditionFilter is specified with parameter -filter for seqdirectory.
 Exception in thread main java.lang.IllegalArgumentException: wrong number 
 of arguments
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:96)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at 
 org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:53)
 In class org.apache.mahout.text.SequenceFilesFromDirectory line 96 the 
 following additional parameter should be inserted into the reflection call of 
 the ctor: charset
 Raises Error:
 pathFilter = constructor.newInstance(conf, keyPrefix, options, writer, fs);
 Fix:
 pathFilter = constructor.newInstance(conf, keyPrefix, options, writer, 
 charset, fs);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1075) ClusterDumper output file should be optional

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1075:
--

Fix Version/s: 0.8

 ClusterDumper output file should be optional
 

 Key: MAHOUT-1075
 URL: https://issues.apache.org/jira/browse/MAHOUT-1075
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Dave Byrne
 Fix For: 0.8

 Attachments: clusterdumper_out.patch


 ClusterDumper output option should be optional, defaults to System.out if -o 
 is not specified

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1093) CrossFoldLearner trains in all folds if trackign key is negative

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1093:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 CrossFoldLearner trains in all folds if trackign key is negative
 

 Key: MAHOUT-1093
 URL: https://issues.apache.org/jira/browse/MAHOUT-1093
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Eric Springer
Assignee: Sebastian Schelter
 Fix For: 0.8


 See: https://github.com/apache/mahout/pull/7

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1171) PMD regression

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1171:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 PMD regression
 --

 Key: MAHOUT-1171
 URL: https://issues.apache.org/jira/browse/MAHOUT-1171
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Ted Dunning
Assignee: Ted Dunning
 Fix For: 0.8




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1167) Parallel item similarity precomputation on a single machine

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1167:
--

Fix Version/s: 0.8

 Parallel item similarity precomputation on a single machine
 ---

 Key: MAHOUT-1167
 URL: https://issues.apache.org/jira/browse/MAHOUT-1167
 Project: Mahout
  Issue Type: New Feature
  Components: Collaborative Filtering
Affects Versions: 0.8
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: MAHOUT-1167.patch


 We need some code for item-based CF usecases with an intermediate data size 
 (e.g., a few million interactions). In such cases, the data might be too big 
 to allow online computation of similarities and recommendations, but at the 
 same time, going to Hadoop might still not be necessary and desired.
 In such a case, it makes sense to precompute item similarities on a single 
 machine. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1150) ARFF Integration does not support quoted identifiers

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1150:
--

Fix Version/s: 0.8

 ARFF Integration does not support quoted identifiers
 

 Key: MAHOUT-1150
 URL: https://issues.apache.org/jira/browse/MAHOUT-1150
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.7
 Environment: All
Reporter: Marty Kube
 Fix For: 0.8

 Attachments: MAHOUT-1150.patch


 I ran the NSL-KDD data set (http://nsl.cs.unb.ca/NSL-KDD/) through the ARFF 
 integration.  The process failed to parse the arff formatted file.  The file 
 has quoted identifiers:
 @relation 'KDDTrain-20Percent'
 @attribute 'duration' real
 @attribute 'protocol_type' {'tcp','udp', 'icmp'} 
 The quotes caused the problem.  The official arff BNF shows that quotes 
 should be supported:
 https://list.scms.waikato.ac.nz/mailman/htdig/wekalist/2008-January/012153.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1114) Some delegating vectors have subtle clone bug

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1114:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Some delegating vectors have subtle clone bug
 -

 Key: MAHOUT-1114
 URL: https://issues.apache.org/jira/browse/MAHOUT-1114
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.7
Reporter: Ted Dunning
 Fix For: 0.8


 Cloning a Centroid returns a WeightedVector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1031) Drop empty vectors in encoding pipeline

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1031:
--

Fix Version/s: 0.8

 Drop empty vectors in encoding pipeline
 ---

 Key: MAHOUT-1031
 URL: https://issues.apache.org/jira/browse/MAHOUT-1031
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Robin Anil
Assignee: Robin Anil
 Fix For: 0.8

 Attachments: MAHOUT-1031.patch, MAHOUT-1031.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1113) Need test case to demonstrate simple use of SGD

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1113:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Need test case to demonstrate simple use of SGD
 ---

 Key: MAHOUT-1113
 URL: https://issues.apache.org/jira/browse/MAHOUT-1113
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.7
Reporter: Ted Dunning
Priority: Minor
 Fix For: 0.8


 Need a test case that shows how to use SGD on a well known data set like the 
 Iris data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-707) Setup Jenkins Jobs to validate our Examples/bin Scripts

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-707:
-

Affects Version/s: 0.7
Fix Version/s: 0.8

 Setup Jenkins Jobs to validate our Examples/bin Scripts
 ---

 Key: MAHOUT-707
 URL: https://issues.apache.org/jira/browse/MAHOUT-707
 Project: Mahout
  Issue Type: Task
Affects Versions: 0.7
Reporter: Grant Ingersoll
 Fix For: 0.8


 We should setup Jenkins to run our example scripts on a regular basis (See 
 MAHOUT-694) and check for breakage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1061) mapreduce split causes ClassNotFound exception

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1061:
--

Fix Version/s: 0.8

 mapreduce split causes ClassNotFound exception
 --

 Key: MAHOUT-1061
 URL: https://issues.apache.org/jira/browse/MAHOUT-1061
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.7
Reporter: David Engel
Assignee: Sebastian Schelter
  Labels: patch
 Fix For: 0.8


 Running the split program in mapreduce mode, e.g. mahout split -xm mapreduce 
 ... results in a ClassNotFound exception because the job jar is not set.  
 The following patch fixes the problem for me.
 diff -ur 
 mahout-distribution-0.7.orig/integration/src/main/java/org/apache/mahout/utils/SplitInputJob.java
  
 mahout-distribution-0.7/integration/src/main/java/org/apache/mahout/utils/SplitInputJob.java
 --- 
 mahout-distribution-0.7.orig/integration/src/main/java/org/apache/mahout/utils/SplitInputJob.java
  2012-06-12 03:30:39.0 -0500
 +++ 
 mahout-distribution-0.7/integration/src/main/java/org/apache/mahout/utils/SplitInputJob.java
   2012-08-20 17:28:18.0 -0500
 @@ -114,6 +114,6 @@
  
  // Setup job with new API
  Job job = new Job(oldApiJob);
 +job.setJarByClass(SplitInputJob.class);
  FileInputFormat.addInputPath(job, inputPath);
  FileOutputFormat.setOutputPath(job, outputPath);
  job.setNumReduceTasks(1);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-917) Build takes too long

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-917:
-

Affects Version/s: 0.6
   0.7
Fix Version/s: 0.8

 Build takes too long
 

 Key: MAHOUT-917
 URL: https://issues.apache.org/jira/browse/MAHOUT-917
 Project: Mahout
  Issue Type: Improvement
  Components: build
Affects Versions: 0.6, 0.7
Reporter: Frank Scholten
Assignee: Sebastian Schelter
 Fix For: 0.8


 On my machine a full mvn clean install takes 55 minutes.
 As an experiment I put all MapReduce job tests for all clustering algorithms 
 on ignore. This reduces the build to 45 minutes. There are a lot of these 
 long running tests in the project.
 What about creating a separate maven profile for the nightly build that run 
 all MapReduce job tests? For this we have to move these MapReduce tests
 to separate classes with a naming convention such as *JobTest or 
 *IntegrationTest and add some maven configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1104) Improve Javadoc for AbstractVectorClassifier

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1104:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Improve Javadoc for AbstractVectorClassifier
 

 Key: MAHOUT-1104
 URL: https://issues.apache.org/jira/browse/MAHOUT-1104
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 0.7
Reporter: Timothy Mann
Priority: Minor
  Labels: classification, documentation, patch
 Fix For: 0.8

 Attachments: classifier_jdoc.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Modify javadocs for AbstractVectorClassifier to clarify what classify and 
 classifyFull methods do.
 Override javadoc for classify and classifyScalar methods in 
 AbstractNaiveBayesClassifier to reflect the fact that they are not supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1141) Driver for cvb0_local does not warn about missing maxIterations command line parameter

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1141:
--

Fix Version/s: 0.8

 Driver for cvb0_local does not warn about missing maxIterations command line 
 parameter
 --

 Key: MAHOUT-1141
 URL: https://issues.apache.org/jira/browse/MAHOUT-1141
 Project: Mahout
  Issue Type: Bug
  Components: Clustering
Affects Versions: 0.7, 0.8
 Environment: MacOS 10.8, Java 7
Reporter: Samar Lotia
Priority: Minor
 Fix For: 0.8


 The driver for cvb0_local does not seem to verify whether the caller has 
 specified the required maxIterations command line parameter. This results in 
 an exception much further down which pretty much requires looking at the 
 source to discover the source of the error.
 Exception in thread main java.lang.ClassCastException: java.lang.Integer 
 cannot be cast to java.lang.String
   at 
 org.apache.mahout.clustering.lda.cvb.InMemoryCollapsedVariationalBayes0.main2(InMemoryCollapsedVariationalBayes0.java:374)
   at 
 org.apache.mahout.clustering.lda.cvb.InMemoryCollapsedVariationalBayes0.run(InMemoryCollapsedVariationalBayes0.java:521)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at 
 org.apache.mahout.clustering.lda.cvb.InMemoryCollapsedVariationalBayes0.main(InMemoryCollapsedVariationalBayes0.java:525)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
   at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
   at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-760) org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest test fails during install

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-760:
-

Fix Version/s: 0.7

 org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest test fails during install 
 --

 Key: MAHOUT-760
 URL: https://issues.apache.org/jira/browse/MAHOUT-760
 Project: Mahout
  Issue Type: Bug
  Components: Frequent Itemset/Association Rule Mining
Affects Versions: 0.5
 Environment: #uname -a
 Linux hostname 2.6.27.54-0.2-default #1 SMP 2010-10-19 18:40:07 +0200 
 x86_64 x86_64 x86_64 GNU/Linux
 # java -version
 java version 1.6.0
 Java(TM) SE Runtime Environment (build 
 pxa6460sr9ifix-20110211_02(SR9+IZ94423))
 IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 
 jvmxa6460sr9-20101124_69295 (JIT enabled, AOT enabled)
 J9VM - 20101124_069295
 JIT  - r9_20101028_17488ifx2
 GC   - 20101027_AA)
 JCL  - 20110211_02
Reporter: Chintamani
Assignee: Sean Owen
Priority: Minor
  Labels: hadoop, ibm-jdk
 Fix For: 0.7


 mvn install core fails because of a single failed test - 
 org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest with the following error 
 (extracted from 
 target/surefire-reports/org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest.txt)
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.615 sec  
 FAILURE!
 testStartParallelFPGrowth(org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest)  
 Time elapsed: 6.587 sec   FAILURE!
 org.junit.ComparisonFailure: expected:{[D=0, E=1, A=0, B=0, C]=1} but 
 was:{[A=0, B=0, C=1, D=0, E]=1}
 at org.junit.Assert.assertEquals(Assert.java:123)
 at org.junit.Assert.assertEquals(Assert.java:145)
 at 
 org.apache.mahout.fpm.pfpgrowth.PFPGrowthTest.testStartParallelFPGrowth(PFPGrowthTest.java:95)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:119)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:101)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ClassLoaderProxy.invoke(ProviderFactory.java:103)
 at $Proxy0.invoke(Unknown Source)
 at 
 org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:150)
 at 
 org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcess(SurefireStarter.java:91)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:69)
 Every other test in all the components succeed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1118) SLF4J Log4j bindings are messed up causing examples to fail

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1118:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 SLF4J Log4j bindings are messed up causing examples to fail
 ---

 Key: MAHOUT-1118
 URL: https://issues.apache.org/jira/browse/MAHOUT-1118
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 0.8

 Attachments: MAHOUT-1118.patch


 We are routinely seeing the following failures when running the examples on 
 Jenkins and they are due to old SLF4j bindings on Cassandra and HBase:
 {code}
 Training on /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/
 hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running 
 locally
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
 [jar:file:/x1/jenkins/jenkins-slave/workspace/Mahout-Examples-Classify-20News/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/x1/jenkins/jenkins-slave/workspace/Mahout-Examples-Classify-20News/trunk/examples/target/dependency/slf4j-jcl-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/x1/jenkins/jenkins-slave/workspace/Mahout-Examples-Classify-20News/trunk/examples/target/dependency/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
 SLF4J: Your binding is version 1.5.5 or earlier.
 SLF4J: Upgrade your binding to version 1.6.x.
 Exception in thread main java.lang.NoSuchMethodError: 
 org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
   at org.slf4j.LoggerFactory.bind(LoggerFactory.java:128)
   at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:107)
   at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:295)
   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:269)
   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:281)
   at org.apache.mahout.driver.MahoutDriver.clinit(MahoutDriver.java:89)
 Could not find the main class: org.apache.mahout.driver.MahoutDriver.  
 Program will exit.
 Build step 'Execute shell' marked build as failure
 Sending e-mails to: dev@mahout.apache.org ssc.o...@googlemail.com p
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-804) Each page in Mahout's Confluence Wiki has 2 URLs, with differing page styles and search behaviours

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-804:
-

Affects Version/s: 0.6
   0.7
Fix Version/s: 0.8

 Each page in Mahout's Confluence Wiki has 2 URLs, with differing page styles 
 and search behaviours
 --

 Key: MAHOUT-804
 URL: https://issues.apache.org/jira/browse/MAHOUT-804
 Project: Mahout
  Issue Type: Improvement
  Components: Website
Affects Versions: 0.6, 0.7
Reporter: Dan Brickley
  Labels: atlassian, confluence, wiki
 Fix For: 0.8


 There are two styles of URL in circulation for URLs into Mahout's Wiki 
 (presumably an Apache-wide configuration issue):
 https://cwiki.apache.org/MAHOUT/svd-singular-value-decomposition.html vs
 https://cwiki.apache.org/confluence/display/MAHOUT/SVD+-+Singular+Value+Decomposition
 They appear to be the self-same confluence 3.4.9 installation (or its raw 
 filetree). Each has a different search box at the top of the page. The 
 version with 'confluence/' in the path does a confluence search, and returns 
 similar URLs as results. The one with '.html' suffixes does a 
 domain-constrained Google search. 
 Despite markup canonicalising the confluence variant, ie.  link 
 rel=canonical 
 href=https://cwiki.apache.org/confluence/display/MAHOUT/SVD+-+Singular+Value+Decomposition;
  appearing in the confluence pages, it seems the Google search results 
 typically throw people into the other version of the Wiki site.
 This is all mildly confusing, mildly annoying but overall mostly harmless. It 
 could be having some negative impact on google rank  suchlike, since 
 incoming links will be split between the two styles. Maybe this could be 
 passed along to the Wiki admins? 
 Which version does the Mahout team consider canonical URLs (for external 
 links etc)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1136) Cannot import project into eclipse with m2e 1.2

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1136:
--

Fix Version/s: 0.8

 Cannot import project into eclipse with m2e 1.2
 ---

 Key: MAHOUT-1136
 URL: https://issues.apache.org/jira/browse/MAHOUT-1136
 Project: Mahout
  Issue Type: Bug
  Components: build
Affects Versions: 0.7
Reporter: Stevo Slavic
  Labels: m2e
 Fix For: 0.8

 Attachments: MAHOUT-1136.patch


 Seems fix for MAHOUT-1043 wasn't good, in pluginExecutionFilter instead of 
 version, versionRange should be used.
 Related SO entry: http://stackoverflow.com/a/6701595/381140

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1088) biased item-based recommender

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1088:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 biased item-based recommender
 -

 Key: MAHOUT-1088
 URL: https://issues.apache.org/jira/browse/MAHOUT-1088
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Sebastian Schelter
Assignee: Sean Owen
 Fix For: 0.8

 Attachments: MAHOUT-1088.patch


 user-item-baseline estimation offers a simple yet very effective to improve 
 the rating prediction of recommenders (see 
 http://dl.acm.org/citation.cfm?id=1644874 for details).
 We should offer an item-based recommender that incorporates this technique 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1131) Can't execute alternative FPG implementation from command line

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1131:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Can't execute alternative FPG implementation from command line
 --

 Key: MAHOUT-1131
 URL: https://issues.apache.org/jira/browse/MAHOUT-1131
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
Reporter: Kirill A. Korinskiy
 Fix For: 0.8

 Attachments: MAHOUT-1131.patch


 When I execute: ./bin/mahout fpg -i input -o output -2 option -2 — execute 
 alternative FPG implementation didn't work.
 Follow patch fix it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1089) SGD matrix factorization for rating prediction with user and item biases

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1089:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 SGD matrix factorization for rating prediction with user and item biases
 

 Key: MAHOUT-1089
 URL: https://issues.apache.org/jira/browse/MAHOUT-1089
 Project: Mahout
  Issue Type: New Feature
  Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Zeno Gantner
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: MAHOUT-1089.patch, RatingSGDFactorizer.java, 
 RatingSGDFactorizer.java


 A matrix factorization that is trained with standard SGD on all features at 
 the same time, in contrast to ExpectationMaximizationFactorizer, which learns 
 feature by feature.
 Additionally to the free features it models a rating bias for each user and 
 item.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1106) SVD++

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1106:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 SVD++
 -

 Key: MAHOUT-1106
 URL: https://issues.apache.org/jira/browse/MAHOUT-1106
 Project: Mahout
  Issue Type: New Feature
  Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Zeno Gantner
Assignee: Sebastian Schelter
 Fix For: 0.8

 Attachments: SVDPlusPlusFactorizer.java


 Initial shot at SVD++.
 Relies on the RatingsSGDFactorizer class introduced in MAHOUT-1089.
 One could also think about several enhancements, e.g. having separate 
 regularization constants for user and item factors.
 I am also the author of the SVDPlusPlus class in MyMediaLite, so if there are 
 any similarities, no need to worry -- I am okay with relicensing this to the 
 Apache 2.0 license.
 https://github.com/zenogantner/MyMediaLite/blob/master/src/MyMediaLite/RatingPrediction/SVDPlusPlus.cs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-852) Upgrade Lucene dependency to 3.4

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-852:
-

Affects Version/s: 0.5
Fix Version/s: 0.6

 Upgrade Lucene dependency to 3.4
 

 Key: MAHOUT-852
 URL: https://issues.apache.org/jira/browse/MAHOUT-852
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.5
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Trivial
 Fix For: 0.6


 As the title says, commit coming shortly once the tests are done running

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1194) Allow to change java target version during the build

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1194:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Allow to change java target version during the build
 

 Key: MAHOUT-1194
 URL: https://issues.apache.org/jira/browse/MAHOUT-1194
 Project: Mahout
  Issue Type: Task
Affects Versions: 0.7
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Priority: Minor
 Fix For: 0.8

 Attachments: bugMAHOUT-1194.patch


 It seems that current build have hard coded java target for JDK6. I think 
 that it would be useful to parametrise that, so that it can be easily 
 overridden on the command line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1111) Logging bindings not working in current trunk as of github 2012-November-9 18:41

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 Logging bindings not working in current trunk as of github 2012-November-9 
 18:41
 

 Key: MAHOUT-
 URL: https://issues.apache.org/jira/browse/MAHOUT-
 Project: Mahout
  Issue Type: Bug
  Components: build, Examples
Affects Versions: 0.7
 Environment: == Most Recent Commit
 commit 1743c1521679daab600a982be6e53751730e
 Author: Paritosh Ranjan pran...@apache.org
 Date:   Thu Nov 1 13:02:03 2012 +
 MAHOUT-1109, Creatinng parent directories if not present while creating 
 file
 
 git-svn-id: https://svn.apache.org/repos/asf/mahout/trunk@1404572 
 13f79535-4
 
 github runs behind svn, apologies if this is fixed. I can't find an online 
 svn commit log in the apache SVN server.
Reporter: Lance Norskog
Assignee: Sebastian Schelter
Priority: Blocker
 Fix For: 0.8

 Attachments: multiple-slf4j.patch


 Current commit is 1743c1521679daab600a982be6e53751730e
 On trunk, running examples/bin/classify-20newsgroups.sh gives this error:
 {noformat}
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
 SLF4J: Your binding is version 1.5.5 or earlier.
 SLF4J: Upgrade your binding to version 1.6.x.
 Exception in thread main java.lang.NoSuchMethodError: 
 org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
   at org.slf4j.LoggerFactory.bind(LoggerFactory.java:128)
   at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:107)
   at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:295)
   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:269)
   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:281)
   at org.apache.mahout.driver.MahoutDriver.clinit(MahoutDriver.java:89)
 {noformat}
 Marked Blocker since script just plain does not run.
 Here is the complete trace from running the script under shell's -x option:
 {noformat}
 @mac bin [trunk] $ sh -x classify-20newsgroups.sh 
 + '[' '' = --help ']'
 + '[' '' = '--?' ']'
 + SCRIPT_PATH=classify-20newsgroups.sh
 + '[' classify-20newsgroups.sh '!=' classify-20newsgroups.sh ']'
 ++ pwd
 + START_PATH=/Users/lancenorskog/Documents/open/mahout/examples/bin
 + WORK_DIR=/tmp/mahout-work-lancenorskog
 + algorithm=(cnaivebayes naivebayes sgd clean)
 + '[' -n '' ']'
 + echo 'Please select a number to choose the corresponding task to run'
 Please select a number to choose the corresponding task to run
 + echo '1. cnaivebayes'
 1. cnaivebayes
 + echo '2. naivebayes'
 2. naivebayes
 + echo '3. sgd'
 3. sgd
 + echo '4. clean -- cleans up the work area in /tmp/mahout-work-lancenorskog'
 4. clean -- cleans up the work area in /tmp/mahout-work-lancenorskog
 + read -p 'Enter your choice : ' choice
 Enter your choice : 1
 + echo 'ok. You chose 1 and we'\''ll use cnaivebayes'
 ok. You chose 1 and we'll use cnaivebayes
 + alg=cnaivebayes
 + echo 'creating work directory at /tmp/mahout-work-lancenorskog'
 creating work directory at /tmp/mahout-work-lancenorskog
 + mkdir -p /tmp/mahout-work-lancenorskog
 + '[' '!' -e /tmp/mahout-work-lancenorskog/20news-bayesinput ']'
 + '[' '!' -e /tmp/mahout-work-lancenorskog/20news-bydate ']'
 + cd /Users/lancenorskog/Documents/open/mahout/examples/bin
 + cd ../..
 + set -e
 + '[' xcnaivebayes == xnaivebayes -o xcnaivebayes == xcnaivebayes ']'
 + c=
 + '[' xcnaivebayes == xcnaivebayes ']'
 + c=' -c'
 + set -x
 + echo 'Preparing 20newsgroups data'
 Preparing 20newsgroups data
 + rm -rf /tmp/mahout-work-lancenorskog/20news-all
 + mkdir /tmp/mahout-work-lancenorskog/20news-all
 + cp -R 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/alt.atheism 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/comp.graphics 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/comp.os.ms-windows.misc
  
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/comp.sys.ibm.pc.hardware
  
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/comp.sys.mac.hardware
  
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/comp.windows.x 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/misc.forsale 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/rec.autos 
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/rec.motorcycles
  
 /tmp/mahout-work-lancenorskog/20news-bydate/20news-bydate-test/rec.sport.baseball
  
 

[jira] [Updated] (MAHOUT-1087) ExpectationMaximizationSVDFactorizer doesn't do expectation maximization

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1087:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 ExpectationMaximizationSVDFactorizer doesn't do expectation maximization
 

 Key: MAHOUT-1087
 URL: https://issues.apache.org/jira/browse/MAHOUT-1087
 Project: Mahout
  Issue Type: Improvement
  Components: Collaborative Filtering
Affects Versions: 0.7
Reporter: Sebastian Schelter
Assignee: Sean Owen
 Fix For: 0.8


 This factorizer simply learns the user and item features via SGD as described 
 in Simon Funk's famous blogpost, which is not expectation maximization, so I 
 suggest we rename it to FunkSVD.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Updating JIRA issues

2013-07-25 Thread Sean Owen
IIRC when you bulk update issues, you can choose to *not* send e-mail.
Might be good if affecting many at once like this!


[jira] [Updated] (MAHOUT-1083) CIReducer in kmeans doesn't work well

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1083:
--

Affects Version/s: 0.7
Fix Version/s: 0.8

 CIReducer in kmeans doesn't work well
 -

 Key: MAHOUT-1083
 URL: https://issues.apache.org/jira/browse/MAHOUT-1083
 Project: Mahout
  Issue Type: Bug
Affects Versions: 0.7
 Environment: hadoop-2.0.0-alpha: pseudo cluster and single node 
 clusterhadoop-1.0.3: pseudo clusterhadoop-0.20.2:pseudo cluster 
 mahout:mahout-0.7 os:ubuntu 11.04 jdk:jdk1.6.0_27
Reporter: liutengfei
 Fix For: 0.8

 Attachments: MAHOUT-1083.patch


 the function reduce in mahout-0.7-kmeans-CIReducer.java doesn't work well 
 as it looks like.
   protected void reduce(IntWritable key, IterableClusterWritable values, 
 Context context) throws IOException,
   InterruptedException {
 IteratorClusterWritable iter = values.iterator();
 ClusterWritable first = null;
 while (iter.hasNext()) {
   ClusterWritable cw = iter.next();
   if (first == null) {
 first = cw;
   } else {
 first.getValue().observe(cw.getValue());
   }
 }
 ListCluster models = new ArrayListCluster();
 models.add(first.getValue());
 classifier = new ClusterClassifier(models, policy);
 classifier.close();
 context.write(key, first);
   }
 Apparently, the variable first will collect all output data of maps. 
 Actually but, the value of first will change after the code 
 ClusterWritable cw = iter.next();, same with this new variable cw! I 
 don't why but running result shows that the code runs looks like 
 this:ClusterWritable cw = first = iter.next();.
 is cw a reference a to iter?
 is iter.next just change the value of iter itself to the next?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Updating JIRA issues

2013-07-25 Thread Suneel Marthi
Sorry about that... will keep in mind the next time.





 From: Sean Owen sro...@gmail.com
To: Mahout Dev List dev@mahout.apache.org 
Sent: Thursday, July 25, 2013 8:46 AM
Subject: Updating JIRA issues
 

IIRC when you bulk update issues, you can choose to *not* send e-mail.
Might be good if affecting many at once like this!

Re: 0.8

2013-07-25 Thread Suneel Marthi
With Isabel's help, updated the 0.8 Release notes on the Wiki and below is the 
text version of the Release notes. 

Checkout the Wiki version at

https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8

---

The Apache Mahout PMC is pleased to announce the release of Mahout 0.8. 
Mahout's goal is to build scalable machine learning libraries focused 
primarily in the areas of collaborative filtering (recommenders), 
clustering and classification (known as the 3Cs), as well as the 
necessary infrastructure to support those implementations including, but
 not limited to, math packages for statistics, linear algebra and others
 as well as Java primitive collections, local and distributed vector and
 matrix classes and a variety of integrative code to work with popular 
packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache 
Cassandra and much more. The 0.8 release is mainly a clean up release in
 preparation for an upcoming 1.0 release, but there are several 
significant new features, which are highlighted below.

To get started with Apache Mahout 0.8, download the release artifacts and 
signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples 
directory contains several working examples of the core 
functionality available in Mahout. These can be run via scripts in the 
examples/bin directory. Most examples do not need a Hadoop cluster in 
order to run.

Please pay attention to the section labelled FUTURE PLANS below for more 
information about upcoming releases of Mahout.

As with any release, we wish to thank all of the users and contributors 
to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for 
individual credits, as there are too many to list here.

RELEASE HIGHLIGHTS

The highlights of the Apache Mahout 0.8 release include, but are not 
limited to the list below. For further information, see the included 
CHANGELOG file.

- Numerous performance improvements to Vector and Matrix 
implementations, API's and their iterators (see also MAHOUT-1192, 
MAHOUT-1202)
- Numerous performance improvements to the recommender implementations 
(see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151, 
MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
- MAHOUT-1088: Support for biased item-based recommender
- MAHOUT-1089: SGD matrix factorization for rating prediction with user and 
item biases
- MAHOUT-1106: Support for SVD++
- MAHOUT-944: Support for converting one or more Lucene storage indexes 
to SequenceFiles as well as an upgrade of the supported Lucene version 
to Lucene 4.3.1.
- MAHOUT-1154 and friends: New streaming k-means implementation that offers 
on-line (and fast) clustering
- MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory' can 
now be run as a MapReduce job.
- MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension of 
vector to hash (indexes or values).
- MAHOUT-884: Matrix Concat utility, presently only concatenates two matrices.
- MAHOUT-1244: Upgraded to use Lucene 4.3
- MAHOUT-1187: Upgraded to CommonsLang3
- MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
- The usual bug fixes. See JIRA [2] for more
information on the 0.8 release.

A total of 218 separate JIRA issues are addressed in this release.

CONTRIBUTING

Mahout is always looking for contributions focused on the 3Cs. If you are 
interested in contributing, please see our 
https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout wiki or 
contact us via email at dev@mahout.apache.org.

FUTURE PLANS

0.9

As the project moves towards a 1.0 release, the community is working to 
clean up and/or remove parts of the code base that are under-supported 
or that underperform as well as to better focus the energy and 
contributions on key algorithms that are proven to scale in production 
and have seen wide-spread adoption. To this end, in the next release, 
the project is planning on removing support for the following algorithms
 unless there is sustained support and improvement of them before the 
next release.

The algorithms to be removed are:
- From Clustering:
Dirichlet
MeanShift
MinHash
Eigencuts
- From Classification (both are sequential implementations)
Winnow
Perceptron
- Frequent Pattern Mining
- Collaborative Filtering
All recommenders in org.apache.mahout.cf.taste.
impl.recommender.knn
SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone and 
org.apache.mahout.cf.taste.impl.recommender.slopeone
Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo
TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender
- Mahout Math
Lanczos in favour of SSVD
Hadoop entropy stuff in org.apache.mahout.math.stats.entropy

If you are interested in supporting 1 or more of these algorithms, please make 
it known on dev@mahout.apache.org and via JIRA issues that fix and/or improve 
them. Please also provide 
supporting 

Re: 0.8

2013-07-25 Thread Grant Ingersoll
Awesome, I will send out the announcement as soon as I check the mirrors.

On Jul 25, 2013, at 2:44 PM, Suneel Marthi suneel_mar...@yahoo.com wrote:

 With Isabel's help, updated the 0.8 Release notes on the Wiki and below is 
 the text version of the Release notes. 
 
 Checkout the Wiki version at
 
 https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
 
 ---
 
 The Apache Mahout PMC is pleased to announce the release of Mahout 0.8. 
 Mahout's goal is to build scalable machine learning libraries focused 
 primarily in the areas of collaborative filtering (recommenders), 
 clustering and classification (known as the 3Cs), as well as the 
 necessary infrastructure to support those implementations including, but
 not limited to, math packages for statistics, linear algebra and others
 as well as Java primitive collections, local and distributed vector and
 matrix classes and a variety of integrative code to work with popular 
 packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache 
 Cassandra and much more. The 0.8 release is mainly a clean up release in
 preparation for an upcoming 1.0 release, but there are several 
 significant new features, which are highlighted below.
 
 To get started with Apache Mahout 0.8, download the release artifacts and 
 signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples 
 directory contains several working examples of the core 
 functionality available in Mahout. These can be run via scripts in the 
 examples/bin directory. Most examples do not need a Hadoop cluster in 
 order to run.
 
 Please pay attention to the section labelled FUTURE PLANS below for more 
 information about upcoming releases of Mahout.
 
 As with any release, we wish to thank all of the users and contributors 
 to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for 
 individual credits, as there are too many to list here.
 
 RELEASE HIGHLIGHTS
 
 The highlights of the Apache Mahout 0.8 release include, but are not 
 limited to the list below. For further information, see the included 
 CHANGELOG file.
 
 - Numerous performance improvements to Vector and Matrix 
 implementations, API's and their iterators (see also MAHOUT-1192, 
 MAHOUT-1202)
 - Numerous performance improvements to the recommender implementations 
 (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151, 
 MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
 - MAHOUT-1088: Support for biased item-based recommender
 - MAHOUT-1089: SGD matrix factorization for rating prediction with user and 
 item biases
 - MAHOUT-1106: Support for SVD++
 - MAHOUT-944: Support for converting one or more Lucene storage indexes 
 to SequenceFiles as well as an upgrade of the supported Lucene version 
 to Lucene 4.3.1.
 - MAHOUT-1154 and friends: New streaming k-means implementation that offers 
 on-line (and fast) clustering
 - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory' can 
 now be run as a MapReduce job.
 - MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension of 
 vector to hash (indexes or values).
 - MAHOUT-884: Matrix Concat utility, presently only concatenates two matrices.
 - MAHOUT-1244: Upgraded to use Lucene 4.3
 - MAHOUT-1187: Upgraded to CommonsLang3
 - MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
 - The usual bug fixes. See JIRA [2] for more
 information on the 0.8 release.
 
 A total of 218 separate JIRA issues are addressed in this release.
 
 CONTRIBUTING
 
 Mahout is always looking for contributions focused on the 3Cs. If you are 
 interested in contributing, please see our 
 https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout wiki or 
 contact us via email at dev@mahout.apache.org.
 
 FUTURE PLANS
 
 0.9
 
 As the project moves towards a 1.0 release, the community is working to 
 clean up and/or remove parts of the code base that are under-supported 
 or that underperform as well as to better focus the energy and 
 contributions on key algorithms that are proven to scale in production 
 and have seen wide-spread adoption. To this end, in the next release, 
 the project is planning on removing support for the following algorithms
 unless there is sustained support and improvement of them before the 
 next release.
 
 The algorithms to be removed are:
 - From Clustering:
 Dirichlet
 MeanShift
 MinHash
 Eigencuts
 - From Classification (both are sequential implementations)
 Winnow
 Perceptron
 - Frequent Pattern Mining
 - Collaborative Filtering
 All recommenders in org.apache.mahout.cf.taste.
 impl.recommender.knn
 SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone and 
 org.apache.mahout.cf.taste.impl.recommender.slopeone
 Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo
 TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender
 - Mahout Math
 Lanczos in favour of SSVD
 Hadoop 

Apache Mahout 0.8 Released

2013-07-25 Thread Grant Ingersoll
The Apache Mahout PMC is pleased to announce the release of Mahout 0.8. 
Mahout's goal is to build scalable machine learning libraries focused 
primarily in the areas of collaborative filtering (recommenders), 
clustering and classification (known collectively as the 3Cs), as well as the 
necessary infrastructure to support those implementations including, but
not limited to, math packages for statistics, linear algebra and others
as well as Java primitive collections, local and distributed vector and
matrix classes and a variety of integrative code to work with popular 
packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache 
Cassandra and much more. The 0.8 release is mainly a clean up release in
preparation for an upcoming 1.0 release, but there are several 
significant new features, which are highlighted below.

To get started with Apache Mahout 0.8, download the release artifacts and 
signatures at http://www.apache.org/dyn/closer.cgi/mahout or visit the central 
Maven repository. 

In addition to the release highlights and artifacts, please pay attention to 
the section labelled FUTURE PLANS below for more information about upcoming 
releases of Mahout.

As with any release, we wish to thank all of the users and contributors 
to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for 
individual credits, as there are too many to list here.

GETTING STARTED

In the release package, the examples directory contains several working 
examples of the core 
functionality available in Mahout. These can be run via scripts in the 
examples/bin directory and will prompt you for more information to help you try 
things out. Most examples do not need a Hadoop cluster in 
order to run.

RELEASE HIGHLIGHTS

The highlights of the Apache Mahout 0.8 release include, but are not 
limited to the list below. For further information, see the included 
CHANGELOG file.

- Numerous performance improvements to Vector and Matrix 
implementations, API's and their iterators (see also MAHOUT-1192, 
MAHOUT-1202)
- Numerous performance improvements to the recommender implementations 
(see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151, 
MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
- MAHOUT-1088: Support for biased item-based recommender
- MAHOUT-1089: SGD matrix factorization for rating prediction with user and 
item biases
- MAHOUT-1106: Support for SVD++
- MAHOUT-944: Support for converting one or more Lucene storage indexes 
to SequenceFiles as well as an upgrade of the supported Lucene version 
to Lucene 4.3.1.
- MAHOUT-1154 and friends: New streaming k-means implementation that offers 
on-line (and fast) clustering
- MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory' can 
now be run as a MapReduce job.
- MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension of 
vector to hash (indexes or values).
- MAHOUT-884: Matrix Concat utility, presently only concatenates two matrices.
- MAHOUT-1244: Upgraded to use Lucene 4.3
- MAHOUT-1187: Upgraded to CommonsLang3
- MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
- The usual bug fixes. See JIRA [2] for more
information on the 0.8 release.

A total of 218 separate JIRA issues are addressed in this release.

CONTRIBUTING

Mahout is always looking for contributions focused on the 3Cs. If you are 
interested in contributing, please see our contribution page, 
https://cwiki.apache.org/MAHOUT/how-to-contribute.html, on the Mahout wiki or 
contact us via email at dev@mahout.apache.org.

FUTURE PLANS

0.9

As the project moves towards a 1.0 release, the community is working to 
clean up and/or remove parts of the code base that are under-supported 
or that underperform as well as to better focus the energy and 
contributions on key algorithms that are proven to scale in production 
and have seen wide-spread adoption. To this end, in the next release, 
the project is planning on removing support for the following algorithms
unless there is sustained support and improvement of them before the 
next release.

The algorithms to be removed are:
- From Clustering:
Dirichlet
MeanShift
MinHash
Eigencuts

- From Classification (both are sequential implementations)
Winnow
Perceptron

- Frequent Pattern Mining

- Collaborative Filtering
All recommenders in org.apache.mahout.cf.taste.
impl.recommender.knn
SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone and 
org.apache.mahout.cf.taste.impl.recommender.slopeone
Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo
TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender

- Mahout Math
Lanczos in favour of SSVD
Hadoop entropy stuff in org.apache.mahout.math.stats.entropy

If you are interested in supporting 1 or more of these algorithms, please make 
it known on dev@mahout.apache.org and via JIRA issues that fix and/or improve 
them. Please also provide 
supporting evidence as to their 

[jira] [Updated] (MAHOUT-1291) MahoutDriver yields cosmetically suboptimal exception when bin/mahout runs without args, on some Hadoop versions

2013-07-25 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated MAHOUT-1291:
--

Attachment: MAHOUT-1291.patch

 MahoutDriver yields cosmetically suboptimal exception when bin/mahout runs 
 without args, on some Hadoop versions
 

 Key: MAHOUT-1291
 URL: https://issues.apache.org/jira/browse/MAHOUT-1291
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Sean Owen
Priority: Trivial
 Fix For: 0.9

 Attachments: MAHOUT-1291.patch


 If you run bin/mahout without arguments, an error is correctly displayed 
 about lack of an argument. The part that displays the error is actually 
 within Hadoop code. In some versions of Hadoop, in the error case, it will 
 quit the JVM with System.exit(). In others, it does not.
 In the calling code in MahoutDriver, in this error case, the main() method 
 does not actually return. So, for versions where Hadoop code doesn't 
 immediately exit the JVM, execution continues. This yields another exception. 
 It's pretty harmless but ugly.
 Attached is a one-line fix, to return from main() in the error case, which is 
 more correct to begin with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAHOUT-1291) MahoutDriver yields cosmetically suboptimal exception when bin/mahout runs without args, on some Hadoop versions

2013-07-25 Thread Sean Owen (JIRA)
Sean Owen created MAHOUT-1291:
-

 Summary: MahoutDriver yields cosmetically suboptimal exception 
when bin/mahout runs without args, on some Hadoop versions
 Key: MAHOUT-1291
 URL: https://issues.apache.org/jira/browse/MAHOUT-1291
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Sean Owen
Priority: Trivial
 Fix For: 0.9
 Attachments: MAHOUT-1291.patch

If you run bin/mahout without arguments, an error is correctly displayed about 
lack of an argument. The part that displays the error is actually within Hadoop 
code. In some versions of Hadoop, in the error case, it will quit the JVM with 
System.exit(). In others, it does not.

In the calling code in MahoutDriver, in this error case, the main() method does 
not actually return. So, for versions where Hadoop code doesn't immediately 
exit the JVM, execution continues. This yields another exception. It's pretty 
harmless but ugly.

Attached is a one-line fix, to return from main() in the error case, which is 
more correct to begin with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1290) Issue when running Mahout Recommender Demo

2013-07-25 Thread Helder Garay Martins (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helder Garay Martins updated MAHOUT-1290:
-

Labels: newdev patch  (was: )
Status: Patch Available  (was: Open)

- Added Jetty dependency to the examples module, and removed it from 
integration module
- Added Jetty conf file to the examples module, and removed it from integration 
module

 Issue when running Mahout Recommender Demo
 --

 Key: MAHOUT-1290
 URL: https://issues.apache.org/jira/browse/MAHOUT-1290
 Project: Mahout
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8
Reporter: Suneel Marthi
  Labels: patch, newdev
 Fix For: 0.9


 When running jetty:run under *mahout-integration*, seeing a 
 ClassNotFoundException:
  org.apache.mahout.cf.taste.**example.grouplens.**GroupLensRecommender.
 The problem is happening because the webapp
 folder wasn't moved to the examples dir and the Jetty dependency wasn't added 
 asa Maven plugin when the GroupLens example moved to the examples submodule. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1290) Issue when running Mahout Recommender Demo

2013-07-25 Thread Helder Garay Martins (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helder Garay Martins updated MAHOUT-1290:
-

Attachment: MAHOUT-1290.patch

I'm not sure if the patch was sent before, I'm sending it here again just to be 
sure.

 Issue when running Mahout Recommender Demo
 --

 Key: MAHOUT-1290
 URL: https://issues.apache.org/jira/browse/MAHOUT-1290
 Project: Mahout
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8
Reporter: Suneel Marthi
  Labels: newdev, patch
 Fix For: 0.9

 Attachments: MAHOUT-1290.patch


 When running jetty:run under *mahout-integration*, seeing a 
 ClassNotFoundException:
  org.apache.mahout.cf.taste.**example.grouplens.**GroupLensRecommender.
 The problem is happening because the webapp
 folder wasn't moved to the examples dir and the Jetty dependency wasn't added 
 asa Maven plugin when the GroupLens example moved to the examples submodule. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAHOUT-1292) lucene2seq creates single document from index

2013-07-25 Thread Liz Merkhofer (JIRA)
Liz Merkhofer created MAHOUT-1292:
-

 Summary: lucene2seq creates single document from index
 Key: MAHOUT-1292
 URL: https://issues.apache.org/jira/browse/MAHOUT-1292
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.8
Reporter: Liz Merkhofer


Lucene2seq creates only one sequencefile, rather than a file for each document 
in the index.

Running lucene2seq on my Solr (4.3) index produces a file with a header and, it 
seems, the field I specified from the index, concatenated for all the 
documents. After running this through seq2sparse and rowid (to prepare for 
cvb), the resulting matrix has only one row, though it should create one row 
per document.

This issue prevents, at least, data from a lucene index from being easily used 
as input for cvb. Lucene.vector is also currently inadequate: the keys to its 
sequence files are LongWriteable, and rowid will not convert only Text to 
IntWriteable, as is necessary for the keys in cvb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1292) lucene2seq creates single document from index

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi updated MAHOUT-1292:
--

Fix Version/s: 0.9

 lucene2seq creates single document from index
 -

 Key: MAHOUT-1292
 URL: https://issues.apache.org/jira/browse/MAHOUT-1292
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.8
Reporter: Liz Merkhofer
Assignee: Suneel Marthi
  Labels: cvb, lucene, solr
 Fix For: 0.9


 Lucene2seq creates only one sequencefile, rather than a file for each 
 document in the index.
 Running lucene2seq on my Solr (4.3) index produces a file with a header and, 
 it seems, the field I specified from the index, concatenated for all the 
 documents. After running this through seq2sparse and rowid (to prepare for 
 cvb), the resulting matrix has only one row, though it should create one row 
 per document.
 This issue prevents, at least, data from a lucene index from being easily 
 used as input for cvb. Lucene.vector is also currently inadequate: the keys 
 to its sequence files are LongWriteable, and rowid will not convert only Text 
 to IntWriteable, as is necessary for the keys in cvb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAHOUT-1292) lucene2seq creates single document from index

2013-07-25 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi reassigned MAHOUT-1292:
-

Assignee: Suneel Marthi

 lucene2seq creates single document from index
 -

 Key: MAHOUT-1292
 URL: https://issues.apache.org/jira/browse/MAHOUT-1292
 Project: Mahout
  Issue Type: Bug
  Components: Integration
Affects Versions: 0.8
Reporter: Liz Merkhofer
Assignee: Suneel Marthi
  Labels: cvb, lucene, solr

 Lucene2seq creates only one sequencefile, rather than a file for each 
 document in the index.
 Running lucene2seq on my Solr (4.3) index produces a file with a header and, 
 it seems, the field I specified from the index, concatenated for all the 
 documents. After running this through seq2sparse and rowid (to prepare for 
 cvb), the resulting matrix has only one row, though it should create one row 
 per document.
 This issue prevents, at least, data from a lucene index from being easily 
 used as input for cvb. Lucene.vector is also currently inadequate: the keys 
 to its sequence files are LongWriteable, and rowid will not convert only Text 
 to IntWriteable, as is necessary for the keys in cvb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Mahout-Examples-Cluster-Reuters-II #553

2013-07-25 Thread Apache Jenkins Server
See https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters-II/553/

--
[...truncated 2174 lines...]
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/ByteBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/CharBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/IntBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/ShortBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/LongBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/FloatBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/buffer/DoubleBufferConsumer.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/ByteArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/CharArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/IntArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/ShortArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/LongArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/FloatArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/DoubleArrayList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractByteList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractCharList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractIntList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractShortList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractLongList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractFloatList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/list/AbstractDoubleList.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteByteProcedure.java
[INFO] Writing to 

Re: 0.8

2013-07-25 Thread Dmitriy Lyubimov
On Thu, Jul 25, 2013 at 6:44 AM, Suneel Marthi suneel_mar...@yahoo.comwrote:

 With Isabel's help, updated the 0.8 Release notes on the Wiki and below is
 the text version of the Release notes.

 Checkout the Wiki version at

 https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8

 ---

 The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.
 Mahout's goal is to build scalable machine learning libraries focused
 primarily in the areas of collaborative filtering (recommenders),
 clustering and classification (known as the 3Cs), as well as the
 necessary infrastructure to support those implementations including, but
  not limited to, math packages for statistics, linear algebra and others
  as well as Java primitive collections, local and distributed vector and
  matrix classes and a variety of integrative code to work with popular
 packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache
 Cassandra and much more. The 0.8 release is mainly a clean up release in
  preparation for an upcoming 1.0 release, but there are several
 significant new features, which are highlighted below.

 To get started with Apache Mahout 0.8, download the release artifacts and
 signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples
 directory contains several working examples of the core
 functionality available in Mahout. These can be run via scripts in the
 examples/bin directory. Most examples do not need a Hadoop cluster in
 order to run.

 Please pay attention to the section labelled FUTURE PLANS below for more
 information about upcoming releases of Mahout.

 As with any release, we wish to thank all of the users and contributors
 to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for
 individual credits, as there are too many to list here.

 RELEASE HIGHLIGHTS

 The highlights of the Apache Mahout 0.8 release include, but are not
 limited to the list below. For further information, see the included
 CHANGELOG file.

 - Numerous performance improvements to Vector and Matrix
 implementations, API's and their iterators (see also MAHOUT-1192,
 MAHOUT-1202)
 - Numerous performance improvements to the recommender implementations
 (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151,
 MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
 - MAHOUT-1088: Support for biased item-based recommender
 - MAHOUT-1089: SGD matrix factorization for rating prediction with user
 and item biases
 - MAHOUT-1106: Support for SVD++
 - MAHOUT-944: Support for converting one or more Lucene storage indexes
 to SequenceFiles as well as an upgrade of the supported Lucene version
 to Lucene 4.3.1.
 - MAHOUT-1154 and friends: New streaming k-means implementation that
 offers on-line (and fast) clustering
 - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory'
 can now be run as a MapReduce job.
 - MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension
 of vector to hash (indexes or values).
 - MAHOUT-884: Matrix Concat utility, presently only concatenates two
 matrices.
 - MAHOUT-1244: Upgraded to use Lucene 4.3
 - MAHOUT-1187: Upgraded to CommonsLang3
 - MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
 - The usual bug fixes. See JIRA [2] for more
 information on the 0.8 release.

 A total of 218 separate JIRA issues are addressed in this release.

 CONTRIBUTING

 Mahout is always looking for contributions focused on the 3Cs. If you are
 interested in contributing, please see our
 https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout wiki
 or contact us via email at dev@mahout.apache.org.

 FUTURE PLANS

 0.9

 As the project moves towards a 1.0 release, the community is working to
 clean up and/or remove parts of the code base that are under-supported
 or that underperform as well as to better focus the energy and
 contributions on key algorithms that are proven to scale in production
 and have seen wide-spread adoption. To this end, in the next release,
 the project is planning on removing support for the following algorithms
  unless there is sustained support and improvement of them before the
 next release.

 The algorithms to be removed are:
 - From Clustering:
 Dirichlet
 MeanShift
 MinHash
 Eigencuts
 - From Classification (both are sequential implementations)
 Winnow
 Perceptron
 - Frequent Pattern Mining
 - Collaborative Filtering
 All recommenders in org.apache.mahout.cf.taste.
 impl.recommender.knn
 SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone and
 org.apache.mahout.cf.taste.impl.recommender.slopeone
 Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo
 TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender
 - Mahout Math


What does it mean -- remove Mahout Math?


 Lanczos in favour of SSVD
 Hadoop entropy stuff in org.apache.mahout.math.stats.entropy

 If you are interested 

Re: 0.8

2013-07-25 Thread Sebastian Schelter
It means to aim to remove the following things *from Mahout Math*:

- Lanczos (use SSVD instead)
- Hadoop entropy stuff in org.apache.mahout.math.stats.entropy


2013/7/25 Dmitriy Lyubimov dlie...@gmail.com

 On Thu, Jul 25, 2013 at 6:44 AM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:

  With Isabel's help, updated the 0.8 Release notes on the Wiki and below
 is
  the text version of the Release notes.
 
  Checkout the Wiki version at
 
  https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
 
  ---
 
  The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.
  Mahout's goal is to build scalable machine learning libraries focused
  primarily in the areas of collaborative filtering (recommenders),
  clustering and classification (known as the 3Cs), as well as the
  necessary infrastructure to support those implementations including, but
   not limited to, math packages for statistics, linear algebra and others
   as well as Java primitive collections, local and distributed vector and
   matrix classes and a variety of integrative code to work with popular
  packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache
  Cassandra and much more. The 0.8 release is mainly a clean up release in
   preparation for an upcoming 1.0 release, but there are several
  significant new features, which are highlighted below.
 
  To get started with Apache Mahout 0.8, download the release artifacts and
  signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples
  directory contains several working examples of the core
  functionality available in Mahout. These can be run via scripts in the
  examples/bin directory. Most examples do not need a Hadoop cluster in
  order to run.
 
  Please pay attention to the section labelled FUTURE PLANS below for more
  information about upcoming releases of Mahout.
 
  As with any release, we wish to thank all of the users and contributors
  to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for
  individual credits, as there are too many to list here.
 
  RELEASE HIGHLIGHTS
 
  The highlights of the Apache Mahout 0.8 release include, but are not
  limited to the list below. For further information, see the included
  CHANGELOG file.
 
  - Numerous performance improvements to Vector and Matrix
  implementations, API's and their iterators (see also MAHOUT-1192,
  MAHOUT-1202)
  - Numerous performance improvements to the recommender implementations
  (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151,
  MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
  - MAHOUT-1088: Support for biased item-based recommender
  - MAHOUT-1089: SGD matrix factorization for rating prediction with user
  and item biases
  - MAHOUT-1106: Support for SVD++
  - MAHOUT-944: Support for converting one or more Lucene storage indexes
  to SequenceFiles as well as an upgrade of the supported Lucene version
  to Lucene 4.3.1.
  - MAHOUT-1154 and friends: New streaming k-means implementation that
  offers on-line (and fast) clustering
  - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory'
  can now be run as a MapReduce job.
  - MAHOUT-1052: Add an option to MinHashDriver that specifies the
 dimension
  of vector to hash (indexes or values).
  - MAHOUT-884: Matrix Concat utility, presently only concatenates two
  matrices.
  - MAHOUT-1244: Upgraded to use Lucene 4.3
  - MAHOUT-1187: Upgraded to CommonsLang3
  - MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
  - The usual bug fixes. See JIRA [2] for more
  information on the 0.8 release.
 
  A total of 218 separate JIRA issues are addressed in this release.
 
  CONTRIBUTING
 
  Mahout is always looking for contributions focused on the 3Cs. If you are
  interested in contributing, please see our
  https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout
 wiki
  or contact us via email at dev@mahout.apache.org.
 
  FUTURE PLANS
 
  0.9
 
  As the project moves towards a 1.0 release, the community is working to
  clean up and/or remove parts of the code base that are under-supported
  or that underperform as well as to better focus the energy and
  contributions on key algorithms that are proven to scale in production
  and have seen wide-spread adoption. To this end, in the next release,
  the project is planning on removing support for the following algorithms
   unless there is sustained support and improvement of them before the
  next release.
 
  The algorithms to be removed are:
  - From Clustering:
  Dirichlet
  MeanShift
  MinHash
  Eigencuts
  - From Classification (both are sequential implementations)
  Winnow
  Perceptron
  - Frequent Pattern Mining
  - Collaborative Filtering
  All recommenders in org.apache.mahout.cf.taste.
  impl.recommender.knn
  SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone
 and
  org.apache.mahout.cf.taste.impl.recommender.slopeone
  

Re: 0.8

2013-07-25 Thread Dmitriy Lyubimov
oh. of course.


On Thu, Jul 25, 2013 at 3:37 PM, Sebastian Schelter s...@apache.org wrote:

 It means to aim to remove the following things *from Mahout Math*:

 - Lanczos (use SSVD instead)
 - Hadoop entropy stuff in org.apache.mahout.math.stats.entropy


 2013/7/25 Dmitriy Lyubimov dlie...@gmail.com

  On Thu, Jul 25, 2013 at 6:44 AM, Suneel Marthi suneel_mar...@yahoo.com
  wrote:
 
   With Isabel's help, updated the 0.8 Release notes on the Wiki and below
  is
   the text version of the Release notes.
  
   Checkout the Wiki version at
  
   https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
  
   ---
  
   The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.
   Mahout's goal is to build scalable machine learning libraries focused
   primarily in the areas of collaborative filtering (recommenders),
   clustering and classification (known as the 3Cs), as well as the
   necessary infrastructure to support those implementations including,
 but
not limited to, math packages for statistics, linear algebra and
 others
as well as Java primitive collections, local and distributed vector
 and
matrix classes and a variety of integrative code to work with popular
   packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache
   Cassandra and much more. The 0.8 release is mainly a clean up release
 in
preparation for an upcoming 1.0 release, but there are several
   significant new features, which are highlighted below.
  
   To get started with Apache Mahout 0.8, download the release artifacts
 and
   signatures at http://www.apache.org/dyn/closer.cgi/mahout. The
 examples
   directory contains several working examples of the core
   functionality available in Mahout. These can be run via scripts in the
   examples/bin directory. Most examples do not need a Hadoop cluster in
   order to run.
  
   Please pay attention to the section labelled FUTURE PLANS below for
 more
   information about upcoming releases of Mahout.
  
   As with any release, we wish to thank all of the users and contributors
   to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for
   individual credits, as there are too many to list here.
  
   RELEASE HIGHLIGHTS
  
   The highlights of the Apache Mahout 0.8 release include, but are not
   limited to the list below. For further information, see the included
   CHANGELOG file.
  
   - Numerous performance improvements to Vector and Matrix
   implementations, API's and their iterators (see also MAHOUT-1192,
   MAHOUT-1202)
   - Numerous performance improvements to the recommender implementations
   (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151,
   MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
   - MAHOUT-1088: Support for biased item-based recommender
   - MAHOUT-1089: SGD matrix factorization for rating prediction with user
   and item biases
   - MAHOUT-1106: Support for SVD++
   - MAHOUT-944: Support for converting one or more Lucene storage indexes
   to SequenceFiles as well as an upgrade of the supported Lucene version
   to Lucene 4.3.1.
   - MAHOUT-1154 and friends: New streaming k-means implementation that
   offers on-line (and fast) clustering
   - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce,
 'seqdirectory'
   can now be run as a MapReduce job.
   - MAHOUT-1052: Add an option to MinHashDriver that specifies the
  dimension
   of vector to hash (indexes or values).
   - MAHOUT-884: Matrix Concat utility, presently only concatenates two
   matrices.
   - MAHOUT-1244: Upgraded to use Lucene 4.3
   - MAHOUT-1187: Upgraded to CommonsLang3
   - MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
   - The usual bug fixes. See JIRA [2] for more
   information on the 0.8 release.
  
   A total of 218 separate JIRA issues are addressed in this release.
  
   CONTRIBUTING
  
   Mahout is always looking for contributions focused on the 3Cs. If you
 are
   interested in contributing, please see our
   https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout
  wiki
   or contact us via email at dev@mahout.apache.org.
  
   FUTURE PLANS
  
   0.9
  
   As the project moves towards a 1.0 release, the community is working to
   clean up and/or remove parts of the code base that are under-supported
   or that underperform as well as to better focus the energy and
   contributions on key algorithms that are proven to scale in production
   and have seen wide-spread adoption. To this end, in the next release,
   the project is planning on removing support for the following
 algorithms
unless there is sustained support and improvement of them before the
   next release.
  
   The algorithms to be removed are:
   - From Clustering:
   Dirichlet
   MeanShift
   MinHash
   Eigencuts
   - From Classification (both are sequential implementations)
   Winnow
   Perceptron
   - Frequent Pattern Mining
   - Collaborative 

Re: 0.8

2013-07-25 Thread Grant Ingersoll

On Jul 25, 2013, at 11:08 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 What does it mean -- remove Mahout Math?

It is a high level bullet, see the items underneath.  Unfortunately, they don't 
translate to text format very well.

Build failed in Jenkins: Mahout-Quality #2156

2013-07-25 Thread Apache Jenkins Server
See https://builds.apache.org/job/Mahout-Quality/2156/

--
[...truncated 197709 lines...]
Running org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec - in 
org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest
Running org.apache.mahout.cf.taste.impl.common.FastByIDMapTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.311 sec - in 
org.apache.mahout.cf.taste.impl.common.FastByIDMapTest
Running org.apache.mahout.cf.taste.common.CommonTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec - in 
org.apache.mahout.cf.taste.common.CommonTest
Running org.apache.mahout.clustering.meanshift.TestMeanShift
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.961 sec - in 
org.apache.mahout.clustering.meanshift.TestMeanShift
Running org.apache.mahout.clustering.classify.ClusterClassificationDriverTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.954 sec - in 
org.apache.mahout.clustering.classify.ClusterClassificationDriverTest
Running org.apache.mahout.clustering.dirichlet.TestMapReduce
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.627 sec - in 
org.apache.mahout.clustering.dirichlet.TestMapReduce
Running org.apache.mahout.clustering.dirichlet.TestDistributions
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.143 sec - in 
org.apache.mahout.clustering.dirichlet.TestDistributions
Running org.apache.mahout.clustering.dirichlet.TestDirichletClustering
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.14 sec - in 
org.apache.mahout.clustering.dirichlet.TestDirichletClustering
Running org.apache.mahout.clustering.TestGaussianAccumulators
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.441 sec - in 
org.apache.mahout.clustering.TestGaussianAccumulators
Running org.apache.mahout.clustering.lda.cvb.TestCVBModelTrainer
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.283 sec - in 
org.apache.mahout.clustering.lda.cvb.TestCVBModelTrainer
Running org.apache.mahout.clustering.canopy.TestCanopyCreation
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.645 sec - in 
org.apache.mahout.clustering.canopy.TestCanopyCreation
Running org.apache.mahout.clustering.kmeans.TestEigenSeedGenerator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.506 sec - in 
org.apache.mahout.clustering.kmeans.TestEigenSeedGenerator
Running org.apache.mahout.clustering.kmeans.TestRandomSeedGenerator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.507 sec - in 
org.apache.mahout.clustering.kmeans.TestRandomSeedGenerator
Running org.apache.mahout.clustering.kmeans.TestKmeansClustering
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.592 sec - in 
org.apache.mahout.clustering.kmeans.TestKmeansClustering
Running org.apache.mahout.clustering.TestClusterInterface
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in 
org.apache.mahout.clustering.TestClusterInterface
Running org.apache.mahout.clustering.minhash.TestMinHashClustering
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.359 sec - in 
org.apache.mahout.clustering.minhash.TestMinHashClustering
Running org.apache.mahout.clustering.topdown.PathDirectoryTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec - in 
org.apache.mahout.clustering.topdown.PathDirectoryTest
Running 
org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessorTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.731 sec - in 
org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessorTest
Running 
org.apache.mahout.clustering.topdown.postprocessor.ClusterCountReaderTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.729 sec - in 
org.apache.mahout.clustering.topdown.postprocessor.ClusterCountReaderTest
Running org.apache.mahout.clustering.streaming.cluster.StreamingKMeansTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.173 sec - in 
org.apache.mahout.clustering.streaming.cluster.StreamingKMeansTest
Running org.apache.mahout.clustering.streaming.cluster.BallKMeansTest
Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 33.151 sec  
FAILURE! - in org.apache.mahout.clustering.streaming.cluster.BallKMeansTest
testClustering(org.apache.mahout.clustering.streaming.cluster.BallKMeansTest)  
Time elapsed: 0.877 sec   FAILURE!
java.lang.AssertionError: expected:625.0 but was:787.0
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:494)
at org.junit.Assert.assertEquals(Assert.java:592)
at 

Build failed in Jenkins: mahout-nightly #1302

2013-07-25 Thread Apache Jenkins Server
See https://builds.apache.org/job/mahout-nightly/1302/changes

Changes:

[dlyubimov] MAHOUT-1280: moving UpperTriangularMatrix to mahout-math as well as 
adding Symmetric matrix as a first class citizen.

Squashed commit of the following:

commit 50d97093eff5416b7b644efaae159ea35d7e7279
Author: Dmitriy Lyubimov dlyubi...@apache.org
Date:   Wed Jul 17 23:35:49 2013 -0700

Illegal like()

commit 7ce78c1dfc7b2c15fef787380e617b873df5890d
Author: Dmitriy Lyubimov dlyubi...@apache.org
Date:   Wed Jul 10 12:54:46 2013 -0700

Bug fixes in constructor-by-vector

commit ef11cfa02727fb29b2533c0848734809f77f8a3e
Author: Dmitriy Lyubimov dlyubi...@apache.org
Date:   Wed Jul 10 11:22:06 2013 -0700

Switching SSVD uses to UpperTriangular.

commit 3e73a8cd7ba32cb8696d76b93ec287540c710f68
Author: Dmitriy Lyubimov dlyubi...@apache.org
Date:   Wed Jul 10 10:55:11 2013 -0700

Adding test for dense symmetric matrix asserting Eigen decomposition 
equivalent to that over a dense matrix.

commit 6fc530b75215c5ad1c0b5561ff3af724c6e48c6b
Author: Dmitriy Lyubimov dlyubi...@apache.org
Date:   Tue Jul 9 18:30:59 2013 -0700

Moving UpperTriangular matrix to mahout.math; adding DenseSymmetric matrix.

--
[...truncated 1546 lines...]
Running org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in 
org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest
Running org.apache.mahout.cf.taste.impl.common.FastByIDMapTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.271 sec - in 
org.apache.mahout.cf.taste.impl.common.FastByIDMapTest
Running org.apache.mahout.cf.taste.impl.common.RunningAverageTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec - in 
org.apache.mahout.cf.taste.impl.common.RunningAverageTest
Running org.apache.mahout.cf.taste.impl.common.RefreshHelperTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec - in 
org.apache.mahout.cf.taste.impl.common.RefreshHelperTest
Running org.apache.mahout.cf.taste.impl.common.FastIDSetTest
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.243 sec - in 
org.apache.mahout.cf.taste.impl.common.FastIDSetTest
Running org.apache.mahout.cf.taste.impl.common.RunningAverageAndStdDevTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.062 sec - in 
org.apache.mahout.cf.taste.impl.common.RunningAverageAndStdDevTest
Running org.apache.mahout.cf.taste.impl.common.CacheTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.885 sec - in 
org.apache.mahout.cf.taste.impl.common.CacheTest
Running org.apache.mahout.cf.taste.impl.common.BitSetTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec - in 
org.apache.mahout.cf.taste.impl.common.BitSetTest
Running org.apache.mahout.cf.taste.impl.common.LongPrimitiveArrayIteratorTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in 
org.apache.mahout.cf.taste.impl.common.LongPrimitiveArrayIteratorTest
Running org.apache.mahout.cf.taste.impl.common.WeightedRunningAverageTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in 
org.apache.mahout.cf.taste.impl.common.WeightedRunningAverageTest
Running org.apache.mahout.cf.taste.impl.common.FastMapTest
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.396 sec - in 
org.apache.mahout.cf.taste.impl.common.FastMapTest
Running org.apache.mahout.cf.taste.impl.common.SamplingLongPrimitiveIteratorTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.924 sec - in 
org.apache.mahout.cf.taste.impl.common.SamplingLongPrimitiveIteratorTest
Running org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarityTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.086 sec - in 
org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarityTest
Running org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarityTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.097 sec - in 
org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarityTest
Running 
org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarityTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.098 sec - in 
org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarityTest
Running 
org.apache.mahout.cf.taste.impl.similarity.AveragingPreferenceInferrerTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.085 sec - in 
org.apache.mahout.cf.taste.impl.similarity.AveragingPreferenceInferrerTest
Running org.apache.mahout.cf.taste.impl.similarity.file.FileItemSimilarityTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.082 sec - in 
org.apache.mahout.cf.taste.impl.similarity.file.FileItemSimilarityTest
Running