Thanks Andrew M., see that some of the example scripts need to be fixed as they still refer to the deprecated algorithms. See that the Streaming KMeans has failed for you as well.
I'll be rolling back the release today to fix these issues. On Tuesday, January 21, 2014 1:22 AM, Andrew Musselman <andrew.mussel...@gmail.com> wrote: Builds on Ubuntu 12.04 from tarball and zip, and on AWS's default 64-bit Linux AMI from tarball. All tests pass. *Output of examples:* *asf-email-examples.sh, run on mahout.apache.org <http://mahout.apache.org>:* *recommendations:* [ec2-user@ip-10-73-146-199 bin]$ hadoop fs -cat /user/ec2-user/asf-output/prefs/recommendations/part-r-00000 | less 1 [21935:1.0,23122:1.0,24084:1.0,26397:1.0,1755:1.0,20743:1.0,13428:1.0,19483:1.0,24067:1.0] 4 [14372:1.0,28069:1.0,12258:1.0,18412:1.0,26707:1.0,14610:1.0,2909:1.0,14777:1.0,11792:1.0,26764:1.0] 6 [5442:1.0,18416:1.0,17554:1.0,14610:1.0,16767:1.0,16740:1.0,26743:1.0,11792:1.0,26707:1.0,28116:1.0] 8 [12758:1.0,19409:1.0,11112:1.0] 11 [25890:1.0,26743:1.0,9122:1.0,14512:1.0,28116:1.0,17499:1.0,14976:1.0,14561:1.0,3686:1.0,26707:1.0] 14 [29596:1.0,25567:1.0,19520:1.0,26327:1.0,13809:1.0,29435:1.0,17331:1.0,17290:1.0,17819:1.0,3829:1.0] 15 [15355:1.0,15322:1.0,23191:1.0,7990:1.0,15318:1.0,15236:1.0,17789:1.0,15286:1.0,20916:1.0,2812:1.0] 16 [23647:1.0,18137:1.0,1692:1.0,11490:1.0,4303:1.0,12906:1.0,5120:1.0,29503:1.0,19409:1.0,27700:1.0] 18 [29738:1.0,12070:1.0,24078:1.0,19449:1.0,17819:1.0,11549:1.0,25410:1.0,15228:1.0,24930:1.0,23708:1.0] 19 [28008:1.0,18416:1.0,2909:1.0,29250:1.0,28023:1.0,14974:1.0] 20 [19313:1.0,3464:1.0,12394:1.0,18665:1.0,16601:1.0,25816:1.0,10212:1.0,11626:1.0,18577:1.0,16734:1.0] [snip] *clustering; kmeans:* [snip] Weight : [props - optional]: Point: 1.0 : [distance-squared=1.0193102046188427]: /commits/200802.gz/20835820.1202052180347.JavaMail.www-data@brutus = [1065:0.195, 1977:0.355, 2246:0.091, 3008:0.078, 5336:0.110, 7573:0.204, 7683:0.126, 7715:0.365, 7812:0.180, 7832:0.075, 8268:0.093, 9779:0.159, 10257:0.133, 10972:0.158, 11663:0.143, 15313:0.065, 17007:0.244, 19359:0.183, 19399:0.338, 19525:0.139, 20224:0.140, 24649:0.095, 25003:0.076, 29143:0.156, 30459:0.075, 31537:0.156, 31559:0.075, 31668:0.139, 33208:0.117, 33425:0.218, 36491:0.075, 38378:0.130, 39789:0.110, 40743:0.190, 45775:0.086] 1.0 : [distance-squared=0.9823018320457279]: /commits/200808.gz/1722278226.1219149603005.JavaMail.www-data@brutus = [1065:0.188, 2246:0.088, 3008:0.076, 3620:0.239, 5200:0.104, 5336:0.106, 6404:0.088, 7552:0.335, 7683:0.122, 7715:0.376, 7812:0.173, 7832:0.072, 10257:0.128, 11663:0.195, 15313:0.063, 16660:0.094, 19359:0.177, 19525:0.134, 19551:0.101, 20025:0.183, 21233:0.098, 24649:0.092, 25003:0.112, 27650:0.283, 27653:0.216, 29143:0.150, 30459:0.072, 30868:0.208, 31559:0.126, 31565:0.203, 33208:0.113, 36491:0.073, 36610:0.141, 36767:0.208, 38378:0.125, 39789:0.106, 45775:0.083] 1.0 : [distance-squared=0.9509142993214911]: /commits/201006.gz/5844140.863.1277658000780.JavaMail.confluence@thor = [648:0.100, 914:0.066, 2040:0.076, 2246:0.078, 3008:0.048, 4419:0.076, 4452:0.070, 5200:0.065, 5203:0.140, 5336:0.067, 6404:0.056, 7235:0.048, 7310:0.077, 7464:0.067, 7471:0.060, 7489:0.093, 7505:0.123, 7683:0.077, 7715:0.145, 7814:0.072, 7912:0.155, 8268:0.098, 9835:0.118, 10225:0.081, 10257:0.114, 11127:0.112, 11510:0.086, 11589:0.139, 11663:0.087, 12641:0.117, 13837:0.052, 14030:0.062, 14089:0.051, 14352:0.061, 14396:0.185, 17015:0.115, 17240:0.097, 18767:0.149, 19774:0.124, 20346:0.159, 21233:0.075, 23657:0.089, 23939:0.078, 23974:0.105, 23998:0.146, 24962:0.122, 25003:0.093, 25084:0.151, 25128:0.052, 29143:0.095, 30459:0.046, 30806:0.075, 31559:0.046, 31727:0.104, 31895:0.105, 31900:0.153, 32149:0.079, 32993:0.069, 33112:0.177, 33208:0.101, 33351:0.089, 33533:0.079, 33638:0.042, 35795:0.066, 36189:0.078, 36491:0.046, 36500:0.093, 36625:0.200, 37111:0.071, 39336:0.079, 39789:0.067, 39933:0.073, 39967:0.079, 41155:0.167, 41280:0.065, 41696:0.072, 41947:0.118, 43685:0.086, 44077:0.308, 44353:0.215, 44423:0.085, 45215:0.151, 45775:0.052, 46766:0.074, 47823:0.082, 48120:0.080, 48212:0.109, 48436:0.110] [snip] *clustering; dirichlet:* Get this complaint: Running Dirichlet with K = 8 Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:16:35 WARN driver.MahoutDriver: Unable to add class: dirichlet 14/01/21 05:16:35 WARN driver.MahoutDriver: No dirichlet.props found on classpath, will use command-line arguments only Unknown program 'dirichlet' chosen. *clustering: minhash:* Running Minhash Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:17:27 WARN driver.MahoutDriver: Unable to add class: minhash 14/01/21 05:17:27 WARN driver.MahoutDriver: No minhash.props found on classpath, will use command-line arguments only Unknown program 'minhash' chosen. *classification; standard:* ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 5384 87.7874% Incorrectly Classified Instances : 749 12.2126% Total Classified Instances : 6133 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d <--Classified as 2949 7 531 25 | 3512 a = dev 0 0 0 0 | 0 b = general 99 8 1763 8 | 1878 c = user 41 1 29 672 | 743 d = commits ======================================================= Statistics ------------------------------------------------------- Kappa 0.7877 Accuracy 87.7874% Reliability 53.658% Reliability (standard deviation) 0.4911 *classification; complementary:* ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 5530 90.1679% Incorrectly Classified Instances : 603 9.8321% Total Classified Instances : 6133 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d <--Classified as 3168 0 276 68 | 3512 a = dev 0 0 0 0 | 0 b = general 196 0 1652 30 | 1878 c = user 25 0 8 710 | 743 d = commits ======================================================= Statistics ------------------------------------------------------- Kappa 0.8259 Accuracy 90.1679% Reliability 54.7459% Reliability (standard deviation) 0.5005 14/01/21 05:28:42 INFO driver.MahoutDriver: Program took 20901 ms (Minutes: 0.34836666666666666) *classification; sgd, with three categories:* Running SGD Training Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:58:00 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TrainASFEmail.props found on classpath, will use command-line arguments only 14/01/21 05:58:00 INFO common.AbstractJob: Command line arguments: {--cardinality=[100000], --categories=[3], --endPhase=[2147483647], --input=[asf-output/classification/sgd/splits/mapRedOut/], --output=[asf-output/classification/sgd/models], --poolSize=[5], --startPhase=[0], --tempDir=[temp], --threads=[20]} 24168 training files 0.00 0.00 0.00 0.00 0.0000000 0.0000000 1 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 2 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 3 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 4 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 6 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 8 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 10 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 12 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 15 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 20 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 25 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 30 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 40 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 50 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 60 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 70 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 80 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 100 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 120 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 140 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 150 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 200 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 250 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 300 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 400 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 500 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 600 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 700 0.000 0.00 none 0.00 0.00 0.00 0.00 0.0000000 0.0000000 800 0.000 0.00 none 0.13 32659.00 12672.00 82.50 1.3512194e-08 1.0019413e-08 1000 -0.607 75.78 none 0.13 32659.00 12672.00 82.50 1.3512194e-08 1.0019413e-08 1200 -0.607 75.78 none 0.13 32659.00 12672.00 82.50 1.3512194e-08 1.0019413e-08 1400 -0.607 75.78 none 0.13 32659.00 12672.00 82.50 1.3512194e-08 1.0019413e-08 1500 -0.607 75.78 none 0.24 43686.00 17924.00 329.50 1.0571799e-08 1.0032261e-08 2000 -0.487 82.65 none 0.24 49753.00 21610.00 330.71 1.3770070e-08 1.0011902e-08 2500 -0.439 83.90 none 0.24 49753.00 21610.00 330.71 1.3770070e-08 1.0011902e-08 3000 -0.439 83.90 none 0.32 50635.00 28531.00 437.09 1.0551175e-08 1.0000001e-08 4000 -0.351 88.14 none 0.32 50635.00 32642.00 437.09 1.0551175e-08 1.0000000e-08 5000 -0.378 87.10 none 0.32 50635.00 36461.00 437.09 1.0556652e-08 1.0000001e-08 6000 -0.372 86.89 none 0.32 50635.00 37768.00 437.09 1.0576742e-08 1.0000001e-08 7000 -0.334 89.26 none 0.32 50635.00 38807.00 437.09 1.0576742e-08 1.0000000e-08 8000 -0.368 87.52 none 0.32 50635.00 44731.00 437.09 1.0576716e-08 1.0000000e-08 10000 -0.374 87.39 none 0.32 50635.00 45672.00 437.09 1.0576716e-08 1.0000000e-08 12000 -0.298 88.26 none Exception in thread "main" java.lang.IllegalStateException: java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples(AdaptiveLogisticRegression.java:175) at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train(AdaptiveLogisticRegression.java:147) at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train(AdaptiveLogisticRegression.java:132) at org.apache.mahout.classifier.sgd.TrainASFEmail.run(TrainASFEmail.java:109) at org.apache.mahout.classifier.sgd.TrainASFEmail.main(TrainASFEmail.java:142) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.mahout.math.DenseVector.setQuick(DenseVector.java:141) at org.apache.mahout.classifier.sgd.DefaultGradient.apply(DefaultGradient.java:44) at org.apache.mahout.classifier.sgd.AbstractOnlineLogisticRegression.train(AbstractOnlineLogisticRegression.java:167) at org.apache.mahout.classifier.sgd.CrossFoldLearner.train(CrossFoldLearner.java:137) at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression$Wrapper.train(AdaptiveLogisticRegression.java:444) at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression$1.apply(AdaptiveLogisticRegression.java:158) at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression$1.apply(AdaptiveLogisticRegression.java:153) at org.apache.mahout.ep.EvolutionaryProcess$1.call(EvolutionaryProcess.java:148) at org.apache.mahout.ep.EvolutionaryProcess$1.call(EvolutionaryProcess.java:145) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701) On Mon, Jan 20, 2014 at 9:37 AM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > Trying out the build today > > > On Mon, Jan 20, 2014 at 6:00 AM, Suneel Marthi <suneel_mar...@yahoo.com>wrote: > >> This is an issue (trivial one though) that needs to be fixed for 0.9 >> Release, will be rerolling the release today (in the next few hrs) and >> putting out a new release candidate in staging. >> >> Thanks for reporting this Andrew P. >> >> >> >> >> >> On Monday, January 20, 2014 12:34 AM, Andrew Palumbo <ap....@outlook.com> >> wrote: >> >> I ran through the tests with on a CentOS VM AMD64 2 cores 4 GB RAM. Had >> a bit of trouble getting the Hadoop natives to compile and therefore may >> have run into some problems because of the hadoop setup. Ran into some >> problems in the example scripts. Particularly with >> ./cluster-syntheticcontrol.sh ->4,5. I will run through the rest of the >> examples when im sure I've got hadoop setup right. >> >> >> Apache Maven 3.1.2-SNAPSHOT >> Java version: 1.6.0_45, vendor: Sun Microsystems Inc. >> Java home: /usr/java/jdk1.6.0_45/jre >> OS name: "linux", version: "2.6.32-358.23.2.el6.x86_64", arch: "amd64", >> family: "unix" >> $MAHOUT_LOCAL=true >> Hadoop 2.2.0 >> >> >> a) Verify that u can unpack the release (tar or zip) ...passed (tar) >> [passed ] >> >> b) Verify u r able to compile the distro >> >> mvn compile- [passed with warnings] >> >> [WARNING] Expected all dependencies to require Scala version: 2.9.3 >> [WARNING] org.apache.mahout:mahout-math-scala:0.9 requires scala >> version: 2.9.3 >> [WARNING] org.scalatest:scalatest_2.9.2:1.9.1 requires scala >> version: 2.9.2 >> [WARNING] Multiple versions of scala libraries detected! >> >> c) Run through the unit tests: mvn clean test >> mvn clean test [passed] >> >> d) Run the >> example scripts under $MAHOUT_HOME/examples/bin. >> Please run through all the different options in each script >> >> Running example scripts with $MAHOUT_LOCAL=true >> >> ./cluster-syntheticcontrol.sh ->1 [works] >> ./cluster-syntheticcontrol.sh ->2 [works] >> ./cluster-syntheticcontrol.sh ->3 [works] >> >> >> ./cluster-syntheticcontrol.sh ->4 [exits, throws exception] >> [...] >> WARNING: Unable to add class: >> org.apache.mahout.clustering.syntheticcontrol.dirichlet.Job >> java.lang.ClassNotFoundException: >> org.apache.mahout.clustering.syntheticcontrol.dirichlet.Job >> at >> java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:171) >> at >> org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:128) >> Jan 19, 2014 7:55:31 PM org.slf4j.impl.JCLLoggerAdapter warn >> >> >> ./cluster-syntheticcontrol.sh ->5 [exits, throws exception] >> >> WARNING: Unable to add class: >> org.apache.mahout.clustering.syntheticcontrol.meanshift.Job >> java.lang.ClassNotFoundException: >> org.apache.mahout.clustering.syntheticcontrol.meanshift.Job >> at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:171) >> at >> org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:128) >> Jan 19, 2014 7:59:51 PM org.slf4j.impl.JCLLoggerAdapter warn >> WARNING: No >> org.apache.mahout.clustering.syntheticcontrol.meanshift.Job.props found on >> classpath, will use command-line arguments only >> Unknown program >> 'org.apache.mahout.clustering.syntheticcontrol.meanshift.Job' chosen. >> >> >> ./classify-20newsgroups.sh ->1 [works] >> ./classify-20newsgroups.sh ->2 [works] >> >> >> cluster-reuters.sh ->1 [works] >> cluster-reuters.sh ->2 [works] >> cluster-reuters.sh ->3 [works] >> >> Same error as noted previosly in the thread: >> >> cluster-reuters.sh ->4 [0 clusters] >> >> [...] >> >> WARNING: No qualcluster.props found on classpath, will use >> command-line arguments only >> Num clusters: 0; maxDistance: 0.000000 >> [Dunn Index] >> First: Infinity >> [Davies-Bouldin Index] First: NaN >> Jan 19, 2014 7:13:57 PM org.slf4j.impl.JCLLoggerAdapter info >> INFO: Program took 669 ms (Minutes: 0.01115) >> cluster,distance.mean,distance.sd >> ,distance.q0,distance.q1,distance.q2,distance.q3,distance.q4,count,is.train >> >> >> >> >> >> >> > Date: Thu, 16 Jan 2014 06:41:09 -0800 >> > From: suneel_mar...@yahoo.com >> > Subject: MAHOUT 0.9 Release - New URL >> > To: u...@mahout.apache.org; dev@mahout.apache.org >> > >> > Third time's a Charm!!! >> > >> > >> > Here's the new URL for Mahout 0.9 Release: >> > >> https://repository.apache.org/content/repositories/orgapachemahout-1002/org/apache/mahout/mahout-distribution/0.9/ >> > >> > For those volunteering to test this, some of the things to be verified: >> > >> > a) Verify that u can unpack the release (tar or zip) >> > b) Verify u r able to compile the distro >> > c) Run through the unit tests: mvn clean test >> > d) Run the example scripts >> under $MAHOUT_HOME/examples/bin. Please run through all the different >> options in each script. >> > >> > >> > Committers >> > and PMC members: >> > --------------------------------------- >> > >> > Need 'at least 3 +1 votes' for the Release to pass. >> > >> > >> > Thanks and Regards. >> > >