Re: MAHOUT 0.9 Release - New URL
Thanks Andrew M., see that some of the example scripts need to be fixed as they still refer to the deprecated algorithms. See that the Streaming KMeans has failed for you as well. I'll be rolling back the release today to fix these issues. On Tuesday, January 21, 2014 1:22 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: Builds on Ubuntu 12.04 from tarball and zip, and on AWS's default 64-bit Linux AMI from tarball. All tests pass. *Output of examples:* *asf-email-examples.sh, run on mahout.apache.org http://mahout.apache.org:* *recommendations:* [ec2-user@ip-10-73-146-199 bin]$ hadoop fs -cat /user/ec2-user/asf-output/prefs/recommendations/part-r-0 | less 1 [21935:1.0,23122:1.0,24084:1.0,26397:1.0,1755:1.0,20743:1.0,13428:1.0,19483:1.0,24067:1.0] 4 [14372:1.0,28069:1.0,12258:1.0,18412:1.0,26707:1.0,14610:1.0,2909:1.0,14777:1.0,11792:1.0,26764:1.0] 6 [5442:1.0,18416:1.0,17554:1.0,14610:1.0,16767:1.0,16740:1.0,26743:1.0,11792:1.0,26707:1.0,28116:1.0] 8 [12758:1.0,19409:1.0,2:1.0] 11 [25890:1.0,26743:1.0,9122:1.0,14512:1.0,28116:1.0,17499:1.0,14976:1.0,14561:1.0,3686:1.0,26707:1.0] 14 [29596:1.0,25567:1.0,19520:1.0,26327:1.0,13809:1.0,29435:1.0,17331:1.0,17290:1.0,17819:1.0,3829:1.0] 15 [15355:1.0,15322:1.0,23191:1.0,7990:1.0,15318:1.0,15236:1.0,17789:1.0,15286:1.0,20916:1.0,2812:1.0] 16 [23647:1.0,18137:1.0,1692:1.0,11490:1.0,4303:1.0,12906:1.0,5120:1.0,29503:1.0,19409:1.0,27700:1.0] 18 [29738:1.0,12070:1.0,24078:1.0,19449:1.0,17819:1.0,11549:1.0,25410:1.0,15228:1.0,24930:1.0,23708:1.0] 19 [28008:1.0,18416:1.0,2909:1.0,29250:1.0,28023:1.0,14974:1.0] 20 [19313:1.0,3464:1.0,12394:1.0,18665:1.0,16601:1.0,25816:1.0,10212:1.0,11626:1.0,18577:1.0,16734:1.0] [snip] *clustering; kmeans:* [snip] Weight : [props - optional]: Point: 1.0 : [distance-squared=1.0193102046188427]: /commits/200802.gz/20835820.1202052180347.JavaMail.www-data@brutus = [1065:0.195, 1977:0.355, 2246:0.091, 3008:0.078, 5336:0.110, 7573:0.204, 7683:0.126, 7715:0.365, 7812:0.180, 7832:0.075, 8268:0.093, 9779:0.159, 10257:0.133, 10972:0.158, 11663:0.143, 15313:0.065, 17007:0.244, 19359:0.183, 19399:0.338, 19525:0.139, 20224:0.140, 24649:0.095, 25003:0.076, 29143:0.156, 30459:0.075, 31537:0.156, 31559:0.075, 31668:0.139, 33208:0.117, 33425:0.218, 36491:0.075, 38378:0.130, 39789:0.110, 40743:0.190, 45775:0.086] 1.0 : [distance-squared=0.9823018320457279]: /commits/200808.gz/1722278226.1219149603005.JavaMail.www-data@brutus = [1065:0.188, 2246:0.088, 3008:0.076, 3620:0.239, 5200:0.104, 5336:0.106, 6404:0.088, 7552:0.335, 7683:0.122, 7715:0.376, 7812:0.173, 7832:0.072, 10257:0.128, 11663:0.195, 15313:0.063, 16660:0.094, 19359:0.177, 19525:0.134, 19551:0.101, 20025:0.183, 21233:0.098, 24649:0.092, 25003:0.112, 27650:0.283, 27653:0.216, 29143:0.150, 30459:0.072, 30868:0.208, 31559:0.126, 31565:0.203, 33208:0.113, 36491:0.073, 36610:0.141, 36767:0.208, 38378:0.125, 39789:0.106, 45775:0.083] 1.0 : [distance-squared=0.9509142993214911]: /commits/201006.gz/5844140.863.1277658000780.JavaMail.confluence@thor = [648:0.100, 914:0.066, 2040:0.076, 2246:0.078, 3008:0.048, 4419:0.076, 4452:0.070, 5200:0.065, 5203:0.140, 5336:0.067, 6404:0.056, 7235:0.048, 7310:0.077, 7464:0.067, 7471:0.060, 7489:0.093, 7505:0.123, 7683:0.077, 7715:0.145, 7814:0.072, 7912:0.155, 8268:0.098, 9835:0.118, 10225:0.081, 10257:0.114, 11127:0.112, 11510:0.086, 11589:0.139, 11663:0.087, 12641:0.117, 13837:0.052, 14030:0.062, 14089:0.051, 14352:0.061, 14396:0.185, 17015:0.115, 17240:0.097, 18767:0.149, 19774:0.124, 20346:0.159, 21233:0.075, 23657:0.089, 23939:0.078, 23974:0.105, 23998:0.146, 24962:0.122, 25003:0.093, 25084:0.151, 25128:0.052, 29143:0.095, 30459:0.046, 30806:0.075, 31559:0.046, 31727:0.104, 31895:0.105, 31900:0.153, 32149:0.079, 32993:0.069, 33112:0.177, 33208:0.101, 33351:0.089, 33533:0.079, 33638:0.042, 35795:0.066, 36189:0.078, 36491:0.046, 36500:0.093, 36625:0.200, 37111:0.071, 39336:0.079, 39789:0.067, 39933:0.073, 39967:0.079, 41155:0.167, 41280:0.065, 41696:0.072, 41947:0.118, 43685:0.086, 44077:0.308, 44353:0.215, 44423:0.085, 45215:0.151, 45775:0.052, 46766:0.074, 47823:0.082, 48120:0.080, 48212:0.109, 48436:0.110] [snip] *clustering; dirichlet:* Get this complaint: Running Dirichlet with K = 8 Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:16:35 WARN driver.MahoutDriver: Unable to add class: dirichlet 14/01/21 05:16:35 WARN driver.MahoutDriver: No dirichlet.props found on classpath, will use command-line arguments only Unknown program 'dirichlet' chosen. *clustering: minhash:* Running Minhash Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:17:27 WARN driver.MahoutDriver: Unable to add class: minhash
Re: MAHOUT 0.9 Release - New URL
Sure thing; continuing to smoke test the other examples tonight On Tue, Jan 21, 2014 at 9:23 AM, Suneel Marthi suneel_mar...@yahoo.comwrote: Thanks Andrew M., see that some of the example scripts need to be fixed as they still refer to the deprecated algorithms. See that the Streaming KMeans has failed for you as well. I'll be rolling back the release today to fix these issues. On Tuesday, January 21, 2014 1:22 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: Builds on Ubuntu 12.04 from tarball and zip, and on AWS's default 64-bit Linux AMI from tarball. All tests pass. *Output of examples:* *asf-email-examples.sh, run on mahout.apache.org http://mahout.apache.org:* *recommendations:* [ec2-user@ip-10-73-146-199 bin]$ hadoop fs -cat /user/ec2-user/asf-output/prefs/recommendations/part-r-0 | less 1 [21935:1.0,23122:1.0,24084:1.0,26397:1.0,1755:1.0,20743:1.0,13428:1.0,19483:1.0,24067:1.0] 4 [14372:1.0,28069:1.0,12258:1.0,18412:1.0,26707:1.0,14610:1.0,2909:1.0,14777:1.0,11792:1.0,26764:1.0] 6 [5442:1.0,18416:1.0,17554:1.0,14610:1.0,16767:1.0,16740:1.0,26743:1.0,11792:1.0,26707:1.0,28116:1.0] 8 [12758:1.0,19409:1.0,2:1.0] 11 [25890:1.0,26743:1.0,9122:1.0,14512:1.0,28116:1.0,17499:1.0,14976:1.0,14561:1.0,3686:1.0,26707:1.0] 14 [29596:1.0,25567:1.0,19520:1.0,26327:1.0,13809:1.0,29435:1.0,17331:1.0,17290:1.0,17819:1.0,3829:1.0] 15 [15355:1.0,15322:1.0,23191:1.0,7990:1.0,15318:1.0,15236:1.0,17789:1.0,15286:1.0,20916:1.0,2812:1.0] 16 [23647:1.0,18137:1.0,1692:1.0,11490:1.0,4303:1.0,12906:1.0,5120:1.0,29503:1.0,19409:1.0,27700:1.0] 18 [29738:1.0,12070:1.0,24078:1.0,19449:1.0,17819:1.0,11549:1.0,25410:1.0,15228:1.0,24930:1.0,23708:1.0] 19 [28008:1.0,18416:1.0,2909:1.0,29250:1.0,28023:1.0,14974:1.0] 20 [19313:1.0,3464:1.0,12394:1.0,18665:1.0,16601:1.0,25816:1.0,10212:1.0,11626:1.0,18577:1.0,16734:1.0] [snip] *clustering; kmeans:* [snip] Weight : [props - optional]: Point: 1.0 : [distance-squared=1.0193102046188427]: /commits/200802.gz/20835820.1202052180347.JavaMail.www-data@brutus = [1065:0.195, 1977:0.355, 2246:0.091, 3008:0.078, 5336:0.110, 7573:0.204, 7683:0.126, 7715:0.365, 7812:0.180, 7832:0.075, 8268:0.093, 9779:0.159, 10257:0.133, 10972:0.158, 11663:0.143, 15313:0.065, 17007:0.244, 19359:0.183, 19399:0.338, 19525:0.139, 20224:0.140, 24649:0.095, 25003:0.076, 29143:0.156, 30459:0.075, 31537:0.156, 31559:0.075, 31668:0.139, 33208:0.117, 33425:0.218, 36491:0.075, 38378:0.130, 39789:0.110, 40743:0.190, 45775:0.086] 1.0 : [distance-squared=0.9823018320457279]: /commits/200808.gz/1722278226.1219149603005.JavaMail.www-data@brutus = [1065:0.188, 2246:0.088, 3008:0.076, 3620:0.239, 5200:0.104, 5336:0.106, 6404:0.088, 7552:0.335, 7683:0.122, 7715:0.376, 7812:0.173, 7832:0.072, 10257:0.128, 11663:0.195, 15313:0.063, 16660:0.094, 19359:0.177, 19525:0.134, 19551:0.101, 20025:0.183, 21233:0.098, 24649:0.092, 25003:0.112, 27650:0.283, 27653:0.216, 29143:0.150, 30459:0.072, 30868:0.208, 31559:0.126, 31565:0.203, 33208:0.113, 36491:0.073, 36610:0.141, 36767:0.208, 38378:0.125, 39789:0.106, 45775:0.083] 1.0 : [distance-squared=0.9509142993214911]: /commits/201006.gz/5844140.863.1277658000780.JavaMail.confluence@thor = [648:0.100, 914:0.066, 2040:0.076, 2246:0.078, 3008:0.048, 4419:0.076, 4452:0.070, 5200:0.065, 5203:0.140, 5336:0.067, 6404:0.056, 7235:0.048, 7310:0.077, 7464:0.067, 7471:0.060, 7489:0.093, 7505:0.123, 7683:0.077, 7715:0.145, 7814:0.072, 7912:0.155, 8268:0.098, 9835:0.118, 10225:0.081, 10257:0.114, 11127:0.112, 11510:0.086, 11589:0.139, 11663:0.087, 12641:0.117, 13837:0.052, 14030:0.062, 14089:0.051, 14352:0.061, 14396:0.185, 17015:0.115, 17240:0.097, 18767:0.149, 19774:0.124, 20346:0.159, 21233:0.075, 23657:0.089, 23939:0.078, 23974:0.105, 23998:0.146, 24962:0.122, 25003:0.093, 25084:0.151, 25128:0.052, 29143:0.095, 30459:0.046, 30806:0.075, 31559:0.046, 31727:0.104, 31895:0.105, 31900:0.153, 32149:0.079, 32993:0.069, 33112:0.177, 33208:0.101, 33351:0.089, 33533:0.079, 33638:0.042, 35795:0.066, 36189:0.078, 36491:0.046, 36500:0.093, 36625:0.200, 37111:0.071, 39336:0.079, 39789:0.067, 39933:0.073, 39967:0.079, 41155:0.167, 41280:0.065, 41696:0.072, 41947:0.118, 43685:0.086, 44077:0.308, 44353:0.215, 44423:0.085, 45215:0.151, 45775:0.052, 46766:0.074, 47823:0.082, 48120:0.080, 48212:0.109, 48436:0.110] [snip] *clustering; dirichlet:* Get this complaint: Running Dirichlet with K = 8 Running on hadoop, using /home/ec2-user/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/ec2-user/mahout-distribution-0.9/examples/target/mahout-examples-0.9-job.jar 14/01/21 05:16:35 WARN driver.MahoutDriver: Unable to add class: dirichlet 14/01/21 05:16:35 WARN driver.MahoutDriver: No dirichlet.props found on classpath, will use command-line arguments only Unknown program 'dirichlet' chosen. *clustering: minhash:* Running Minhash Running on
Build failed in Jenkins: Mahout-Examples-Classify-20News #401
See https://builds.apache.org/job/Mahout-Examples-Classify-20News/401/changes Changes: [smarthi] Reverting back to 0.9-SNAPSHOT -- [...truncated 3525 lines...] 777 494 778 494 779 493 780 492 781 492 782 491 783 491 784 490 785 490 786 489 787 489 788 488 789 488 790 488 791 487 792 487 793 487 794 487 795 487 796 486 797 485 798 485 799 484 800 481 801 481 802 480 803 477 804 477 805 477 806 477 807 477 808 476 809 475 810 475 811 474 812 474 813 474 814 473 815 473 816 473 817 472 818 472 819 471 820 471 821 471 822 470 823 470 824 470 825 470 826 469 827 469 828 468 829 468 830 468 831 466 832 466 833 466 834 465 835 465 836 464 837 464 838 464 839 463 840 463 841 462 842 461 843 461 844 461 845 461 846 461 847 461 848 460 849 460 850 459 851 458 852 458 853 458 854 456 855 455 856 455 857 455 858 454 859 454 860 453 861 453 862 452 863 452 864 452 865 451 866 451 867 451 868 451 869 450 870 450 871 449 872 448 873 448 874 448 875 448 876 447 877 447 878 447 879 446 880 446 881 445 882 445 883 444 884 443 885 443 886 443 887 442 888 442 889 441 890 441 891 441 892 440 893 439 894 438 895 437 896 437 897 437 898 436 899 436 900 436 901 435 902 435 903 434 904 434 905 433 906 433 907 433 908 432 909 432 910 431 911 431 912 431 913 431 914 431 915 430 916 430 917 430 918 428 919 428 920 426 921 425 922 425 923 425 924 425 925 425 926 425 927 424 928 423 929 421 930 421 931 421 932 421 933 418 934 418 935 417 936 416 937 416 938 416 939 415 940 415 941 415 942 415 943 414 944 414 945 414 946 413 947 413 948 412 949 411 950 410 951 410 952 410 953 409 954 408 955 407 956 406 957 406 958 405 959 405 960 405 961 405 962 404 963 404 964 404 965 404 966 404 967 404 968 404 969 404 970 403 971 402 972 402 973 401 974 401 975 401 976 400 977 400 978 399 979 399 980 398 981 397 982 397 983 397 984 396 985 396 986 395 987 395 988 395 989 394 990 394 991 394 992 394 993 393 994 393 995 393 996 393 997 393 998 393 999 393 1000392 Jan 21, 2014 8:16:46 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 398642 ms (Minutes: 6.6440334) Testing on /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/ with model: /tmp/news-group.model hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.9-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] Jan 21, 2014 8:16:46 PM org.slf4j.impl.JCLLoggerAdapter warn WARNING: No org.apache.mahout.classifier.sgd.TestNewsGroups.props found on classpath, will use command-line arguments only 1 test files Exception in thread main java.lang.IndexOutOfBoundsException: Index: 12, Size: 1 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.mahout.classifier.sgd.TestNewsGroups.run(TestNewsGroups.java:95) at org.apache.mahout.classifier.sgd.TestNewsGroups.main(TestNewsGroups.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) Build step 'Execute shell' marked build as failure
RE: MAHOUT 0.9 Release - New URL
from the asf-email-examples.sh script: # You will need to download or otherwise obtain some or all of the Amazon ASF Em ail Public Dataset (http://aws.amazon.com/datasets/7791434387204566) to use this script. # To obtain a full copy you will need to launch an EC2 instance and mount the da taset to download it, otherwise you can get a sample of it at # http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout It looks like the: http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout link is down. Is there somewhere else that we can get a subset of the ASF emails? Date: Tue, 21 Jan 2014 09:48:06 -0800 Subject: Re: MAHOUT 0.9 Release - New URL From: andrew.mussel...@gmail.com To: dev@mahout.apache.org Sure thing; continuing to smoke test the other examples tonight On Tue, Jan 21, 2014 at 9:23 AM, Suneel Marthi suneel_mar...@yahoo.comwrote: Thanks Andrew M., see that some of the example scripts need to be fixed as they still refer to the deprecated algorithms. See that the Streaming KMeans has failed for you as well. I'll be rolling back the release today to fix these issues. On Tuesday, January 21, 2014 1:22 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: Builds on Ubuntu 12.04 from tarball and zip, and on AWS's default 64-bit Linux AMI from tarball. All tests pass. *Output of examples:* *asf-email-examples.sh, run on mahout.apache.org http://mahout.apache.org:* *recommendations:* [ec2-user@ip-10-73-146-199 bin]$ hadoop fs -cat /user/ec2-user/asf-output/prefs/recommendations/part-r-0 | less 1 [21935:1.0,23122:1.0,24084:1.0,26397:1.0,1755:1.0,20743:1.0,13428:1.0,19483:1.0,24067:1.0] 4 [14372:1.0,28069:1.0,12258:1.0,18412:1.0,26707:1.0,14610:1.0,2909:1.0,14777:1.0,11792:1.0,26764:1.0] 6 [5442:1.0,18416:1.0,17554:1.0,14610:1.0,16767:1.0,16740:1.0,26743:1.0,11792:1.0,26707:1.0,28116:1.0] 8 [12758:1.0,19409:1.0,2:1.0] 11 [25890:1.0,26743:1.0,9122:1.0,14512:1.0,28116:1.0,17499:1.0,14976:1.0,14561:1.0,3686:1.0,26707:1.0] 14 [29596:1.0,25567:1.0,19520:1.0,26327:1.0,13809:1.0,29435:1.0,17331:1.0,17290:1.0,17819:1.0,3829:1.0] 15 [15355:1.0,15322:1.0,23191:1.0,7990:1.0,15318:1.0,15236:1.0,17789:1.0,15286:1.0,20916:1.0,2812:1.0] 16 [23647:1.0,18137:1.0,1692:1.0,11490:1.0,4303:1.0,12906:1.0,5120:1.0,29503:1.0,19409:1.0,27700:1.0] 18 [29738:1.0,12070:1.0,24078:1.0,19449:1.0,17819:1.0,11549:1.0,25410:1.0,15228:1.0,24930:1.0,23708:1.0] 19 [28008:1.0,18416:1.0,2909:1.0,29250:1.0,28023:1.0,14974:1.0] 20 [19313:1.0,3464:1.0,12394:1.0,18665:1.0,16601:1.0,25816:1.0,10212:1.0,11626:1.0,18577:1.0,16734:1.0] [snip] *clustering; kmeans:* [snip] Weight : [props - optional]: Point: 1.0 : [distance-squared=1.0193102046188427]: /commits/200802.gz/20835820.1202052180347.JavaMail.www-data@brutus = [1065:0.195, 1977:0.355, 2246:0.091, 3008:0.078, 5336:0.110, 7573:0.204, 7683:0.126, 7715:0.365, 7812:0.180, 7832:0.075, 8268:0.093, 9779:0.159, 10257:0.133, 10972:0.158, 11663:0.143, 15313:0.065, 17007:0.244, 19359:0.183, 19399:0.338, 19525:0.139, 20224:0.140, 24649:0.095, 25003:0.076, 29143:0.156, 30459:0.075, 31537:0.156, 31559:0.075, 31668:0.139, 33208:0.117, 33425:0.218, 36491:0.075, 38378:0.130, 39789:0.110, 40743:0.190, 45775:0.086] 1.0 : [distance-squared=0.9823018320457279]: /commits/200808.gz/1722278226.1219149603005.JavaMail.www-data@brutus = [1065:0.188, 2246:0.088, 3008:0.076, 3620:0.239, 5200:0.104, 5336:0.106, 6404:0.088, 7552:0.335, 7683:0.122, 7715:0.376, 7812:0.173, 7832:0.072, 10257:0.128, 11663:0.195, 15313:0.063, 16660:0.094, 19359:0.177, 19525:0.134, 19551:0.101, 20025:0.183, 21233:0.098, 24649:0.092, 25003:0.112, 27650:0.283, 27653:0.216, 29143:0.150, 30459:0.072, 30868:0.208, 31559:0.126, 31565:0.203, 33208:0.113, 36491:0.073, 36610:0.141, 36767:0.208, 38378:0.125, 39789:0.106, 45775:0.083] 1.0 : [distance-squared=0.9509142993214911]: /commits/201006.gz/5844140.863.1277658000780.JavaMail.confluence@thor = [648:0.100, 914:0.066, 2040:0.076, 2246:0.078, 3008:0.048, 4419:0.076, 4452:0.070, 5200:0.065, 5203:0.140, 5336:0.067, 6404:0.056, 7235:0.048, 7310:0.077, 7464:0.067, 7471:0.060, 7489:0.093, 7505:0.123, 7683:0.077, 7715:0.145, 7814:0.072, 7912:0.155, 8268:0.098, 9835:0.118, 10225:0.081, 10257:0.114, 11127:0.112, 11510:0.086, 11589:0.139, 11663:0.087, 12641:0.117, 13837:0.052, 14030:0.062, 14089:0.051, 14352:0.061, 14396:0.185, 17015:0.115, 17240:0.097, 18767:0.149, 19774:0.124, 20346:0.159, 21233:0.075, 23657:0.089, 23939:0.078, 23974:0.105, 23998:0.146, 24962:0.122, 25003:0.093, 25084:0.151, 25128:0.052, 29143:0.095, 30459:0.046, 30806:0.075, 31559:0.046, 31727:0.104, 31895:0.105, 31900:0.153, 32149:0.079, 32993:0.069, 33112:0.177, 33208:0.101, 33351:0.089, 33533:0.079, 33638:0.042, 35795:0.066, 36189:0.078,
[jira] [Updated] (MAHOUT-1398) FileDataModel should provide a constructor with a delimiterPattern
[ https://issues.apache.org/jira/browse/MAHOUT-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-1398: -- Affects Version/s: (was: 0.9) 0.8 Fix Version/s: (was: 1.0) 0.9 Assignee: Sebastian Schelter Moving this to 0.9 FileDataModel should provide a constructor with a delimiterPattern -- Key: MAHOUT-1398 URL: https://issues.apache.org/jira/browse/MAHOUT-1398 Project: Mahout Issue Type: Improvement Components: Collaborative Filtering Affects Versions: 0.8 Reporter: Roy Guo Assignee: Sebastian Schelter Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1398.patch For now we only have ',' and '\t' as delimiters, this is really not enough for users. Of course users can overwritten processLine etc. to archive their goal(e.g. use four spaces as delimiter pattern), but as a well designed framework, Mahout should consider vary demands of most users and make it very easy to use. Also, it will not cost much time to implement, can I push a patch on this ? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
Suneel Marthi created MAHOUT-1400: - Summary: Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Suneel Marthi Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi reassigned MAHOUT-1400: - Assignee: Sebastian Schelter (was: Suneel Marthi) Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877844#comment-13877844 ] Sebastian Schelter commented on MAHOUT-1400: I will remove the asf-email-examples script for this release, as (1) it still has references to the deprecated algorithms , (2) the sample data is not available under the given address. We should definitely fix and reintroduce it for the next release Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter resolved MAHOUT-1400. Resolution: Fixed Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: MAHOUT 0.9 Release - New URL
Thanks Andrew for reporting that. I rolled back the release to fix this and few other issues. We have removed asf-examples*.sh from trunk as the sample file at the url mentioned in ur email is not available. This is something we need to fix and restore in 1.0. On Tuesday, January 21, 2014 3:21 PM, Andrew Palumbo ap@outlook.com wrote: from the asf-email-examples.sh script: # You will need to download or otherwise obtain some or all of the Amazon ASF Em ail Public Dataset (http://aws.amazon.com/datasets/7791434387204566) to use this script. # To obtain a full copy you will need to launch an EC2 instance and mount the da taset to download it, otherwise you can get a sample of it at # http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout It looks like the: http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout link is down. Is there somewhere else that we can get a subset of the ASF emails? Date: Tue, 21 Jan 2014 09:48:06 -0800 Subject: Re: MAHOUT 0.9 Release - New URL From: andrew.mussel...@gmail.com To: dev@mahout.apache.org Sure thing; continuing to smoke test the other examples tonight On Tue, Jan 21, 2014 at 9:23 AM, Suneel Marthi suneel_mar...@yahoo.comwrote: Thanks Andrew M., see that some of the example scripts need to be fixed as they still refer to the deprecated algorithms. See that the Streaming KMeans has failed for you as well. I'll be rolling back the release today to fix these issues. On Tuesday, January 21, 2014 1:22 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: Builds on Ubuntu 12.04 from tarball and zip, and on AWS's default 64-bit Linux AMI from tarball. All tests pass. *Output of examples:* *asf-email-examples.sh, run on mahout.apache.org http://mahout.apache.org:* *recommendations:* [ec2-user@ip-10-73-146-199 bin]$ hadoop fs -cat /user/ec2-user/asf-output/prefs/recommendations/part-r-0 | less 1 [21935:1.0,23122:1.0,24084:1.0,26397:1.0,1755:1.0,20743:1.0,13428:1.0,19483:1.0,24067:1.0] 4 [14372:1.0,28069:1.0,12258:1.0,18412:1.0,26707:1.0,14610:1.0,2909:1.0,14777:1.0,11792:1.0,26764:1.0] 6 [5442:1.0,18416:1.0,17554:1.0,14610:1.0,16767:1.0,16740:1.0,26743:1.0,11792:1.0,26707:1.0,28116:1.0] 8 [12758:1.0,19409:1.0,2:1.0] 11 [25890:1.0,26743:1.0,9122:1.0,14512:1.0,28116:1.0,17499:1.0,14976:1.0,14561:1.0,3686:1.0,26707:1.0] 14 [29596:1.0,25567:1.0,19520:1.0,26327:1.0,13809:1.0,29435:1.0,17331:1.0,17290:1.0,17819:1.0,3829:1.0] 15 [15355:1.0,15322:1.0,23191:1.0,7990:1.0,15318:1.0,15236:1.0,17789:1.0,15286:1.0,20916:1.0,2812:1.0] 16 [23647:1.0,18137:1.0,1692:1.0,11490:1.0,4303:1.0,12906:1.0,5120:1.0,29503:1.0,19409:1.0,27700:1.0] 18 [29738:1.0,12070:1.0,24078:1.0,19449:1.0,17819:1.0,11549:1.0,25410:1.0,15228:1.0,24930:1.0,23708:1.0] 19 [28008:1.0,18416:1.0,2909:1.0,29250:1.0,28023:1.0,14974:1.0] 20 [19313:1.0,3464:1.0,12394:1.0,18665:1.0,16601:1.0,25816:1.0,10212:1.0,11626:1.0,18577:1.0,16734:1.0] [snip] *clustering; kmeans:* [snip] Weight : [props - optional]: Point: 1.0 : [distance-squared=1.0193102046188427]: /commits/200802.gz/20835820.1202052180347.JavaMail.www-data@brutus = [1065:0.195, 1977:0.355, 2246:0.091, 3008:0.078, 5336:0.110, 7573:0.204, 7683:0.126, 7715:0.365, 7812:0.180, 7832:0.075, 8268:0.093, 9779:0.159, 10257:0.133, 10972:0.158, 11663:0.143, 15313:0.065, 17007:0.244, 19359:0.183, 19399:0.338, 19525:0.139, 20224:0.140, 24649:0.095, 25003:0.076, 29143:0.156, 30459:0.075, 31537:0.156, 31559:0.075, 31668:0.139, 33208:0.117, 33425:0.218, 36491:0.075, 38378:0.130, 39789:0.110, 40743:0.190, 45775:0.086] 1.0 : [distance-squared=0.9823018320457279]: /commits/200808.gz/1722278226.1219149603005.JavaMail.www-data@brutus = [1065:0.188, 2246:0.088, 3008:0.076, 3620:0.239, 5200:0.104, 5336:0.106, 6404:0.088, 7552:0.335, 7683:0.122, 7715:0.376, 7812:0.173, 7832:0.072, 10257:0.128, 11663:0.195, 15313:0.063, 16660:0.094, 19359:0.177, 19525:0.134, 19551:0.101, 20025:0.183, 21233:0.098, 24649:0.092, 25003:0.112, 27650:0.283, 27653:0.216, 29143:0.150, 30459:0.072, 30868:0.208, 31559:0.126, 31565:0.203, 33208:0.113, 36491:0.073, 36610:0.141, 36767:0.208, 38378:0.125, 39789:0.106, 45775:0.083] 1.0 : [distance-squared=0.9509142993214911]: /commits/201006.gz/5844140.863.1277658000780.JavaMail.confluence@thor = [648:0.100, 914:0.066, 2040:0.076, 2246:0.078, 3008:0.048, 4419:0.076, 4452:0.070, 5200:0.065, 5203:0.140, 5336:0.067, 6404:0.056, 7235:0.048, 7310:0.077, 7464:0.067, 7471:0.060, 7489:0.093, 7505:0.123, 7683:0.077, 7715:0.145, 7814:0.072, 7912:0.155, 8268:0.098, 9835:0.118, 10225:0.081, 10257:0.114, 11127:0.112, 11510:0.086, 11589:0.139, 11663:0.087, 12641:0.117, 13837:0.052, 14030:0.062, 14089:0.051, 14352:0.061, 14396:0.185, 17015:0.115, 17240:0.097, 18767:0.149, 19774:0.124,
Jenkins build is back to normal : Mahout-Examples-Classify-20News #402
See https://builds.apache.org/job/Mahout-Examples-Classify-20News/402/changes
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877872#comment-13877872 ] Hudson commented on MAHOUT-1400: SUCCESS: Integrated in Mahout-Quality #2428 (See [https://builds.apache.org/job/Mahout-Quality/2428/]) MAHOUT-1400 Remove references to deprecated and removed algorithms from examples scripts (ssc: rev 1560185) * /mahout/trunk/CHANGELOG * /mahout/trunk/examples/bin/README.txt MAHOUT-1400 Remove references to deprecated and removed algorithms from examples scripts (ssc: rev 1560178) * /mahout/trunk/examples/bin/asf-email-examples.sh * /mahout/trunk/examples/bin/build-asf-email.sh * /mahout/trunk/examples/bin/build-cluster-syntheticcontrol.sh * /mahout/trunk/examples/bin/cluster-syntheticcontrol.sh Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MAHOUT-1398) FileDataModel should provide a constructor with a delimiterPattern
[ https://issues.apache.org/jira/browse/MAHOUT-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter resolved MAHOUT-1398. Resolution: Fixed FileDataModel should provide a constructor with a delimiterPattern -- Key: MAHOUT-1398 URL: https://issues.apache.org/jira/browse/MAHOUT-1398 Project: Mahout Issue Type: Improvement Components: Collaborative Filtering Affects Versions: 0.8 Reporter: Roy Guo Assignee: Sebastian Schelter Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1398.patch For now we only have ',' and '\t' as delimiters, this is really not enough for users. Of course users can overwritten processLine etc. to archive their goal(e.g. use four spaces as delimiter pattern), but as a well designed framework, Mahout should consider vary demands of most users and make it very easy to use. Also, it will not cost much time to implement, can I push a patch on this ? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: MAHOUT 0.9 Release - New URL
*classify-20newsgroups.sh* *Complementary naive bayes:* === Summary --- Correctly Classified Instances : 11207 98.9406% Incorrectly Classified Instances:1201.0594% Total Classified Instances : 11327 === Confusion Matrix --- a b c d e f g h i j k l m n o p q r s t--Classified as 475 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 | 478 a = alt.atheism 0 597 1 1 0 1 1 0 0 0 0 1 0 2 1 0 0 0 0 0 | 605 b = comp.graphics 0 1 620 3 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 | 627 c = comp.os.ms-windows.misc 1 1 1 593 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 | 599 d = comp.sys.ibm.pc.hardware 0 1 1 0 568 0 1 0 0 0 1 1 2 0 0 0 0 1 0 0 | 576 e = comp.sys.mac.hardware 0 4 2 0 0 581 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 587 f = comp.windows.x 0 0 0 1 2 0 571 3 0 0 1 1 4 1 0 0 0 0 0 0 | 584 g = misc.forsale 0 0 0 1 0 0 0 589 1 0 0 1 1 0 0 0 0 0 0 0 | 593 h = rec.autos 0 0 0 0 0 0 0 1 565 0 0 0 0 0 1 0 0 0 0 0 | 567 i = rec.motorcycles 0 0 0 0 0 0 0 0 0 600 2 0 0 0 1 0 0 0 0 0 | 603 j = rec.sport.baseball 0 0 0 0 0 0 0 0 0 1 584 0 0 0 0 0 0 0 0 0 | 585 k = rec.sport.hockey 0 0 0 0 0 0 0 0 0 0 0 579 0 0 0 0 0 1 0 0 | 580 l = sci.crypt 0 0 0 1 3 0 2 0 0 2 0 0 567 1 2 1 0 0 0 0 | 579 m = sci.electronics 0 0 0 0 0 0 0 0 0 0 0 0 1 605 0 0 0 0 0 0 | 606 n = sci.med 0 0 0 0 0 0 0 0 0 0 0 0 0 0 602 0 0 0 0 0 | 602 o = sci.space 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 602 0 0 1 0 | 604 p = soc.religion.christian 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 556 0 0 0 | 556 q = talk.politics.mideast 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 568 0 0 | 571 r = talk.politics.guns 11 0 0 0 0 0 0 0 0 1 0 0 0 1 3 8 1 4 338 2 | 369 s = talk.religion.misc 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 3 4 0 447 | 456 t = talk.politics.misc === Statistics --- Kappa 0.9806 Accuracy 98.9406% Reliability94.0932% Reliability (standard deviation)0.2163 Jan 21, 2014 6:37:28 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 15870 ms (Minutes:
[jira] [Commented] (MAHOUT-1398) FileDataModel should provide a constructor with a delimiterPattern
[ https://issues.apache.org/jira/browse/MAHOUT-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877924#comment-13877924 ] Hudson commented on MAHOUT-1398: SUCCESS: Integrated in Mahout-Quality #2429 (See [https://builds.apache.org/job/Mahout-Quality/2429/]) MAHOUT-1398 FileDataModel should provide a constructor with a delimiterPattern (ssc: rev 1560202) * /mahout/trunk/CHANGELOG * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/impl/model/file/FileDataModel.java * /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/impl/model/file/FileDataModelTest.java FileDataModel should provide a constructor with a delimiterPattern -- Key: MAHOUT-1398 URL: https://issues.apache.org/jira/browse/MAHOUT-1398 Project: Mahout Issue Type: Improvement Components: Collaborative Filtering Affects Versions: 0.8 Reporter: Roy Guo Assignee: Sebastian Schelter Priority: Minor Fix For: 0.9 Attachments: MAHOUT-1398.patch For now we only have ',' and '\t' as delimiters, this is really not enough for users. Of course users can overwritten processLine etc. to archive their goal(e.g. use four spaces as delimiter pattern), but as a well designed framework, Mahout should consider vary demands of most users and make it very easy to use. Also, it will not cost much time to implement, can I push a patch on this ? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work started] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAHOUT-1401 started by Suneel Marthi. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (MAHOUT-1401) Resurrect Frequent Pattern mining
Suneel Marthi created MAHOUT-1401: - Summary: Resurrect Frequent Pattern mining Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (MAHOUT-1402) Zero clusters using streaming k-means option in cluster-reuters.sh
Andrew Musselman created MAHOUT-1402: Summary: Zero clusters using streaming k-means option in cluster-reuters.sh Key: MAHOUT-1402 URL: https://issues.apache.org/jira/browse/MAHOUT-1402 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: AWS default Linux AMI Reporter: Andrew Musselman Fix For: 0.9 Running cluster-reuters.sh in examples/bin results in this: [snip] INFO: Number of Centroids: 0 Jan 22, 2014 1:52:22 AM org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local23982482_0001 java.lang.IllegalArgumentException: Must have nonzero number of training and test vectors. Asked for %.1f %% of %d vectors for test [10.00149011612, 0] at com.google.common.base.Preconditions.checkArgument(Preconditions.java:120) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.splitTrainTest(BallKMeans.java:176) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.cluster(BallKMeans.java:192) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.getBestCentroids(StreamingKMeansReducer.java:107) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:73) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:37) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:177) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398) [snip] WARNING: No qualcluster.props found on classpath, will use command-line arguments only Num clusters: 0; maxDistance: 0.00 [Dunn Index] First: Infinity [Davies-Bouldin Index] First: NaN Jan 22, 2014 1:52:24 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 535 ms (Minutes: 0.008916) cluster,distance.mean,distance.sd,distance.q0,distance.q1,distance.q2,distance.q3,distance.q4,count,is.train -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878264#comment-13878264 ] Yoonmin Nam commented on MAHOUT-1401: - I believe we must focus on the availability of FPM algorithm during the resurrection. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878269#comment-13878269 ] Suneel Marthi commented on MAHOUT-1401: --- Sorry, could u explain that - 'availability of FPM algorithm' ? Not sure I get what u r trying to convey. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1296) Remove deprecated algorithms
[ https://issues.apache.org/jira/browse/MAHOUT-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878274#comment-13878274 ] Hudson commented on MAHOUT-1296: FAILURE: Integrated in Mahout-Quality #2430 (See [https://builds.apache.org/job/Mahout-Quality/2430/]) MAHOUT-1296: /examples/src/main/java/org/apache/mahout/fpm directory should have been deleted as part of this jira. (smarthi: rev 1560250) * /mahout/trunk/examples/src/main/java/org/apache/mahout/fpm Remove deprecated algorithms Key: MAHOUT-1296 URL: https://issues.apache.org/jira/browse/MAHOUT-1296 Project: Mahout Issue Type: Improvement Reporter: Sebastian Schelter Assignee: Sebastian Schelter Fix For: 0.9 Remove the algorithms we chose to deprecate in MAHOUT-1250 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Build failed in Jenkins: Mahout-Quality #2430
See https://builds.apache.org/job/Mahout-Quality/2430/changes Changes: [smarthi] MAHOUT-1296: /examples/src/main/java/org/apache/mahout/fpm directory should have been deleted as part of this jira. -- [...truncated 4251 lines...] Running org.apache.mahout.math.neighborhood.SearchQualityTest Running org.apache.mahout.math.neighborhood.SearchSanityTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.919 sec - in org.apache.mahout.math.hadoop.TestDistributedRowMatrix Running org.apache.mahout.math.ssvd.SequentialOutOfCoreSvdTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.993 sec - in org.apache.mahout.math.hadoop.solver.TestDistributedConjugateGradientSolverCLI Running org.apache.mahout.math.stats.OnlineAucTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.64 sec - in org.apache.mahout.math.hadoop.solver.TestDistributedConjugateGradientSolver Running org.apache.mahout.math.stats.SamplerTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.372 sec - in org.apache.mahout.math.stats.SamplerTest Running org.apache.mahout.math.VectorWritableTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.488 sec - in org.apache.mahout.math.stats.OnlineAucTest Running org.apache.mahout.cf.taste.common.CommonTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.566 sec - in org.apache.mahout.cf.taste.common.CommonTest Tests run: 100, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.718 sec - in org.apache.mahout.math.VectorWritableTest Running org.apache.mahout.cf.taste.hadoop.TopItemsQueueTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.136 sec - in org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.33 sec - in org.apache.mahout.cf.taste.hadoop.TopItemsQueueTest Running org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducerTest Running org.apache.mahout.cf.taste.hadoop.item.RecommenderJobTest Running org.apache.mahout.cf.taste.hadoop.TasteHadoopUtilsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.034 sec - in org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.632 sec - in org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducerTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.336 sec - in org.apache.mahout.cf.taste.hadoop.TasteHadoopUtilsTest Running org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJobTest Running org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJobTest Running org.apache.mahout.cf.taste.impl.common.RefreshHelperTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.632 sec - in org.apache.mahout.cf.taste.impl.common.RefreshHelperTest Running org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.526 sec - in org.apache.mahout.cf.taste.impl.common.InvertedRunningAverageTest Running org.apache.mahout.cf.taste.impl.common.CacheTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.285 sec - in org.apache.mahout.cf.taste.impl.common.CacheTest Running org.apache.mahout.cf.taste.impl.common.RunningAverageTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.328 sec - in org.apache.mahout.cf.taste.impl.common.RunningAverageTest Running org.apache.mahout.cf.taste.impl.common.FastMapTest Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.175 sec - in org.apache.mahout.cf.taste.impl.common.FastMapTest Running org.apache.mahout.cf.taste.impl.common.BitSetTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.303 sec - in org.apache.mahout.cf.taste.impl.common.BitSetTest Running org.apache.mahout.cf.taste.impl.common.FastByIDMapTest Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.079 sec - in org.apache.mahout.cf.taste.impl.common.FastByIDMapTest Running org.apache.mahout.cf.taste.impl.common.LongPrimitiveArrayIteratorTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.331 sec - in org.apache.mahout.cf.taste.impl.common.LongPrimitiveArrayIteratorTest Running org.apache.mahout.cf.taste.impl.common.WeightedRunningAverageTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.983 sec - in org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDPCASparseTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.292 sec - in org.apache.mahout.cf.taste.impl.common.WeightedRunningAverageTest Running org.apache.mahout.cf.taste.impl.common.RunningAverageAndStdDevTest Running org.apache.mahout.cf.taste.impl.common.FastIDSetTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.434 sec - in
[jira] [Commented] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878282#comment-13878282 ] Yoonmin Nam commented on MAHOUT-1401: - I mean that the very first, but crucial problem of current implementation of FPM algorithm is the explosion of intermediate data when generating frequent patterns. Above explosion makes FPM not available in many cases even we use very small input. That problem comes from the shortage of intermediate buffer of MapReduce, but as we consider the FPM in the algorithm-level, so we should find the another alternatives either avoid or handle that problem I mentioned. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878282#comment-13878282 ] Yoonmin Nam edited comment on MAHOUT-1401 at 1/22/14 5:34 AM: -- I mean that the very first, but crucial problem of current implementation of FPM algorithm is the explosion of intermediate data when generating frequent patterns. Above explosion makes FPM not available in many cases even we use very small input. That problem comes from the shortage of intermediate buffer of MapReduce, but as we consider the FPM in the algorithm-level, so we should find the another alternatives either to avoid or handle that problem I mentioned. was (Author: ronymin): I mean that the very first, but crucial problem of current implementation of FPM algorithm is the explosion of intermediate data when generating frequent patterns. Above explosion makes FPM not available in many cases even we use very small input. That problem comes from the shortage of intermediate buffer of MapReduce, but as we consider the FPM in the algorithm-level, so we should find the another alternatives either avoid or handle that problem I mentioned. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878287#comment-13878287 ] Suneel Marthi commented on MAHOUT-1401: --- We don't have the time in 0.9 Release to fix the implementation, that's definitely something that needs to be fixed in future releases. Let me go ahead and resurrect the deleted code for now. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878293#comment-13878293 ] Suneel Marthi commented on MAHOUT-1400: --- factorize-netflix.sh: References a data set that is no longer available and Netflix took down after the competition. Should we retire this script too? Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi resolved MAHOUT-1401. --- Resolution: Fixed Fix Version/s: 0.9 Code committed back into trunk. Resurrect Frequent Pattern mining - Key: MAHOUT-1401 URL: https://issues.apache.org/jira/browse/MAHOUT-1401 Project: Mahout Issue Type: Bug Reporter: Suneel Marthi Assignee: Suneel Marthi Priority: Critical Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (MAHOUT-1402) Zero clusters using streaming k-means option in cluster-reuters.sh
[ https://issues.apache.org/jira/browse/MAHOUT-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi reassigned MAHOUT-1402: - Assignee: Suneel Marthi Zero clusters using streaming k-means option in cluster-reuters.sh -- Key: MAHOUT-1402 URL: https://issues.apache.org/jira/browse/MAHOUT-1402 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: AWS default Linux AMI Reporter: Andrew Musselman Assignee: Suneel Marthi Fix For: 0.9 Running cluster-reuters.sh in examples/bin results in this: [snip] INFO: Number of Centroids: 0 Jan 22, 2014 1:52:22 AM org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local23982482_0001 java.lang.IllegalArgumentException: Must have nonzero number of training and test vectors. Asked for %.1f %% of %d vectors for test [10.00149011612, 0] at com.google.common.base.Preconditions.checkArgument(Preconditions.java:120) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.splitTrainTest(BallKMeans.java:176) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.cluster(BallKMeans.java:192) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.getBestCentroids(StreamingKMeansReducer.java:107) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:73) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:37) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:177) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398) [snip] WARNING: No qualcluster.props found on classpath, will use command-line arguments only Num clusters: 0; maxDistance: 0.00 [Dunn Index] First: Infinity [Davies-Bouldin Index] First: NaN Jan 22, 2014 1:52:24 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 535 ms (Minutes: 0.008916) cluster,distance.mean,distance.sd,distance.q0,distance.q1,distance.q2,distance.q3,distance.q4,count,is.train -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1402) Zero clusters using streaming k-means option in cluster-reuters.sh
[ https://issues.apache.org/jira/browse/MAHOUT-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878304#comment-13878304 ] Suneel Marthi commented on MAHOUT-1402: --- The MR version of Streaming KMeans seems to be failing (the sequential mode passes), the reason being that the reducer is reading zero centroids from the mappers; need to investigate as to what's going on. Zero clusters using streaming k-means option in cluster-reuters.sh -- Key: MAHOUT-1402 URL: https://issues.apache.org/jira/browse/MAHOUT-1402 Project: Mahout Issue Type: Bug Components: Clustering Affects Versions: 0.8 Environment: AWS default Linux AMI Reporter: Andrew Musselman Assignee: Suneel Marthi Fix For: 0.9 Running cluster-reuters.sh in examples/bin results in this: [snip] INFO: Number of Centroids: 0 Jan 22, 2014 1:52:22 AM org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local23982482_0001 java.lang.IllegalArgumentException: Must have nonzero number of training and test vectors. Asked for %.1f %% of %d vectors for test [10.00149011612, 0] at com.google.common.base.Preconditions.checkArgument(Preconditions.java:120) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.splitTrainTest(BallKMeans.java:176) at org.apache.mahout.clustering.streaming.cluster.BallKMeans.cluster(BallKMeans.java:192) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.getBestCentroids(StreamingKMeansReducer.java:107) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:73) at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:37) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:177) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398) [snip] WARNING: No qualcluster.props found on classpath, will use command-line arguments only Num clusters: 0; maxDistance: 0.00 [Dunn Index] First: Infinity [Davies-Bouldin Index] First: NaN Jan 22, 2014 1:52:24 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 535 ms (Minutes: 0.008916) cluster,distance.mean,distance.sd,distance.q0,distance.q1,distance.q2,distance.q3,distance.q4,count,is.train -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1401) Resurrect Frequent Pattern mining
[ https://issues.apache.org/jira/browse/MAHOUT-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878330#comment-13878330 ] Hudson commented on MAHOUT-1401: SUCCESS: Integrated in Mahout-Quality #2431 (See [https://builds.apache.org/job/Mahout-Quality/2431/]) MAHOUT-1401: Resurrecting Frequent Pattern Mining (smarthi: rev 1560259) * /mahout/trunk/CHANGELOG * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/AggregatorMapper.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/AggregatorReducer.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/CountDescendingPairComparator.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthDriver.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/MultiTransactionTreeIterator.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowth.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelCountingMapper.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelCountingReducer.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelFPGrowthCombiner.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelFPGrowthMapper.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelFPGrowthReducer.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/TransactionTree.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/TransactionTreeIterator.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/ContextStatusUpdater.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/ContextWriteOutputCollector.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/SequenceFileOutputCollector.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/StatusUpdater.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/TopKPatternsOutputConverter.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/TransactionIterator.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/integer/IntegerStringOutputConverter.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/string/StringOutputConverter.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/convertors/string/TopKStringPatterns.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/FPGrowth.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/FPTree.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/FPTreeDepthCache.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/FrequentPatternMaxHeap.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/LeastKCache.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/Pattern.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth2/FPGrowthIds.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth2/FPGrowthObj.java * /mahout/trunk/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth2/FPTree.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthRetailDataTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthRetailDataTest2.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthRetailDataTestVs.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthSyntheticDataTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/FPGrowthTest2.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthRetailDataTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthRetailDataTest2.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthRetailDataTestVs.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthSynthDataTest2.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/PFPGrowthTest2.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/TransactionTreeTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/fpm/pfpgrowth/fpgrowth/FrequentPatternMaxHeapTest.java *
Jenkins build is back to normal : Mahout-Quality #2431
See https://builds.apache.org/job/Mahout-Quality/2431/changes
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878339#comment-13878339 ] Sebastian Schelter commented on MAHOUT-1400: We should keep the example, the Netflix dataset is still regularly used in research papers. Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878346#comment-13878346 ] Suneel Marthi commented on MAHOUT-1400: --- Do we have a copy of the dataset someplace, we could add a reference to that in the script? Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878349#comment-13878349 ] Andrew Musselman commented on MAHOUT-1400: -- The ASF email dataset is usable via the AWS volume; perhaps the Netflix set can live in a snapshot too. Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878364#comment-13878364 ] Sean Owen commented on MAHOUT-1400: --- We don't automatically have permission to redistribute any dataset, even if it's still distributed publicly. In this case, I'm sure Netflix won't or can't grant that permission anyway. I would not host this or any data set via Apache unless it's clearly public domain, licensed appropriately (AL2 or appropriate Creative Commons) or permission has been given explicitly. Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878366#comment-13878366 ] Sebastian Schelter commented on MAHOUT-1400: Completely agree to that. Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Issue Comment Deleted] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-1400: -- Comment: was deleted (was: Do we have a copy of the dataset someplace, we could add a reference to that in the script?) Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1400) Remove references to deprecated and removed algorithms from examples scripts
[ https://issues.apache.org/jira/browse/MAHOUT-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878370#comment-13878370 ] Suneel Marthi commented on MAHOUT-1400: --- +1 to Sean's comment. Remove references to deprecated and removed algorithms from examples scripts Key: MAHOUT-1400 URL: https://issues.apache.org/jira/browse/MAHOUT-1400 Project: Mahout Issue Type: Bug Components: Examples Affects Versions: 0.8 Reporter: Suneel Marthi Assignee: Sebastian Schelter Fix For: 0.9 Still see references to old clustering algorithms like Minhash, Dirichlet in asf-email-examples.sh and cluster-syntheticcontrol.sh. Also remove build-asf-email.sh and build-cluster-syntheticcontrol.sh from examples/bin. -- This message was sent by Atlassian JIRA (v6.1.5#6160)