[ 
https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289373#comment-13289373
 ] 

Robin Anil commented on MAHOUT-939:
-----------------------------------

Verified SGD works as well.
encoding time: 200sec
training time: 510sec
testing time: 4sec

Running SGD Training
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
12/06/05 14:29:01 WARN driver.MahoutDriver: No 
org.apache.mahout.classifier.sgd.TrainASFEmail.props found on classpath, will 
use command-line arguments only
12/06/05 14:29:01 INFO common.AbstractJob: Command line arguments: 
{--cardinality=[100000], --categories=[2], --endPhase=[2147483647], 
--input=[/tmp/mahout-asf/classification/sgd/splits/mapRedOut/], 
--output=[/tmp/mahout-asf/classification/sgd/models], --poolSize=[5], 
--startPhase=[0], --tempDir=[temp], --threads=[20]}
2012-06-05 14:29:01.949 java[91795:1903] Unable to load realm info from 
SCDynamicStore
159915 training files
0.00    0.00    0.00    0.00    0.0000000       0.0000000       1       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       2       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       3       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       4       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       6       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       8       0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       10      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       12      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       15      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       20      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       25      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       30      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       40      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       50      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       60      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       70      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       80      0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       100     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       120     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       140     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       150     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       200     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       250     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       300     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       400     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       500     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       600     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       700     0.000   
0.00    none
0.00    0.00    0.00    0.00    0.0000000       0.0000000       800     0.000   
0.00    none
0.00    290.00  284.00  0.00    0.079926227     1.0003742e-08   1000    -0.692  
93.38   none
0.00    290.00  284.00  0.00    0.079926227     1.0003742e-08   1200    -0.692  
93.38   none
0.00    290.00  284.00  0.00    0.079926227     1.0003742e-08   1400    -0.692  
93.38   none
0.00    290.00  284.00  0.00    0.079926227     1.0003742e-08   1500    -0.692  
93.38   none
0.00    286.00  500.00  0.01    0.079926227     1.0000000e-08   2000    -0.692  
94.13   none
0.00    207.00  555.00  0.01    0.079926227     1.0000000e-08   2500    -0.692  
94.13   none
0.00    207.00  555.00  0.01    0.079926227     1.0000000e-08   3000    -0.692  
94.13   none
0.00    89.00   196.00  0.01    0.079926227     1.0000000e-08   4000    -0.692  
94.23   none
0.00    74.00   244.00  0.01    0.079926264     1.0000000e-08   5000    -0.691  
94.88   none
0.00    74.00   998.00  0.01    0.079955672     1.0000000e-08   6000    -0.691  
93.95   none
0.00    70.00   1127.00 0.01    0.079955672     1.0000000e-08   7000    -0.691  
93.80   none
0.00    70.00   2057.00 0.01    0.079955672     1.0000000e-08   8000    -0.691  
93.73   none
0.00    70.00   630.00  0.01    0.079955672     1.0000000e-08   10000   -0.691  
94.08   none
0.00    70.00   365.00  0.01    0.079955672     1.0000000e-08   12000   -0.691  
93.99   none
0.00    66.00   674.00  0.01    0.079955672     1.0000000e-08   14000   -0.691  
94.22   none
0.00    66.00   310.00  0.01    0.079955672     1.0000000e-08   15000   -0.691  
94.25   none
0.00    66.00   449.00  0.01    0.079955672     1.0000000e-08   20000   -0.691  
94.31   none
0.00    65.00   418.00  0.01    0.079955672     1.0000000e-08   25000   -0.691  
94.26   none
0.00    63.00   409.00  0.01    0.079955672     1.0000000e-08   30000   -0.691  
94.41   none
0.00    61.00   461.00  0.01    0.079955672     1.0000000e-08   40000   -0.691  
94.55   none
0.00    61.00   855.00  0.01    0.079955672     1.0000000e-08   50000   -0.691  
94.41   none
0.00    59.00   229.00  0.01    0.079955672     1.0000000e-08   60000   -0.691  
93.88   none
0.00    59.00   211.00  0.01    0.079955672     1.0000000e-08   70000   -0.691  
94.56   none
0.00    58.00   576.00  0.01    0.079955672     1.0000000e-08   80000   -0.691  
93.99   none
0.00    55.00   250.00  0.01    0.079955672     1.0000000e-08   100000  -0.691  
95.35   none
0.00    55.00   419.00  0.01    0.079955672     1.0000000e-08   120000  -0.691  
94.26   none
0.00    55.00   94.00   0.01    0.079955672     1.0000000e-08   140000  -0.691  
94.01   none
0.00    55.00   117.00  0.01    0.079955672     1.0000000e-08   150000  -0.691  
93.45   none
exiting main, writing model to /tmp/mahout-asf/classification/sgd/models
Word counts
12/06/05 14:37:32 INFO driver.MahoutDriver: Program took 510979 ms (Minutes: 
8.516316666666667)
Running Test
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/robinanil/mahout-revert/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
12/06/05 14:37:32 WARN driver.MahoutDriver: No 
org.apache.mahout.classifier.sgd.TestASFEmail.props found on classpath, will 
use command-line arguments only
2012-06-05 14:37:33.401 java[93869:1903] Unable to load realm info from 
SCDynamicStore
40085 test files
=======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances          :      38346       95.6617%
Incorrectly Classified Instances        :       1739        4.3383%
Total Classified Instances              :      40085

=======================================================
Confusion Matrix
-------------------------------------------------------
a       b       <--Classified as
18683   1478     |  20161       a     = cocoon_apache_org
261     19663    |  19924       b     = commons_apache_org



Avg. Log-likelihood: -0.6909925745820731 25%-ile: -0.6923558418191567 75%-ile: 
-0.6915062054522141

12/06/05 14:37:37 INFO driver.MahoutDriver: Program took 4247 ms (Minutes: 
0.07078333333333334)
                
> ASF Email Classification Examples don't always produce good results
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-939
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-939
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Grant Ingersoll
>            Assignee: Robin Anil
>              Labels: MAHOUT_INTRO_CONTRIBUTE
>             Fix For: 0.8
>
>         Attachments: 939.patch, MAHOUT-939.patch, MAHOUT-939.patch, 
> MAHOUT-939.patch, asf_sample_list.txt, bayes.patch, strip_reject.patch
>
>
> The classification examples for the ASF email don't work all that well 
> currently in terms of quality when it comes to more than a few labels.  Also, 
> need to determine how much memory is required for vectors of cardinality size 
> 100K.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to