[ https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141321#comment-13141321 ]
Grant Ingersoll commented on MAHOUT-857: ---------------------------------------- Here's the conf. matrix I'm getting, which clearly points to some idiocy on my part: {quote} 7532 test files ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 374 4.9655% Incorrectly Classified Instances : 7158 95.0345% Total Classified Instances : 7532 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h i j k l m n o p q r s t u <--Classified as 123 0 1 1 1 2 6 19 2 2 5 23 27 8 53 3 14 17 12 0 0 | 319 a = alt.atheism 55 16 28 14 80 24 3 8 4 3 8 86 27 28 0 2 3 0 0 0 0 | 389 b = comp.graphics 38 171 57 14 49 5 3 6 2 4 3 25 7 6 1 1 0 2 0 0 0 | 394 c = comp.os.ms-windows.misc 10 14 237 18 17 15 2 7 4 0 2 54 7 4 0 0 0 1 0 0 0 | 392 d = comp.sys.ibm.pc.hardware 20 10 55 159 17 20 7 11 5 0 1 63 13 2 0 1 0 1 0 0 0 | 385 e = comp.sys.mac.hardware 11 25 5 0 306 13 3 1 0 5 2 13 5 6 0 0 0 0 0 0 0 | 395 f = comp.windows.x 2 1 23 14 6 310 1 3 3 1 1 10 6 5 0 3 0 1 0 0 0 | 390 g = misc.forsale 8 1 6 2 9 11 270 15 10 3 3 37 11 4 0 2 0 4 0 0 0 | 396 h = rec.autos 7 0 1 1 8 6 14 326 1 0 1 12 17 3 1 0 0 0 0 0 0 | 398 i = rec.motorcycles 17 1 2 1 2 5 2 7 295 26 1 16 12 2 0 2 3 3 0 0 0 | 397 j = rec.sport.baseball 6 1 0 0 1 3 3 6 55 291 1 7 4 14 2 4 1 0 0 0 0 | 399 k = rec.sport.hockey 22 2 0 3 5 3 0 3 2 1 293 24 12 7 0 4 2 13 0 0 0 | 396 l = sci.crypt 25 6 23 13 15 11 10 18 4 3 13 212 18 16 2 1 1 2 0 0 0 | 393 m = sci.electronics 14 4 5 2 5 7 2 17 7 3 0 38 268 11 4 3 4 2 0 0 0 | 396 n = sci.med 22 1 0 1 3 4 0 8 1 4 2 34 26 279 0 2 2 5 0 0 0 | 394 o = sci.space 43 1 2 4 0 4 1 11 4 1 0 9 33 8 249 2 5 14 7 0 0 | 398 p = soc.religion.christian 21 0 0 1 3 3 2 12 6 2 3 10 16 5 1 235 4 40 0 0 0 | 364 q = talk.politics.guns 41 0 0 2 1 1 5 3 3 7 0 10 12 5 1 8 250 27 0 0 0 | 376 r = talk.politics.mideast 34 0 0 1 2 4 3 16 2 1 5 14 12 6 4 67 8 131 0 0 0 | 310 s = talk.politics.misc 50 0 0 1 2 0 1 15 7 0 3 11 21 7 53 17 6 19 38 0 0 | 251 t = talk.religion.misc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 0 u = DEFAULT Default Category: DEFAULT: 20 {quote} > Rework 20 NewsGroup shell script example to include SGD Example > --------------------------------------------------------------- > > Key: MAHOUT-857 > URL: https://issues.apache.org/jira/browse/MAHOUT-857 > Project: Mahout > Issue Type: Improvement > Reporter: Grant Ingersoll > Attachments: MAHOUT-857.patch > > > We have build-20news-bayes.sh that runs our NB stuff on 20 news groups. We > also have an SGD example that works on 20 news groups, but no script to run > it. I'm going to rename build-20news-bayes.sh to classify-20news.sh and > incorporate the two. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira