Sam, Per Ted's email below please run with the trunk for your work. Please look at Chapters 13 - 16 in the Mahout in Action book for sample code snippets for classifying 20 newsgroups with SGD. There presently is no command line option (I am not aware of one and could be wrong) for running the 20 newsgroup example with SGD.
The only command line tools for SGD - trainlogistic and runlogistic expect the input files to be in CSV format which is not what you have. I have a sample program for qualifying datasets (similar to the format you have) using SGD which I can share with you later today. Regards, Suneel ________________________________ From: Ted Dunning <ted.dunn...@gmail.com> To: user@mahout.apache.org Sent: Saturday, December 10, 2011 3:20 AM Subject: Re: PLEASE HELP! - MAHOUT CLASSIFICATION a) run with trunk b) see https://github.com/tdunning/Chapter-16 c) also see org.apache.mahout.classifier.sgd.TrainNewsGroups Your training data is tiny. The bayes classifiers are designed for large data. Poor results are not very surprising at this data size. On Fri, Dec 9, 2011 at 8:03 PM, Sam Cunningham <sam_cun...@yahoo.com> wrote: > I am running Mahout distribution v0.5. Though, I am not sure what > difference > would that make? I ran my dataset with bayes/cbayes only. I don't have any > sample code for SGD or its command option. Is there any SGD example for > 20news > dataset so that I can follow (for training and testing)? >