Hi Grant , I am trying to run the classification example in
http://www.ibm.com/developerworks/java/library/j-mahout/ doing the step 3. ant install However it is trying to download the 2GB file , I might run out of space in my linux partition , also download may be disturbed in my connection . is there any way I can test the example in a smaller set of wikipedia data or download the data offline ? Thanks Neil http://neilghosh.com On Mon, Sep 27, 2010 at 6:12 PM, Grant Ingersoll <[email protected]>wrote: > > On Sep 24, 2010, at 1:12 PM, Neil Ghosh wrote: > > > Is there any other examples/documents/reference how to use mahout for* > text > > classification. > > * > > I went through and ran the following > > > > > > 1. Wikipedia Bayes > > Example<https://cwiki.apache.org/MAHOUT/wikipedia-bayes-example.html>- > > Classify Wikipedia data. > > > > > > 1. Twenty Newsgroups< > https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html>- > > Classify the classic Twenty Newsgroups data. > > > > However these two are not much definitive and there aren't much > explanation > > for the examples .Please share if there are more documentation. > > > What kinds of problems are you looking to solve? In general, we don't have > too much in the way of special things for text other than we have various > utilities for converting text into Mahout's vector format based on various > weighting schemes. Both of those examples just take and convert the text > into vectors and then either train or test on them. I would agree, though, > that a good tutorial is needed. It's a bit out of date in terms of the > actual commands, but I believe the concepts are still accurate: > http://www.ibm.com/developerworks/java/library/j-mahout/ > > See > https://cwiki.apache.org/confluence/display/MAHOUT/Mahout+Wiki#MahoutWiki-ImplementationBackground(and > the creating vectors section). Also see the Algorithms section. > > > -------------------------- > Grant Ingersoll > http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8 > > -- Thanks and Regards Neil http://neilghosh.com
