On Sep 28, 2010, at 2:54 PM, Ted Dunning wrote: > Neil, > > That example should be updated to the current trunk version of the software. > That isn't likely to happen right away, so you should > adapt the procedures. > > On Tue, Sep 28, 2010 at 10:49 AM, Neil Ghosh <[email protected]> wrote: > Hi Grant , > > I am trying to run the classification example in > > http://www.ibm.com/developerworks/java/library/j-mahout/ > > doing the step 3. ant install > > We don't use ant any more.
I used Ant to build/run the examples. The examples came w/ Mahout already built, so no need for Maven for the examples. > > You should use 'mvn install' here instead. Make sure you have checked out > the trunk version of the software. > > > However it is trying to download the 2GB file , I might run out of space in > my linux partition , also download may be disturbed in my connection . > > Yes. These could happen. IF this is a problem, you might want to invest a > tiny amount of money to rent an EC2 machine for a few hours. This literally > will be less than a dollar, even if you have to go through the process > several times. Yes, it is going to get the Wikipedia data set. It expands to about 10GB, if I recall. > > Yes > is there any way I can test the example in a smaller set of wikipedia data > or download the data offline ? > > Sure. Try the 20newsgroups examples. Yep, the principals here are the same. For the wikipedia, all I did was classify into Democrats and Republicans, but the underlying process really is no different. > > Also, you can download the wikipedia test data any way you like. -------------------------- Grant Ingersoll http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
