It's MAHOUT_OPTS. Read bin/mahout. There's also a JAVA_HEAP_MAX that uses Heapsize.
Try running the script with "sh -x bin/mahout". This will show you the actual command line. Or 'ps -ef | cat'. On Fri, May 4, 2012 at 4:49 PM, Ryan Rosario <[email protected]> wrote: > Hi, > > I am trying to follow along with the 20 newsgroups example but using > my own data. I am running the examples on a server with 24GB of RAM > and 24 cores. When I get to the "Computing TF-IDF" stage, the whole > process fails with the following exception. I have 14000 documents and > 2 classes. The lexicon consists of 2705284 trigrams which I created > myself. I then set the ng parameter equal to 1 since I already > tokenized the words myself. > > The system at max has only been using 4-5GB total, and I have set > MAHOUT_OPTIONS=-Xmx4g, MAHOUT_HEAPSIZE=24000, > mapred.map.child.java.opts=-Xmx24g just to see if I could get Mahout > to acknowledge the increase in heap space, but this does not seem to > be helping at all. > > What else can I try to get past this problem? The system has plenty of RAM. > > Thanks, > Ryan > > ./bin/mahout trainclassifier -i /user/ryan/pageclass-train -o > pageclass-out -type cbayes -ng 1 -source > > .... > 12/05/04 15:52:43 INFO cbayes.CBayesDriver: Calculating Tf-Idf... > 12/05/04 15:52:46 INFO common.BayesTfIdfDriver: Counts of documents in > Each Label > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) > at java.lang.StringCoding.encode(StringCoding.java:272) > at java.lang.String.getBytes(String.java:946) > at > org.apache.hadoop.io.DefaultStringifier.fromString(DefaultStringifier.java:73) > at > org.apache.mahout.classifier.bayes.mapreduce.common.BayesTfIdfDriver.runJob(BayesTfIdfDriver.java:88) > at > org.apache.mahout.classifier.bayes.mapreduce.cbayes.CBayesDriver.runJob(CBayesDriver.java:51) > at > org.apache.mahout.classifier.bayes.TrainClassifier.trainCNaiveBayes(TrainClassifier.java:58) > at > org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:151) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > -- > RRR > > > -- > RRR -- Lance Norskog [email protected]
