Don't have any more ideas, good luck! On Fri, May 4, 2012 at 7:37 PM, Ryan Rosario <[email protected]> wrote: > Pseudo-distributed. > > Yes, it is 64 bit Java: > > java version "1.6.0_26" > Java(TM) SE Runtime Environment (build 1.6.0_26-b03) > Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) > > > On Fri, May 4, 2012 at 7:06 PM, Lance Norskog <[email protected]> wrote: >> Is this pseudo-distributed or on a cluster? I don't know how to get >> Java options out to a cluster. >> >> Also, an annoying question: is this a 64-bit Java? >> >> On Fri, May 4, 2012 at 6:36 PM, Ryan Rosario <[email protected]> wrote: >>> Setting those options makes no difference. TFIDF calculation >>> immediately stops with an OutOfMemory error. It doesn't even try. >>> The system only uses about 1-4GB. The only way to get anything to run >>> is to use something like <= 100 documents, which does not help me. >>> >>> MAHOUT_OPTS=-Xmx24g >>> JAVA_HEAP_MAX=-Xmx24000m >>> >>> R. >>> >>> >>> On Fri, May 4, 2012 at 6:13 PM, Lance Norskog <[email protected]> wrote: >>>> It's MAHOUT_OPTS. Read bin/mahout. There's also a JAVA_HEAP_MAX that >>>> uses Heapsize. >>>> >>>> Try running the script with "sh -x bin/mahout". This will show you the >>>> actual command line. Or 'ps -ef | cat'. >>>> >>>> On Fri, May 4, 2012 at 4:49 PM, Ryan Rosario <[email protected]> wrote: >>>>> Hi, >>>>> >>>>> I am trying to follow along with the 20 newsgroups example but using >>>>> my own data. I am running the examples on a server with 24GB of RAM >>>>> and 24 cores. When I get to the "Computing TF-IDF" stage, the whole >>>>> process fails with the following exception. I have 14000 documents and >>>>> 2 classes. The lexicon consists of 2705284 trigrams which I created >>>>> myself. I then set the ng parameter equal to 1 since I already >>>>> tokenized the words myself. >>>>> >>>>> The system at max has only been using 4-5GB total, and I have set >>>>> MAHOUT_OPTIONS=-Xmx4g, MAHOUT_HEAPSIZE=24000, >>>>> mapred.map.child.java.opts=-Xmx24g just to see if I could get Mahout >>>>> to acknowledge the increase in heap space, but this does not seem to >>>>> be helping at all. >>>>> >>>>> What else can I try to get past this problem? The system has plenty of >>>>> RAM. >>>>> >>>>> Thanks, >>>>> Ryan >>>>> >>>>> ./bin/mahout trainclassifier -i /user/ryan/pageclass-train -o >>>>> pageclass-out -type cbayes -ng 1 -source >>>>> >>>>> .... >>>>> 12/05/04 15:52:43 INFO cbayes.CBayesDriver: Calculating Tf-Idf... >>>>> 12/05/04 15:52:46 INFO common.BayesTfIdfDriver: Counts of documents in >>>>> Each Label >>>>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >>>>> at >>>>> java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) >>>>> at java.lang.StringCoding.encode(StringCoding.java:272) >>>>> at java.lang.String.getBytes(String.java:946) >>>>> at >>>>> org.apache.hadoop.io.DefaultStringifier.fromString(DefaultStringifier.java:73) >>>>> at >>>>> org.apache.mahout.classifier.bayes.mapreduce.common.BayesTfIdfDriver.runJob(BayesTfIdfDriver.java:88) >>>>> at >>>>> org.apache.mahout.classifier.bayes.mapreduce.cbayes.CBayesDriver.runJob(CBayesDriver.java:51) >>>>> at >>>>> org.apache.mahout.classifier.bayes.TrainClassifier.trainCNaiveBayes(TrainClassifier.java:58) >>>>> at >>>>> org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:151) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>> at >>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >>>>> at >>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >>>>> at >>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>> >>>>> >>>>> -- >>>>> RRR >>>>> >>>>> >>>>> -- >>>>> RRR >>>> >>>> >>>> >>>> -- >>>> Lance Norskog >>>> [email protected] >>> >>> >>> >>> -- >>> RRR >> >> >> >> -- >> Lance Norskog >> [email protected] > > > > -- > RRR
-- Lance Norskog [email protected]
