Don't have any more ideas, good luck!

On Fri, May 4, 2012 at 7:37 PM, Ryan Rosario <[email protected]> wrote:
> Pseudo-distributed.
>
> Yes, it is 64 bit Java:
>
> java version "1.6.0_26"
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>
>
> On Fri, May 4, 2012 at 7:06 PM, Lance Norskog <[email protected]> wrote:
>> Is this pseudo-distributed or on a cluster? I don't know how to get
>> Java options out to a cluster.
>>
>> Also, an annoying question: is this a 64-bit Java?
>>
>> On Fri, May 4, 2012 at 6:36 PM, Ryan Rosario <[email protected]> wrote:
>>> Setting those options makes no difference. TFIDF calculation
>>> immediately stops with an OutOfMemory error. It doesn't even try.
>>> The system only uses about 1-4GB. The only way to get anything to run
>>> is to use something like <= 100 documents, which does not help me.
>>>
>>> MAHOUT_OPTS=-Xmx24g
>>> JAVA_HEAP_MAX=-Xmx24000m
>>>
>>> R.
>>>
>>>
>>> On Fri, May 4, 2012 at 6:13 PM, Lance Norskog <[email protected]> wrote:
>>>> It's MAHOUT_OPTS. Read bin/mahout. There's also a JAVA_HEAP_MAX that
>>>> uses Heapsize.
>>>>
>>>> Try running the script with "sh -x bin/mahout". This will show you the
>>>> actual command line. Or 'ps -ef | cat'.
>>>>
>>>> On Fri, May 4, 2012 at 4:49 PM, Ryan Rosario <[email protected]> wrote:
>>>>> Hi,
>>>>>
>>>>> I am trying to follow along with the 20 newsgroups example but using
>>>>> my own data. I am running the examples on a server with 24GB of RAM
>>>>> and 24 cores. When I get to the "Computing TF-IDF" stage, the whole
>>>>> process fails with the following exception. I have 14000 documents and
>>>>> 2 classes. The lexicon consists of 2705284 trigrams which I created
>>>>> myself. I then set the ng parameter equal to 1 since I already
>>>>> tokenized the words myself.
>>>>>
>>>>> The system at max has only been using 4-5GB total, and I have set
>>>>> MAHOUT_OPTIONS=-Xmx4g, MAHOUT_HEAPSIZE=24000,
>>>>> mapred.map.child.java.opts=-Xmx24g just to see if I could get Mahout
>>>>> to acknowledge the increase in heap space, but this does not seem to
>>>>> be helping at all.
>>>>>
>>>>> What else can I try to get past this problem? The system has plenty of 
>>>>> RAM.
>>>>>
>>>>> Thanks,
>>>>> Ryan
>>>>>
>>>>> ./bin/mahout trainclassifier -i /user/ryan/pageclass-train -o
>>>>> pageclass-out -type cbayes -ng 1 -source
>>>>>
>>>>> ....
>>>>> 12/05/04 15:52:43 INFO cbayes.CBayesDriver: Calculating Tf-Idf...
>>>>> 12/05/04 15:52:46 INFO common.BayesTfIdfDriver: Counts of documents in
>>>>> Each Label
>>>>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>>>>        at 
>>>>> java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
>>>>>        at java.lang.StringCoding.encode(StringCoding.java:272)
>>>>>        at java.lang.String.getBytes(String.java:946)
>>>>>        at 
>>>>> org.apache.hadoop.io.DefaultStringifier.fromString(DefaultStringifier.java:73)
>>>>>        at 
>>>>> org.apache.mahout.classifier.bayes.mapreduce.common.BayesTfIdfDriver.runJob(BayesTfIdfDriver.java:88)
>>>>>        at 
>>>>> org.apache.mahout.classifier.bayes.mapreduce.cbayes.CBayesDriver.runJob(CBayesDriver.java:51)
>>>>>        at 
>>>>> org.apache.mahout.classifier.bayes.TrainClassifier.trainCNaiveBayes(TrainClassifier.java:58)
>>>>>        at 
>>>>> org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:151)
>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>        at 
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>        at 
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>        at 
>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>>>        at 
>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>>        at 
>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>        at 
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>        at 
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>
>>>>>
>>>>> --
>>>>> RRR
>>>>>
>>>>>
>>>>> --
>>>>> RRR
>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> [email protected]
>>>
>>>
>>>
>>> --
>>> RRR
>>
>>
>>
>> --
>> Lance Norskog
>> [email protected]
>
>
>
> --
> RRR



-- 
Lance Norskog
[email protected]

Reply via email to