Re: Error: Java heap space when running FPGrowth

tanweiguo 00124421 Thu, 05 Aug 2010 22:43:32 -0700

I run the same command of you, and get this error. Don't know why.

10/08/06 11:46:05 ERROR driver.MahoutDriver: MahoutDriver failed with args:
[fpg, -i, accidents, -o, pattern, -k, 50, -method, mapreduce, -g, 20,
-regex, [ ], -s, 2]
null
Exception in thread "main" java.lang.NullPointerException
        at java.util.Properties$LineReader.readLine(Properties.java:418)
        at java.util.Properties.load0(Properties.java:337)
        at java.util.Properties.load(Properties.java:325)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:98)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


-----邮件原件-----
发件人: Ankur C. Goel [mailto:[email protected]] 
发送时间: 2010年8月4日 18:02
收件人: [email protected]
主题: Re: Error: Java heap space when running FPGrowth

Hi tanweiguo,
             Which version of hadoop are you using ? I ran the example on
hadoop 0.20.2 release on a single node cluster using the mahout binary
$MAHOUT_INSTALL_DIR/bin/mahout fpg -i accidents -o pattern -k 50 -method
mapreduce -g 20 -regex [\ ] -s 2 and it worked for me.
In my single node setup,  mapred.child.java.opts="-server -Xmx768m
-Djava.net.preferIPv4Stack=true"

Not sure if there is a way exposed to control the parallelism. Robin ?

-...@nkur

On 8/4/10 1:18 PM, "tanweiguo" <[email protected]> wrote:

I just followed the wiki to test FPGrowth:
https://cwiki.apache.org/MAHOUT/parallel-frequent-pattern-mining.html

1.unzip and put the accidents.dat.gz to HDFS accidents folder 2.run on a
hadoop cluster(1 master and 3 slaves)
    hadoop jar mahout-examples-0.3.job
org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver \
         -i accidents \
         -o patterns \
         -k 50 \
         -method mapreduce \
         -g 10 \
         -regex [\ ] \
         -s 2

The first two MapReduce(Parallel Counting Driver running over input:
accidents; PFP Transaction Sorting running over inputaccidents) succeed.
However, the third MapReduce(PFP Growth Driver running over
inputpatterns/sortedoutput) always fail with this error message:

    10/08/04 15:23:45 INFO input.FileInputFormat: Total input paths to
process : 1
    10/08/04 15:23:46 INFO mapred.JobClient: Running job:
job_201007271506_0025
    10/08/04 15:23:47 INFO mapred.JobClient:  map 0% reduce 0%
    10/08/04 15:24:05 INFO mapred.JobClient:  map 13% reduce 0%
    10/08/04 15:24:08 INFO mapred.JobClient:  map 22% reduce 0%
    10/08/04 15:24:11 INFO mapred.JobClient:  map 24% reduce 0%
    10/08/04 15:24:29 INFO mapred.JobClient:  map 0% reduce 0%
    10/08/04 15:24:31 INFO mapred.JobClient: Task Id :
attempt_201007271506_0025_m_000000_0, Status : FAILED
    Error: java.lang.OutOfMemoryError: Java heap space
            at
org.apache.mahout.fpm.pfpgrowth.TransactionTree.resize(TransactionTree.java:
446)
            at
org.apache.mahout.fpm.pfpgrowth.TransactionTree.createNode(TransactionTree.j
ava:409)
            at
org.apache.mahout.fpm.pfpgrowth.TransactionTree.addPattern(TransactionTree.j
ava:202)
            at
org.apache.mahout.fpm.pfpgrowth.TransactionTree.getCompressedTree(Transactio
nTree.java:285)
            at
org.apache.mahout.fpm.pfpgrowth.ParallelFPGrowthCombiner.reduce(ParallelFPGr
owthCombiner.java:51)
            at
org.apache.mahout.fpm.pfpgrowth.ParallelFPGrowthCombiner.reduce(ParallelFPGr
owthCombiner.java:33)
            at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
            at
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1214)
            at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1
227)
            at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:64
8)
            at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav
a:1135)

The parameter mapred.child.java.opts is set to -Xmx512m in my cluster.
I also tried -g 5 and -g 20, both failed with the same error message.

Another question: I find there is only one mapper. How to adjust parameter
to have more mappers to improve speed?

Re: Error: Java heap space when running FPGrowth

Reply via email to