Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan
Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops at the chunk #571 (571*64=36.5GB). Still I haven't figured out is this a problem with JVM or Hadoop or Mahout? I have tested various parameters on 16GB RAM property namemapred.map.child.java.opts/name

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan
I am pretty sure that there is something wrong with hadoop/mahout/java. With any configuration, it stuck at the chunk #571. Previous chunks are created rapidly but I see it waits for bout 30 minutes on 571 and that is the reason for heap error size. I will try to submit a bug report.  

Solving heap size error

2014-03-11 Thread Mahmood Naderan
Hi, Recently I have faced a heap size error when I run   $MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d $MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 Here is the specs 1- XML file size = 44GB 2- System memory = 54GB (on virtualbox) 3- Heap size = 51GB

Re: Solving heap size error

2014-03-11 Thread Mahmood Naderan
As I posted earlier, here is the result of a successful test 5.4GB XML file (which is larger than enwiki-latest-pages-articles10.xml) with 4GB of RAM and -Xmx128m tooks 5 minutes to complete. I didn't find a larger wikipedia XML file. Need to test 10GB, 20GB and 30GB files   Regards, Mahmood