Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops
at the chunk #571 (571*64=36.5GB).
Still I haven't figured out is this a problem with JVM or Hadoop or Mahout?
I have tested various parameters on 16GB RAM
property
namemapred.map.child.java.opts/name
I am pretty sure that there is something wrong with hadoop/mahout/java. With
any configuration, it stuck at the chunk #571. Previous chunks are created
rapidly but I see it waits for bout 30 minutes on 571 and that is the reason
for heap error size.
I will try to submit a bug report.
Hi,
Recently I have faced a heap size error when I run
$MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d
$MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o
wikipedia/chunks -c 64
Here is the specs
1- XML file size = 44GB
2- System memory = 54GB (on virtualbox)
3- Heap size = 51GB
As I posted earlier, here is the result of a successful test
5.4GB XML file (which is larger than enwiki-latest-pages-articles10.xml) with
4GB of RAM and -Xmx128m tooks 5 minutes to complete.
I didn't find a larger wikipedia XML file. Need to test 10GB, 20GB and 30GB
files
Regards,
Mahmood