Max Garmash created GIRAPH-1026:
-----------------------------------

             Summary: New Out-of-core mechanism does not work
                 Key: GIRAPH-1026
                 URL: https://issues.apache.org/jira/browse/GIRAPH-1026
             Project: Giraph
          Issue Type: Bug
            Reporter: Max Garmash


After releasing new OOC mechanism we tried to test it on our data and it failed.

Our environment:
4x (CPU 6 cores / 12 threads, RAM 64GB) 

We can successfully process about 75 millions of vertices. 
With 100-120M vertices it fails like this:

2015-08-04 12:35:21,000 INFO  [AMRM Callback Handler Thread] 
yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:onContainersCompleted(574)) - Got container 
status for containerID=container_1438068521412_0193_01_000005, state=COMPLETE, 
exitStatus=-104, diagnostics=Container 
[pid=6700,containerID=container_1438068521412_0193_01_000005] is running beyond 
physical memory limits. Current usage: 20.3 GB of 20 GB physical memory used; 
22.4 GB of 42 GB virtual memory used. Killing container.
Dump of the process-tree for container_1438068521412_0193_01_000005 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6704 6700 6700 6700 (java) 78760 20733 24033841152 5317812 java 
-Xmx20480M -Xms20480M -cp 
.:${CLASSPATH}:./*:$HADOOP_CLIENT_CONF_DIR:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*:$HADOOP_MAPRED_HOME/*:$HADOOP_MAPRED_HOME/lib/*:$MR2_CLASSPATH:./*:/etc/hadoop/conf.cloudera.yarn:/run/cloudera-scm-agent/process/264-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/lib/*::./*:/etc/hadoop/conf.cloudera.yarn:/run/cloudera-scm-agent/process/264-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/lib/*::./*:/etc/hadoop/conf.cloudera.yarn:/run/cloudera-scm-agent/process/264-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/lib/*:
 org.apache.giraph.yarn.GiraphYarnTask 1438068521412 193 5 1 
        |- 6700 6698 6700 6700 (bash) 0 0 14376960 433 /bin/bash -c java 
-Xmx20480M -Xms20480M -cp 
.:${CLASSPATH}:./*:$HADOOP_CLIENT_CONF_DIR:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*:$HADOOP_MAPRED_HOME/*:$HADOOP_MAPRED_HOME/lib/*:$MR2_CLASSPATH:./*:/etc/hadoop/conf.cloudera.yarn:/run/cloudera-scm-agent/process/264-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-mapreduce/lib/*:
 org.apache.giraph.yarn.GiraphYarnTask 1438068521412 193 5 1 
1>/var/log/hadoop-yarn/container/application_1438068521412_0193/container_1438068521412_0193_01_000005/task-5-stdout.log
 
2>/var/log/hadoop-yarn/container/application_1438068521412_0193/container_1438068521412_0193_01_000005/task-5-stderr.log
  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143



Logs from container

2015-08-04 12:34:51,258 INFO  [netty-server-worker-4] handler.RequestDecoder 
(RequestDecoder.java:channelRead(74)) - decode: Server window metrics 
MBytes/sec received = 12.5315, MBytesReceived = 380.217, ave received req 
MBytes = 0.007, secs waited = 30.34
2015-08-04 12:35:16,258 INFO  [check-memory] ooc.CheckMemoryCallable 
(CheckMemoryCallable.java:call(221)) - call: Memory is very limited now. 
Calling GC manually. freeMemory = 924.27MB




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to