Hi Xiangrui,
(2014/06/18 8:49), Xiangrui Meng wrote:
Makoto, dense vectors are used to in aggregation. If you have 32
partitions and each one sending a dense vector of size 1,354,731 to
master. Then the driver needs 300M+. That may be the problem.
It seems that it could cuase certain problems for a convex optimization
of large training data and a merging tree, like allreduce, would help to
reduce memory requirements (though time for aggregation might increase).
Which deploy mode are you using, standalone or local?
Standalone.
Setting -driver-memory 8G was not solved the aggregate problem.
Aggregation never finishes.
`ps aux | grep spark` on master is as follows:
myui 7049 79.3 1.1 8768868 592348 pts/2 Sl+ 11:10 0:14
/usr/java/jdk1.7/bin/java -cp
::/opt/spark-1.0.0/conf:/opt/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop0.20.2-cdh3u6.jar:/usr/lib/hadoop-0.20/conf
-XX:MaxPermSize=128m -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -Djava.library.path= -Xms2g -Xmx2g
org.apache.spark.deploy.SparkSubmit spark-shell --driver-memory 8G
--class org.apache.spark.repl.Main
myui 5694 2.5 0.5 6868296 292572 pts/2 Sl 10:59 0:17
/usr/java/jdk1.7/bin/java -cp
::/opt/spark-1.0.0/conf:/opt/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop0.20.2-cdh3u6.jar:/usr/lib/hadoop-0.20/conf
-XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m
-Xmx512m org.apache.spark.deploy.master.Master --ip 10.0.0.1 --port 7077
--webui-port 8081
----------------------------------------
Exporting SPARK_DAEMON_MEMORY=4g in spark-env.sh did not take effect for
the evaluation.
`ps aux | grep spark`
/usr/java/jdk1.7/bin/java -cp
::/opt/spark-1.0.0/conf:/opt/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop0.20.2-cdh3u6.jar:/usr/lib/hadoop-0.20/conf
-XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms4g -Xmx4g
org.apache.spark.deploy.master.Master --ip 10.0.0.1 --port 7077
--webui-port 8081
...
Thanks,
Makoto