Re: Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
Glad to help you On Fri, Aug 1, 2014 at 11:28 AM, Bin wrote: > Hi Haiyang, > > Thanks, it really is the reason. > > Best, > Bin > > > 在 2014-07-31 08:05:34,"Haiyang Fu" 写道: > > Have you tried to increase the dirver memory? > > > On Thu, Jul 31, 2014 at 3:54 PM, Bin wrote: > >> Hi All, >> >> The data size of my task is about 30mb. It runs smoothly in local mode. >> However, when I submit it to the cluster, it throws the titled error >> (Please see below for the complete output). >> >> Actually, my output is almost the same with >> http://stackoverflow.com/questions/24080891/spark-program-hangs-at-job-finished-toarray-workers-throw-java-util-concurren. >> I >> also toArray my data, which was the reason of his case. >> >> However, how come it runs OK in local but not in the cluster? The memory >> of each worker is over 60g, and my run command is: >> >> "$SPARK_HOME/bin/spark-class org.apache.spark.deploy.Client launch >> spark://10.196.135.101:7077 $jar_path $programname -Dspark. >> master=spark://10.196.135.101:7077 -Dspark.cores.max=300 >> -Dspark.executor.memory=20g -spark.jars=$jar_path >> -Dspark.default.parallelism=100 >> -Dspark.hadoop.hadoop.job.ugi=$username,$groupname >> -Dspark.app.name=$appname >> $in_path $scala_out_path" >> >> Looking for help and thanks a lot! >> >> Below please find the complete output: >> >> 14/07/31 15:06:53 WARN Configuration: DEPRECATED: hadoop-site.xml found in >> the classpath. Usage of hadoop-site.xml is deprecated. Instead use >> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of >> core-default.xml, mapred-default.xml and hdfs-default.xml respectively >> 14/07/31 15:06:53 INFO SecurityManager: Changing view acls to: spark >> 14/07/31 15:06:53 INFO SecurityManager: SecurityManager: authentication >> disabled; ui acls disabled; users with view permissions: Set(spark) >> 14/07/31 15:06:53 INFO Slf4jLogger: Slf4jLogger started >> 14/07/31 15:06:53 INFO Remoting: Starting remoting >> 14/07/31 15:06:54 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://sparkExecutor@tdw-10-215-140-22:39446] >> 14/07/31 15:06:54 INFO Remoting: Remoting now listens on addresses: >> [akka.tcp://sparkExecutor@tdw-10-215-140-22:39446] >> 14/07/31 15:06:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: >> akka.tcp://spark@tdw-10-196-135-106:38502/user/CoarseGrainedScheduler >> 14/07/31 15:06:54 INFO WorkerWatcher: Connecting to worker >> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker >> 14/07/31 15:06:54 INFO WorkerWatcher: Successfully connected to >> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker >> 14/07/31 15:06:56 INFO CoarseGrainedExecutorBackend: Successfully registered >> with driver >> 14/07/31 15:06:56 INFO SecurityManager: Changing view acls to: spark >> 14/07/31 15:06:56 INFO SecurityManager: SecurityManager: authentication >> disabled; ui acls disabled; users with view permissions: Set(spark) >> 14/07/31 15:06:56 INFO Slf4jLogger: Slf4jLogger started >> 14/07/31 15:06:56 INFO Remoting: Starting remoting >> 14/07/31 15:06:56 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://spark@tdw-10-215-140-22:56708] >> 14/07/31 15:06:56 INFO Remoting: Remoting now listens on addresses: >> [akka.tcp://spark@tdw-10-215-140-22:56708] >> 14/07/31 15:06:56 INFO SparkEnv: Connecting to MapOutputTracker: >> akka.tcp://spark@tdw-10-196-135-106:38502/user/MapOutputTracker >> 14/07/31 15:06:58 INFO SparkEnv: Connecting to BlockManagerMaster: >> akka.tcp://spark@tdw-10-196-135-106:38502/user/BlockManagerMaster >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data1/sparkenv/local/spark-local-20140731150659-3f12 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data2/sparkenv/local/spark-local-20140731150659-1602 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data3/sparkenv/local/spark-local-20140731150659-d213 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data4/sparkenv/local/spark-local-20140731150659-f42e >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data5/sparkenv/local/spark-local-20140731150659-63d0 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data6/sparkenv/local/spark-local-20140731150659-9003 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data7/sparkenv/local/spark-local-20140731150659-f260 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data8/sparkenv/local/spark-local-20140731150659-6334 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data9/sparkenv/local/spark-local-20140731150659-3af4 >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data10/sparkenv/local/spark-local-20140731150659-133d >> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at >> /data11/sparkenv/local/spark-loca
Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
Have you tried to increase the dirver memory? On Thu, Jul 31, 2014 at 3:54 PM, Bin wrote: > Hi All, > > The data size of my task is about 30mb. It runs smoothly in local mode. > However, when I submit it to the cluster, it throws the titled error > (Please see below for the complete output). > > Actually, my output is almost the same with > http://stackoverflow.com/questions/24080891/spark-program-hangs-at-job-finished-toarray-workers-throw-java-util-concurren. > I > also toArray my data, which was the reason of his case. > > However, how come it runs OK in local but not in the cluster? The memory > of each worker is over 60g, and my run command is: > > "$SPARK_HOME/bin/spark-class org.apache.spark.deploy.Client launch spark:// > 10.196.135.101:7077 $jar_path $programname -Dspark.master=spark:// > 10.196.135.101:7077 -Dspark.cores.max=300 -Dspark.executor.memory=20g > -spark.jars=$jar_path -Dspark.default.parallelism=100 > -Dspark.hadoop.hadoop.job.ugi=$username,$groupname -Dspark.app.name=$appname > $in_path $scala_out_path" > > Looking for help and thanks a lot! > > Below please find the complete output: > > 14/07/31 15:06:53 WARN Configuration: DEPRECATED: hadoop-site.xml found in > the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > 14/07/31 15:06:53 INFO SecurityManager: Changing view acls to: spark > 14/07/31 15:06:53 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(spark) > 14/07/31 15:06:53 INFO Slf4jLogger: Slf4jLogger started > 14/07/31 15:06:53 INFO Remoting: Starting remoting > 14/07/31 15:06:54 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkExecutor@tdw-10-215-140-22:39446] > 14/07/31 15:06:54 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sparkExecutor@tdw-10-215-140-22:39446] > 14/07/31 15:06:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: > akka.tcp://spark@tdw-10-196-135-106:38502/user/CoarseGrainedScheduler > 14/07/31 15:06:54 INFO WorkerWatcher: Connecting to worker > akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker > 14/07/31 15:06:54 INFO WorkerWatcher: Successfully connected to > akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker > 14/07/31 15:06:56 INFO CoarseGrainedExecutorBackend: Successfully registered > with driver > 14/07/31 15:06:56 INFO SecurityManager: Changing view acls to: spark > 14/07/31 15:06:56 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(spark) > 14/07/31 15:06:56 INFO Slf4jLogger: Slf4jLogger started > 14/07/31 15:06:56 INFO Remoting: Starting remoting > 14/07/31 15:06:56 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://spark@tdw-10-215-140-22:56708] > 14/07/31 15:06:56 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://spark@tdw-10-215-140-22:56708] > 14/07/31 15:06:56 INFO SparkEnv: Connecting to MapOutputTracker: > akka.tcp://spark@tdw-10-196-135-106:38502/user/MapOutputTracker > 14/07/31 15:06:58 INFO SparkEnv: Connecting to BlockManagerMaster: > akka.tcp://spark@tdw-10-196-135-106:38502/user/BlockManagerMaster > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data1/sparkenv/local/spark-local-20140731150659-3f12 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data2/sparkenv/local/spark-local-20140731150659-1602 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data3/sparkenv/local/spark-local-20140731150659-d213 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data4/sparkenv/local/spark-local-20140731150659-f42e > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data5/sparkenv/local/spark-local-20140731150659-63d0 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data6/sparkenv/local/spark-local-20140731150659-9003 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data7/sparkenv/local/spark-local-20140731150659-f260 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data8/sparkenv/local/spark-local-20140731150659-6334 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data9/sparkenv/local/spark-local-20140731150659-3af4 > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data10/sparkenv/local/spark-local-20140731150659-133d > 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at > /data11/sparkenv/local/spark-local-20140731150659-ed08 > 14/07/31 15:06:59 INFO MemoryStore: MemoryStore started with capacity 11.5 GB. > 14/07/31 15:06:59 INFO ConnectionManager: Bound socket to port 35127 with id > = ConnectionManagerId(tdw-10-215-140-22,35127) > 14/07/31 15:06:59 INFO BlockManagerMaster