Re: Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

2014-07-31 Thread Haiyang Fu
Glad to help you


On Fri, Aug 1, 2014 at 11:28 AM, Bin  wrote:

> Hi Haiyang,
>
> Thanks, it really is the reason.
>
> Best,
> Bin
>
>
> 在 2014-07-31 08:05:34,"Haiyang Fu"  写道:
>
> Have you tried to increase the dirver memory?
>
>
> On Thu, Jul 31, 2014 at 3:54 PM, Bin  wrote:
>
>> Hi All,
>>
>> The data size of my task is about 30mb. It runs smoothly in local mode.
>> However, when I submit it to the cluster, it throws the titled error
>> (Please see below for the complete output).
>>
>> Actually, my output is almost the same with
>> http://stackoverflow.com/questions/24080891/spark-program-hangs-at-job-finished-toarray-workers-throw-java-util-concurren.
>>  I
>> also toArray my data, which was the reason of his case.
>>
>> However, how come it runs OK in local but not in the cluster? The memory
>> of each worker is over 60g, and my run command is:
>>
>> "$SPARK_HOME/bin/spark-class org.apache.spark.deploy.Client launch
>> spark://10.196.135.101:7077 $jar_path $programname -Dspark.
>> master=spark://10.196.135.101:7077 -Dspark.cores.max=300
>> -Dspark.executor.memory=20g -spark.jars=$jar_path 
>> -Dspark.default.parallelism=100
>>  -Dspark.hadoop.hadoop.job.ugi=$username,$groupname  
>> -Dspark.app.name=$appname
>> $in_path $scala_out_path"
>>
>> Looking for help and thanks a lot!
>>
>> Below please find the complete output:
>>
>> 14/07/31 15:06:53 WARN Configuration: DEPRECATED: hadoop-site.xml found in 
>> the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
>> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
>> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
>> 14/07/31 15:06:53 INFO SecurityManager: Changing view acls to: spark
>> 14/07/31 15:06:53 INFO SecurityManager: SecurityManager: authentication 
>> disabled; ui acls disabled; users with view permissions: Set(spark)
>> 14/07/31 15:06:53 INFO Slf4jLogger: Slf4jLogger started
>> 14/07/31 15:06:53 INFO Remoting: Starting remoting
>> 14/07/31 15:06:54 INFO Remoting: Remoting started; listening on addresses 
>> :[akka.tcp://sparkExecutor@tdw-10-215-140-22:39446]
>> 14/07/31 15:06:54 INFO Remoting: Remoting now listens on addresses: 
>> [akka.tcp://sparkExecutor@tdw-10-215-140-22:39446]
>> 14/07/31 15:06:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: 
>> akka.tcp://spark@tdw-10-196-135-106:38502/user/CoarseGrainedScheduler
>> 14/07/31 15:06:54 INFO WorkerWatcher: Connecting to worker 
>> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker
>> 14/07/31 15:06:54 INFO WorkerWatcher: Successfully connected to 
>> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker
>> 14/07/31 15:06:56 INFO CoarseGrainedExecutorBackend: Successfully registered 
>> with driver
>> 14/07/31 15:06:56 INFO SecurityManager: Changing view acls to: spark
>> 14/07/31 15:06:56 INFO SecurityManager: SecurityManager: authentication 
>> disabled; ui acls disabled; users with view permissions: Set(spark)
>> 14/07/31 15:06:56 INFO Slf4jLogger: Slf4jLogger started
>> 14/07/31 15:06:56 INFO Remoting: Starting remoting
>> 14/07/31 15:06:56 INFO Remoting: Remoting started; listening on addresses 
>> :[akka.tcp://spark@tdw-10-215-140-22:56708]
>> 14/07/31 15:06:56 INFO Remoting: Remoting now listens on addresses: 
>> [akka.tcp://spark@tdw-10-215-140-22:56708]
>> 14/07/31 15:06:56 INFO SparkEnv: Connecting to MapOutputTracker: 
>> akka.tcp://spark@tdw-10-196-135-106:38502/user/MapOutputTracker
>> 14/07/31 15:06:58 INFO SparkEnv: Connecting to BlockManagerMaster: 
>> akka.tcp://spark@tdw-10-196-135-106:38502/user/BlockManagerMaster
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data1/sparkenv/local/spark-local-20140731150659-3f12
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data2/sparkenv/local/spark-local-20140731150659-1602
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data3/sparkenv/local/spark-local-20140731150659-d213
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data4/sparkenv/local/spark-local-20140731150659-f42e
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data5/sparkenv/local/spark-local-20140731150659-63d0
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data6/sparkenv/local/spark-local-20140731150659-9003
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data7/sparkenv/local/spark-local-20140731150659-f260
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data8/sparkenv/local/spark-local-20140731150659-6334
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data9/sparkenv/local/spark-local-20140731150659-3af4
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data10/sparkenv/local/spark-local-20140731150659-133d
>> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
>> /data11/sparkenv/local/spark-loca

Re: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

2014-07-31 Thread Haiyang Fu
Have you tried to increase the dirver memory?


On Thu, Jul 31, 2014 at 3:54 PM, Bin  wrote:

> Hi All,
>
> The data size of my task is about 30mb. It runs smoothly in local mode.
> However, when I submit it to the cluster, it throws the titled error
> (Please see below for the complete output).
>
> Actually, my output is almost the same with
> http://stackoverflow.com/questions/24080891/spark-program-hangs-at-job-finished-toarray-workers-throw-java-util-concurren.
>  I
> also toArray my data, which was the reason of his case.
>
> However, how come it runs OK in local but not in the cluster? The memory
> of each worker is over 60g, and my run command is:
>
> "$SPARK_HOME/bin/spark-class org.apache.spark.deploy.Client launch spark://
> 10.196.135.101:7077 $jar_path $programname -Dspark.master=spark://
> 10.196.135.101:7077 -Dspark.cores.max=300 -Dspark.executor.memory=20g
> -spark.jars=$jar_path -Dspark.default.parallelism=100
>  -Dspark.hadoop.hadoop.job.ugi=$username,$groupname  -Dspark.app.name=$appname
> $in_path $scala_out_path"
>
> Looking for help and thanks a lot!
>
> Below please find the complete output:
>
> 14/07/31 15:06:53 WARN Configuration: DEPRECATED: hadoop-site.xml found in 
> the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> 14/07/31 15:06:53 INFO SecurityManager: Changing view acls to: spark
> 14/07/31 15:06:53 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(spark)
> 14/07/31 15:06:53 INFO Slf4jLogger: Slf4jLogger started
> 14/07/31 15:06:53 INFO Remoting: Starting remoting
> 14/07/31 15:06:54 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkExecutor@tdw-10-215-140-22:39446]
> 14/07/31 15:06:54 INFO Remoting: Remoting now listens on addresses: 
> [akka.tcp://sparkExecutor@tdw-10-215-140-22:39446]
> 14/07/31 15:06:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: 
> akka.tcp://spark@tdw-10-196-135-106:38502/user/CoarseGrainedScheduler
> 14/07/31 15:06:54 INFO WorkerWatcher: Connecting to worker 
> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker
> 14/07/31 15:06:54 INFO WorkerWatcher: Successfully connected to 
> akka.tcp://sparkWorker@tdw-10-215-140-22:34755/user/Worker
> 14/07/31 15:06:56 INFO CoarseGrainedExecutorBackend: Successfully registered 
> with driver
> 14/07/31 15:06:56 INFO SecurityManager: Changing view acls to: spark
> 14/07/31 15:06:56 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(spark)
> 14/07/31 15:06:56 INFO Slf4jLogger: Slf4jLogger started
> 14/07/31 15:06:56 INFO Remoting: Starting remoting
> 14/07/31 15:06:56 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://spark@tdw-10-215-140-22:56708]
> 14/07/31 15:06:56 INFO Remoting: Remoting now listens on addresses: 
> [akka.tcp://spark@tdw-10-215-140-22:56708]
> 14/07/31 15:06:56 INFO SparkEnv: Connecting to MapOutputTracker: 
> akka.tcp://spark@tdw-10-196-135-106:38502/user/MapOutputTracker
> 14/07/31 15:06:58 INFO SparkEnv: Connecting to BlockManagerMaster: 
> akka.tcp://spark@tdw-10-196-135-106:38502/user/BlockManagerMaster
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data1/sparkenv/local/spark-local-20140731150659-3f12
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data2/sparkenv/local/spark-local-20140731150659-1602
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data3/sparkenv/local/spark-local-20140731150659-d213
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data4/sparkenv/local/spark-local-20140731150659-f42e
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data5/sparkenv/local/spark-local-20140731150659-63d0
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data6/sparkenv/local/spark-local-20140731150659-9003
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data7/sparkenv/local/spark-local-20140731150659-f260
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data8/sparkenv/local/spark-local-20140731150659-6334
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data9/sparkenv/local/spark-local-20140731150659-3af4
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data10/sparkenv/local/spark-local-20140731150659-133d
> 14/07/31 15:06:59 INFO DiskBlockManager: Created local directory at 
> /data11/sparkenv/local/spark-local-20140731150659-ed08
> 14/07/31 15:06:59 INFO MemoryStore: MemoryStore started with capacity 11.5 GB.
> 14/07/31 15:06:59 INFO ConnectionManager: Bound socket to port 35127 with id 
> = ConnectionManagerId(tdw-10-215-140-22,35127)
> 14/07/31 15:06:59 INFO BlockManagerMaster