[ 
https://issues.apache.org/jira/browse/SPARK-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guoqiang Li reopened SPARK-7934:
--------------------------------

> In some cases, Spark hangs in yarn-client mode.
> -----------------------------------------------
>
>                 Key: SPARK-7934
>                 URL: https://issues.apache.org/jira/browse/SPARK-7934
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.3.1
>            Reporter: Guoqiang Li
>
> The conf/spark-defaults.conf
> {noformat}
> spark.executor.extraJavaOptions -XX:+UseG1GC -XX:ConcGCThreadss=5 
> -XX:MaxPermSize=512m -Xss2m
> {noformat}
> Note:  {{ -XX:ConcGCThreadss=5}}  is wrong.
> The logs:
>  {noformat}
> 15/05/29 10:20:20 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/05/29 10:20:20 INFO SecurityManager: Changing view acls to: spark
> 15/05/29 10:20:20 INFO SecurityManager: Changing modify acls to: spark
> 15/05/29 10:20:20 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(spark); users 
> with modify permissions: Set(spark)
> 15/05/29 10:20:20 INFO HttpServer: Starting HTTP Server
> 15/05/29 10:20:20 INFO Server: jetty-8.y.z-SNAPSHOT
> 15/05/29 10:20:20 INFO AbstractConnector: Started 
> SocketConnector@0.0.0.0:54276
> 15/05/29 10:20:20 INFO Utils: Successfully started service 'HTTP class 
> server' on port 54276.
> 15/05/29 10:20:31 INFO SparkContext: Running Spark version 1.3.1
> 15/05/29 10:20:31 WARN SparkConf: The configuration option 
> 'spark.yarn.user.classpath.first' has been replaced as of Spark 1.3 and may 
> be removed in the future. Use spark.{driver,executor}.userClassPathFirst 
> instead.
> 15/05/29 10:20:31 INFO SecurityManager: Changing view acls to: spark
> 15/05/29 10:20:31 INFO SecurityManager: Changing modify acls to: spark
> 15/05/29 10:20:31 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(spark); users 
> with modify permissions: Set(spark)
> 15/05/29 10:20:32 INFO Slf4jLogger: Slf4jLogger started
> 15/05/29 10:20:32 INFO Remoting: Starting remoting
> 15/05/29 10:20:33 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkdri...@10dian71.domain.test:55492]
> 15/05/29 10:20:33 INFO Utils: Successfully started service 'sparkDriver' on 
> port 55492.
> 15/05/29 10:20:33 INFO SparkEnv: Registering MapOutputTracker
> 15/05/29 10:20:33 INFO SparkEnv: Registering BlockManagerMaster
> 15/05/29 10:20:33 INFO DiskBlockManager: Created local directory at 
> /tmp/spark-94c41fce-1788-484e-9878-88d1bf8c7247/blockmgr-b3d7ba9d-6656-408f-b9e2-683784493f22
> 15/05/29 10:20:33 INFO MemoryStore: MemoryStore started with capacity 4.1 GB
> 15/05/29 10:20:34 INFO HttpFileServer: HTTP File server directory is 
> /tmp/spark-271bab98-b4e8-4b02-8267-0020a38f355b/httpd-92bb8c15-51a7-4b40-9d01-2fb01cfbb148
> 15/05/29 10:20:34 INFO HttpServer: Starting HTTP Server
> 15/05/29 10:20:34 INFO Server: jetty-8.y.z-SNAPSHOT
> 15/05/29 10:20:34 INFO AbstractConnector: Started 
> SocketConnector@0.0.0.0:38530
> 15/05/29 10:20:34 INFO Utils: Successfully started service 'HTTP file server' 
> on port 38530.
> 15/05/29 10:20:34 INFO SparkEnv: Registering OutputCommitCoordinator
> 15/05/29 10:20:34 INFO Server: jetty-8.y.z-SNAPSHOT
> 15/05/29 10:20:34 INFO AbstractConnector: Started 
> SelectChannelConnector@0.0.0.0:4040
> 15/05/29 10:20:34 INFO Utils: Successfully started service 'SparkUI' on port 
> 4040.
> 15/05/29 10:20:34 INFO SparkUI: Started SparkUI at 
> http://10dian71.domain.test:4040
> 15/05/29 10:20:34 INFO SparkContext: Added JAR 
> file:/opt/spark/spark-1.3.0-cdh5/lib/hadoop-lzo-0.4.15-gplextras5.0.1-SNAPSHOT.jar
>  at 
> http://192.168.10.71:38530/jars/hadoop-lzo-0.4.15-gplextras5.0.1-SNAPSHOT.jar 
> with timestamp 1432866034769
> 15/05/29 10:20:34 INFO SparkContext: Added JAR 
> file:/opt/spark/classes/toona-assembly.jar at 
> http://192.168.10.71:38530/jars/toona-assembly.jar with timestamp 
> 1432866034972
> 15/05/29 10:20:35 INFO RMProxy: Connecting to ResourceManager at 
> 10dian72/192.168.10.72:9080
> 15/05/29 10:20:36 INFO Client: Requesting a new application from cluster with 
> 9 NodeManagers
> 15/05/29 10:20:36 INFO Client: Verifying our application has not requested 
> more than the maximum memory capability of the cluster (10240 MB per 
> container)
> 15/05/29 10:20:36 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 15/05/29 10:20:36 INFO Client: Setting up container launch context for our AM
> 15/05/29 10:20:36 INFO Client: Preparing resources for our AM container
> 15/05/29 10:20:37 INFO Client: Uploading resource 
> file:/opt/spark/spark-1.3.0-cdh5/lib/spark-assembly-1.3.2-SNAPSHOT-hadoop2.3.0-cdh5.0.1.jar
>  -> 
> hdfs://ns1/user/spark/.sparkStaging/application_1429108701044_0881/spark-assembly-1.3.2-SNAPSHOT-hadoop2.3.0-cdh5.0.1.jar
> 15/05/29 10:20:39 INFO Client: Uploading resource 
> hdfs://ns1:8020/input/lbs/recommend/toona/spark/conf -> 
> hdfs://ns1/user/spark/.sparkStaging/application_1429108701044_0881/conf
> 15/05/29 10:20:41 INFO Client: Setting up the launch environment for our AM 
> container
> 15/05/29 10:20:42 INFO SecurityManager: Changing view acls to: spark
> 15/05/29 10:20:42 INFO SecurityManager: Changing modify acls to: spark
> 15/05/29 10:20:42 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(spark); users 
> with modify permissions: Set(spark)
> 15/05/29 10:20:42 INFO Client: Submitting application 881 to ResourceManager
> 15/05/29 10:20:42 INFO YarnClientImpl: Submitted application 
> application_1429108701044_0881
> 15/05/29 10:20:43 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:43 INFO Client:
>          client token: N/A
>          diagnostics: N/A
>          ApplicationMaster host: N/A
>          ApplicationMaster RPC port: -1
>          queue: root.spark
>          start time: 1432865812121
>          final status: UNDEFINED
>          tracking URL: 
> http://10dian72:9082/proxy/application_1429108701044_0881/
>          user: spark
> 15/05/29 10:20:44 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:45 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:46 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:47 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:48 INFO Client: Application report for 
> application_1429108701044_0881 (state: ACCEPTED)
> 15/05/29 10:20:48 INFO YarnClientSchedulerBackend: ApplicationMaster 
> registered as 
> Actor[akka.tcp://sparkYarnAM@10dian147:46186/user/YarnAM#-1047019982]
> 15/05/29 10:20:48 INFO YarnClientSchedulerBackend: Add WebUI Filter. 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS 
> -> 10dian72, PROXY_URI_BASES -> 
> http://10dian72:9082/proxy/application_1429108701044_0881), 
> /proxy/application_1429108701044_0881
> 15/05/29 10:20:48 INFO JettyUtils: Adding filter: 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> 15/05/29 10:20:49 INFO Client: Application report for 
> application_1429108701044_0881 (state: RUNNING)
> 15/05/29 10:20:49 INFO Client:
>          client token: N/A
>          diagnostics: N/A
>          ApplicationMaster host: 10dian147
>          ApplicationMaster RPC port: 0
>          queue: root.spark
>          start time: 1432865812121
>          final status: UNDEFINED
>          tracking URL: 
> http://10dian72:9082/proxy/application_1429108701044_0881/
>          user: spark
> 15/05/29 10:20:49 INFO YarnClientSchedulerBackend: Application 
> application_1429108701044_0881 has started running.
> 15/05/29 10:20:49 INFO NettyBlockTransferService: Server created on 60983
> 15/05/29 10:20:49 INFO BlockManagerMaster: Trying to register BlockManager
> 15/05/29 10:20:49 INFO BlockManagerMasterActor: Registering block manager 
> 10dian71.domain.test:60983 with 4.1 GB RAM, BlockManagerId(<driver>, 
> 10dian71.domain.test, 60983)
> 15/05/29 10:20:49 INFO BlockManagerMaster: Registered BlockManager
> 15/05/29 10:21:34 WARN ReliableDeliverySupervisor: Association with remote 
> system [akka.tcp://sparkYarnAM@10dian147:46186] has failed, address is now 
> gated for [5000] ms. Reason is: [Disassociated].
> 15/05/29 10:21:39 INFO YarnClientSchedulerBackend: ApplicationMaster 
> registered as 
> Actor[akka.tcp://sparkYarnAM@10dian107:39361/user/YarnAM#-1054488890]
> 15/05/29 10:21:39 INFO YarnClientSchedulerBackend: Add WebUI Filter. 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS 
> -> 10dian72, PROXY_URI_BASES -> 
> http://10dian72:9082/proxy/application_1429108701044_0881), 
> /proxy/application_1429108701044_0881
> 15/05/29 10:21:39 INFO JettyUtils: Adding filter: 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> 15/05/29 10:22:20 WARN ReliableDeliverySupervisor: Association with remote 
> system [akka.tcp://sparkYarnAM@10dian107:39361] has failed, address is now 
> gated for [5000] ms. Reason is: [Disassociated].
> 15/05/29 10:22:25 INFO YarnClientSchedulerBackend: ApplicationMaster 
> registered as 
> Actor[akka.tcp://sparkYarnAM@10dian148:43452/user/YarnAM#1073876277]
> 15/05/29 10:22:25 INFO YarnClientSchedulerBackend: Add WebUI Filter. 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS 
> -> 10dian72, PROXY_URI_BASES -> 
> http://10dian72:9082/proxy/application_1429108701044_0881), 
> /proxy/application_1429108701044_0881
> 15/05/29 10:22:25 INFO JettyUtils: Adding filter: 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> 15/05/29 10:23:06 WARN ReliableDeliverySupervisor: Association with remote 
> system [akka.tcp://sparkYarnAM@10dian148:43452] has failed, address is now 
> gated for [5000] ms. Reason is: [Disassociated].
> 15/05/29 10:23:10 INFO YarnClientSchedulerBackend: ApplicationMaster 
> registered as 
> Actor[akka.tcp://sparkYarnAM@10dian138:57898/user/YarnAM#-1389221836]
> 15/05/29 10:23:10 INFO YarnClientSchedulerBackend: Add WebUI Filter. 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS 
> -> 10dian72, PROXY_URI_BASES -> 
> http://10dian72:9082/proxy/application_1429108701044_0881), 
> /proxy/application_1429108701044_0881
> 15/05/29 10:23:10 INFO JettyUtils: Adding filter: 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
>  {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to