> > But in some cases all of the applications will fail which caused > by SparkContext did not initialize after waiting for 150000 ms. > See attchment (hive.spark.client.server.connect.timeout is set to 5min).
*The error log is different from original mail* Container: container_1448873753366_113453_01_000001 on 10.247.169.134_8041 ============================================================================ LogType: stderr LogLength: 3302 Log Contents: Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in the future Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in the future 15/12/09 02:11:48 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/12/09 02:11:48 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1448873753366_113453_000001 15/12/09 02:11:49 INFO spark.SecurityManager: Changing view acls to: mqq 15/12/09 02:11:49 INFO spark.SecurityManager: Changing modify acls to: mqq 15/12/09 02:11:49 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mqq); users with modify permissions: Set(mqq) 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/12/09 02:11:49 INFO client.RemoteDriver: Connecting to: 10.179.12.140:58013 15/12/09 02:11:49 ERROR yarn.ApplicationMaster: User class threw exception: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused: /10.179.12.140:58013 java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused: /10.179.12.140:58013 at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) Caused by: java.net.ConnectException: Connection refused: / 10.179.12.140:58013 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused: /10.179.12.140:58013) 15/12/09 02:11:59 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 150000 ms. Please check earlier log output for errors. Failing the application. 15/12/09 02:11:59 INFO util.Utils: Shutdown hook called 2015-12-09 19:22 GMT+08:00 Jone Zhang <joyoungzh...@gmail.com>: > Hive version is 1.2.1 > Spark version is 1.4.1 > Hadoop version is 2.5.1 > > The application_1448873753366_121062 will success in the above mail. > > But in some cases all of the applications will fail which caused by > SparkContext > did not initialize after waiting for 150000 ms. > See attchment (hive.spark.client.server.connect.timeout is set to 5min). > > Thanks. > Best wishes. > > 2015-12-09 17:56 GMT+08:00 Jone Zhang <joyoungzh...@gmail.com>: > >> *Hi, Xuefu:* >> >> *See attachment 1* >> *When the queue resources is not enough.* >> *The application application_1448873753366_121022 will pending.* >> *Two minutes later, the application application_1448873753366_121055 will >> be submited and pending.* >> *And then application_1448873753366_121062.* >> >> *See attachment 2* >> *When the queue resources is free.* >> *The application application_1448873753366_121062 begin to running.* >> *Application_1448873753366_121022 and application_1448873753366_121055 >> will failed fast.* >> >> *Logs of Application_1448873753366_121022 as follows(same as * >> *application_1448873753366_121055**):* >> Container: container_1448873753366_121022_03_000001 on 10.226.136.122_8041 >> >> ============================================================================ >> LogType: stderr >> LogLength: 4664 >> Log Contents: >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> 15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal handlers >> for [TERM, HUP, INT] >> 15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1448873753366_121022_000003 >> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq >> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to: mqq >> 15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(mqq); users with modify permissions: Set(mqq) >> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user >> application in a separate Thread >> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to: >> 10.179.12.140:38842 >> 15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to >> default. >> 15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw >> exception: java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> at >> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) >> at >> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156) >> at >> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) >> Caused by: javax.security.sasl.SaslException: Client closed before SASL >> negotiation finished. >> at >> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449) >> at >> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769) >> at >> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) >> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) >> at java.lang.Thread.run(Thread.java:745) >> 15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status: FAILED, >> exitCode: 15, (reason: User class threw exception: >> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: >> Client closed before SASL negotiation finished.) >> 15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not >> initialize after waiting for 150000 ms. Please check earlier log output for >> errors. Failing the application. >> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering >> ApplicationMaster with FAILED (diag message: User class threw exception: >> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: >> Client closed before SASL negotiation finished.) >> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging directory >> .sparkStaging/application_1448873753366_121022 >> 15/12/09 16:29:57 INFO util.Utils: Shutdown hook called >> >> LogType: stdout >> LogLength: 216 >> Log Contents: >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> >> >> >> Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041 >> ========================================================================== >> LogType: stderr >> LogLength: 4284 >> Log Contents: >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> 15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal handlers >> for [TERM, HUP, INT] >> 15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1448873753366_121022_000001 >> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq >> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to: mqq >> 15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(mqq); users with modify permissions: Set(mqq) >> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user >> application in a separate Thread >> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to: >> 10.179.12.140:38842 >> 15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to >> default. >> 15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw >> exception: java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> at >> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) >> at >> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156) >> at >> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) >> Caused by: javax.security.sasl.SaslException: Client closed before SASL >> negotiation finished. >> at >> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449) >> at >> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769) >> at >> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) >> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) >> at java.lang.Thread.run(Thread.java:745) >> 15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status: FAILED, >> exitCode: 15, (reason: User class threw exception: >> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: >> Client closed before SASL negotiation finished.) >> 15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not >> initialize after waiting for 150000 ms. Please check earlier log output for >> errors. Failing the application. >> 15/12/09 16:29:29 INFO util.Utils: Shutdown hook called >> >> LogType: stdout >> LogLength: 216 >> Log Contents: >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> >> >> >> Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041 >> >> =========================================================================== >> LogType: stderr >> LogLength: 4284 >> Log Contents: >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled >> in the future >> 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal handlers >> for [TERM, HUP, INT] >> 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1448873753366_121022_000002 >> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq >> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to: mqq >> 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(mqq); users with modify permissions: Set(mqq) >> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user >> application in a separate Thread >> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to: >> 10.179.12.140:38842 >> 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to >> default. >> 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw >> exception: java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> java.util.concurrent.ExecutionException: >> javax.security.sasl.SaslException: Client closed before SASL negotiation >> finished. >> at >> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) >> at >> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156) >> at >> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) >> Caused by: javax.security.sasl.SaslException: Client closed before SASL >> negotiation finished. >> at >> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449) >> at >> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) >> at >> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233) >> at >> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219) >> at >> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769) >> at >> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) >> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) >> at java.lang.Thread.run(Thread.java:745) >> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED, >> exitCode: 15, (reason: User class threw exception: >> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: >> Client closed before SASL negotiation finished.) >> 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not >> initialize after waiting for 150000 ms. Please check earlier log output for >> errors. Failing the application. >> 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called >> >> LogType: stdout >> LogLength: 216 >> Log Contents: >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option >> UseCompressedStrings; support was removed in 7.0 >> >> >> *I find submit cycle is base >> on hive.spark.client.server.connect.timeout,which is 120000ms in our team.* >> *And spark.yarn.am.waitTime is 150s.* >> >> *My question is * >> *1.Is a bug or design of submit more times?How do i set up the value of >> hive.spark.client.server.connect.timeout when the queue resources is not >> enough?* >> *2. Why the application will fast fail,caused by "SparkContext did not >> initialize after waiting for 150000 ms"?* >> >> *Thanks.* >> *Best wishes.* >> >> > >