[ 
https://issues.apache.org/jira/browse/HIVE-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148177#comment-15148177
 ] 

JoneZhang commented on HIVE-12650:
----------------------------------

Hi all,
I'm sorrry reply you so late.

Yes
hive.spark.client.server.connect.timeout and spark.yarn.am.waitTime does not 
have any relations.
hive.spark.client.server.connect.timeout is the timeout between RPC server and 
client handshake.When no container is available, hive cient  will exit after 
hive.spark.client.server.connect.timeout.
spark.yarn.am.waitTime is the time the Spark AM waits for the SparkContext to 
be created after the AM has been launched.

There are two types of error log
1.Client closed before SASL negotiation finished was happened on resubmitted. 
See https://issues.apache.org/jira/browse/HIVE-12649.
2.Connection refused: /hiveclientip:port was happend when am tries to connect 
back to Hive.

Container: container_1448873753366_113453_01_000001 on 10.247.169.134_8041
============================================================================
LogType: stderr
LogLength: 3302
Log Contents:
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in 
the future
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled in 
the future
15/12/09 02:11:48 INFO yarn.ApplicationMaster: Registered signal handlers for 
[TERM, HUP, INT]
15/12/09 02:11:48 INFO yarn.ApplicationMaster: ApplicationAttemptId: 
appattempt_1448873753366_113453_000001
15/12/09 02:11:49 INFO spark.SecurityManager: Changing view acls to: mqq
15/12/09 02:11:49 INFO spark.SecurityManager: Changing modify acls to: mqq
15/12/09 02:11:49 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(mqq); users with 
modify permissions: Set(mqq)
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Starting the user application in 
a separate Thread
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context 
initialization
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context 
initialization ... 
15/12/09 02:11:49 INFO client.RemoteDriver: Connecting to: 10.179.12.140:58013
15/12/09 02:11:49 ERROR yarn.ApplicationMaster: User class threw exception: 
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection 
refused: /10.179.12.140:58013
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection 
refused: /10.179.12.140:58013
        at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at 
org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
        at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Caused by: java.net.ConnectException: Connection refused: /10.179.12.140:58013
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at 
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
        at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Final app status: FAILED, 
exitCode: 15, (reason: User class threw exception: 
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection 
refused: /10.179.12.140:58013)
15/12/09 02:11:59 ERROR yarn.ApplicationMaster: SparkContext did not initialize 
after waiting for 150000 ms. Please check earlier log output for errors. 
Failing the application.
15/12/09 02:11:59 INFO util.Utils: Shutdown hook called

> Increase default value of hive.spark.client.server.connect.timeout to exceeds 
> spark.yarn.am.waitTime
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12650
>                 URL: https://issues.apache.org/jira/browse/HIVE-12650
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.1.1, 1.2.1
>            Reporter: JoneZhang
>            Assignee: Xuefu Zhang
>
> I think hive.spark.client.server.connect.timeout should be set greater than 
> spark.yarn.am.waitTime. The default value for 
> spark.yarn.am.waitTime is 100s, and the default value for 
> hive.spark.client.server.connect.timeout is 90s, which is not good. We can 
> increase it to a larger value such as 120s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to