[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-68760841 Thanks for the review and commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-67061747 @rxin , any conclusion on this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-65761296 @rxin , I will just summarize what are the configuration defaults I have used. I put a value of 100 in initial pull request with the intention of having a futher discussion on appropriate defaults. There are 2 approaches possible. We can continue using the same defaults as earlier. That spark.network.timeout will have different default values. Or decide a fixed default value. I think latter should be done but an appropriate value has to be decided. 1. spark.core.connection.ack.wait.timeout - Default value of 60s was used earlier. 2. spark.shuffle.io.connectionTimeout - Default value of 120s was used earlier. 3. spark.akka.timeout - Default value of 100s. was used earlier 4. spark.storage.blockManagerSlaveTimeoutMs - Here default was 3 times value of spark.executor.heartbeatInterval or 45 sec., whichever is higher. I think based on these cases we can fix a default timeout value of 120 sec. for spark.network.timeout The only issue i can see is in case 4. But 120 sec. should be a good enough upper cap I think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21361036 --- Diff: docs/configuration.md --- @@ -777,6 +777,16 @@ Apart from these, the following properties are also available, and may be useful /td /tr tr + tdcodespark.network.timeout/code/td + td100/td + td +Default timeout for all network interactions, in seconds. This config will be used in +place of spark.core.connection.ack.wait.timeout, spark.akka.timeout, --- End diff -- Ok... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21357610 --- Diff: docs/configuration.md --- @@ -777,6 +777,16 @@ Apart from these, the following properties are also available, and may be useful /td /tr tr + tdcodespark.network.timeout/code/td + td100/td + td +Default timeout for all network interactions, in seconds. This config will be used in +place of spark.core.connection.ack.wait.timeout, spark.akka.timeout, --- End diff -- I am not sure what you mean here. You mean I should change the statement to This config will be used around spark.core.connection.ack.wait.timeout, spark.akka.timeout, spark.storage.blockManagerSlaveTimeoutMs and spark.shuffle.io.connectionTimeout, if they are not configured. ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21216319 --- Diff: network/common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -37,7 +37,8 @@ public boolean preferDirectBufs() { /** Connect timeout in secs. Default 120 secs. */ public int connectionTimeoutMs() { -return conf.getInt(spark.shuffle.io.connectionTimeout, 120) * 1000; +int timeout = conf.getInt(spark.shuffle.io.connectionTimeout, conf.getInt(spark.network.timeout,100)); --- End diff -- Alright will fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21216817 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -53,7 +53,9 @@ class BlockManagerMasterActor(val isLocal: Boolean, conf: SparkConf, listenerBus private val akkaTimeout = AkkaUtils.askTimeout(conf) val slaveTimeout = conf.getLong(spark.storage.blockManagerSlaveTimeoutMs, -math.max(conf.getInt(spark.executor.heartbeatInterval, 1) * 3, 45000)) --- End diff -- @rxin , how about this ? val slaveTimeout = conf.getLong(spark.storage.blockManagerSlaveTimeoutMs, conf.getInt(spark.network.timeout, math.max( conf.getInt(spark.executor.heartbeatInterval, 1) * 3, 45000)/1000) * 1000) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
GitHub user varunsaxena opened a pull request: https://github.com/apache/spark/pull/3562 [SPARK-4688] Have a single shared network timeout in Spark [SPARK-4688] Have a single shared network timeout in Spark You can merge this pull request into a Git repository by running: $ git pull https://github.com/varunsaxena/spark SPARK-4688 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3562.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3562 commit 594226c5bdfc01babda3975188dc5f307c2a4842 Author: varunsaxena vsaxena.va...@gmail.com Date: 2014-12-02T21:56:57Z SPARK-4688 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-65316168 Wanted to know what should be the default spark.network.timeout value ? I have kept it at 100 sec. Should it be different ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21197916 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -65,7 +65,8 @@ private[spark] object AkkaUtils extends Logging { val akkaThreads = conf.getInt(spark.akka.threads, 4) val akkaBatchSize = conf.getInt(spark.akka.batchSize, 15) -val akkaTimeout = conf.getInt(spark.akka.timeout, 100) +val akkaTimeout = conf.getInt(spark.akka.timeout, --- End diff -- Ok...Will make this change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-65320054 Made the changes as per review. Also updated configuration.md --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on a diff in the pull request: https://github.com/apache/spark/pull/3562#discussion_r21214791 --- Diff: core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala --- @@ -81,7 +81,8 @@ private[nio] class ConnectionManager( private val ackTimeoutMonitor = new HashedWheelTimer(Utils.namedThreadFactory(AckTimeoutMonitor)) - private val ackTimeout = conf.getInt(spark.core.connection.ack.wait.timeout, 60) + private val ackTimeout = +conf.getInt(spark.core.connection.ack.wait.timeout, conf.getInt(spark.network.timeout, 100)) --- End diff -- @Lewuathe , that is what I wanted to know. That what should I keep as the default value. Keep a single fixed timeout value for the config spark.network.timeout or change the default based on defaults for earlier configurations which this config intends to replace. As I am pretty new to Spark, I am not aware of what default value will be suitable. Maybe @rxin can confirm. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org