GitHub user CodingCat opened a pull request:
https://github.com/apache/spark/pull/85
SPARK-1192: The document for most of the parameters used in core component
I grep the code in core component, I found that around 30 parameters in the
implementation is actually used but undocumented. By reading the source code, I
found that some of them are actually very useful for the user.
I suggest to make a complete document on the parameters.
Also some parameters are having confusing names
1. spark.shuffle.copier.threads - this parameters is to control how many
threads you will use when you start a Netty-based shuffle service....but from
the name, we cannot get this information
2. spark.shuffle.sender.port - the similar problem with the above one, when
you use Netty-based shuffle receiver, you will have to setup a Netty-based
sender...this parameter is to setup the port used by the Netty sender, but the
name cannot convey this information.
---
To facilitate the review, I mostly make a commit for each parameter, but
some highly related parameters are combined together. The format of the commit
title is like
" %s - %s L%d" % parameter name, File/Class name, line-number
---
spark.deploy.retainedApplications - Master.scala L49
spark.dead.worker.persistence - Master.scala L50
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/CodingCat/spark config_fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/85.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #85
----
commit 4a6816b4d8fca663dd6b7fa2cb4eca9aaef58d62
Author: CodingCat <[email protected]>
Date: 2014-03-05T04:23:21Z
spark.deploy.retainedApplications
commit 43eaac1d279ffbc9ffd92c12698bcfcde9c7d60d
Author: CodingCat <[email protected]>
Date: 2014-03-05T04:30:55Z
spark.dead.worker.persistence
commit ef2e8f0ffdd0e1b4ff456f2e836cc17deff45bc0
Author: CodingCat <[email protected]>
Date: 2014-03-05T04:36:55Z
spark.deploy.recoveryDirectory - Master.scala L51
commit 8ae5f2c31c49955c0fe643ac9051262e82a0de86
Author: CodingCat <[email protected]>
Date: 2014-03-05T05:00:09Z
spark.deploy.zookeeper.url - SparkCuratorUtil.scala L34
commit e1b401259b072276d2f3728b1ec83019d9fb22a2
Author: CodingCat <[email protected]>
Date: 2014-03-05T05:31:41Z
spark.repl.class.uri - Executor.scala 313
commit ab807a5931e415ea3586337411426322513d52ba
Author: CodingCat <[email protected]>
Date: 2014-03-05T14:13:25Z
spark.core.connection.*.threads.* - ConnectionManager.scala
commit be4baaf20bae3b5526badb6b2e7e0195da21f9c9
Author: CodingCat <[email protected]>
Date: 2014-03-05T14:21:28Z
spark.shuffle.netty.connect.timeout - ShuffleCopier.scala L38
commit 3dc1c74a1c74e71602c434ddb6412fe2c4da88d1
Author: CodingCat <[email protected]>
Date: 2014-03-05T14:50:05Z
spark.mesos.extra.cores - CoarseMesosSchedulerBackend L79
commit 6eb9cafa1cf73fba5ef8a210fa46f800cf3064a0
Author: CodingCat <[email protected]>
Date: 2014-03-05T14:54:49Z
spark.scheduler.allocation.file - SchedulableBuilder.scala L55
commit bc55539b270a1d87ae078b58a5b6f94f94b30d03
Author: CodingCat <[email protected]>
Date: 2014-03-05T15:05:30Z
spark.resultGetter.threads - TaskResultGetter L33
commit 7d4af4c8fff6932e31c7c563563009e3e1549ecd
Author: CodingCat <[email protected]>
Date: 2014-03-05T15:10:32Z
spark.starvation.timeout - TaskSchedulerImpl L63
commit dc2125bb582d29c87f64701f8c2f9e1dc299513b
Author: CodingCat <[email protected]>
Date: 2014-03-05T15:27:21Z
spark.task.cpus - L60
commit 059b7e7f0b9c062076b7a99259d92228a42faa77
Author: CodingCat <[email protected]>
Date: 2014-03-05T15:33:43Z
spark.logging.exceptionPrintInterval - TaskSetManager.scala L127
commit 05d9aa3842dee923f8cb4a070b3d53e9e77b11ed
Author: CodingCat <[email protected]>
Date: 2014-03-05T15:38:05Z
spark.jars - SparkContext.scala L122
commit f1de674f1fbed3e780adf73649d865fd605b572d
Author: CodingCat <[email protected]>
Date: 2014-03-05T16:09:02Z
spark.shuffle.copier.threads - BlockFetcherIterator L329
commit 5a2bdf28fe311d7b0695c01483ce88924c1bd1f9
Author: CodingCat <[email protected]>
Date: 2014-03-05T16:13:06Z
spark.shuffle.use.netty - BlockManager.scala L446
commit 60e0ee877152368fd2f25df5cb46927be3f6c6fd
Author: CodingCat <[email protected]>
Date: 2014-03-05T16:22:09Z
spark.shuffle.sender.port - BlockManager.scala L59
commit 6243bf52e3bc32ec3d991e19ac3943dce539ff3f
Author: CodingCat <[email protected]>
Date: 2014-03-05T21:37:30Z
spark.shuffle.sync - BlockManager L472
commit e8736a9e6428a5b769ac7ce45ad3008955d17fa4
Author: CodingCat <[email protected]>
Date: 2014-03-05T21:47:39Z
spark.storage.blockManagerTimeoutIntervalMs - BlockManager L874
commit 6291cc7829202297f442e3acf7f53f16e6b690be
Author: CodingCat <[email protected]>
Date: 2014-03-05T21:54:58Z
spark.storage.blockManagerSlaveTimeoutMs - BlockManagerMasterActor L53
commit cfb093660aad644740d6a6db1d115b8f943672b1
Author: CodingCat <[email protected]>
Date: 2014-03-05T22:17:58Z
spark.diskStore.subDirectories - DiskBlockManager L41
commit ddf48a320c3ae47913ba526616af0c428678305b
Author: CodingCat <[email protected]>
Date: 2014-03-06T00:06:11Z
spark.akka.batchSize - AkkaUtils.scala L53
commit e4062f000c87aa30956e799b64966427e8dde0be
Author: CodingCat <[email protected]>
Date: 2014-03-06T00:12:35Z
spark.akka.logAkkaConfig - AkkaUtils L61
commit ab25a96ab272719f1e1051f362efae2338a48e6c
Author: CodingCat <[email protected]>
Date: 2014-03-06T00:17:10Z
spark.akka.askTimeout/lookupTimeout - AkkaUtils L106/111
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---