[jira] [Comment Edited] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Dongjoon Hyun (JIRA) Thu, 07 Feb 2019 01:18:06 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762499#comment-16762499
 ]


Dongjoon Hyun edited comment on SPARK-26082 at 2/7/19 9:17 AM:
---------------------------------------------------------------

Since this bug is introduced by SPARK-15994 which is added Spark 2.1.0, I 
removed 2.0.x from the affected versions.

BTW, Spark 2.2.x is EOL (https://spark.apache.org/versioning-policy.html).


was (Author: dongjoon):
Since this bug is introduced by SPARK-15994 which is added Spark 2.1.0, I 
removed 2.0.x from the affected versions.

> Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
> -----------------------------------------------------------------------
>
>                 Key: SPARK-26082
>                 URL: https://issues.apache.org/jira/browse/SPARK-26082
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 
> 2.3.1, 2.3.2
>            Reporter: Martin Loncaric
>            Priority: Major
>
> Currently in 
> [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]:
> {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs 
> (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the 
> Mesos Fetcher Cache
> {quote}
> Currently in {{MesosClusterScheduler.scala}} (which passes parameter to 
> driver):
> {{private val useFetchCache = 
> conf.getBoolean("spark.mesos.fetchCache.enable", false)}}
> Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos 
> caching parameter to executors):
> {{private val useFetcherCache = 
> conf.getBoolean("spark.mesos.fetcherCache.enable", false)}}
> This naming discrepancy dates back to version 2.0.0 
> ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]).
> This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, 
> the Mesos cache will be used only for executors, and not for drivers.
> IMPACT:
> Not caching these driver files (typically including at least spark binaries, 
> custom jar, and additional dependencies) adds considerable overhead network 
> traffic and startup time when frequently running spark Applications on a 
> Mesos cluster. Additionally, since extracted files like 
> {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox 
> with the cache off (rather than extracted directly without an extra copy), 
> this can considerably increase disk usage. Users CAN currently workaround by 
> specifying the {{spark.mesos.fetchCache.enable}} option, but this should at 
> least be specified in the documentation.
> SUGGESTED FIX:
> Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - 
> 2.4, and update {{MesosClusterScheduler.scala}} to use 
> {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line 
> change).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Reply via email to