[jira] [Assigned] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Apache Spark (JIRA) Thu, 07 Feb 2019 01:07:40 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-26082:
------------------------------------

    Assignee: Apache Spark

> Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
> -----------------------------------------------------------------------
>
>                 Key: SPARK-26082
>                 URL: https://issues.apache.org/jira/browse/SPARK-26082
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 
> 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2
>            Reporter: Martin Loncaric
>            Assignee: Apache Spark
>            Priority: Major
>
> Currently in 
> [docs|https://spark.apache.org/docs/latest/running-on-mesos.html]:
> {quote}spark.mesos.fetcherCache.enable / false / If set to `true`, all URIs 
> (example: `spark.executor.uri`, `spark.mesos.uris`) will be cached by the 
> Mesos Fetcher Cache
> {quote}
> Currently in {{MesosClusterScheduler.scala}} (which passes parameter to 
> driver):
> {{private val useFetchCache = 
> conf.getBoolean("spark.mesos.fetchCache.enable", false)}}
> Currently in {{MesosCourseGrainedSchedulerBackend.scala}} (which passes mesos 
> caching parameter to executors):
> {{private val useFetcherCache = 
> conf.getBoolean("spark.mesos.fetcherCache.enable", false)}}
> This naming discrepancy dates back to version 2.0.0 
> ([jira|http://mail-archives.apache.org/mod_mbox/spark-issues/201606.mbox/%3cjira.12979909.1466099309000.9921.1466101026...@atlassian.jira%3E]).
> This means that when {{spark.mesos.fetcherCache.enable=true}} is specified, 
> the Mesos cache will be used only for executors, and not for drivers.
> IMPACT:
> Not caching these driver files (typically including at least spark binaries, 
> custom jar, and additional dependencies) adds considerable overhead network 
> traffic and startup time when frequently running spark Applications on a 
> Mesos cluster. Additionally, since extracted files like 
> {{spark-x.x.x-bin-*.tgz}} are additionally copied and left in the sandbox 
> with the cache off (rather than extracted directly without an extra copy), 
> this can considerably increase disk usage. Users CAN currently workaround by 
> specifying the {{spark.mesos.fetchCache.enable}} option, but this should at 
> least be specified in the documentation.
> SUGGESTED FIX:
> Add {{spark.mesos.fetchCache.enable}} to the documentation for versions 2 - 
> 2.4, and update {{MesosClusterScheduler.scala}} to use 
> {{spark.mesos.fetcherCache.enable}} going forward (literally a one-line 
> change).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-26082) Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler

Reply via email to