[SPARK on MESOS] Avoid re-fetching Spark binary

Tien Dat Fri, 06 Jul 2018 01:00:32 -0700

Dear all,

We are running Spark with Mesos as the master for resource management.
In our cluster, there are jobs that require very short response time (near
real time applications), which usually around 3-5 seconds.


In order to Spark to execute with Mesos, one has to specify the
SPARK_EXECUTOR_URI configuration, which defines the location where Mesos can
fetch the Spark binary every time it launches new job.
We noticed that the fetching and extraction of the Spark binary repeats
every time we run, even though the binary is basically the same. More
importantly, fetching and extracting this file can lead to 2-3 seconds of
latency, which is fatal for our near real-time application. Besides, after
running many Spark jobs, the Spark binary tar is cumulated and occupies a
large disk space.

As a result, we wonder if there is a workaround to avoid this fetching and
extracting process, given that the Spark binary is available locally at each
of the Mesos agent?

Please don't hesitate to ask me if you have any further information needed.
Thank you in advance.

Best regards



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[SPARK on MESOS] Avoid re-fetching Spark binary

Reply via email to