I don't think it's sufficient to have them in YARN (or any other services)
without Spark aware of them. If Spark is not aware of them, then there is
no way to really efficiently utilize these accelerators when you run
anything that require non-accelerators (which is almost 100% of the cases
in
Hadoop / Yarn 3.1 added GPU scheduling. 3.2 is planned to add FPGA scheduling,
so it might be worth to have the last point generic that not only the Spark
scheduler, but all supported schedulers can use GPU.
For the other 2 points I just wonder if it makes sense to address this in the
ml
Thanks Reynold for summarizing the offline discussion! I added a few
comments inline. -Xiangrui
On Mon, May 7, 2018 at 5:37 PM Reynold Xin wrote:
> Hi all,
>
> Xiangrui and I were discussing with a heavy Apache Spark user last week on
> their experiences integrating machine
Hi all,
Xiangrui and I were discussing with a heavy Apache Spark user last week on
their experiences integrating machine learning (and deep learning)
frameworks with Spark and some of their pain points. Couple things were
obvious and I wanted to share our learnings with the list.
(1) Most
On Mon, May 7, 2018 at 1:44 AM, Anshi Shrivastava
wrote:
> I've found a KVStore wrapper which stores all the metrics in a LevelDb
> store. This KVStore wrapper is available as a spark-dependency but we cannot
> access the metrics directly from spark since they are
Hi Joseph and devs,
Happy to see the discussion of CP shuffle, as comment in
https://issues.apache.org/jira/browse/SPARK-20928?focusedCommentId=16245556=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16245556
Hi All,
I've been trying to debug the Spark UI source code to replicate the same
Metric monitoring mechanism in my application.
I've found a KVStore wrapper which stores all the metrics in a LevelDb
store. This KVStore wrapper is available as a spark-dependency but we
cannot access the metrics