ch more. Any failure during this long
time is pretty expensive.
ShayFrom: Tom Graves
Sent: Thursday, November 3, 2022 7:56 PM
To: Artemis User ; user@spark.apache.org
; Shay Elbaz
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of
executors when using GPUs
Stage level scheduling does not allow you to change configs right now. This is
something we thought about as follow on but have never implemented. How many
tasks on the DL stage are you running? The typical case is run some etl lots
of tasks... do mapPartitions and then run your DL stuff,
As Sean mentioned its only available at Stage level but you said you don't
want to shuffle so splitting into stages doesn't help you. Without more
details it seems like you could "hack" this by just requesting an executor with
1 GPU (allowing 2 tasks per gpu) and 2 CPUs and the one task would
Hey Martin,
I would encourage you to file issues in the spark-rapids repo for questions
with that plugin: https://github.com/NVIDIA/spark-rapids/issues
I'm assuming the query ran and you looked at the sql UI or the .expalin()
output and it was on cpu and not gpu? I am assuming you have the
I don't know if it all works but some work was done to make cluster manager
pluggable, see SPARK-13904.
Tom
On Wednesday, November 6, 2019, 07:22:59 PM CST, Klaus Ma
wrote:
Any suggestions?
- Klaus
On Mon, Nov 4, 2019 at 5:04 PM Klaus Ma wrote:
Hi team,
AFAIK, we built
We are happy to announce the availability of Spark 2.2.2!
Apache Spark 2.2.2 is a maintenance release, based on the branch-2.2
maintenance branch of Spark. We strongly recommend all 2.2.x users to upgrade
to this stable release. The release notes are available at
wrote:
I think this is happening in the driver. Could you check the classpath
of the JVM that gets started ? If you use spark-submit on yarn the
classpath is setup before R gets launched, so it should match the
behavior of Scala / Python.
Thanks
Shivaram
On Fri, Nov 6, 2015 at 1:39 PM, Tom
I'm trying to use the netlib-java stuff with mllib and sparkR on yarn. I've
compiled with -Pnetlib-lgpl, see the necessary things in the spark assembly
jar. The nodes have /usr/lib64/liblapack.so.3, /usr/lib64/libblas.so.3, and
/usr/lib/libgfortran.so.3.
Running:data <- read.df(sqlContext,
I would like to change the logging level for my application running on a
standalone Spark cluster. Is there an easy way to do that without changing
the log4j.properties on each individual node?
Thanks,Tom
It sounds like something is closing the hdfs filesystem before everyone is
really done with it. The filesystem gets cached and is shared so if someone
closes it while other threads are still using it you run into this error. Is
your application closing the filesystem? Are you using the
You need to look at the logs files for yarn. Generally this can be done with
yarn logs -applicationId your_app_id. That only works if you have log
aggregation enabled though. You should be able to see atleast the application
master logs through the yarn resourcemanager web ui. I would try
Since 1.0 is still in development you can pick up the latest docs in git:
https://github.com/apache/spark/tree/branch-1.0/docs
I didn't see anywhere that you said you started the spark history server?
there are multiple things that need to happen for the spark history server to
work.
1)
Do we have a list of things we really want to get in for 1.X? Perhaps move
any jira out to a 1.1 release if we aren't targetting them for 1.0.
It might be nice to send out reminders when these dates are approaching.
Tom
On Thursday, April 3, 2014 11:19 PM, Bhaskar Dutta bhas...@gmail.com
I had asked a similar question on the dev mailing list a while back (Jan 22nd).
See the archives:
http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser - look
for spork.
Basically Matei said:
Yup, that was it, though I believe people at Twitter picked it up again
recently.
14 matches
Mail list logo