[ANNOUNCE] Apache Spark 3.0.3 released

2021-06-24 Thread Yi Wu
We are happy to announce the availability of Spark 3.0.3! Spark 3.0.3 is a maintenance release containing stability fixes. This release is based on the branch-3.0 maintenance branch of Spark. We strongly recommend all 3.0 users to upgrade to this stable release. To download Spark 3.0.3, head over

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am su

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
That's awesome, I'm just starting to get context around Volcano but maybe we can schedule an initial meeting for all of us interested in pursuing this to get on the same page. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear t

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Lalwani, Jayesh
You can always chain aggregations by chaining multiple Structured Streaming jobs. It’s not a showstopper. Getting Spark on Kubernetes is important for organizations that want to pursue a multi-cloud strategy From: Mich Talebzadeh Date: Wednesday, June 23, 2021 at 11:27 AM To: "user @spark" Cc

Re: Issue with Running Spark in Jupyter Notebook

2021-06-24 Thread Artemis User
Looks like you didn't set up your environment properly.  I assume you are running this from a standalone python program instead of from the pyspark shell.  I would first run your code from the pyspark shell, then follow the spark python installation guide to set up your python environment prope

Issue with Running Spark in Jupyter Notebook

2021-06-24 Thread Hsu, Philip
Hi there, My name is Philip, a master’s student at Imperial College London. I’m trying to use Spark to complete my course work assignment. I ran the following code: from pyspark import SparkContext sc = SparkContext.getOrCreate() and got the following error message: Py4JJavaError: An error occ

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Klaus! I am interested in more details. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear that the spark > community also has such requirements :) > > Volcano provides several features for batch workload, e.g. fair-share

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Thanks Klaus. That will be great. It will also be intuitive if you elaborate the need for this feature in line with the limitation of the current batch workload. Regards, Mich view my Linkedin profile *Disclaimer:* Use it at you