Re: Dynamic Scaling without Kubernetes

2022-10-26 Thread Artemis User
Wouldn't you need to run Spark on Hadoop in order to use YARN?  I believe that YARN only manages Hadoop nodes, not Spark workers directly.  Besides, what I read was that you would need some extra plug-ins to be able to get nodes managed dynamically. Our use case would be like this: 1. A

Re: Dynamic Scaling without Kubernetes

2022-10-26 Thread Holden Karau
So Spark can dynamically scale on YARN, but standalone mode becomes a bit complicated — where do you envision Spark gets the extra resources from? On Wed, Oct 26, 2022 at 12:18 PM Artemis User wrote: > Has anyone tried to make a Spark cluster dynamically scalable, i.e., > adding a new worker

Dynamic Scaling without Kubernetes

2022-10-26 Thread Artemis User
Has anyone tried to make a Spark cluster dynamically scalable, i.e., adding a new worker node automatically to the cluster when no more executors are available upon a new job submitted?  We need to make the whole cluster on-prem and really lightweight, so standalone mode is preferred and no

Re: Running 30 Spark applications at the same time is slower than one on average

2022-10-26 Thread Sean Owen
That just means G = GB mem, C = cores, but yeah the driver and executors are very small, possibly related. On Wed, Oct 26, 2022 at 12:34 PM Artemis User wrote: > Are these Cloudera specific acronyms? Not sure how Cloudera configures > Spark differently, but obviously the number of nodes is too

Re: Running 30 Spark applications at the same time is slower than one on average

2022-10-26 Thread Artemis User
Are these Cloudera specific acronyms?  Not sure how Cloudera configures Spark differently, but obviously the number of nodes is too small, considering each app only uses a small number of cores and RAM.  So you may consider increase the number of nodes.   When all these apps jam on a few

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Chao Sun
Congrats everyone! and thanks Yuming for driving the release! On Wed, Oct 26, 2022 at 7:37 AM beliefer wrote: > > Congratulations everyone have contributed to this release. > > > At 2022-10-26 14:21:36, "Yuming Wang" wrote: > > We are happy to announce the availability of Apache Spark 3.3.1! >

Re:[ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread beliefer
Congratulations everyone have contributed to this release. At 2022-10-26 14:21:36, "Yuming Wang" wrote: We are happy to announce the availability of Apache Spark 3.3.1! Spark 3.3.1 is a maintenance release containing stability fixes. This release is based on the branch-3.3 maintenance

Re: Running 30 Spark applications at the same time is slower than one on average

2022-10-26 Thread Sean Owen
Resource contention. Now all the CPU and I/O is competing and probably slows down On Wed, Oct 26, 2022, 5:37 AM eab...@163.com wrote: > Hi All, > > I have a CDH5.16.2 hadoop cluster with 1+3 nodes(64C/128G, 1NN/RM + > 3DN/NM), and yarn with 192C/240G. I used the following test scenario: > >

Running 30 Spark applications at the same time is slower than one on average

2022-10-26 Thread eab...@163.com
Hi All, I have a CDH5.16.2 hadoop cluster with 1+3 nodes(64C/128G, 1NN/RM + 3DN/NM), and yarn with 192C/240G. I used the following test scenario: 1.spark app resource with 2G driver memory/2C driver vcore/1 executor nums/2G executor memory/2C executor vcore. 2.one spark app will use 5G4C on

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Jacek Laskowski
Yoohoo! Thanks Yuming for driving this release. A tiny step for Spark a huge one for my clients (who still are on 3.2.1 or even older :)) Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski "The Internals Of" Online Books Follow me on

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Yang,Jie(INF)
Thanks Yuming and all developers ~ Yang Jie 发件人: Maxim Gekk 日期: 2022年10月26日 星期三 15:19 收件人: Hyukjin Kwon 抄送: "L. C. Hsieh" , Dongjoon Hyun , Yuming Wang , dev , User 主题: Re: [ANNOUNCE] Apache Spark 3.3.1 released Congratulations everyone with the new release, and thanks to Yuming for his

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Maxim Gekk
Congratulations everyone with the new release, and thanks to Yuming for his efforts. Maxim Gekk Software Engineer Databricks, Inc. On Wed, Oct 26, 2022 at 10:14 AM Hyukjin Kwon wrote: > Thanks, Yuming. > > On Wed, 26 Oct 2022 at 16:01, L. C. Hsieh wrote: > >> Thank you for driving the

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Hyukjin Kwon
Thanks, Yuming. On Wed, 26 Oct 2022 at 16:01, L. C. Hsieh wrote: > Thank you for driving the release of Apache Spark 3.3.1, Yuming! > > On Tue, Oct 25, 2022 at 11:38 PM Dongjoon Hyun > wrote: > > > > It's great. Thank you so much, Yuming! > > > > Dongjoon > > > > On Tue, Oct 25, 2022 at 11:23

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread L. C. Hsieh
Thank you for driving the release of Apache Spark 3.3.1, Yuming! On Tue, Oct 25, 2022 at 11:38 PM Dongjoon Hyun wrote: > > It's great. Thank you so much, Yuming! > > Dongjoon > > On Tue, Oct 25, 2022 at 11:23 PM Yuming Wang wrote: >> >> We are happy to announce the availability of Apache Spark

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Dongjoon Hyun
It's great. Thank you so much, Yuming! Dongjoon On Tue, Oct 25, 2022 at 11:23 PM Yuming Wang wrote: > We are happy to announce the availability of Apache Spark 3.3.1! > > Spark 3.3.1 is a maintenance release containing stability fixes. This > release is based on the branch-3.3 maintenance

[ANNOUNCE] Apache Spark 3.3.1 released

2022-10-26 Thread Yuming Wang
We are happy to announce the availability of Apache Spark 3.3.1! Spark 3.3.1 is a maintenance release containing stability fixes. This release is based on the branch-3.3 maintenance branch of Spark. We strongly recommend all 3.3 users to upgrade to this stable release. To download Spark 3.3.1,