Spark on Kubernetes scheduler variety

2021-06-17 Thread Holden Karau
Hi Folks, I'm continuing my adventures to make Spark on containers party and I was wondering if folks have experience with the different batch scheduler options that they prefer? I was thinking so that we can better support dynamic allocation it might make sense for us to support using different

Re: [VOTE] Release Spark 3.0.3 (RC1)

2021-06-17 Thread Sean Owen
+1 same result as ever. Signatures are OK, tags look good, tests pass. On Thu, Jun 17, 2021 at 5:11 AM Yi Wu wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.0.3. > > The vote is open until Jun 21th 3AM (PST) and passes if a majority +1 PMC > votes are cast,

Re: CRAN package SparkR

2021-06-17 Thread Felix Cheung
Any suggestion or comment on this? They are going to remove the package by 6-28 Seems to me if we have a switch to opt in to install (and not by default on), or prompt the user in interactive session, should be good as user confirmation. On Sun, Jun 13, 2021 at 11:25 PM Felix Cheung wrote: >

Re: UPDATE: Apache Spark 3.2 Release

2021-06-17 Thread Dongjoon Hyun
Thank you for the correction, Yikun. Yes, it's 3.3.1. :) On 2021/06/17 09:03:55, Yikun Jiang wrote: > - Apache Hadoop 3.3.2 becomes the default Hadoop profile for Apache Spark > 3.2 via SPARK-29250 today. We are observing big improvements in S3 use > cases. Please try it and share your

[VOTE] Release Spark 3.0.3 (RC1)

2021-06-17 Thread Yi Wu
Please vote on releasing the following candidate as Apache Spark version 3.0.3. The vote is open until Jun 21th 3AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.0.3 [ ] -1 Do not release this package because ...

Re: UPDATE: Apache Spark 3.2 Release

2021-06-17 Thread Yikun Jiang
- Apache Hadoop 3.3.2 becomes the default Hadoop profile for Apache Spark 3.2 via SPARK-29250 today. We are observing big improvements in S3 use cases. Please try it and share your experience. It should be Apache Hadoop 3.3.1 [1]. : ) Note that Apache hadoop 3.3.0 is the first Hadoop release

Re: Migrating from hive to spark

2021-06-17 Thread Mich Talebzadeh
Ok the first link throws some clues .*... Hive excels in batch disc processing with a map reduce execution engine. Actually, Hive can also use Spark as its execution engine which also has a Hive context allowing us to query Hive tables. Despite all the great things Hive can solve, this post is to

Migrating from hive to spark

2021-06-17 Thread Battula, Brahma Reddy
Hi Talebzadeh, Looks I confused, Sorry.. Now I changed to subject to make it clear. Facebook has tried migration from hive to spark. Check the following links for same. https://www.dcsl.com/migrating-from-hive-to-spark/

Re: Apache Spark 3.2 Expectation

2021-06-17 Thread Hyukjin Kwon
*GA -> QA On Thu, 17 Jun 2021, 15:16 Hyukjin Kwon, wrote: > I think we would make sure treating these items in the list as exceptions > from the code freeze, and discourage to push new APIs and features though. > > GA period ideally we should focus on bug fixes and polishing. > > It would be

Re: Apache Spark 3.2 Expectation

2021-06-17 Thread Hyukjin Kwon
I think we would make sure treating these items in the list as exceptions from the code freeze, and discourage to push new APIs and features though. GA period ideally we should focus on bug fixes and polishing. It would be great if we can speed up on these items in the list too. On Thu, 17 Jun

Re: Apache Spark 3.2 Expectation

2021-06-17 Thread Gengliang Wang
Thanks for the suggestions from Dongjoon, Liangchi, Min, and Xiao! Now we make it clear that it's a soft cut and we can still merge important code changes to branch-3.2 before RC. Let's keep the branch cut date as July 1st. On Thu, Jun 17, 2021 at 1:41 PM Dongjoon Hyun wrote: > > First, I think