Re: Apache Spark 3.2 Expectation

2021-07-01 Thread Gengliang Wang
Hi all, I just cut branch-3.2 on Github and created version 3.3.0 on Jira. When merging PRs on the master branch before 3.2.0 RC, please help cherry-picking bug fixes and ongoing major features mentioned in this thread to branch-3.2, thanks! On Fri, Jul 2, 2021 at 2:31 AM Dongjoon Hyun wrote: >

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Thanks. I also have a three node cluster in my lab running Red Hat 7.6 with 64GB of RAM etc. However, I doubt whether minikube will be useful. If we can get a Google Kubernetes Engine (GKE) cluster (which is a fully managed service) from Google on a loan

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Holden Karau
I do my own dev work on a personal cluster I have down in Fremont which I’ve got setup using k3sup. I know some devs use minikube (and our integration tests can). But yeah if there was a vendor willing to hand out Kube resources that could simplify our dev cycles. On Thu, Jul 1, 2021 at 12:52 PM M

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Hi, A rather simple question. As Kubernetes is a special work requiring some effort in setting it up properly, do we have a dev/test bed to conduct development work? What I am trying to get at is if there is official support for Volcano stuff that a vendor can provide free cluster usage in excha

Re: Apache Spark 3.2 Expectation

2021-07-01 Thread Dongjoon Hyun
Thank you, Gengliang! On Wed, Jun 30, 2021 at 10:56 PM Gengliang Wang wrote: > Hi all, > > Just as a gentle reminder, I will do the branch cut tomorrow. Please > focus on finalizing the works to land in Spark 3.2.0. > After the branch cut, we can still merge the ongoing major features > mentione

Re: Hive on Spark vs Spark on Hive(HiveContext)

2021-07-01 Thread Mich Talebzadeh
Hi Pralabh, You need to check the latest compatibility between Spark version that can successfully work as Hive execution engine This is my old file alluding to spark-1.3.1 as the execution engine set spark.home=/data6/hduser/spark-1.3.1-bin-hadoop2.6; --set spark.home=/usr/lib/spark-1.6.2-bin-h

Re: Hive on Spark vs Spark on Hive(HiveContext)

2021-07-01 Thread Pralabh Kumar
Hi mich Thx for replying.your answer really helps. The comparison was done in 2016. I would like to know the latest comparison with spark 3.0 Also what you are suggesting is to migrate queries to Spark ,which is hivecontxt or hive on spark, which is what Facebook also did . Is that understanding

Re: Hive on Spark vs Spark on Hive(HiveContext)

2021-07-01 Thread Mich Talebzadeh
Hi Prahabh, This question has been asked before :) Few years ago (late 2016), I made a presentation on running Hive Queries on the Spark execution engine for Hortonworks. https://www.slideshare.net/MichTalebzadeh1/query-engines-for-hive-mr-spark-tez-with-llap-considerations The issue you will

Hive on Spark vs Spark on Hive(HiveContext)

2021-07-01 Thread Pralabh Kumar
Hi Dev I am having thousands of legacy hive queries . As a plan to move to Spark , we are planning to migrate Hive queries on Spark . Now there are two approaches 1. One is Hive on Spark , which is similar to changing the execution engine in hive queries like TEZ. 2. Another one is m