Re: Spark-sql can replace Hive ?

2021-06-15 Thread Mich Talebzadeh
OK you mean use spark.sql as opposed to HiveContext.sql?

val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
HiveContext.sql("")

replace with

spark.sql("")
?


   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 15 Jun 2021 at 18:00, Battula, Brahma Reddy 
wrote:

> Currently I am using hive sql engine for adhoc queries. As spark-sql also
> supports this, I want migrate from hive.
>
>
>
>
>
>
>
>
>
> *From: *Mich Talebzadeh 
> *Date: *Thursday, 10 June 2021 at 8:12 PM
> *To: *Battula, Brahma Reddy 
> *Cc: *ayan guha , dev@spark.apache.org <
> dev@spark.apache.org>, u...@spark.apache.org 
> *Subject: *Re: Spark-sql can replace Hive ?
>
> These are different things. Spark provides a computational layer and a
> dialogue of SQL based on Hive.
>
>
>
> Hive is a DW on top of HDFS. What are you trying to replace?
>
>
>
> HTH
>
>
>
>
>
>
>view my Linkedin profile
> 
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Thu, 10 Jun 2021 at 12:09, Battula, Brahma Reddy
>  wrote:
>
> Thanks for prompt reply.
>
>
>
> I want to replace hive with spark.
>
>
>
>
>
>
>
>
>
> *From: *ayan guha 
> *Date: *Thursday, 10 June 2021 at 4:35 PM
> *To: *Battula, Brahma Reddy 
> *Cc: *dev@spark.apache.org , u...@spark.apache.org <
> u...@spark.apache.org>
> *Subject: *Re: Spark-sql can replace Hive ?
>
> Would you mind expanding the ask? Spark Sql can use hive by itaelf
>
>
>
> On Thu, 10 Jun 2021 at 8:58 pm, Battula, Brahma Reddy
>  wrote:
>
> Hi
>
>
>
> Would like know any refences/docs to replace hive with spark-sql
> completely like how migrate the existing data in hive.?
>
>
>
> thanks
>
>
>
>
>
> --
>
> Best Regards,
> Ayan Guha
>
>


Re: Spark-sql can replace Hive ?

2021-06-15 Thread Battula, Brahma Reddy
Currently I am using hive sql engine for adhoc queries. As spark-sql also 
supports this, I want migrate from hive.




From: Mich Talebzadeh 
Date: Thursday, 10 June 2021 at 8:12 PM
To: Battula, Brahma Reddy 
Cc: ayan guha , dev@spark.apache.org 
, u...@spark.apache.org 
Subject: Re: Spark-sql can replace Hive ?
These are different things. Spark provides a computational layer and a dialogue 
of SQL based on Hive.

Hive is a DW on top of HDFS. What are you trying to replace?

HTH





 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Thu, 10 Jun 2021 at 12:09, Battula, Brahma Reddy  
wrote:
Thanks for prompt reply.

I want to replace hive with spark.




From: ayan guha mailto:guha.a...@gmail.com>>
Date: Thursday, 10 June 2021 at 4:35 PM
To: Battula, Brahma Reddy 
Cc: dev@spark.apache.org 
mailto:dev@spark.apache.org>>, 
u...@spark.apache.org 
mailto:u...@spark.apache.org>>
Subject: Re: Spark-sql can replace Hive ?
Would you mind expanding the ask? Spark Sql can use hive by itaelf

On Thu, 10 Jun 2021 at 8:58 pm, Battula, Brahma Reddy 
 wrote:
Hi

Would like know any refences/docs to replace hive with spark-sql completely 
like how migrate the existing data in hive.?

thanks


--
Best Regards,
Ayan Guha


Re: Apache Spark 3.2 Expectation

2021-06-15 Thread Hyukjin Kwon
+1, thanks.

On Tue, 15 Jun 2021, 16:17 Gengliang Wang,  wrote:

> Hi,
>
> As the expected release date is close,  I would like to volunteer as the
> release manager for Apache Spark 3.2.0.
>
> Thanks,
> Gengliang
>
> On Mon, Apr 12, 2021 at 1:59 PM Wenchen Fan  wrote:
>
>> An update: we found a mistake that we picked the Spark 3.2 release date
>> based on the scheduled release date of 3.1. However, 3.1 was delayed and
>> released on March 2. In order to have a full 6 months development for 3.2,
>> the target release date for 3.2 should be September 2.
>>
>> I'm updating the release dates in
>> https://github.com/apache/spark-website/pull/331
>>
>> Thanks,
>> Wenchen
>>
>> On Thu, Mar 11, 2021 at 11:17 PM Dongjoon Hyun 
>> wrote:
>>
>>> Thank you, Xiao, Wenchen and Hyukjin.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>> On Thu, Mar 11, 2021 at 2:15 AM Hyukjin Kwon 
>>> wrote:
>>>
 Just for an update, I will send a discussion email about my idea late
 this week or early next week.

 2021년 3월 11일 (목) 오후 7:00, Wenchen Fan 님이 작성:

> There are many projects going on right now, such as new DS v2 APIs,
> ANSI interval types, join improvement, disaggregated shuffle, etc. I don't
> think it's realistic to do the branch cut in April.
>
> I'm +1 to release 3.2 around July, but it doesn't mean we have to cut
> the branch 3 months earlier. We should make the release process faster and
> cut the branch around June probably.
>
>
>
> On Thu, Mar 11, 2021 at 4:41 AM Xiao Li  wrote:
>
>> Below are some nice-to-have features we can work on in Spark 3.2: Lateral
>> Join support ,
>> interval data type, timestamp without time zone, un-nesting arbitrary
>> queries, the returned metrics of DSV2, and error message standardization.
>> Spark 3.2 will be another exciting release I believe!
>>
>> Go Spark!
>>
>> Xiao
>>
>>
>>
>>
>> Dongjoon Hyun  于2021年3月10日周三 下午12:25写道:
>>
>>> Hi, Xiao.
>>>
>>> This thread started 13 days ago. Since you asked the community about
>>> major features or timelines at that time, could you share your roadmap 
>>> or
>>> expectations if you have something in your mind?
>>>
>>> > Thank you, Dongjoon, for initiating this discussion. Let us keep
>>> it open. It might take 1-2 weeks to collect from the community all the
>>> features we plan to build and ship in 3.2 since we just finished the 3.1
>>> voting.
>>> > TBH, cutting the branch this April does not look good to me. That
>>> means, we only have one month left for feature development of Spark 
>>> 3.2. Do
>>> we have enough features in the current master branch? If not, are we 
>>> able
>>> to finish major features we collected here? Do they have a timeline or
>>> project plan?
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>>
>>> On Wed, Mar 3, 2021 at 2:58 PM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>>
 Hi, John.

 This thread aims to share your expectations and goals (and maybe
 work progress) to Apache Spark 3.2 because we are making this 
 together. :)

 Bests,
 Dongjoon.


 On Wed, Mar 3, 2021 at 1:59 PM John Zhuge 
 wrote:

> Hi Dongjoon,
>
> Is it possible to get ViewCatalog in? The community already had
> fairly detailed discussions.
>
> Thanks,
> John
>
> On Thu, Feb 25, 2021 at 8:57 AM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
>
>> Hi, All.
>>
>> Since we have been preparing Apache Spark 3.2.0 in master branch
>> since December 2020, March seems to be a good time to share our 
>> thoughts
>> and aspirations on Apache Spark 3.2.
>>
>> According to the progress on Apache Spark 3.1 release, Apache
>> Spark 3.2 seems to be the last minor release of this year. Given the
>> timeframe, we might consider the following. (This is a small set. 
>> Please
>> add your thoughts to this limited list.)
>>
>> # Languages
>>
>> - Scala 2.13 Support: This was expected on 3.1 via SPARK-25075
>> but slipped out. Currently, we are trying to use Scala 2.13.5 via
>> SPARK-34505 and investigating the publishing issue. Thank you for 
>> your
>> contributions and feedback on this.
>>
>> - Java 17 LTS Support: Java 17 LTS will arrive in September 2017.
>> Like Java 11, we need lots of support from our dependencies. Let's 
>> see.
>>
>> - Python 3.6 Deprecation(?): Python 3.6 community support ends at
>> 2021-12-23. So, the deprecation is not required yet, but we had 
>> 

Re: Apache Spark 3.2 Expectation

2021-06-15 Thread Gengliang Wang
Hi,

As the expected release date is close,  I would like to volunteer as the
release manager for Apache Spark 3.2.0.

Thanks,
Gengliang

On Mon, Apr 12, 2021 at 1:59 PM Wenchen Fan  wrote:

> An update: we found a mistake that we picked the Spark 3.2 release date
> based on the scheduled release date of 3.1. However, 3.1 was delayed and
> released on March 2. In order to have a full 6 months development for 3.2,
> the target release date for 3.2 should be September 2.
>
> I'm updating the release dates in
> https://github.com/apache/spark-website/pull/331
>
> Thanks,
> Wenchen
>
> On Thu, Mar 11, 2021 at 11:17 PM Dongjoon Hyun 
> wrote:
>
>> Thank you, Xiao, Wenchen and Hyukjin.
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Thu, Mar 11, 2021 at 2:15 AM Hyukjin Kwon  wrote:
>>
>>> Just for an update, I will send a discussion email about my idea late
>>> this week or early next week.
>>>
>>> 2021년 3월 11일 (목) 오후 7:00, Wenchen Fan 님이 작성:
>>>
 There are many projects going on right now, such as new DS v2 APIs,
 ANSI interval types, join improvement, disaggregated shuffle, etc. I don't
 think it's realistic to do the branch cut in April.

 I'm +1 to release 3.2 around July, but it doesn't mean we have to cut
 the branch 3 months earlier. We should make the release process faster and
 cut the branch around June probably.



 On Thu, Mar 11, 2021 at 4:41 AM Xiao Li  wrote:

> Below are some nice-to-have features we can work on in Spark 3.2: Lateral
> Join support ,
> interval data type, timestamp without time zone, un-nesting arbitrary
> queries, the returned metrics of DSV2, and error message standardization.
> Spark 3.2 will be another exciting release I believe!
>
> Go Spark!
>
> Xiao
>
>
>
>
> Dongjoon Hyun  于2021年3月10日周三 下午12:25写道:
>
>> Hi, Xiao.
>>
>> This thread started 13 days ago. Since you asked the community about
>> major features or timelines at that time, could you share your roadmap or
>> expectations if you have something in your mind?
>>
>> > Thank you, Dongjoon, for initiating this discussion. Let us keep it
>> open. It might take 1-2 weeks to collect from the community all the
>> features we plan to build and ship in 3.2 since we just finished the 3.1
>> voting.
>> > TBH, cutting the branch this April does not look good to me. That
>> means, we only have one month left for feature development of Spark 3.2. 
>> Do
>> we have enough features in the current master branch? If not, are we able
>> to finish major features we collected here? Do they have a timeline or
>> project plan?
>>
>> Bests,
>> Dongjoon.
>>
>>
>>
>> On Wed, Mar 3, 2021 at 2:58 PM Dongjoon Hyun 
>> wrote:
>>
>>> Hi, John.
>>>
>>> This thread aims to share your expectations and goals (and maybe
>>> work progress) to Apache Spark 3.2 because we are making this together. 
>>> :)
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>> On Wed, Mar 3, 2021 at 1:59 PM John Zhuge  wrote:
>>>
 Hi Dongjoon,

 Is it possible to get ViewCatalog in? The community already had
 fairly detailed discussions.

 Thanks,
 John

 On Thu, Feb 25, 2021 at 8:57 AM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:

> Hi, All.
>
> Since we have been preparing Apache Spark 3.2.0 in master branch
> since December 2020, March seems to be a good time to share our 
> thoughts
> and aspirations on Apache Spark 3.2.
>
> According to the progress on Apache Spark 3.1 release, Apache
> Spark 3.2 seems to be the last minor release of this year. Given the
> timeframe, we might consider the following. (This is a small set. 
> Please
> add your thoughts to this limited list.)
>
> # Languages
>
> - Scala 2.13 Support: This was expected on 3.1 via SPARK-25075 but
> slipped out. Currently, we are trying to use Scala 2.13.5 via 
> SPARK-34505
> and investigating the publishing issue. Thank you for your 
> contributions
> and feedback on this.
>
> - Java 17 LTS Support: Java 17 LTS will arrive in September 2017.
> Like Java 11, we need lots of support from our dependencies. Let's 
> see.
>
> - Python 3.6 Deprecation(?): Python 3.6 community support ends at
> 2021-12-23. So, the deprecation is not required yet, but we had better
> prepare it because we don't have an ETA of Apache Spark 3.3 in 2022.
>
> - SparkR CRAN publishing: As we know, it's discontinued so far.
> Resuming it depends on the success of Apache SparkR 3.1.1 CRAN 
> publishing.