Re: Spark 3.0.1 not connecting with Hive 2.1.1

2021-01-09 Thread Pradyumn Agrawal
Hi DB Tsai, Thanks for the JIRA link. I think this blocks me to the Hive end instead of Spark. Regards Pradyumn Agrawal Media.net (India) On Sun, Jan 10, 2021 at 10:43 AM DB Tsai wrote: > Hi Pradyumn, > > I think it’s because of a HMS client backward compatibility issue > described here,

Re: Spark 3.0.1 not connecting with Hive 2.1.1

2021-01-09 Thread DB Tsai
Hi Pradyumn, I think it’s because of a HMS client backward compatibility issue described here, https://issues.apache.org/jira/browse/HIVE-24608 Thanks, DB Tsai | ACI Spark Core |  Apple, Inc > On Jan 9, 2021, at 9:53 AM, Pradyumn Agrawal wrote: > > Hi Michael, > Thanks for references,

Re: Use case advice

2021-01-09 Thread muru
You could try Delta Lake or Apache Hudi for this use case. On Sat, Jan 9, 2021 at 12:32 PM András Kolbert wrote: > Sorry if my terminology is misleading. > > What I meant under driver only is to use a local pandas dataframe (collect > the data to the master), and keep updating that instead of

Re: Use case advice

2021-01-09 Thread András Kolbert
Sorry if my terminology is misleading. What I meant under driver only is to use a local pandas dataframe (collect the data to the master), and keep updating that instead of dealing with a spark distributed dataframe for holding this data. For example, we have a dataframe with all users and their

Re: Understanding Executors UI

2021-01-09 Thread Amit Sharma
I believe it’s a spark Ui issue which do not display correct value. I believe it is resolved for spark 3.0. Thanks Amit On Fri, Jan 8, 2021 at 4:00 PM Luca Canali wrote: > You report 'Storage Memory': 3.3TB/ 598.5 GB -> The first number is the > memory used for storage, the second one is the

Re: Use case advice

2021-01-09 Thread Artemis User
Could you please clarify what do you mean by 1)? Driver is only responsible for submitting Spark job, not performing. -- ND On 1/9/21 9:35 AM, András Kolbert wrote: Hi, I would like to get your advice on my use case. I have a few spark streaming applications where I need to keep updating a

Use case advice

2021-01-09 Thread András Kolbert
Hi, I would like to get your advice on my use case. I have a few spark streaming applications where I need to keep updating a dataframe after each batch. Each batch probably affects a small fraction of the dataframe (5k out of 200k records). The options I have been considering so far: 1) keep

Re: Spark 3.0.1 not connecting with Hive 2.1.1

2021-01-09 Thread michael.yang
Hi Pradyumn, We integrated Spark 3.0.1 with hive 2.1.1-cdh6.1.0 and it works fine to use spark-sql to query hive tables. Make sure you config spark-defaults.conf and spark-env.sh well and copy hive/hadoop related config files to spark conf folder. You can refer to below refrences for detail.

Re: how to integrate hbase and hive in spark3.0.1?

2021-01-09 Thread michael.yang
Hi all, We also encountered these exceptions when integrated Spark 3.0.1 with hive 2.1.1-cdh6.1.0 and hbase 2.1.0-cdh-6.1.0. Does anyone have some ideas to solve these exceptions? Thanks in advance. Best. Michael Yang -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/