Re: How Spark establishes connectivity to Hive

2022-03-15 Thread Artemis User
I guess it's really depends on your configuration.  The Hive metastore is providing just the metadata/schema data for your database, not actual data storage.  Hive is running on top of Hadoop. If you configure your Spark to run on the same Hadoop cluster using Yarn, your SQL dataframe in Spark

How Spark establishes connectivity to Hive

2022-03-14 Thread Venkatesan Muniappan
hi Team, I wanted to understand how spark connects to Hive. Does it connect to Hive metastore directly bypassing hive server?. Lets say when we are inserting data into a hive table with its I/O format as Parquet. Does Spark creates the parquet file from the Dataframe/RDD/DataSet and put it in its

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Mich Talebzadeh
actually can you create an Uber jar file in a conventional way using those two hadoop versions? You have HADOOP_AWS_VERSION=3.3.0 besides 3.2. HTH view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaime

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
I understand what you are saying . However, I am not sure how to implement when i create a docker image using spark 3.2.1 with hadoop 3.2 which has guava jar already added as part of distribution. On Tue, Feb 15, 2022, 01:17 Mich Talebzadeh wrote: > Hi Raj, > > I found the old email. That is wha

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Mich Talebzadeh
Hi Raj, I found the old email. That is what I did but it is 2018 stuff. The email says I sorted out this problem. I rewrote the assembly with shade rules to avoid old jar files as follows: lazy val root = (project in file(".")). settings( name := "${APPLICATION}", version := "1.0",

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
Should we remove the existing jar and upgrade it to some recent version? On Tue, Feb 15, 2022, 01:08 Mich Talebzadeh wrote: > I recall I had similar issues running Spark on Google Dataproc. > > sounds like it gets Hadoop's jars on the classpath which include an older > version of Guava. The solu

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Mich Talebzadeh
I recall I had similar issues running Spark on Google Dataproc. sounds like it gets Hadoop's jars on the classpath which include an older version of Guava. The solution is to shade/relocate Guava in your distribution HTH view my Linkedin profile

Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
Hi Team , We are trying to build a docker image using Centos and trying to connect through S3. Same works with Hadoop 3.2.0 and spark.3.1.2 #Installing spark binaries ENV SPARK_HOME /opt/spark ENV SPARK_VERSION 3.2.1 ENV HADOOP_VERSION 3.2.0 ARG HADOOP_VERSION_SHORT=3.2 ARG HADOOP_AWS_VERSION=3.3

Re: Spark hive build and connectivity

2020-10-25 Thread hopefulnick
For compatibility,it's recommended:- Use compatible version of Hive.- Build Spark without hive and configure hive to use Spark.Here is the way to build Spark with custom Hive. It worked for me and hope helpful to you. Hive on Spark

Re: Spark hive build and connectivity

2020-10-22 Thread Ravi Shankar
Thanks ! I have a very similar setup. I have built spark with -Phive which includes hive-2.3.7 jars , spark-hive*jars and some hadoop-common* jars. At runtime, i set SPARK_DIST_CLASSPATH=${hadoop classpath} and set spark.sql.hive.metastore.version and spark.sql.hive.metastore.jars to $HIVE_HOME/l

Re: Spark hive build and connectivity

2020-10-22 Thread Kimahriman
I have always been a little confused about the different hive-version integration as well. To expand on this question, we have a Hive 3.1.1 metastore that we can successfully interact with using the -Phive profile with Hive 2.3.7. We do not use the Hive 3.1.1 jars anywhere in our Spark applications

Re: Spark hive build and connectivity

2020-10-22 Thread Mich Talebzadeh
Hi, To access Hive tables Spark uses native API as below (default) where you have set-up ltr $SPARK_HOME/conf hive-site.xml -> /data6/hduser/hive-3.0.0/conf/hive-site.xml val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) HiveContext.sql("use ilayer") val account_table = HiveContext

Re: Spark hive build and connectivity

2020-10-22 Thread Artemis User
By default Spark will build with Hive 2.3.7, according to the Spark build doc.  If you want to replace it with a different hive jar, you need to change the Maven pom.xml file. -- ND On 10/22/20 11:35 AM, Ravi Shankar wrote: Hello all, I am trying to understand how the Spark SQL integration wi

Re: Spark hive build and connectivity

2020-10-22 Thread Ravi Shankar
Hello Mitch, I am just trying to access hive tables from my hive 3.2.1 cluster using spark. Basically i just want my spark-jobs to be able to access these hive tables. I want to understand how spark jobs interact with hive to access these tables. - I see that whenever i build spark with hive suppo

Re: Spark hive build and connectivity

2020-10-22 Thread Mich Talebzadeh
Hi Ravi, What exactly are you trying to do? You want to enhance Spark SQl or you want to run Hive on Spark engine? HTH LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Spark hive build and connectivity

2020-10-22 Thread Ravi Shankar
Hello all, I am trying to understand how the Spark SQL integration with hive works. Whenever i build spark with -Phive -P hive-thriftserver options, i see that it is packaged with hive-2.3.7*.jars and spark-hive*.jars. And the documentation claims that spark can talk to different versions of hive.

connectivity

2019-12-01 Thread Krishna Chandran Nair
Hi Team, Can anyone provide the sample code to connect to azure to connect to ADLS using azure key vault(user managed key). Qatar Airways - Going Places Together Disclaimer:- This message (including attachments) is intended solely for the addressee named above. It may be confidential, privi

Spark SQL -JDBC connectivity

2016-08-09 Thread Soni spark
Hi, I would to know the steps to connect SPARK SQL from spring framework (Web-UI). also how to run and deploy the web application?

Re: Checkpoint FS failure or connectivity issue

2015-06-29 Thread Tathagata Das
Yes, the observation is correct. That connectivity is assumed to be HA. On Mon, Jun 29, 2015 at 2:34 PM, Amit Assudani wrote: > Hi All, > > While using Checkpoints ( using HDFS ), if connectivity to hadoop > cluster is lost for a while and gets restored in some time, what happ

Checkpoint FS failure or connectivity issue

2015-06-29 Thread Amit Assudani
Hi All, While using Checkpoints ( using HDFS ), if connectivity to hadoop cluster is lost for a while and gets restored in some time, what happens to the running streaming job. Is it always assumed that connection to checkpoint FS ( this case HDFS ) would ALWAYS be HA and would never fail for

Re: Spark / Thrift / ODBC connectivity

2014-08-29 Thread Cheng Lian
ve tables? > > As well, by any chance do we have any documents that point to how we can > connect something like Tableau to Spark SQL Thrift - similar to the SAP > ODBC connectivity http://www.saphana.com/docs/DOC-472? > > Thanks! > Denny > >

Spark / Thrift / ODBC connectivity

2014-08-28 Thread Denny Lee
point to how we can connect something like Tableau to Spark SQL Thrift - similar to the SAP ODBC connectivity http://www.saphana.com/docs/DOC-472? Thanks! Denny

Re: Spark SQL JDBC Connectivity

2014-07-30 Thread Michael Armbrust
://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-JDBC-Connectivity-tp6511p10986.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

Re: Spark SQL JDBC Connectivity

2014-07-30 Thread Venkat Subramanian
now. It is easy to do this and took a just a few hours and it works for our use case. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-JDBC-Connectivity-tp6511p10986.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark SQL JDBC Connectivity and more

2014-06-09 Thread Michael Armbrust
> [Venkat] Are you saying - pull in the SharkServer2 code in my standalone > spark application (as a part of the standalone application process), pass > in > the spark context of the standalone app to SharkServer2 Sparkcontext at > startup and viola we get a SQL/JDBC interfaces for the RDDs of t

Re: Spark SQL JDBC Connectivity and more

2014-06-09 Thread Venkat Subramanian
s tables? Thanks for the clarification. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-JDBC-Connectivity-tp6511p7264.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark SQL JDBC Connectivity and more

2014-05-29 Thread Michael Armbrust
On Thu, May 29, 2014 at 3:26 PM, Venkat Subramanian wrote: > > 1) If I have a standalone spark application that has already built a RDD, > how can SharkServer2 or for that matter Shark access 'that' RDD and do > queries on it. All the examples I have seen for Shark, the RDD (tables) are > created

Re: Spark SQL JDBC Connectivity and more

2014-05-29 Thread Venkat Subramanian
may be a very common use case? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-JDBC-Connectivity-tp6511p6543.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark SQL JDBC Connectivity

2014-05-29 Thread Michael Armbrust
On Wed, May 28, 2014 at 11:39 PM, Venkat Subramanian wrote: > We are planning to use the latest Spark SQL on RDDs. If a third party > application wants to connect to Spark via JDBC, does Spark SQL have > support? > (We want to avoid going though Shark/Hive JDBC layer as we need good > performance)

Spark SQL JDBC Connectivity

2014-05-28 Thread Venkat Subramanian
.nabble.com/Spark-SQL-JDBC-Connectivity-tp6511.html Sent from the Apache Spark User List mailing list archive at Nabble.com.