Thanks Judy.
You are right. The query is going to Spark ThriftServer2. I have it setup on a 
different port number.
I got the wrong perception b/c there were other jobs running at the same time. 
It should be Spark jobs instead of Hive jobs.
From: judyn...@exchange.microsoft.com
To: alee...@hotmail.com; sjbru...@uwaterloo.ca; user@spark.apache.org
Subject: RE: Is the Thrift server right for me?
Date: Wed, 11 Feb 2015 20:12:03 +0000









It should relay the queries to spark (i.e. you shouldn’t see any MR job on 
Hadoop & you should see activities on the spark app on headnode UI).

 
Check your hive-site.xml. Are you directing to the hive server 2 port instead 
of spark thrift port?

Their default ports are both 10000.

 


From: Andrew Lee [mailto:alee...@hotmail.com]


Sent: Wednesday, February 11, 2015 12:00 PM

To: sjbrunst; user@spark.apache.org

Subject: RE: Is the Thrift server right for me?


 

I have ThriftServer2 up and running, however, I notice that it relays the query 
to HiveServer2 when I pass the hive-site.xml to it.

 


I'm not sure if this is the expected behavior, but based on what I have up and 
running, the ThriftServer2 invokes HiveServer2 that results in MapReduce or Tez 
query. In this case, I could
 just connect directly to HiveServer2 if Hive is all you need.


 


If you are programmer and want to mash up data from Hive with other tables and 
data in Spark, then Spark ThriftServer2 seems to be a good integration point at 
some use case.


 


Please correct me if I misunderstood the purpose of Spark ThriftServer2.


 

> Date: Thu, 8 Jan 2015 14:49:00 -0700

> From: sjbru...@uwaterloo.ca

> To: user@spark.apache.org

> Subject: Is the Thrift server right for me?

> 

> I'm building a system that collects data using Spark Streaming, does some

> processing with it, then saves the data. I want the data to be queried by

> multiple applications, and it sounds like the Thrift JDBC/ODBC server might

> be the right tool to handle the queries. However, the documentation for the

> Thrift server

> <http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server>


> seems to be written for Hive users who are moving to Spark. I never used

> Hive before I started using Spark, so it is not clear to me how best to use

> this.

> 

> I've tried putting data into Hive, then serving it with the Thrift server.

> But I have not been able to update the data in Hive without first shutting

> down the server. This is a problem because new data is always being streamed

> in, and so the data must continuously be updated.

> 

> The system I'm building is supposed to replace a system that stores the data

> in MongoDB. The dataset has now grown so large that the database index does

> not fit in memory, which causes major performance problems in MongoDB.

> 

> If the Thrift server is the right tool for me, how can I set it up for my

> application? If it is not the right tool, what else can I use?

> 

> 

> 

> --

> View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-the-Thrift-server-right-for-me-tp21044.html

> Sent from the Apache Spark User List mailing list archive at Nabble.com.

> 

> ---------------------------------------------------------------------

> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

> For additional commands, e-mail: user-h...@spark.apache.org

> 



                                          

Reply via email to