Re: Spark as sql engine on S3

2016-07-08 Thread Mich Talebzadeh
You can have two approaches here. Use Hive as it is and replace Hive execution engine with Spark. You can beeline with Hive thrift server to access your Hive tables. beeline connects to the thrift server (either Hive or Spark). If you use spark thrift server with beeline then you are going to

Re: Spark as sql engine on S3

2016-07-08 Thread Ashok Kumar
Hi As I said we have using Hive asour SQL engine for the datasets but we are storing data externally in amazonS3,  Now you suggested Spark thrift server. Started Spark thrift server on port 10001 and I have used beeline that accesses thrift server.  Connecting to

Re: Spark as sql engine on S3

2016-07-07 Thread ayan guha
Yes, it can. On Fri, Jul 8, 2016 at 3:03 PM, Ashok Kumar wrote: > thanks so basically Spark Thrift Server runs on a port much like beeline > that uses JDBC to connect to Hive? > > Can Spark thrift server access Hive tables? > > regards > > > On Friday, 8 July 2016, 5:27,

Re: Spark as sql engine on S3

2016-07-07 Thread Ashok Kumar
thanks so basically Spark Thrift Server runs on a port much like beeline that uses JDBC to connect to Hive? Can Spark thrift server access Hive tables? regards On Friday, 8 July 2016, 5:27, ayan guha wrote: Spark Thrift Server..works as jdbc server. you can

Re: Spark as sql engine on S3

2016-07-07 Thread ayan guha
Spark Thrift Server..works as jdbc server. you can connect to it from any jdbc tool like squirrel On Fri, Jul 8, 2016 at 3:50 AM, Ashok Kumar wrote: > Hello gurus, > > We are storing data externally on Amazon S3 > > What is the optimum or best way to use Spark

Spark as sql engine on S3

2016-07-07 Thread Ashok Kumar
Hello gurus, We are storing data externally on Amazon S3 What is the optimum or best way to use Spark as SQL engine to access data on S3? Any info/write up will be greatly appreciated. Regards