subject:"performance difference between Thrift server and SparkSQL\?"

Re: performance difference between Thrift server and SparkSQL?

2015-10-05 Thread Jeff Thompson

Thanks for the suggestion. The output from EXPLAIN is indeed equivalent in both sparkSQL and via the Thrift server. I did some more testing. The source of the performance difference is in the way I was triggering the sparkSQL query. I was using .count() instead of .collect(). When I use

Re: performance difference between Thrift server and SparkSQL?

2015-10-03 Thread Michael Armbrust

Underneath the covers, the thrift server is just calling hiveContext.sql(...) so this is surprising. Maybe running EXPLAIN or EXPLAIN

performance difference between Thrift server and SparkSQL?

2015-10-03 Thread Jeff Thompson

Hi, I'm running a simple SQL query over a ~700 million row table of the form: SELECT * FROM my_table WHERE id = '12345'; When I submit the query via beeline & the JDBC thrift server it returns in 35s When I submit the exact same query using sparkSQL from a pyspark shell (sqlContex.sql("SELECT *

Re: performance difference between Thrift server and SparkSQL?

Re: performance difference between Thrift server and SparkSQL?

performance difference between Thrift server and SparkSQL?

3 matches

Site Navigation

Mail list logo

Footer information