I was really surprised to see the results here, esp. SparkSQL "not completing" http://www.citusdata.com/blog/86-making-postgresql-scale-hadoop-style
I was under the impression that SparkSQL performs really well because it can optimize the RDD operations and load only the columns that are required. This essentially means in most cases SparkSQL should be as fast as Spark is. I would be very interested to hear what others in the group have to say about this. Thanks -Soumya