Hi, On Tue, Mar 10, 2015 at 2:13 PM, Cesar Flores <ces...@gmail.com> wrote:
> I am new to the SchemaRDD class, and I am trying to decide in using SQL > queries or Language Integrated Queries ( > https://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD > ). > > Can someone tell me what is the main difference between the two > approaches, besides using different syntax? Are they interchangeable? Which > one has better performance? > One difference is that the language integrated queries are method calls on the SchemaRDD you want to work on, which requires you have access to the object at hand. The SQL queries are passed to a method of the SQLContext and you have to call registerTempTable() on the SchemaRDD you want to use beforehand, which can basically happen at an arbitrary location of your program. (I don't know if I could express what I wanted to say.) That may have an influence on how you design your program and how the different parts work together. Tobias