Hi,
On Wed, Mar 11, 2015 at 11:05 PM, Cesar Flores wrote:
>
> Thanks for both answers. One final question. *This registerTempTable is
> not an extra process that the SQL queries need to do that may decrease
> performance over the language integrated method calls? *
>
As far as I know, registerTe
Hi:
Thanks for both answers. One final question. *This registerTempTable is not
an extra process that the SQL queries need to do that may decrease
performance over the language integrated method calls? *The thing is that I
am planning to use them in the current version of the ML Pipeline
transform
Hi,
On Tue, Mar 10, 2015 at 2:13 PM, Cesar Flores wrote:
> I am new to the SchemaRDD class, and I am trying to decide in using SQL
> queries or Language Integrated Queries (
> https://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD
> ).
>
> Can someone tell me wha
They should have the same performance, as they are compiled down to the
same execution plan.
Note that starting in Spark 1.3, SchemaRDD is renamed DataFrame:
https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
On Tue, Mar 10, 2015 at 2:13 PM
I am new to the SchemaRDD class, and I am trying to decide in using SQL
queries or Language Integrated Queries (
https://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD
).
Can someone tell me what is the main difference between the two approaches,
besides using diff