Re: Running spark function on parquet without sql

2015-03-15 Thread Cheng Lian
That's an unfortunate documentation bug in the programming guide... We failed to update it after making the change. Cheng On 2/28/15 8:13 AM, Deborah Siegel wrote: Hi Michael, Would you help me understand the apparent difference here.. The Spark 1.2.1 programming guide indicates: Note

Re: Running spark function on parquet without sql

2015-02-27 Thread tridib
when cached the table using cacheTable()? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833p21850.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Running spark function on parquet without sql

2015-02-27 Thread Deborah Siegel
Hi Michael, Would you help me understand the apparent difference here.. The Spark 1.2.1 programming guide indicates: Note that if you call schemaRDD.cache() rather than sqlContext.cacheTable(...), tables will *not* be cached using the in-memory columnar format, and therefore

Re: Running spark function on parquet without sql

2015-02-27 Thread Michael Armbrust
From Zhan Zhang's reply, yes I still get the parquet's advantage. You will need to at least use SQL or the DataFrame API (coming in Spark 1.3) to specify the columns that you want in order to get the parquet benefits. The rest of your operations can be standard Spark. My next question is,

Running spark function on parquet without sql

2015-02-26 Thread tridib
Regards Tridib -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Running spark function on parquet without sql

2015-02-26 Thread Zhan Zhang
as it would have been if I have used SQL? Please advice. Thanks Regards Tridib -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833.html Sent from the Apache Spark User List mailing list archive