Hi Jianshi, When accessing a Hive table with Parquet SerDe, Spark SQL tries to convert it into Spark SQL's native Parquet support for better performance. And yes, predicate push-down, column pruning are applied here. In 1.3.0, we'll also cover the write path except for writing partitioned table.
Cheng On Sun Feb 15 2015 at 9:22:15 AM Jianshi Huang <jianshi.hu...@gmail.com> wrote: > Hi, > > If I have a table in Hive metastore saved as Parquet, and I want to use it > in Spark. It seems Spark will use Hive's Parquet serde to load the actual > data. > > So is there any difference here? Will predicate pushdown, pruning and > future Parquet optimizations in SparkSQL work for using Hive serde? > > Loading tables using parquetFile vs. loading tables from Hive metastore > with Parquet serde > > > Thanks, > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ >