Yes it works! But the filter can't pushdown!!! If custom parquetinputformat only implement the datasource API?
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 2015-01-16 21:51 GMT+08:00 Xiaoyu Wang <wangxy...@gmail.com>: > Thanks yana! > I will try it! > > 在 2015年1月16日,20:51,yana <yana.kadiy...@gmail.com> 写道: > > I think you might need to set > spark.sql.hive.convertMetastoreParquet to false if I understand that flag > correctly > > Sent on the new Sprint Network from my Samsung Galaxy S®4. > > > -------- Original message -------- > From: Xiaoyu Wang > Date:01/16/2015 5:09 AM (GMT-05:00) > To: user@spark.apache.org > Subject: Why custom parquet format hive table execute "ParquetTableScan" > physical plan, not "HiveTableScan"? > > Hi all! > > In the Spark SQL1.2.0. > I create a hive table with custom parquet inputformat and outputformat. > like this : > CREATE TABLE test( > id string, > msg string) > CLUSTERED BY ( > id) > SORTED BY ( > id ASC) > INTO 10 BUCKETS > ROW FORMAT SERDE > '*com.a.MyParquetHiveSerDe*' > STORED AS INPUTFORMAT > '*com.a.MyParquetInputFormat*' > OUTPUTFORMAT > '*com.a.MyParquetOutputFormat*'; > > And the spark shell see the plan of "select * from test" is : > > [== Physical Plan ==] > [!OutputFaker [id#5,msg#6]] > [ *ParquetTableScan* [id#12,msg#13], (ParquetRelation > hdfs://hadoop/user/hive/warehouse/test.db/test, Some(Configuration: > core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, > yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), > org.apache.spark.sql.hive.HiveContext@6d15a113, []), []] > > *Not HiveTableScan*!!! > *So it dosn't execute my custom inputformat!* > Why? How can it execute my custom inputformat? > > Thanks! > > >