Yes it works!
But the filter can't pushdown!!!

If custom parquetinputformat only implement the datasource API?

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala

2015-01-16 21:51 GMT+08:00 Xiaoyu Wang <wangxy...@gmail.com>:

> Thanks yana!
> I will try it!
>
> 在 2015年1月16日,20:51,yana <yana.kadiy...@gmail.com> 写道:
>
> I think you might need to set
> spark.sql.hive.convertMetastoreParquet to false if I understand that flag
> correctly
>
> Sent on the new Sprint Network from my Samsung Galaxy S®4.
>
>
> -------- Original message --------
> From: Xiaoyu Wang
> Date:01/16/2015 5:09 AM (GMT-05:00)
> To: user@spark.apache.org
> Subject: Why custom parquet format hive table execute "ParquetTableScan"
> physical plan, not "HiveTableScan"?
>
> Hi all!
>
> In the Spark SQL1.2.0.
> I create a hive table with custom parquet inputformat and outputformat.
> like this :
> CREATE TABLE test(
>   id string,
>   msg string)
> CLUSTERED BY (
>   id)
> SORTED BY (
>   id ASC)
> INTO 10 BUCKETS
> ROW FORMAT SERDE
>   '*com.a.MyParquetHiveSerDe*'
> STORED AS INPUTFORMAT
>   '*com.a.MyParquetInputFormat*'
> OUTPUTFORMAT
>   '*com.a.MyParquetOutputFormat*';
>
> And the spark shell see the plan of "select * from test" is :
>
> [== Physical Plan ==]
> [!OutputFaker [id#5,msg#6]]
> [ *ParquetTableScan* [id#12,msg#13], (ParquetRelation
> hdfs://hadoop/user/hive/warehouse/test.db/test, Some(Configuration:
> core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
> yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml),
> org.apache.spark.sql.hive.HiveContext@6d15a113, []), []]
>
> *Not HiveTableScan*!!!
> *So it dosn't execute my custom inputformat!*
> Why? How can it execute my custom inputformat?
>
> Thanks!
>
>
>

Reply via email to