[GitHub] [spark] gengliangwang commented on issue #24327: [SPARK-27418][SQL] Migrate Parquet to File Data Source V2

GitBox Fri, 31 May 2019 22:03:40 -0700

gengliangwang commented on issue #24327: [SPARK-27418][SQL] Migrate Parquet to 
File Data Source V2
URL: https://github.com/apache/spark/pull/24327#issuecomment-497913250
 
 
   @dongjoon-hyun I think Spark needs to read the actual physical schema for 
getting the exact names and data types for pushing down filters.  If the names 
or data types are not matched when performing filter push down,  it might cause 
regression.
   @rdblue has explained this in 
https://github.com/apache/spark/pull/21696#discussion_r199979463 .
   
   With the current DSV2 design, I think we have to implement Parquet V2 in 
this way.  Suggestions are welcome.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on issue #24327: [SPARK-27418][SQL] Migrate Parquet to File Data Source V2

Reply via email to