[jira] [Comment Edited] (SPARK-33449) Add cache for Parquet Metadata

Yang Jie (Jira) Sun, 15 Nov 2020 19:01:03 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-33449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232478#comment-17232478
 ]


Yang Jie edited comment on SPARK-33449 at 11/16/20, 2:59 AM:
-------------------------------------------------------------

[~yumwang] Yes, I think this is a very good suggestion. We implement parquet 
and orc metadata(footer) caching in 
[OAP|https://github.com/Intel-bigdata/OAP/tree/master/oap-cache] ,  this 
feature run in a long running  query process in Baidu and it has a good effect


was (Author: luciferyang):
[~yumwang] Yes, I think this is a very good suggestion. We implement parquet 
and orc metadata(footer) caching in 
[OAP|https://github.com/Intel-bigdata/OAP/tree/master/oap-cache] and the effect 
is very good

> Add cache for Parquet Metadata
> ------------------------------
>
>                 Key: SPARK-33449
>                 URL: https://issues.apache.org/jira/browse/SPARK-33449
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Yuming Wang
>            Priority: Major
>         Attachments: Get Parquet metadata.png
>
>
> Get Parquet metadata may takes a lot of time, maybe we can cache it. Presto 
> support it:
> https://github.com/prestodb/presto/pull/15276



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-33449) Add cache for Parquet Metadata

Reply via email to