[ https://issues.apache.org/jira/browse/SPARK-33449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232478#comment-17232478 ]
Yang Jie edited comment on SPARK-33449 at 11/16/20, 2:59 AM: ------------------------------------------------------------- [~yumwang] Yes, I think this is a very good suggestion. We implement parquet and orc metadata(footer) caching in [OAP|https://github.com/Intel-bigdata/OAP/tree/master/oap-cache] , this feature run in a long running query process in Baidu and it has a good effect was (Author: luciferyang): [~yumwang] Yes, I think this is a very good suggestion. We implement parquet and orc metadata(footer) caching in [OAP|https://github.com/Intel-bigdata/OAP/tree/master/oap-cache] and the effect is very good > Add cache for Parquet Metadata > ------------------------------ > > Key: SPARK-33449 > URL: https://issues.apache.org/jira/browse/SPARK-33449 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.0 > Reporter: Yuming Wang > Priority: Major > Attachments: Get Parquet metadata.png > > > Get Parquet metadata may takes a lot of time, maybe we can cache it. Presto > support it: > https://github.com/prestodb/presto/pull/15276 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org