[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152738#comment-17152738
 ] 

Lantao Jin edited comment on SPARK-29038 at 7/7/20, 1:14 PM:
-------------------------------------------------------------

Hi [~AidenZhang], our focusings of MV in recent months are two parts. One is 
the rewrite algothim optimization. Such as forbidding count distict post 
aggregation, avoid unnecessary rewrite when do relation replacement. Another is 
bugfix in MV refresh. Use a Spark listener to deliver the metastore events to 
refresh. Some parts depends on third part system. So maybe only interfaces are 
available in community Spark. I don't do the partial/incremental refresh since 
it's not a blocker for us. I am not sure the community are still interested the 
feature, but we are moving existing implementation to Spark3.0 now.


was (Author: cltlfcjin):
Hi [~AidenZhang], my focusings of MV in recent months are two parts. One is the 
rewrite algothim optimization. Such as forbidding count distict post 
aggregation, avoid unnecessary rewrite when do relation replacement. Another is 
bugfix in MV refresh. Use a Spark listener to deliver the metastore events to 
refresh. Some parts depends on third part system. So maybe only interfaces are 
available in community Spark. I don't do the partial/incremental refresh since 
it's not a blocker for us. I am not sure the community are still interested the 
feature, but we are moving existing implementation to Spark3.0 now.

> SPIP: Support Spark Materialized View
> -------------------------------------
>
>                 Key: SPARK-29038
>                 URL: https://issues.apache.org/jira/browse/SPARK-29038
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to