[jira] [Commented] (SPARK-26764) [SPIP] Spark Relational Cache

Nicholas Chammas (Jira) Wed, 21 Oct 2020 09:13:11 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-26764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218374#comment-17218374
 ]


Nicholas Chammas commented on SPARK-26764:
------------------------------------------

The SPIP PDF references a design doc, but I'm not clear on where the design doc 
actually is. Is this issue supposed to be linked to some other ones?

Also, appendix B suggests to me that this idea would mesh well with the 
existing proposals to support materialized views. I could actually see this as 
an enhancement to those proposals, like SPARK-29038.

In fact, when I look at the [design 
doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit#]
 for SPARK-29038, I see that goal 3 covers automatic query rewrites, which I 
think subsumes the main benefit of this proposal as compared to "traditional" 
materialized views.
{quote}> 3. A query _rewrite_ capability to transparently rewrite a query to 
use a materialized view[1][2].
 > a. Query rewrite capability is transparent to SQL applications.
 > b. Query rewrite can be disabled at the system level or on individual 
 > materialized view. Also it can be disabled for a specified query via hint.
 > c. Query rewrite as a rule in optimizer should be made sure that it won’t 
 > cause performance regression if it can use other index or cache.
{quote}

> [SPIP] Spark Relational Cache
> -----------------------------
>
>                 Key: SPARK-26764
>                 URL: https://issues.apache.org/jira/browse/SPARK-26764
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Adrian Wang
>            Priority: Major
>         Attachments: Relational+Cache+SPIP.pdf
>
>
> In modern database systems, relational cache is a common technology to boost 
> ad-hoc queries. While Spark provides cache natively, Spark SQL should be able 
> to utilize the relationship between relations to boost all possible queries. 
> In this SPIP, we will make Spark be able to utilize all defined cached 
> relations if possible, without explicit substitution in user query, as well 
> as keep some user defined cache available in different sessions. Materialized 
> views in many database systems provide similar function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26764) [SPIP] Spark Relational Cache

Reply via email to