+1 (binding) On Thu, Feb 3, 2022 at 2:26 PM Erik Krogen <xkro...@apache.org> wrote:
> +1 (non-binding) > > Really looking forward to having this natively supported by Spark, so that > we can get rid of our own hacks to tie in a custom view catalog > implementation. I appreciate the care John has put into various parts of > the design and believe this will provide a robust and flexible solution to > this problem faced by various large-scale Spark users. > > Thanks John! > > On Thu, Feb 3, 2022 at 11:22 AM Walaa Eldin Moustafa < > wa.moust...@gmail.com> wrote: > >> +1 >> >> On Thu, Feb 3, 2022 at 11:19 AM John Zhuge <jzh...@apache.org> wrote: >> >>> Hi Spark community, >>> >>> I’d like to restart the vote for the ViewCatalog design proposal (SPIP >>> <https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing> >>> ). >>> >>> The proposal is to add a ViewCatalog interface that can be used to load, >>> create, alter, and drop views in DataSourceV2. >>> >>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll >>> update the PR <https://github.com/apache/spark/pull/28147> for review. >>> >>> [ ] +1: Accept the proposal as an official SPIP >>> [ ] +0 >>> [ ] -1: I don’t think this is a good idea because … >>> >>> Thanks! >>> >>> On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa < >>> wa.moust...@gmail.com> wrote: >>> >>>> Considering the API aspect, the ViewCatalog API sounds like a good >>>> idea. A view catalog will enable us to integrate Coral >>>> <https://engineering.linkedin.com/blog/2020/coral> (our view SQL >>>> translation and management layer) very cleanly to Spark. Currently we can >>>> only do it by maintaining our special version of the >>>> HiveExternalCatalog. Considering that views can be expanded >>>> syntactically without necessarily invoking the analyzer, using a dedicated >>>> view API can make performance better if performance is the concern. >>>> Further, a catalog can still be both a table and view provider if it >>>> chooses to based on this design, so I do not think we necessarily lose the >>>> ability of providing both. Looking forward to more discussions on this and >>>> making views a powerful tool in Spark. >>>> >>>> Thanks, >>>> Walaa. >>>> >>>> >>>> On Wed, May 26, 2021 at 9:54 AM John Zhuge <jzh...@apache.org> wrote: >>>> >>>>> Looks like we are running in circles. Should we have an online meeting >>>>> to get this sorted out? >>>>> >>>>> Thanks, >>>>> John >>>>> >>>>> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan <cloud0...@gmail.com> >>>>> wrote: >>>>> >>>>>> OK, then I'd vote for TableViewCatalog, because >>>>>> 1. This is how Hive catalog works, and we need to migrate Hive >>>>>> catalog to the v2 API sooner or later. >>>>>> 2. Because of 1, TableViewCatalog is easy to support in the current >>>>>> table/view resolution framework. >>>>>> 3. It's better to avoid name conflicts between table and views at the >>>>>> API level, instead of relying on the catalog implementation. >>>>>> 4. Caching invalidation is always a tricky problem. >>>>>> >>>>>> On Tue, May 25, 2021 at 3:09 AM Ryan Blue <rb...@netflix.com.invalid> >>>>>> wrote: >>>>>> >>>>>>> I don't think that it makes sense to discuss a different approach in >>>>>>> the PR rather than in the vote. Let's discuss this now since that's the >>>>>>> purpose of an SPIP. >>>>>>> >>>>>>> On Mon, May 24, 2021 at 11:22 AM John Zhuge <jzh...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi everyone, I’d like to start a vote for the ViewCatalog design >>>>>>>> proposal (SPIP). >>>>>>>> >>>>>>>> The proposal is to add a ViewCatalog interface that can be used to >>>>>>>> load, create, alter, and drop views in DataSourceV2. >>>>>>>> >>>>>>>> The full SPIP doc is here: >>>>>>>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing >>>>>>>> >>>>>>>> Please vote on the SPIP in the next 72 hours. Once it is approved, >>>>>>>> I’ll update the PR for review. >>>>>>>> >>>>>>>> [ ] +1: Accept the proposal as an official SPIP >>>>>>>> [ ] +0 >>>>>>>> [ ] -1: I don’t think this is a good idea because … >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Software Engineer >>>>>>> Netflix >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> John Zhuge >>>>> >>>> >>> >>> -- >>> John Zhuge >>> >> -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau