Thanks Piotr. I agree with both points. I added a doc comment to clarify both the description and name for this property. Hopefully, we're all in sync now:
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?disco=AAABFwRPGoA On Mon, Jun 24, 2024 at 4:58 AM Piotr Findeisen <[email protected]> wrote: > Hi, > For the MV to be useful, the grace period (max staleness) should be part > of materialized view definition. > Ultimately it's the query engine responsibility to implement grace period > behavior correctly, but the engine needs to know what amount of staleness > is OK for this particular view and that's ultimately user's choice and > needs to be persisted. > > re when clock starts. staleness clock cannot start at refresh query > *finish*, must start at refresh query start. > How long the refresh query run shouldn't matter. If it is run very very > slowly on a single node and takes hours to complete instead of minutes -- > doesn't matter for materialized view consumer perspective. > The only thing that matters is what *state* the fresh query was run on. > Refresh query start time is a pretty accurate approximation of that state. > Better yet is to track snapshots of tables involved in the mat view, and > then count staleness only if those tables change. > This is doable only if all involved tables are Iceberg tables and > reachable from where the mat view is being accessed, which may not be the > case in some producer/consumer scenarios. > > Best > PF > > > > > On Fri, 21 Jun 2024 at 18:28, Benny Chow <[email protected]> wrote: > >> Hi Dan, looks like it is pretty common across engines and sometimes part >> of the engine specific DDL operation to create the MV. So, I agree let's >> keep the "*materialization.max-staleness*" property. I'll also point >> out that when the clock starts for this max staleness check could be either >> when the refresh job starts or when the refresh job finishes... >> again another engine specific behavior. >> >> Benny >> >> On Fri, Jun 21, 2024 at 8:15 AM Daniel Weeks <[email protected]> wrote: >> >>> Benny, I think you bring up a good point about staleness in that >>> different clients may want different behaviors. However, defining a "grace >>> period" or "max staleness" is pretty common and makes a lot of sense when >>> working with expensive queries. Trino >>> <https://trino.io/docs/current/sql/create-materialized-view.html#synopsis> >>> and BigQuery >>> <https://cloud.google.com/bigquery/docs/materialized-views-create#max_staleness> >>> (I >>> found some references to traditional OLTP databases as well) support this >>> concept and I think it's helpful to standardize because it's a common >>> pattern especially where clients may not have the ability to refresh or >>> fallback to the view definition. >>> >>> Maybe adding the property and noting that it's acceptable for clients to >>> override the behavior with query specific options? >>> >>> -Dan >>> >>> >>> >>> On Thu, Jun 20, 2024 at 9:47 PM Jan Kaul <[email protected]> >>> wrote: >>> >>>> Thanks Benny for bringing these issues up. I would agree with both of >>>> your propositions. >>>> >>>> Regarding the naming of the fields, we can go with the naming that you >>>> suggested. I just wanted to wait if some more people chime in with their >>>> opinions. >>>> >>>> Jan >>>> On 20.06.24 23:16, Benny Chow wrote: >>>> >>>> > So basically this is just FYI and it is up to the consumer to assume >>>> what to do given the length of time between that timestamp and now? >>>> Yes.. some consumers will use it.. some won't. Only the engine >>>> producing the materialization will know when it started the refresh job. >>>> >>>> > Would the decision be on the engine implementation side (which does >>>> not have the domain knowledge), or would it be on the user's side? >>>> I think the decision would mostly be on the engine implementation >>>> side. I would imagine an engine would have a bunch of configuration >>>> properties associated with a view that control how/when the refresh is >>>> performed and settings around when the planner can still use the view's >>>> materialization when it is stale. It might be possible for the user to >>>> specify hints in the query or session/connection properties to control the >>>> staleness tolerance too. >>>> >>>> Benny >>>> >>>> >>>> On Thu, Jun 20, 2024 at 12:17 PM Walaa Eldin Moustafa < >>>> [email protected]> wrote: >>>> >>>>> So basically this is just FYI and it is up to the consumer to assume >>>>> what to do given the length of time between that timestamp and now? Would >>>>> the decision be on the engine implementation side (which does not have the >>>>> domain knowledge), or would it be on the user's side? In the latter case >>>>> the user would need to explicitly query the storage table directly, >>>>> correct? With a grace period I think we could push it down to the engine. >>>>> >>>>> Thanks, >>>>> Walaa. >>>>> >>>>> >>>>> On Thu, Jun 20, 2024 at 12:06 PM Benny Chow <[email protected]> wrote: >>>>> >>>>>> Piotr, thanks for the Trino pointers. I noticed that Trino stores >>>>>> the refresh start time as a snapshot summary property here >>>>>> <https://github.com/trinodb/trino/blob/6697fe24481a30d37eb91efd62666165acf379c2/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java#L3027>. >>>>>> I think this is exactly what I am asking for with >>>>>> "refresh-start-timestamp-ms". >>>>>> >>>>>> Walaa, no, the suggestion is to not have a grace period as this is >>>>>> engine specific on how it wants to handle staleness. >>>>>> "refresh-start-timestamp-ms" refers to 1 and not 2. We should already >>>>>> have 2 in the snapshot summary timestamp-ms property. >>>>>> >>>>>> When I say "fresh as of", I don't mean the AS OF construct. It's >>>>>> just making a guarantee to the consumer of the MV that the >>>>>> materialization >>>>>> contains data that is "fresh as of" a certain timestamp. So like, if you >>>>>> built a materialization on top of 100 tables (possibly a mix of Iceberg >>>>>> and >>>>>> non-Iceberg) and you know that the refresh job ran on say 6/20/2024 >>>>>> 12:02:10 UTC, then whatever data is in the materialization has to be >>>>>> "fresh >>>>>> as of" 6/20/2024 12:02:10 UTC. >>>>>> >>>>>> Thanks >>>>>> Benny >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jun 20, 2024 at 11:19 AM Walaa Eldin Moustafa < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Benny, is the suggestion to couple the "refresh-start-timestamp-ms" >>>>>>> property with a grace period as well? Also, could you clarify which >>>>>>> timestamp "refresh-start-timestamp-ms" refers to: >>>>>>> (1) Timestamp when refresh is triggered >>>>>>> (2) Timestamp when refresh is concluded and the snapshot is written. >>>>>>> >>>>>>> Also when you say "fresh as of" this timestamp, do you mean "AS OF" >>>>>>> construct when used to query the materialized view? Or something else? >>>>>>> If >>>>>>> "AS OF" is what you meant, then this might answer my question about the >>>>>>> grace period, where it won't be needed. >>>>>>> >>>>>>> Thanks, >>>>>>> Walaa. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 20, 2024 at 5:22 AM Piotr Findeisen < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Benny, >>>>>>>> >>>>>>>> on the staleness topic I'd recommend to check how Trino implements >>>>>>>> materialized views in Iceberg and how it defines staleness. >>>>>>>> In particular >>>>>>>> >>>>>>>> - a view can have defined grace period which defines how stale the >>>>>>>> data can be for the materialization to be considered useful (defaults >>>>>>>> to >>>>>>>> unlimited) >>>>>>>> - staleness clock starts with the first table change after refresh >>>>>>>> - for unmanaged (non-iceberg) tables where we don't know when the >>>>>>>> table changed, the staleness clock starts right after refresh >>>>>>>> >>>>>>>> Best >>>>>>>> Piotr >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 19 Jun 2024 at 19:58, Benny Chow <[email protected]> wrote: >>>>>>>> >>>>>>>>> Hey Guys, >>>>>>>>> >>>>>>>>> Great progress on the MV spec and thanks a ton to Jan and Walaa >>>>>>>>> for driving this. One of our latest achievements was that we >>>>>>>>> finalized the >>>>>>>>> view lineage and materialization table refresh JSON so that we can >>>>>>>>> definitively and concisely describe what data is in the >>>>>>>>> materialization >>>>>>>>> table. >>>>>>>>> >>>>>>>>> Regarding the actual refresh process, I have two more suggestions: >>>>>>>>> >>>>>>>>> *When should a MV be refreshed? *There could be many different >>>>>>>>> refresh policies such as "on table data or view change", periodic, >>>>>>>>> scheduled and/or manual with the goal of reducing staleness while >>>>>>>>> minimizing cost to refresh. I don't think we should try to capture >>>>>>>>> this >>>>>>>>> configuration as part of the first iteration of the MV spec. So, I >>>>>>>>> suggest >>>>>>>>> we just remove the "*materialization.data.max-staleness*" view >>>>>>>>> property for now. There's a lot of comments on this in the spec and >>>>>>>>> many >>>>>>>>> contributors did suggest to not include it as well. >>>>>>>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?disco=AAABFwRPGoA >>>>>>>>> >>>>>>>>> *When refresh is done, what information is stored to help engines >>>>>>>>> evaluate materialization freshness?* We agreed on storing the >>>>>>>>> view lineage and materialization refresh-tables so that engines can >>>>>>>>> query >>>>>>>>> for the current table snapshot versions and view versions and compare >>>>>>>>> to >>>>>>>>> what is stored in the refresh-tables. However, there could be 100s of >>>>>>>>> tables/views here and it could be prohibitively expensive to do this >>>>>>>>> check. Instead, the engine may just use the materialization's >>>>>>>>> snapshot >>>>>>>>> summary timestamp-ms to determine the last refresh time and assume >>>>>>>>> the data >>>>>>>>> is fresh as of this timestamp. However, this assumption might be >>>>>>>>> naive if >>>>>>>>> the refresh job took 1 hour to run and source tables were queried at >>>>>>>>> different times throughout the execution of the job. So, I propose >>>>>>>>> we add >>>>>>>>> a "*refresh-start-timestamp-ms*" to the materialization snapshot >>>>>>>>> summary which tells users that the data in the materialization is at >>>>>>>>> least >>>>>>>>> as fresh as of this date (It might be fresher but not more stale). >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Benny >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>
