Re: Iceberg MV Refresh

Benny Chow Mon, 24 Jun 2024 08:12:42 -0700

Thanks Piotr.  I agree with both points.  I added a doc comment to clarify
both the description and name for this property.  Hopefully, we're all in
sync now:


https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?disco=AAABFwRPGoA



On Mon, Jun 24, 2024 at 4:58 AM Piotr Findeisen <piotr.findei...@gmail.com>
wrote:

> Hi,
> For the MV to be useful, the grace period (max staleness) should be part
> of materialized view definition.
> Ultimately it's the query engine responsibility to implement grace period
> behavior correctly, but the engine needs to know what amount of staleness
> is OK for this particular view and that's ultimately user's choice and
> needs to be persisted.
>
> re when clock starts. staleness clock cannot start at refresh query
> *finish*, must start at refresh query start.
> How long the refresh query run shouldn't matter. If it is run very very
> slowly on a single node and takes hours to complete instead of minutes --
> doesn't matter for materialized view consumer perspective.
> The only thing that matters is what *state* the fresh query was run on.
> Refresh query start time is a pretty accurate approximation of that state.
> Better yet is to track snapshots of tables involved in the mat view, and
> then count staleness only if those tables change.
> This is doable only if all involved tables are Iceberg tables and
> reachable from where the mat view is being accessed, which may not be the
> case in some producer/consumer scenarios.
>
> Best
> PF
>
>
>
>
> On Fri, 21 Jun 2024 at 18:28, Benny Chow <btc...@gmail.com> wrote:
>
>> Hi Dan, looks like it is pretty common across engines and sometimes part
>> of the engine specific DDL operation to create the MV.  So, I agree let's
>> keep the "*materialization.max-staleness*" property.  I'll also point
>> out that when the clock starts for this max staleness check could be either
>> when the refresh job starts or when the refresh job finishes...
>> again another engine specific behavior.
>>
>> Benny
>>
>> On Fri, Jun 21, 2024 at 8:15 AM Daniel Weeks <dwe...@apache.org> wrote:
>>
>>> Benny, I think you bring up a good point about staleness in that
>>> different clients may want different behaviors.  However, defining a "grace
>>> period" or "max staleness" is pretty common and makes a lot of sense when
>>> working with expensive queries.  Trino
>>> <https://trino.io/docs/current/sql/create-materialized-view.html#synopsis>
>>> and BigQuery
>>> <https://cloud.google.com/bigquery/docs/materialized-views-create#max_staleness>
>>>  (I
>>> found some references to traditional OLTP databases as well) support this
>>> concept and I think it's helpful to standardize because it's a common
>>> pattern especially where clients may not have the ability to refresh or
>>> fallback to the view definition.
>>>
>>> Maybe adding the property and noting that it's acceptable for clients to
>>> override the behavior with query specific options?
>>>
>>> -Dan
>>>
>>>
>>>
>>> On Thu, Jun 20, 2024 at 9:47 PM Jan Kaul <jank...@mailbox.org.invalid>
>>> wrote:
>>>
>>>> Thanks Benny for bringing these issues up. I would agree with both of
>>>> your propositions.
>>>>
>>>> Regarding the naming of the fields, we can go with the naming that you
>>>> suggested. I just wanted to wait if some more people chime in with their
>>>> opinions.
>>>>
>>>> Jan
>>>> On 20.06.24 23:16, Benny Chow wrote:
>>>>
>>>> > So basically this is just FYI and it is up to the consumer to assume
>>>> what to do given the length of time between that timestamp and now?
>>>> Yes.. some consumers will use it.. some won't.   Only the engine
>>>> producing the materialization will know when it started the refresh job.
>>>>
>>>> > Would the decision be on the engine implementation side (which does
>>>> not have the domain knowledge), or would it be on the user's side?
>>>> I think the decision would mostly be on the engine implementation
>>>> side.  I would imagine an engine would have a bunch of configuration
>>>> properties associated with a view that control how/when the refresh is
>>>> performed and settings around when the planner can still use the view's
>>>> materialization when it is stale.  It might be possible for the user to
>>>> specify hints in the query or session/connection properties to control the
>>>> staleness tolerance too.
>>>>
>>>> Benny
>>>>
>>>>
>>>> On Thu, Jun 20, 2024 at 12:17 PM Walaa Eldin Moustafa <
>>>> wa.moust...@gmail.com> wrote:
>>>>
>>>>> So basically this is just FYI and it is up to the consumer to assume
>>>>> what to do given the length of time between that timestamp and now? Would
>>>>> the decision be on the engine implementation side (which does not have the
>>>>> domain knowledge), or would it be on the user's side? In the latter case
>>>>> the user would need to explicitly query the storage table directly,
>>>>> correct? With a grace period I think we could push it down to the engine.
>>>>>
>>>>> Thanks,
>>>>> Walaa.
>>>>>
>>>>>
>>>>> On Thu, Jun 20, 2024 at 12:06 PM Benny Chow <btc...@gmail.com> wrote:
>>>>>
>>>>>> Piotr, thanks for the Trino pointers.  I noticed that Trino stores
>>>>>> the refresh start time as a snapshot summary property here
>>>>>> <https://github.com/trinodb/trino/blob/6697fe24481a30d37eb91efd62666165acf379c2/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java#L3027>.
>>>>>> I think this is exactly what I am asking for with
>>>>>> "refresh-start-timestamp-ms".
>>>>>>
>>>>>> Walaa, no, the suggestion is to not have a grace period as this is
>>>>>> engine specific on how it wants to handle staleness.
>>>>>>  "refresh-start-timestamp-ms" refers to 1 and not 2.  We should already
>>>>>> have 2 in the snapshot summary timestamp-ms property.
>>>>>>
>>>>>> When I say "fresh as of", I don't mean the AS OF construct.  It's
>>>>>> just making a guarantee to the consumer of the MV that the 
>>>>>> materialization
>>>>>> contains data that is "fresh as of" a certain timestamp.  So like, if you
>>>>>> built a materialization on top of 100 tables (possibly a mix of Iceberg 
>>>>>> and
>>>>>> non-Iceberg) and you know that the refresh job ran on say 6/20/2024
>>>>>> 12:02:10 UTC, then whatever data is in the materialization has to be 
>>>>>> "fresh
>>>>>> as of" 6/20/2024 12:02:10 UTC.
>>>>>>
>>>>>> Thanks
>>>>>> Benny
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 20, 2024 at 11:19 AM Walaa Eldin Moustafa <
>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>
>>>>>>> Benny, is the suggestion to couple the "refresh-start-timestamp-ms"
>>>>>>> property with a grace period as well? Also, could you clarify which
>>>>>>> timestamp "refresh-start-timestamp-ms" refers to:
>>>>>>> (1) Timestamp when refresh is triggered
>>>>>>> (2) Timestamp when refresh is concluded and the snapshot is written.
>>>>>>>
>>>>>>> Also when you say "fresh as of" this timestamp, do you mean "AS OF"
>>>>>>> construct when used to query the materialized view? Or something else? 
>>>>>>> If
>>>>>>> "AS OF" is what you meant, then this might answer my question about the
>>>>>>> grace period, where it won't be needed.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Walaa.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 20, 2024 at 5:22 AM Piotr Findeisen <
>>>>>>> piotr.findei...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Benny,
>>>>>>>>
>>>>>>>> on the staleness topic I'd recommend to check how Trino implements
>>>>>>>> materialized views in Iceberg and how it defines staleness.
>>>>>>>> In particular
>>>>>>>>
>>>>>>>> - a view can have defined grace period which defines how stale the
>>>>>>>> data can be for the materialization to be considered useful (defaults 
>>>>>>>> to
>>>>>>>> unlimited)
>>>>>>>> - staleness clock starts with the first table change after refresh
>>>>>>>> - for unmanaged (non-iceberg) tables where we don't know when the
>>>>>>>> table changed, the staleness clock starts right after refresh
>>>>>>>>
>>>>>>>> Best
>>>>>>>> Piotr
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, 19 Jun 2024 at 19:58, Benny Chow <btc...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hey Guys,
>>>>>>>>>
>>>>>>>>> Great progress on the MV spec and thanks a ton to Jan and Walaa
>>>>>>>>> for driving this.  One of our latest achievements was that we 
>>>>>>>>> finalized the
>>>>>>>>> view lineage and materialization table refresh JSON so that we can
>>>>>>>>> definitively and concisely describe what data is in the 
>>>>>>>>> materialization
>>>>>>>>> table.
>>>>>>>>>
>>>>>>>>> Regarding the actual refresh process, I have two more suggestions:
>>>>>>>>>
>>>>>>>>> *When should a MV be refreshed?  *There could be many different
>>>>>>>>> refresh policies such as "on table data or view change", periodic,
>>>>>>>>> scheduled and/or manual with the goal of reducing staleness while
>>>>>>>>> minimizing cost to refresh.  I don't think we should try to capture 
>>>>>>>>> this
>>>>>>>>> configuration as part of the first iteration of the MV spec.  So, I 
>>>>>>>>> suggest
>>>>>>>>> we just remove the "*materialization.data.max-staleness*" view
>>>>>>>>> property for now.  There's a lot of comments on this in the spec and 
>>>>>>>>> many
>>>>>>>>> contributors did suggest to not include it as well.
>>>>>>>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?disco=AAABFwRPGoA
>>>>>>>>>
>>>>>>>>> *When refresh is done, what information is stored to help engines
>>>>>>>>> evaluate materialization freshness?*  We agreed on storing the
>>>>>>>>> view lineage and materialization refresh-tables so that engines can 
>>>>>>>>> query
>>>>>>>>> for the current table snapshot versions and view versions and compare 
>>>>>>>>> to
>>>>>>>>> what is stored in the refresh-tables.  However, there could be 100s of
>>>>>>>>> tables/views here and it could be prohibitively expensive to do this
>>>>>>>>> check.  Instead, the engine may just use the materialization's 
>>>>>>>>> snapshot
>>>>>>>>> summary timestamp-ms to determine the last refresh time and assume 
>>>>>>>>> the data
>>>>>>>>> is fresh as of this timestamp.  However, this assumption might be 
>>>>>>>>> naive if
>>>>>>>>> the refresh job took 1 hour to run and source tables were queried at
>>>>>>>>> different times throughout the execution of the job.  So, I propose 
>>>>>>>>> we add
>>>>>>>>> a "*refresh-start-timestamp-ms*" to the materialization snapshot
>>>>>>>>> summary which tells users that the data in the materialization is at 
>>>>>>>>> least
>>>>>>>>> as fresh as of this date  (It might be fresher but not more stale).
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Benny
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>

Re: Iceberg MV Refresh

Reply via email to