Re: Dedicated sync for Iceberg materialized view

Benny Chow Fri, 05 Dec 2025 13:53:37 -0800

>> Benny: Are you suggesting that the source-table-states should only
capture the leaf table nodes in the MV dependency pipeline?
Yes.  But to be clear with an example, suppose you have a MV like:


CREATE MV MV1 as SELECT * FROM T1 UNION ALL SELECT * FROM V1

And suppose V1 was defined as CREATE MV V1 as SELECT * FROM T1  --- Yes, T1
again to make this example interesting.

Then, I'm saying that the source-table-state for MV1 is going to somehow
combine the first T1 with the source table state from V1.

On Fri, Dec 5, 2025 at 1:25 PM Igor Belianski <[email protected]>
wrote:

> Hi Benny and Walaa,
>
> Could you please clarify the following statement from Benny's last email:
>
> "We should avoid the need for consumers to expand nested MVs. I think the
> producer should be combining the refresh states of all the nested MVs it
> uses into two flat lists of source views and source tables. These source
> tables can't contain storage tables."
>
> Benny: Are you suggesting that the source-table-states should only capture
> the leaf table nodes in the MV dependency pipeline?
>
> Walaa: If we completely enumerate all source tables recursively, we must
> consider the scenario of shared upstream tables (the "diamond pattern").
> Specifically:
>
> 1. Do we allow duplicate table entries in the list (e.g., if the same
> table was used at different snapshots across different refresh path
> traversals)?
> 2. Would we need to include the path in the refresh state entry to make
> this data interpretable?
>
> If we pursue the option of listing everything in the tree, we should
> choose between:
>
> - A) Permissive: Allow duplicate table entries, treating the list as a
> client hint for tables to check. This leaves it up to engines to
> disambiguate or skip entries, and the list may not be strictly exhaustive.
> - B) Prescriptive: Establish an exactly defined meaning for each entry,
> mandating clear rules for aggregation.
>
> I am highly hesitant to mandate option B ( it would obviously be too
> prescriptive for most engines).
>
> Thanks,
> Igor
>
> On Thu, Dec 4, 2025 at 9:28 PM Benny Chow <[email protected]> wrote:
>
>> I agree with Walaa.  In the last sync, we talked about smart producers
>> and dumb consumers.  We should avoid the need for consumers to expand
>> nested MVs.  I think the producer should be combining the refresh states of
>> all the nested MVs it uses into two flat lists of source views and source
>> tables.  These source tables can't contain storage tables.  When planning
>> the refresh job, the producer can choose to use the nested MV's storage
>> table or not and the refresh state needs to reflect this decision
>> accordingly.
>>
>> There's also a somewhat corner case to consider.  It is completely
>> possible for a source table to show up in a materialization at different
>> snapshots.  In this scenario, it's up to the producer to decide whether to
>> allow this or not or maybe just record the earliest snapshot.  These
>> scenarios are inevitable when you get MVs built on MVs built on MVs such as
>> in ETL scenarios.
>>
>> For max-staleness, I think it should strictly apply to only source tables
>> and not storage tables.  For the ETL pipeline use case, the consumer is
>> probably not going to care about this property.
>>
>> Thanks
>>
>> On Thu, Dec 4, 2025 at 5:37 PM Walaa Eldin Moustafa <
>> [email protected]> wrote:
>>
>>> I think this creates significant friction for engine implementations
>>> (and contradicts some of the principles we established earlier):
>>>
>>> * When the engine sees the tables backing mv_3, the spec provides no
>>> built-in way to distinguish MV storage tables from true physical tables.
>>> The engine must always perform an external lookup to determine whether a
>>> “table” is really an MV.
>>>
>>> * Freshness evaluation becomes inconsistent: nested logical views
>>> require only a one-shot leaf-table comparison, while nested MVs require
>>> recursive traversal because their refresh-state does not contain leaf
>>> snapshots.
>>>
>>> * Even if an engine uses the MV definition to detect deeper staleness,
>>> it cannot refresh the MV to a consistent base-table state. Option 2 refresh
>>> semantics stop at the immediate MV boundary, so recursion for freshness
>>> does not align with recursion for refresh.
>>>
>>> For these reasons, “allowing recursive expansion” is not practically
>>> usable. It introduces complexity without providing coherent semantics.
>>>
>>> In summary, treating MVs either as views or as tables yields a
>>> consistent model, but the optionality implied in Option 2 is misleading.
>>> The metadata does not support cleanly mixing the two modes.
>>>
>>> Thanks,
>>> Walaa.
>>>
>>> On Thu, Dec 4, 2025 at 11:44 AM Steven Wu <[email protected]> wrote:
>>>
>>>> Walaa,
>>>>
>>>> We are saying the `refres-state` only has a source view state for the
>>>> source MV and a source table state for the source MV's storage table. It
>>>> would allow both evaluation strategies (recursive or not)
>>>> * non-recursive: as long as the MV refresh state is aligned with the
>>>> source MV's storage table (with max staleness config), it is fresh. This
>>>> semantic matches many ETL pipeline use cases.
>>>> * recursive: If an engine wants to enforce stronger freshness
>>>> semantics, it can recursively evaluate if source mv_1 and mv_2 themselves
>>>> are fresh. The current spec wording mentioned this is allowed: "query
>>>> engines may recursively expand the query tree to determine freshness".
>>>>
>>>> We wants the spec definition to be flexible enough to support both use
>>>> cases.
>>>>
>>>> Thanks,
>>>> Steven
>>>>
>>>>
>>>> On Wed, Dec 3, 2025 at 5:30 PM Walaa Eldin Moustafa <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Steven,
>>>>>
>>>>> > In option 2, when determining the freshness of mv_3, engines can
>>>>> choose to recursively evaluate the freshness of mv_1 and mv_2 since they
>>>>> are also MVs. But engines can also choose not to.
>>>>>
>>>>> Does not "evaluating freshness of mv_1 and mv_2" mean that engines
>>>>> consider mv_1 and mv_2 as views? "Tables" do not have freshness.
>>>>>
>>>>> Thanks,
>>>>> Walaa.
>>>>>
>>>>>
>>>>> On Tue, Nov 25, 2025 at 11:49 PM Jan Kaul via dev <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Thank you Steven,
>>>>>>
>>>>>> I've included the "max-staleness" in the PR. Please have a look and
>>>>>> give feedback on the phrasing.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jan
>>>>>> On 11/19/25 22:37, Steven Wu wrote:
>>>>>>
>>>>>> Thanks everyone for joining today's sync. We had a good discussion on
>>>>>> how to interpret the "max staleness" config.
>>>>>>
>>>>>> You can find the meeting notes here.
>>>>>>
>>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.eho7jgm13usg
>>>>>>
>>>>>> Recording is also linked in the doc (thanks Kevin).
>>>>>>
>>>>>> For the next step, maybe we can collaborate on the MV spec PR to
>>>>>> flush the exact wording for staleness config and semantic.
>>>>>> https://github.com/apache/iceberg/pull/11041/files
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 18, 2025 at 1:05 PM Benny Chow <[email protected]> wrote:
>>>>>>
>>>>>>> Thanks Igor.  The PR has a suggestion for exactly what you
>>>>>>> suggested.  I called it a "*warm*" state which is a state where
>>>>>>> stale materialization can still be used.
>>>>>>> https://github.com/apache/iceberg/pull/11041/files#r2474661166
>>>>>>>
>>>>>>> I think if we continue with the assumption that MVs can only
>>>>>>> reference iceberg tables and views, then it makes sense for the
>>>>>>> max-staleness grace period to be dynamic based on snapshot history.   
>>>>>>> This
>>>>>>> is what Trino does:
>>>>>>> https://trino.io/docs/current/connector/iceberg.html?utm_source=chatgpt.com#materialized-views
>>>>>>>
>>>>>>> If there are non-Iceberg tables in the view SQL, then the grace
>>>>>>> period will have to be based on last refresh which is also what Trino
>>>>>>> describes here:
>>>>>>> https://trino.io/docs/current/sql/create-materialized-view.html#mv-grace-period
>>>>>>>
>>>>>>> Should we call out both scenarios in the MV spec?  I think this is
>>>>>>> worth being explicit here.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 18, 2025 at 11:03 AM Igor Belianski <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Re:  max-stalenss-ms interpretation
>>>>>>>> proposal:
>>>>>>>>    A Materialized View(MNV) considered fresh if and only if the
>>>>>>>> results stored are equivalent to the those that would have been 
>>>>>>>> obtained by
>>>>>>>> running MV's defining query at some point in time within interval :
>>>>>>>>  [CurrentTime-max-staleness-ms, Current_time]
>>>>>>>>
>>>>>>>> Note: this definition allows for optimization proposed by option 2
>>>>>>>> (implementing which is definitely a great idea) , but doesn't mandate 
>>>>>>>> it.
>>>>>>>>  One can also imagine some other optimization that would be
>>>>>>>> possible given definition above , and would be left up to the engines 
>>>>>>>> toi
>>>>>>>> implement.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 18, 2025 at 10:54 AM Steven Wu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> A reminder for tomorrow's community sync for the MV spec.
>>>>>>>>> https://calendar.app.google/T4zSk6qKWoy1vV6P7
>>>>>>>>>
>>>>>>>>> We have one open question from the last meeting on how
>>>>>>>>> `max-stalenesss-ms` should be interpreted. You can find more details 
>>>>>>>>> in the
>>>>>>>>> meeting notes.
>>>>>>>>>
>>>>>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o
>>>>>>>>>
>>>>>>>>> Please also bring other topics that we should discuss.
>>>>>>>>>
>>>>>>>>> On Sat, Nov 1, 2025 at 10:14 PM Steven Wu <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry for the delay. Here are the recording and meeting notes for
>>>>>>>>>> the MV sync meeting on Wednesday, Oct 29.
>>>>>>>>>>
>>>>>>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o
>>>>>>>>>>
>>>>>>>>>> We have started to collect them in the above google doc.
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 27, 2025 at 8:58 AM Péter Váry <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> If we have materialized views (MVs) and support for incremental
>>>>>>>>>>> change scans, then by introducing a Java-based representation of 
>>>>>>>>>>> the view,
>>>>>>>>>>> we can expose a scan API that always returns up-to-date results for 
>>>>>>>>>>> the MV.
>>>>>>>>>>>
>>>>>>>>>>> The scan could include multiple tasks:
>>>>>>>>>>>
>>>>>>>>>>>    - A task for reading the current version of the MV.
>>>>>>>>>>>    - An incremental change log scan covering the range between
>>>>>>>>>>>    the snapshot ID of the source table at the time the MV was last 
>>>>>>>>>>> refreshed
>>>>>>>>>>>    and its current snapshot ID. Applying the Java representation of 
>>>>>>>>>>> the view
>>>>>>>>>>>    when transformations are required.
>>>>>>>>>>>
>>>>>>>>>>> This approach allows us to build an always up-to-date index
>>>>>>>>>>> table/single source MV, using existing components.
>>>>>>>>>>>
>>>>>>>>>>> Benny Chow <[email protected]> ezt írta (időpont: 2025. okt.
>>>>>>>>>>> 24., P, 7:44):
>>>>>>>>>>>
>>>>>>>>>>>> Hi Peter
>>>>>>>>>>>>
>>>>>>>>>>>> I think the current proposal would support your example.  In
>>>>>>>>>>>> most situations, replace table operations after a view is 
>>>>>>>>>>>> materialized
>>>>>>>>>>>> wouldn’t invalidate the materialization.  However, if the view 
>>>>>>>>>>>> includes
>>>>>>>>>>>> metadata columns, then the replace operations should invalidate the
>>>>>>>>>>>> materialization.
>>>>>>>>>>>>
>>>>>>>>>>>> This also brings up another important point that engines will
>>>>>>>>>>>> differ on what views can be materialized or not.  For example, 
>>>>>>>>>>>> maybe
>>>>>>>>>>>> metadata columns are not allowed similar to non deterministic 
>>>>>>>>>>>> functions
>>>>>>>>>>>> like random.  But some engines like Dremio may allow views that 
>>>>>>>>>>>> use current
>>>>>>>>>>>> date functions.  It should be possible for one engine to 
>>>>>>>>>>>> materialize a view
>>>>>>>>>>>> and another engine to look at the query tree and decide it’s not a 
>>>>>>>>>>>> view it
>>>>>>>>>>>> supports materializations on and choose not to use that 
>>>>>>>>>>>> materialization.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Benny
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 23, 2025, at 8:44 AM, Péter Váry <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>
>>>>>>>>>>>> I’ve been catching up on the discussion and wanted to share an
>>>>>>>>>>>> observation. One aspect that stands out to me in the proposed 
>>>>>>>>>>>> staleness
>>>>>>>>>>>> evaluation logic is that snapshots which don’t modify data can 
>>>>>>>>>>>> still affect
>>>>>>>>>>>> the view’s contents if the view includes metadata columns.
>>>>>>>>>>>>
>>>>>>>>>>>> I was considering using a materialized view as an index for a
>>>>>>>>>>>> given table to accelerate the conversion of equality deletes to 
>>>>>>>>>>>> position
>>>>>>>>>>>> deletes. For example, the query might look like:
>>>>>>>>>>>>
>>>>>>>>>>>> *SELECT _POS, _FILE, id FROM target_table*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> During compaction, the materialized view would need to be
>>>>>>>>>>>> refreshed to ensure it reflects the correct data.
>>>>>>>>>>>>
>>>>>>>>>>>> Does this seem like a valid use case? Or should we explicitly
>>>>>>>>>>>> exclude scenarios like this?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Peter
>>>>>>>>>>>>
>>>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. okt.
>>>>>>>>>>>> 20., H, 17:30):
>>>>>>>>>>>>
>>>>>>>>>>>>> Walaa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> > while Option 2 is described in your summary as "giving
>>>>>>>>>>>>> engines *flexibility* to determine freshness recursively
>>>>>>>>>>>>> beyond a source MV", that *isn’t achievable* under the MV
>>>>>>>>>>>>> evaluation model itself.
>>>>>>>>>>>>> Because each MV treats upstream MVs as physical tables,
>>>>>>>>>>>>> recursion stops at the first materialized boundary; *deeper
>>>>>>>>>>>>> staleness cannot be discovered without switching to a logical-view
>>>>>>>>>>>>> evaluation model, i.e., stepping outside the MV model altogether 
>>>>>>>>>>>>> (note that
>>>>>>>>>>>>> in Option 3 we can determine recursive staleness while still 
>>>>>>>>>>>>> inside the MV
>>>>>>>>>>>>> model).*
>>>>>>>>>>>>>
>>>>>>>>>>>>> In option 2, when determining the freshness of mv_3, engines
>>>>>>>>>>>>> can choose to recursively evaluate the freshness of mv_1 and mv_2 
>>>>>>>>>>>>> since
>>>>>>>>>>>>> they are also MVs. But engines can also choose not to.
>>>>>>>>>>>>>
>>>>>>>>>>>>> > This means that there seems to be an implicit “Option 3”.
>>>>>>>>>>>>> This option treats MVs as logical views, i.e., storing only view 
>>>>>>>>>>>>> versions +
>>>>>>>>>>>>> base table snapshot IDs (no MV storage snapshot IDs, no per-path 
>>>>>>>>>>>>> lineage).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the new option 3 you described, how could the engine update
>>>>>>>>>>>>> mv3's refresh state for base table_a and table_b? unless all 
>>>>>>>>>>>>> connected MVs
>>>>>>>>>>>>> are refreshed and committed in one single transaction, one entry 
>>>>>>>>>>>>> per base
>>>>>>>>>>>>> table doesn't seem feasible. That's the main reason for option 1 
>>>>>>>>>>>>> to require
>>>>>>>>>>>>> the lineage path information in refresh state for base tables.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It also seems that option 3 can only interpret freshness
>>>>>>>>>>>>> recursively, while today there are engines that support MVs 
>>>>>>>>>>>>> without
>>>>>>>>>>>>> recursively evaluating source MVs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Steven
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 20, 2025 at 1:44 AM Walaa Eldin Moustafa <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Steven,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for organizing the series and summarizing the outcome.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After re-reading the Option 1/2 proposal, initially I
>>>>>>>>>>>>>> interpreted Option 1 as simply expanding MVs like regular 
>>>>>>>>>>>>>> logical views. On
>>>>>>>>>>>>>> closer look, it is actually more complex. It also preserves 
>>>>>>>>>>>>>> per-path
>>>>>>>>>>>>>> lineage state (e.g., multiple entries for the same base table 
>>>>>>>>>>>>>> via different
>>>>>>>>>>>>>> parents), which increases expressiveness but significantly 
>>>>>>>>>>>>>> increases
>>>>>>>>>>>>>> metadata complexity. So I agree it is not a practical option.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This means that there seems to be an implicit “Option 3”.
>>>>>>>>>>>>>> This option treats MVs as logical views, i.e., storing only view 
>>>>>>>>>>>>>> versions +
>>>>>>>>>>>>>> base table snapshot IDs (no MV storage snapshot IDs, no per-path 
>>>>>>>>>>>>>> lineage).
>>>>>>>>>>>>>> Under this model, mv_3’s metadata might look like:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Type   Name     Tracked State
>>>>>>>>>>>>>> -----  -------  -----------------------
>>>>>>>>>>>>>> view   mv_1     view_version_id
>>>>>>>>>>>>>> view   mv_2     view_version_id
>>>>>>>>>>>>>> table  table_a  table_snapshot_id
>>>>>>>>>>>>>> table  table_b  table_snapshot_id
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This preserves logical semantics and aligns MV behavior with
>>>>>>>>>>>>>> pure views.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *If we choose Option 2 (treat source MV as a materialized
>>>>>>>>>>>>>> table), we may have to be consider those constraints:*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * Staleness only degrades up the chain. mv_1 and mv_2 may
>>>>>>>>>>>>>> already be stale relative to the base tables, but if mv_3 is 
>>>>>>>>>>>>>> refreshed
>>>>>>>>>>>>>> using their storage snapshots, then mv_3 will be marked as fresh 
>>>>>>>>>>>>>> under
>>>>>>>>>>>>>> Option 2, even though all three MVs are stale relative to the 
>>>>>>>>>>>>>> base tables.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * Engines can no longer discover staleness beyond mv_1. Once
>>>>>>>>>>>>>> mv_3 sees mv_1 (or mv_2) as fresh based only on their storage 
>>>>>>>>>>>>>> snapshots, it
>>>>>>>>>>>>>> will not expand into mv_1 or mv_2 to check whether they are 
>>>>>>>>>>>>>> stale relative
>>>>>>>>>>>>>> to the base tables.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * If mv_2 and mv_3 were purely logical views instead of MVs,
>>>>>>>>>>>>>> they would evaluate directly against base tables and return 
>>>>>>>>>>>>>> newer data.
>>>>>>>>>>>>>> Under Option 2, the same definitions but materialized upstream 
>>>>>>>>>>>>>> produce
>>>>>>>>>>>>>> different data, not just different metadata.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Therefore, while Option 2 is described in your summary as
>>>>>>>>>>>>>> "giving engines *flexibility* to determine freshness
>>>>>>>>>>>>>> recursively beyond a source MV", that *isn’t achievable*
>>>>>>>>>>>>>> under the MV evaluation model itself.
>>>>>>>>>>>>>> Because each MV treats upstream MVs as physical tables,
>>>>>>>>>>>>>> recursion stops at the first materialized boundary; *deeper
>>>>>>>>>>>>>> staleness cannot be discovered without switching to a 
>>>>>>>>>>>>>> logical-view
>>>>>>>>>>>>>> evaluation model, i.e., stepping outside the MV model altogether 
>>>>>>>>>>>>>> (note that
>>>>>>>>>>>>>> in Option 3 we can determine recursive staleness while still 
>>>>>>>>>>>>>> inside the MV
>>>>>>>>>>>>>> model).*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know your thoughts. I slightly prefer Option 3. I’m
>>>>>>>>>>>>>> also fine with Option 2, but I don’t think the flexibility to 
>>>>>>>>>>>>>> recursively
>>>>>>>>>>>>>> determine freshness actually exists under its evaluation model. 
>>>>>>>>>>>>>> Not sure if
>>>>>>>>>>>>>> this changes anyone’s view, but I wanted to clarify how I’m 
>>>>>>>>>>>>>> reading it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 11:11 PM Benny Chow <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I just listened to the recording.  I'm the tech lead for MVs
>>>>>>>>>>>>>>> at Dremio and responsible for both refresh management and query 
>>>>>>>>>>>>>>> rewrites
>>>>>>>>>>>>>>> with MVs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's great that we seem to agree that Iceberg MV spec won't
>>>>>>>>>>>>>>> require that MVs always be up to date in order to be usable for 
>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>> rewrites.  There can be many data consistency issues (as Dan 
>>>>>>>>>>>>>>> pointed out)
>>>>>>>>>>>>>>> but that is the state of affairs today.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It sounds like we are converging on the following scenarios
>>>>>>>>>>>>>>> for an engine to validate the MV freshness:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1.  Use storage table without any validation.  This might be
>>>>>>>>>>>>>>> the extreme "async MV" example.
>>>>>>>>>>>>>>> 2.  Ignore storage table even if one exists because SQL
>>>>>>>>>>>>>>> command or use case requires that.
>>>>>>>>>>>>>>> 3.  Use storage table only if data is not more than x hours
>>>>>>>>>>>>>>> old.  This can be achieved with the proposed 
>>>>>>>>>>>>>>> refresh-start-timestamp-ms
>>>>>>>>>>>>>>> which is currently in the proposed spec.  For this to work
>>>>>>>>>>>>>>> with MVs built on MVs, we should probably state in the spec 
>>>>>>>>>>>>>>> that if a MV is
>>>>>>>>>>>>>>> built on another MV, then it needs to inherit the
>>>>>>>>>>>>>>> refresh-start-timestamp-ms of the child MV.  In Steven's 
>>>>>>>>>>>>>>> example, when
>>>>>>>>>>>>>>> building mv3, refresh-start-timestamp-ms needs to be set to the 
>>>>>>>>>>>>>>> minimum of
>>>>>>>>>>>>>>> mv1 or mv2's refresh-start-timestamp-ms.  If this property name 
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> confusing, we can rename it to 
>>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms".  I
>>>>>>>>>>>>>>> originally proposed this property and also listed out other 
>>>>>>>>>>>>>>> benefits here:
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1779797796
>>>>>>>>>>>>>>> Also, at the time, MVs built on MVs weren't being considered.  
>>>>>>>>>>>>>>> Now that it
>>>>>>>>>>>>>>> is, I would recommend we have both "refresh-start-timestamp-ms" 
>>>>>>>>>>>>>>> (when the
>>>>>>>>>>>>>>> refresh was started on the storage table) and
>>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms" (used for freshness 
>>>>>>>>>>>>>>> validation).
>>>>>>>>>>>>>>> 4.  Don't use the storage table if it is older than X
>>>>>>>>>>>>>>> hours.  This is what I had originally proposed for the
>>>>>>>>>>>>>>> *materialization.max-stalessness-ms* view property here:
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1744837644
>>>>>>>>>>>>>>> It wasn't meant to validate the freshness but more to prevent 
>>>>>>>>>>>>>>> use of a
>>>>>>>>>>>>>>> materialization after some criteria.
>>>>>>>>>>>>>>> 5.  Use storage table if recursive validation passes... i.e.
>>>>>>>>>>>>>>> refresh-state matches the current expanded query tree state.  
>>>>>>>>>>>>>>> This is what
>>>>>>>>>>>>>>> I think Steven is calling the "synchronous MV".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For scenario 1-4, it would support the nice use case of an
>>>>>>>>>>>>>>> Iceberg client using a view's data through the storage table 
>>>>>>>>>>>>>>> without
>>>>>>>>>>>>>>> needing to know how to parse/validate/expand any view SQLs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In Dremio's planner, we primarily use scenario 1 and 4
>>>>>>>>>>>>>>> together to determine MV validity for query rewrite.  Scenario 
>>>>>>>>>>>>>>> 2 and 5 also
>>>>>>>>>>>>>>> apply in certain situations.  For scenario 3, Dremio only 
>>>>>>>>>>>>>>> exposes the
>>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms" as an fyi to the user but 
>>>>>>>>>>>>>>> it would be
>>>>>>>>>>>>>>> interesting to allow the user to set this time so that they 
>>>>>>>>>>>>>>> could run
>>>>>>>>>>>>>>> queries and be 100% certain that they were not seeing data 
>>>>>>>>>>>>>>> older than x
>>>>>>>>>>>>>>> hours.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Benny
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 3:37 PM Steven Wu <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> correction for a typo.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Prashanth brought up another scenario of compaction/rewrite
>>>>>>>>>>>>>>>> where a new snapshot was added *with* actual data change
>>>>>>>>>>>>>>>> -->
>>>>>>>>>>>>>>>> Prashanth brought up another scenario of compaction/rewrite
>>>>>>>>>>>>>>>> where a new snapshot was added *without* actual data change
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 2:12 PM Steven Wu <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks everyone for joining the MV discussion meeting. We
>>>>>>>>>>>>>>>>> will continue to have the recurring sync meeting on Wednesday 
>>>>>>>>>>>>>>>>> 9 am
>>>>>>>>>>>>>>>>> (Pacific) every 3 weeks until we get to the finish line where 
>>>>>>>>>>>>>>>>> Jan's MV spec
>>>>>>>>>>>>>>>>> PR [1] is merged. I have scheduled our next meeting on Oct 29 
>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>> Iceberg dev events calendar.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here is the video recording for today's meeting.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://drive.google.com/file/d/1-nfhBPDWLoAFDu5cKP0rwLd_30HB6byR/view?usp=sharing
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We mostly discussed freshness evaluation. Here is the
>>>>>>>>>>>>>>>>> meeting summary.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    1. For tracking the refresh state for the source MV
>>>>>>>>>>>>>>>>>    [2], the consensus is option 2 (treating source MV as a 
>>>>>>>>>>>>>>>>> materialized table)
>>>>>>>>>>>>>>>>>    which would give engines the flexibility on freshness 
>>>>>>>>>>>>>>>>> determination
>>>>>>>>>>>>>>>>>    (recursive beyond source MV or not).
>>>>>>>>>>>>>>>>>    2. Earlier design doc [3] discussed max staleness
>>>>>>>>>>>>>>>>>    config. But it wasn't reflected in the spec PR. The 
>>>>>>>>>>>>>>>>> general opinion is to
>>>>>>>>>>>>>>>>>    add the config to the spec PR. The open question is 
>>>>>>>>>>>>>>>>> whether the `
>>>>>>>>>>>>>>>>>    materialization.max-staleness-ms` config should be
>>>>>>>>>>>>>>>>>    added to the view metadata or the storage table metadata. 
>>>>>>>>>>>>>>>>> Either can work.
>>>>>>>>>>>>>>>>>    We just need to decide which makes a little better fit.
>>>>>>>>>>>>>>>>>    3. Prashanth brought up schema change with default
>>>>>>>>>>>>>>>>>    value and how it may affect the MV refresh state (for SQL 
>>>>>>>>>>>>>>>>> representation
>>>>>>>>>>>>>>>>>    with select *). Jan mentioned that snapshot contains 
>>>>>>>>>>>>>>>>> schema id when the
>>>>>>>>>>>>>>>>>    snapshot was created. Engine can compare the snapshot 
>>>>>>>>>>>>>>>>> schema id to the
>>>>>>>>>>>>>>>>>    source table schema id during freshness evaluation. There 
>>>>>>>>>>>>>>>>> is no need for
>>>>>>>>>>>>>>>>>    additional schema info in refresh-state tracking in the 
>>>>>>>>>>>>>>>>> storage table.
>>>>>>>>>>>>>>>>>    4. Prashanth brought up another scenario of
>>>>>>>>>>>>>>>>>    compaction/rewrite where a new snapshot was added with 
>>>>>>>>>>>>>>>>> actual data change.
>>>>>>>>>>>>>>>>>    The general take is that the engine can optimize and 
>>>>>>>>>>>>>>>>> decide that MV is
>>>>>>>>>>>>>>>>>    fresh as the new snapshot doesn't have any data change.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We can add some clarifications in the spec PR for
>>>>>>>>>>>>>>>>> freshness evaluation based on the above discussions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/11041
>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1_StBW5hCQhumhIvgbdsHjyW0ED3dWMkjtNzyPp9Sfr8/edit?tab=t.0
>>>>>>>>>>>>>>>>> [3]
>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?tab=t.0#heading=h.3wigecex0zls
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Sep 25, 2025 at 9:27 AM Steven Wu <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Iceberg materialized view has been discussed in the
>>>>>>>>>>>>>>>>>> community for a long time. Thanks Jan Kaul for driving the 
>>>>>>>>>>>>>>>>>> discussion and
>>>>>>>>>>>>>>>>>> the spec PR. It has been stalled for a long time due to lack 
>>>>>>>>>>>>>>>>>> of consensus
>>>>>>>>>>>>>>>>>> on 1 or 2 topics. In Wed's Iceberg community sync meeting, 
>>>>>>>>>>>>>>>>>> Talat brought up
>>>>>>>>>>>>>>>>>> the question on how to move forward and if we can have a 
>>>>>>>>>>>>>>>>>> dedicated meeting
>>>>>>>>>>>>>>>>>> for MV.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have set up a meeting on *Oct 8 (9-10 am Pacific)*. If
>>>>>>>>>>>>>>>>>> you subscribe to the "Iceberg Dev Events" calendar, you
>>>>>>>>>>>>>>>>>> should be able to see it. If not, here is the link:
>>>>>>>>>>>>>>>>>> https://meet.google.com/nfe-guyq-pqf
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We are going to discuss
>>>>>>>>>>>>>>>>>> * remaining open questions
>>>>>>>>>>>>>>>>>> * unresolved concerns
>>>>>>>>>>>>>>>>>> * the next step and hopefully some consensus on moving
>>>>>>>>>>>>>>>>>> forward
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> MV spec PR is up to date. Jan has incorporated recent
>>>>>>>>>>>>>>>>>> feedback. This should be the base of the discussion.
>>>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041
>>>>>>>>>>>>>>>>>> <https://www.google.com/url?q=https://github.com/apache/iceberg/pull/11041&sa=D&source=calendar&usd=2&usg=AOvVaw3w0TjRpwbC17AGzmxZmElM>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Dev discussion thread (a long-running thread started by
>>>>>>>>>>>>>>>>>> Jan).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf
>>>>>>>>>>>>>>>>>> <https://www.google.com/url?q=https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf&sa=D&source=calendar&usd=2&usg=AOvVaw0fotlsrnRBOb820mA5JRyB>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The mail archive has broken lineage and doesn't show all
>>>>>>>>>>>>>>>>>> replies. Email subject is "*[DISCUSS] Iceberg
>>>>>>>>>>>>>>>>>> Materialzied Views*".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Steven
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Re: Dedicated sync for Iceberg materialized view

Reply via email to