Re: [DISCUSS] Write-path gap for field-id-bound policy during schema evolution

Prashant Singh Tue, 16 Jun 2026 12:39:46 -0700

Hey Sung,
I understand people define and attach policies by name but I don't think
engines / metastore keep names as metadata (at least for the engine /
catalogs i have worked with). These column names are resolved to field id
before mapping of policy to table is persisted, this means for
attachment column must exist.
Much like one creates an iceberg table with column names and those are
assigned fieldId which are kind of opaque to the user, then for all
operations fieldID becomes our source of truth and preserves things like
rename.


We discussed this exact scenario when we were modeling ReadRestrictions as
well and made ReadRestrictions return name, and left it to be catalog's
choice for their metadata representation (09/10/25) :
https://www.youtube.com/watch?v=orAXA5e9pmU&t=2867s as ideal would be to
just rely on field id in the first place.
So it's intentionally left that way since defining policy and attaching
policy are entirely catalog concerns and how catalog comes with read
restrictions is entirely catalog implementer choice, I personally don't see
a gap here.

With that being said this doesn't mean we can't rediscuss, thank you for
putting to agenda for sync. Looking forward to it.

Best,
Prashant Singh

On Tue, Jun 16, 2026 at 8:14 AM Sung Yun <[email protected]> wrote:

> Hi Andrei, that's a sharp framing, and I think you've identified something
> broader that spans multiple constructs being discussed today. I agree that
> it's worth discussing the meta pattern as its own topic.
>
> On the data governance side: before we settle what a shared write-side
> shape should look like, I think it would help to first establish whether
> the specific problems are ones the community agrees are worth solving. The
> Drift and Creation problems I raised in the Google Doc have security
> consequences for FGAC policies, and whether they merit a first-class
> construct in the IRC write path is a data-governance question that I think
> is worth putting to the community on its own terms.
>
> I'll bring the policy case to the IRC sync, and I'm glad to dig into the
> meta pattern with you and Sebastian and the rest of the community as it
> sharpens.
>
> Sung
>
> On 2026/06/16 12:13:12 Andrei Tserakhau via dev wrote:
> > Hi Sung,
> >
> > Thanks for raising this. The overlap with the labels write-side
> > work I've been drafting with @Sebastian Baunsgaard
> > <[email protected]>  is structural --
> > same lifecycle, field-id binding, co-commit and concurrency questions,
> > different payload.
> >
> > But what stands out more than the specific overlap is that this is
> > the first sighting of a pattern the spec doesn't yet have a
> > framework for: catalog-authored, lifecycle-managed write APIs that
> > reach deep into catalog-owned space. Read Restrictions already
> > does this on the read side — policies are very much catalog
> > territory. Your authoring proposal extends it to write. Labels
> > CRUD lands in the same neighborhood from a different direction.
> >
> > Kevin asked something close to this at the May 28 labels sync:
> > what's the pattern for introducing new first-class concepts in the
> > REST spec? Ryan's answer pointed at the shape (CRUD verb +
> > transactional path), but the deeper question hasn't been worked
> > through — should the spec standardize the write side of
> > catalog-owned territory at all, or is this best left ad-hoc per
> > proposal with capability negotiation governing client expectations?
> >
> > I lean toward Capabilities being the right frame here. Catalogs
> > opt in, clients discover what's supported, the spec doesn't force
> > standardization deep into catalog territory. A unified write-side
> > surface has real value for clients — engines, custom tools, one
> > shape to learn — but real cost too: catalog innovation space
> > shrinks to differentiators inside a spec-prescribed envelope.
> >
> > So before aligning on specific conventions, worth asking the
> > meta-question: shall we go this direction at all? And if yes —
> > ad-hoc per proposal, or a deliberate meta-framework?
> >
> > This is broader than labels alone, so probably worth raising at
> > one of the upcoming catalog community syncs as a meta-topic
> > rather than the labels-specific sync. Labels sync can pick up
> > labels-side implications afterward, once the broader direction
> > is clearer.
> >
> > Best,
> > Andrei
> >
> > On Mon, Jun 15, 2026 at 9:10 AM Sung Yun <[email protected]> wrote:
> >
> > > Hi Dan,
> > >
> > > Apologies for the confusion. "Write" was a poor word choice on my
> part. I
> > > didn't mean enforcing policy on writers and you're right that a writer
> > > holds the highest level of access, and there's little to restrict
> there. By
> > > "write" I meant to refer to the lifecycle of the policies themselves:
> > > creating, updating, and deleting them. Enforcement stays on the read
> side
> > > (#13879). This is the complementary authoring path discussion on
> policies.
> > >
> > > The problem I'm looking at is that a policy bound to a column by name
> can
> > > detach or retarget when that column is renamed or dropped. A policy
> also
> > > can't land in the same commit as the column it protects, so the column
> can
> > > exist before its protection does. I've written up the analysis and a
> > > direction that could close it [1], and I'd appreciate your review.
> > >
> > > Christian, thanks. I agree with your pointers. Drop+re-add is a good
> > > example of the general case and it faces the same exposure as any
> schema
> > > change when policy is managed separately from the schema change that
> > > introduces it, which is exactly the problem the doc works through. I'd
> > > value your review on the shared doc.
> > >
> > > Sung
> > >
> > > [1]
> > >
> https://docs.google.com/document/d/1yL2Yv70hJ569dpLdW_upTzzK8Zb3fAFEKEH4JRdosjU/edit?tab=t.0
> > >
> > > On 2026/06/15 15:19:12 Daniel Weeks wrote:
> > > > Hey Sung,
> > > >
> > > > I'm not sure I fully understand the use case here.  Generally,
> readers
> > > can
> > > > have different policies when they consume data (what's
> > > > restricted/hidden/obfuscated).  However, on the write path, I'm not
> aware
> > > > of scenarios where similar policies would be applied.  A writer
> typically
> > > > has the highest level of access because they need to read (metadata
> at
> > > > minimum) and write (both metadata and data).
> > > >
> > > > What use cases are you envisioning for write side policy enforcement?
> > > >
> > > > Thanks,
> > > > Dan
> > > >
> > > > On Sun, Jun 14, 2026 at 11:43 PM Christian Thiel <
> > > [email protected]>
> > > > wrote:
> > > >
> > > > > Hello Sung,
> > > > >
> > > > > thanks for sharing this!
> > > > >
> > > > > I'd definitely be interested in seeing your ideas for the proposal.
> > > > > Especially your point about field-id binding had me thinking —
> since
> > > admins
> > > > > author against names and never see field-ids today, it'd be worth
> > > spelling
> > > > > out where and when that name→field-id binding happens, and how it
> > > handles
> > > > > drop+re-add.
> > > > >
> > > > > I think a number of interesting points are worth discussing such as
> > > > > coexistence with external policy engines and separation of duties
> on
> > > > > commit, while still keeping the field-id binding intact where it
> > > applies.
> > > > >
> > > > > Looking forward to it!
> > > > >
> > > > > Best,
> > > > > Christian
> > > > >
> > > > > On Fri, 5 Jun 2026 at 22:59, Sung Yun <[email protected]> wrote:
> > > > >
> > > > >> Hi folks,
> > > > >>
> > > > >> The FGAC / Read Restriction proposal [1] is introducing a
> read-side
> > > path
> > > > >> to standardize how we describe row filters and masks, and to do it
> > > safely
> > > > >> across schema evolution by binding them to field-ids. We don't yet
> > > have
> > > > >> anything matching on the write path.
> > > > >>
> > > > >> Today, policies are administered entirely outside the REST
> protocol,
> > > so
> > > > >> external systems reference columns by name, as they're not part
> of the
> > > > >> commit and never see field-ids. And two things break once schema
> and
> > > policy
> > > > >> have to change together:
> > > > >> - a policy bound to a column name silently re-targets when the
> column
> > > is
> > > > >> renamed
> > > > >> - a policy commits separately from the schema change it depends
> on,
> > > so a
> > > > >> column can exist before its protection does
> > > > >>
> > > > >> So far, policy administration has been left out of scope [2], and
> now
> > > > >> that the Read Restrictions Proposal is finding consensus, I
> believe
> > > it is a
> > > > >> good time to start thinking about it on the write path.
> > > > >> I have a rough direction in mind, of enabling co-committing
> policy and
> > > > >> binding it to field-ids on the server-side. So I wanted to gauge:
> > > > >> 1. whether people see this as a gap worth closing in the IRC
> protocol
> > > > >> 2. whether there are concerns or considerations that should be
> taken
> > > into
> > > > >> account
> > > > >>
> > > > >> If there's interest, I'm happy to put together a detailed
> proposal and
> > > > >> share it here for discussion.
> > > > >>
> > > > >> Sung
> > > > >>
> > > > >> [1]
> > > > >>
> > >
> https://docs.google.com/document/d/108Y0E8XsZi91x-UY0_aHLEbmXDNmxmS5BnDjunEKvTM/edit?tab=t.7l861fq8jo38
> > > > >> [2]
> https://lists.apache.org/thread/2jx33fn7lq37oxxm7sd6rjy0dnvbm4t6
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Write-path gap for field-id-bound policy during schema evolution

Reply via email to