Hi everyone, Bumping the thread ! We will be discussing this again in the Read Restrictions Sync tomorrow : meet.google.com/gwy-jxos-jif (9 -10 am PST)
Please do join if you are interested ! Best, Prashant Singh On Tue, Feb 24, 2026 at 9:48 AM Prashant Singh <[email protected]> wrote: > *Hi everyone,* > > As we progress with *Read Restrictions [1]*, we need to reach a community > consensus on two key items: the *list* of predefined masks to include in > the spec, and the *representation* of those masks. > > Regarding representation, the current proposal uses an *Action* model. As > Ryan rightly puts it, this is essentially *syntactic sugar* for these > predefined common masking operations. > > *Here is how the current "Action" proposal compares to a full "Transform" > approach for a standard mask:* > > - > > *Transform Approach (define new transforms):* > > {"field-id": 1, "expr": {"type": "alias", "name": "col-name", "child": > {"type": "apply", "func-name": "mask_alphanum", "child": {"type": > "reference", "field-id": 1}}}} > - > > *Action Approach (Current Proposal):* > > {"field-id": 1, "action": "mask_alphanum"} > > In the "Action" model, the REST spec defines what the action means, and > the caller simply ensures that it is understood and enforced. This mirrors > how many existing policy stores handle masking: > > - > > *Apache Ranger [2]:* Uses maskType (e.g., "maskType": > "MASK_SHOW_LAST_4"). > - > > *Google BigQuery [3]:* Uses predefinedExpression (e.g., > "predefinedExpression": > "SHA256"). > > *I would love to get your feedback on the following:* > > - > > *Representation:* Does the community agree with using this *Action* > (syntactic sugar) approach for standard masks, or should we strictly use > the explicit *Transform* approach? > - > > *The List:* Based on the research into Ranger, BQ, PG, and MS SQL > [2,3,4,5], what is the "minimal must-have" list of masks we should define > in the spec (i have some defined already)? > > Please feel free to comment on the *Spec PR #13879 > <https://github.com/apache/iceberg/pull/13879>* or reply here. > > *Best regards,* > > Prashant Singh > ------------------------------ > > *References:* > > - > > *[1] Proposal:* https://github.com/apache/iceberg/pull/13879 > - > > *[2] Ranger:* Column masking in Hive > > <https://docs.cloudera.com/runtime/7.3.1/security-ranger-authorization/topics/security-ranger-resource-based-column-masking-in-hive-with-ranger-policies.html> > - > > *[3] Google BQ:* BigQuery predefined expressions > > <https://docs.cloud.google.com/bigquery/docs/reference/bigquerydatapolicy/rest/v1/projects.locations.dataPolicies#PredefinedExpression> > - > > *[4] PostgreSQL Extension:* Masking functions > <https://postgresql-anonymizer.readthedocs.io/en/stable/masking_functions/> > - > > *[5] MS SQL Server:* Dynamic Data Masking > > <https://learn.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking?view=sql-server-ver17#define-a-dynamic-data-mask> > > > > > >
