Hi,
I also observed that through the REST API, Apache Ranger allows specifying
multiple values for catalog, schema, table, and column resources in data
masking policies.
{
"service": "trino",
"name": "mask 5",
"policyType": 1,
"policyPriority": 0,
"isAuditEnabled": true,
"resources": {
"catalog": {
"values": [
"tpch",
"tpch2"
],
"isExcludes": false,
"isRecursive": false
},
"schema": {
"values": [
"sf1",
"sf100"
],
"isExcludes": false,
"isRecursive": false
},
"table": {
"values": [
"region",
"nation"
],
"isExcludes": false,
"isRecursive": false
},
"column": {
"values": [
"name",
"comment",
"regionkey"
],
"isExcludes": false,
"isRecursive": false
}
},
"dataMaskPolicyItems": [
{
"accesses": [
{
"type": "select",
"isAllowed": true
}
],
"users": [
"admin"
],
"groups": [
"public"
],
"delegateAdmin": false,
"dataMaskInfo": {
"dataMaskType": "MASK_SHOW_LAST_4"
}
}
],
"serviceType": "trino",
"isDenyAllElse": false
}
API:
curl -u admin:admin -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Basic YWRtaW46QWRtaW4xMjM=" \
-d @policy1.json \
http://localhost:33032/service/public/v2/api/policy
Thanks,
Vikash
On Thu, 4 Sept 2025 at 13:19, Vikash Kumar <[email protected]> wrote:
> Hi,
>
> I wanted to check on the compatibility and recommended usage of
> multi-value resources in Ranger Data Mask policies, specifically for the
> table and column definitions.
> Context (Trino)
>
> -
>
> In the Ranger Admin UI, when creating a masking policy, we can only
> select one catalog, schema, table, and column at a time (
>
> https://github.com/apache/ranger/blob/82082f1ac6abe9a1f5d3c6974ce57110d888ab72/agents-common/src/main/resources/service-defs/ranger-servicedef-trino.json#L503
> ).
> -
>
> This creates a challenge If we need to mask 10–20 fields across
> multiple tables, we end up having to create a very large number of
> policies.
> -
>
> If both table and column are configured as multiValue in the service
> definition, a single policy can cover multiple tables and multiple columns.
> -
>
> This means the Ranger plugin will effectively expand all table–column
> combinations (e.g., if tables = [orders, customers] and columns =
> [email, phone], masking applies to orders.email, orders.phone,
> customers.email, customers.phone).
>
> We are exploring making table and column resources support multi-value in
> the service definition.
>
> -
>
> *Option 1:* Allow multi-value for columns only (table remains
> single-value).
> -
>
> Simpler, fewer conflicts, easier auditing.
> -
>
> Still many policies if masking the same column across multiple
> tables.
> -
>
> *Option 2:* Allow multi-value for both tables and columns.
> -
>
> One policy can cover multiple tables and multiple columns.
> -
>
> Possible backward compatibility issues, overlapping policy
> conflicts, harder audit/debugging, and slight performance overhead in
> Trino.
>
> Observation from Testing
>
> I tried modifying the service definition to allow columns as multi-value.
> While testing, I noticed that Ranger did not throw any exception in the
> case of overlapping policies.
>
> -
>
> Example:
> -
>
> Policy 1 → region.name
> -
>
> Policy 2 → region.name, region.key
> -
>
> Even though region.name is repeated, Ranger allowed both policies to
> be created.
>
> This could potentially lead to conflicts or ambiguous behavior in Trino
> when deciding which mask is applied.
> Request
>
> Could you confirm:
>
> -
>
> Do we have full compatibility in Ranger and Trino when configuring
> both table and column as multiValue?
> -
>
> Are there any known limitations, best practices, or risks
> (particularly around policy evaluation and query execution in Trino)?
> -
>
> What is your recommendation for making columns and tables as
> multivalue?
> -
>
> Would you recommend using multiValue for tables in addition to
> columns, or should tables remain singleValue?
>
> Thanks,
> Vikash
>