Hello Vikash,

A single value for catalog/schema/table/column in Ranger data mask policies is 
a UI-only constraint. As you noticed, multiple values are allowed in REST APIs; 
the policy engine also recognizes multiple values.

Of various mask types supported, some are applicable to columns of any 
datatype; for example, MASK_NULL, MASK_NONE. Other mask types work correctly 
only on specific data types - for example, MASK_SHOW_LAST_4 does not work on 
numeric/date data type columns. Allowing multiple values for 
catalog/schema/table/column in data mask policies can result in policies that 
apply incompatible mask type for some columns - unless the policy author 
ensures that the mask type specified is applicable for all columns covered by 
the policy.

That said, policy UI can be configured to allow multiple values in data mask 
policies, with an update to Trino service-def - you seem to have already done 
this.

> While testing, I noticed that Ranger did not throw any exception in the case 
> of overlapping policies.
When multiple masking policies match a column, Ranger will pick the first 
policy, sorted by the policy name.

Hope this helps.

Madhan


On 9/4/25, 5:04 AM, "Vikash Kumar" <[email protected] 
<mailto:[email protected]>> wrote:


Hi,


I also observed that through the REST API, Apache Ranger allows specifying
multiple values for catalog, schema, table, and column resources in data
masking policies.


{
"service": "trino",
"name": "mask 5",
"policyType": 1,
"policyPriority": 0,
"isAuditEnabled": true,
"resources": {
"catalog": {
"values": [
"tpch",
"tpch2"
],
"isExcludes": false,
"isRecursive": false
},
"schema": {
"values": [
"sf1",
"sf100"
],
"isExcludes": false,
"isRecursive": false
},
"table": {
"values": [
"region",
"nation"
],
"isExcludes": false,
"isRecursive": false
},
"column": {
"values": [
"name",
"comment",
"regionkey"
],
"isExcludes": false,
"isRecursive": false
}
},
"dataMaskPolicyItems": [
{
"accesses": [
{
"type": "select",
"isAllowed": true
}
],
"users": [
"admin"
],
"groups": [
"public"
],
"delegateAdmin": false,
"dataMaskInfo": {
"dataMaskType": "MASK_SHOW_LAST_4"
}
}
],
"serviceType": "trino",
"isDenyAllElse": false
}


API:


curl -u admin:admin -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Basic YWRtaW46QWRtaW4xMjM=" \
-d @policy1.json \
http://localhost:33032/service/public/v2/api/policy


Thanks,
Vikash


On Thu, 4 Sept 2025 at 13:19, Vikash Kumar <[email protected] 
<mailto:[email protected]>> wrote:


> Hi,
>
> I wanted to check on the compatibility and recommended usage of
> multi-value resources in Ranger Data Mask policies, specifically for the
> table and column definitions.
> Context (Trino)
>
> -
>
> In the Ranger Admin UI, when creating a masking policy, we can only
> select one catalog, schema, table, and column at a time (
> https://github.com/apache/ranger/blob/82082f1ac6abe9a1f5d3c6974ce57110d888ab72/agents-common/src/main/resources/service-defs/ranger-servicedef-trino.json#L503
>  
> <https://github.com/apache/ranger/blob/82082f1ac6abe9a1f5d3c6974ce57110d888ab72/agents-common/src/main/resources/service-defs/ranger-servicedef-trino.json#L503>
> ).
> -
>
> This creates a challenge If we need to mask 10–20 fields across
> multiple tables, we end up having to create a very large number of policies.
> -
>
> If both table and column are configured as multiValue in the service
> definition, a single policy can cover multiple tables and multiple columns.
> -
>
> This means the Ranger plugin will effectively expand all table–column
> combinations (e.g., if tables = [orders, customers] and columns =
> [email, phone], masking applies to orders.email, orders.phone,
> customers.email, customers.phone).
>
> We are exploring making table and column resources support multi-value in
> the service definition.
>
> -
>
> *Option 1:* Allow multi-value for columns only (table remains
> single-value).
> -
>
> Simpler, fewer conflicts, easier auditing.
> -
>
> Still many policies if masking the same column across multiple
> tables.
> -
>
> *Option 2:* Allow multi-value for both tables and columns.
> -
>
> One policy can cover multiple tables and multiple columns.
> -
>
> Possible backward compatibility issues, overlapping policy
> conflicts, harder audit/debugging, and slight performance overhead in Trino.
>
> Observation from Testing
>
> I tried modifying the service definition to allow columns as multi-value.
> While testing, I noticed that Ranger did not throw any exception in the
> case of overlapping policies.
>
> -
>
> Example:
> -
>
> Policy 1 → region.name
> -
>
> Policy 2 → region.name, region.key
> -
>
> Even though region.name is repeated, Ranger allowed both policies to
> be created.
>
> This could potentially lead to conflicts or ambiguous behavior in Trino
> when deciding which mask is applied.
> Request
>
> Could you confirm:
>
> -
>
> Do we have full compatibility in Ranger and Trino when configuring
> both table and column as multiValue?
> -
>
> Are there any known limitations, best practices, or risks
> (particularly around policy evaluation and query execution in Trino)?
> -
>
> What is your recommendation for making columns and tables as
> multivalue?
> -
>
> Would you recommend using multiValue for tables in addition to
> columns, or should tables remain singleValue?
>
> Thanks,
> Vikash
>




Reply via email to