bett-it opened a new issue, #17610:
URL: https://github.com/apache/druid/issues/17610

   ### **Motivation** 
   
   Currently, Apache Druid doesn’t support restricting access to the lookup 
tables. This might be problematic when using Druid in a multitenant 
environment. It might expose sensitive information and break companies' 
internal policies. Our motivation is to enable Druid users to set permissions 
for the lookup tables. 
   
    
   
   ### **Current findings & proposals**
   
   The main entry point for query handling is the `QueryLifecycle` class. 
Authorization is handled in the method `public Access 
authorize(HttpServletRequest req)` where permissions are modeled as a set of 
`ResourceAction` objects. 
   
   The method `authorize()` generates the resource actions for all tables that 
a query refers to in the following lines: 
   
   ```
   Iterables.transform( 
     baseQuery.getDataSource().getTableNames(), 
     AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR
   )
   ```
   
   Finally, the `DataSource` class specifies what qualifies as a table name. 
There is a comment that clearly states that lookups are not included in the 
list of table names that a query generates -”Returns the names of all table 
datasources involved in this query. Does not include names for non-tables, like 
lookups or inline datasources.” 
   
   However, In the `@JsonSubType` declarations, a `LookupDataSource` is listed. 
When we checked the `LookupDataSource` class, which would be instantiated for 
queries like `SELECT * FROM lookups.mylookup`, we found that it returns an 
empty list of table names: 
   
   ```
   public Set<String> getTableNames() { 
     return Collections.emptySet();
   }
   ```
   
   So currently, neither inline use of `LOOKUP()` calls nor querying the lookup 
tables directly can be secured in Druid. 
   
   Would modifying the `LookupDataSource` class to return the injected table 
name via `getTableNames()` be sufficient to enforce restrictions on lookup 
tables and treat them as queryable data sources?
   
   
   ### **To be discussed** 
   
   Is there a compelling reason for still excluding the lookups in the access 
checks? Wouldn’t it be easy to include all of them into a single rule to permit 
access since all lookups are arranged into the same schema (lookups.*)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to