Can you comment on what isn't working with MapR in this scenario? I'm familiar 
with impersonation tickets and constrained impersonation.

That said, I do agree that a general purpose feature in Drill that allows one 
to constrain who can issue queries seems useful.

Keys
_______________________________
Keys Botzum 
MapR Technologies 
http://www.mapr.com

> On Aug 21, 2018, at 3:47 AM, Joel Pfaff <[email protected]> wrote:
> 
> Hello,
> 
> "Unfortunately I have not used the setup described above but from
> explanation looks like the impersonation tickets will be used by Drillbit's
> on Tenant A to restrict the MapR platform access by a limited set of
> Drillbit authenticated user. Using this any user in Tenant B will not be
> able to execute query on Tenant A even though it can be authenticated
> successfully by the Drillbit in Tenant A. This way authorization check is
> done at data layer."
> 
> Unfortunately, the tests we have done so far do not confirm this expected
> behavior.
> That's why Alex opened a ticket for an Authorization framework :
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6699&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=GqmpS_1AHD_cvgkumRuDkBtRTvUsIvfjVomAQtdhBks&m=th4RzorF4mYi7oPGaRMacJVgsQwPrqO3721YuREqjM8&s=I9DqH7uLEEdgnaHNGN7zBJxfc5dtbDjJ09mLgcJdVB8&e=
> 
> We have also opened a ticket to MapR to clarify the expected behavior of
> impersonation tickets with group restrictions.
> 
> Regards, Joel
> 
> On Sun, Aug 19, 2018 at 9:21 PM Oleksandr Kalinin <[email protected]>
> wrote:
> 
>> Hi Sorabh,
>> 
>> In case of Hive, user connects to Hive server. Launching the query launches
>> YARN application - each query is YARN application. To make sure that query
>> uses YARN cluster resources launching user is authorized to use, YARN
>> authorization kicks in - e.g. YARN queue ACLs - mechanism a bit similar to
>> the one proposed in this thread. Once application is running, impersonation
>> and data (FS) level authorization do the rest of the job like you say -
>> that is indeed the key.
>> 
>> We use the same authorization model for Spark - to run Spark job, user must
>> launch it as YARN application on specific YARN resource protected by YARN
>> authorization, with impersonation and FS level authorization following once
>> the job is running.
>> 
>> In case of Drill on YARN, user connects to Drill cluster which is *already*
>> running as YARN application. Thus exposing that Drill cluster to any user
>> in the entire YARN cluster we expose YARN resources users might be not
>> authorized to use. That is main issue we are trying to solve.
>> 
>> Hope this makes it clearer.
>> 
>> Best Regards,
>> Alex
>> 
>> 
>> On Fri, Aug 17, 2018 at 11:57 PM, Sorabh Hamirwasia <[email protected]>
>> wrote:
>> 
>>> Hi Joel/Alex,
>>> Thanks for explaining the use case with multi tenant cluster.
>>> 
>>> @Joel
>>> Unfortunately I have not used the setup described above but from
>>> explanation looks like the impersonation tickets will be used by
>> Drillbit's
>>> on Tenant A to restrict the MapR platform access by a limited set of
>>> Drillbit authenticated user. Using this any user in Tenant B will not be
>>> able to execute query on Tenant A even though it can be authenticated
>>> successfully by the Drillbit in Tenant A. This way authorization check is
>>> done at data layer.
>>> 
>>> @Alex,
>>> Adding an authorization check for a valid authenticated cluster user
>>> shouldn't be a big change. Based on a configured set's of users/group a
>>> subset of cluster user can be allowed to connect. But can you please
>> point
>>> to how other services do these authorization checks when running in multi
>>> tenant environment ? Based on my understanding all these authorization
>>> check in Hadoop system are done at data layer or they have a separate
>>> security service which does these checks along with other security checks
>>> for authentication, etc.
>>> 
>>> Also please feel free to open a JIRA ticket with details.
>>> 
>>> Thanks,
>>> Sorabh
>>> 
>>> 
>>> 
>>> On Fri, Aug 17, 2018 at 11:21 AM, Oleksandr Kalinin <[email protected]>
>>> wrote:
>>> 
>>>> Hi Sorabh,
>>>> 
>>>> Thanks for you comments. Joel described in detail our current thinking
>> on
>>>> how to overcome the issue. We are not yet 100% sure if it will actually
>>>> work though.
>>>> 
>>>> Actually I raised this topic in this mailing list because I think it's
>>> not
>>>> only specific to our setup. It's more about having nice "Drill on YARN"
>>>> feature with very limited (frankly, no) access control which almost
>> makes
>>>> the feature unusable in environments where it is attractive - multi
>>> tenant
>>>> secure clusters. Supported security mechanisms are good for
>>> authentication,
>>>> but using them for authorization seems suboptimal. Typically, YARN
>>> clusters
>>>> run in single Kerberos realm and the need to introduce multiple realms
>>> and
>>>> separate identities for Drill service is not at all convenient (I am
>>> pretty
>>>> sure that in many environments like ours it is a no go). And how about
>>> use
>>>> cases with no Kerberos setup? If we can workaround access control by
>>>> MapR-specific security tickets like described by Joel - good for us,
>> but
>>>> what about other environments?
>>>> 
>>>> So the question is more whether it make sense to consider introducing
>>> user
>>>> authorization feature. This thread refers only to session authorization
>>> to
>>>> complement YARN feature, but it could be extendable of course, e.g. in
>>>> similar ways like Drill already supports multiple authentication
>>>> mechanisms.
>>>> 
>>>> Thanks & Best Regards,
>>>> Alex
>>>> 
>>>> On Wed, Aug 15, 2018 at 11:30 PM, Sorabh Hamirwasia <
>>> [email protected]>
>>>> wrote:
>>>> 
>>>>> Hi Oleksandr,
>>>>> Drill doesn't do any user management in itself, instead relies on the
>>>>> corresponding security mechanisms in use to do it. It uses SASL
>>> framework
>>>>> to allow using different pluggable security mechanisms. So it should
>> be
>>>>> upon the security mechanism in use to do the authorization level
>>> checks.
>>>>> For example in your use case if you want to allow only certain set's
>> of
>>>>> users to connect to a cluster then you can choose to use Kerberos
>> with
>>>> each
>>>>> cluster running in different realms. This will ensure client users
>>>> running
>>>>> in corresponding realm can only connect to cluster running in that
>>> realm.
>>>>> 
>>>>> For the impersonation issue I think it's a configuration issue and
>> the
>>>>> behavior is expected where all queries whether from user A or B are
>>>>> executed as admin users.
>>>>> 
>>>>> Thanks,
>>>>> Sorabh
>>>>> 
>>>>> On Mon, Aug 13, 2018 at 9:02 AM, Oleksandr Kalinin <
>> [email protected]
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hello Drill community,
>>>>>> 
>>>>>> In multi-tenant YARN clusters, running multiple Drill-on-YARN
>>> clusters
>>>>>> seems as attractive feature as it enables leveraging on YARN
>>> mechanisms
>>>>> of
>>>>>> resource management and isolation. However, there seems to be
>> simple
>>>>> access
>>>>>> restriction issue. Assume :
>>>>>> 
>>>>>> - Cluster A launched by user X
>>>>>> - Cluster B launched by user Y
>>>>>> 
>>>>>> Both users X and Y will be able to connect and run queries against
>>>>> clusters
>>>>>> A and B (in fact, that applies to any positively authenticated
>> user,
>>>> not
>>>>>> only X and Y). Whereas we obviously would like to ensure exclusive
>>>> usage
>>>>> of
>>>>>> clusters by their owners - who are owners of respective YARN
>>> resources.
>>>>> In
>>>>>> case users X and Y are non-privileged DFS users and impersonation
>> is
>>>> not
>>>>>> enabled, then user A has access to data on behalf of user B and
>> vice
>>>>> versa
>>>>>> which is additional potential security issue.
>>>>>> 
>>>>>> I was looking for possibilities to control connect authorization,
>> but
>>>>>> couldn't find anything related yet. Do I miss something maybe? Are
>>>> there
>>>>>> any other considerations, perhaps this point was already discussed
>>>>> before?
>>>>>> 
>>>>>> It could be possible to tweak PAM setup to trigger authentication
>>>> failure
>>>>>> for "undesired" users but that looks like an overkill in terms of
>>>>>> complexity.
>>>>>> 
>>>>>> From user perspective, basic ACL configuration with users and
>> groups
>>>>>> authorized to connect to Drillbit would already be sufficient IMO.
>> Or
>>>>>> configuration switch to ensure that only owner user is authorized
>> to
>>>>>> connect.
>>>>>> 
>>>>>> Best Regards,
>>>>>> Alex
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to