Hi Keys,

Assume we want to :
- Run Drill cluster on YARN as user 'foo' (UID = N)
- Authorize all users in group 'bar' (GID = K) for running Drill queries on
that cluster with impersonation enabled
- All other users should be able to connect to the cluster, but their
queries should fail with impersonation failure

We expected (wrongly?) that launching Drill cluster on YARN with following
MapR ticket would be suitable :

$ maprlogin generateticket -type servicewithimpersonation -user foo -out
foo.ticket  -duration x:0:0 -impersonateduids N  -impersonatedgids K

However, we seem to have 2 issues :

1. When accessing Drill cluster launched on YARN with above ticket, and
even though 'foo' is non-privileged user, impersonation seems to work for
users outside of 'bar' group(!)
- we are currently puzzled by this behavior and continue to dig into the
issue hoping that something is wrong with our test

2. When using above ticket with another impersonating service - loopback
NFS client - we observe that service does not perform expected
impersonation. It only works for user 'foo'. Any other user using the
service gets FS permission denied error. This is the issue I raise to MapR
already.

Thanks,
Best Regards,
Alex

On Tue, Aug 21, 2018 at 6:24 PM Keys Botzum <[email protected]> wrote:

> Can you comment on what isn't working with MapR in this scenario? I'm
> familiar with impersonation tickets and constrained impersonation.
>
> That said, I do agree that a general purpose feature in Drill that allows
> one to constrain who can issue queries seems useful.
>
> Keys
> _______________________________
> Keys Botzum
> MapR Technologies
> http://www.mapr.com
>
> > On Aug 21, 2018, at 3:47 AM, Joel Pfaff <[email protected]> wrote:
> >
> > Hello,
> >
> > "Unfortunately I have not used the setup described above but from
> > explanation looks like the impersonation tickets will be used by
> Drillbit's
> > on Tenant A to restrict the MapR platform access by a limited set of
> > Drillbit authenticated user. Using this any user in Tenant B will not be
> > able to execute query on Tenant A even though it can be authenticated
> > successfully by the Drillbit in Tenant A. This way authorization check is
> > done at data layer."
> >
> > Unfortunately, the tests we have done so far do not confirm this expected
> > behavior.
> > That's why Alex opened a ticket for an Authorization framework :
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6699&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=GqmpS_1AHD_cvgkumRuDkBtRTvUsIvfjVomAQtdhBks&m=th4RzorF4mYi7oPGaRMacJVgsQwPrqO3721YuREqjM8&s=I9DqH7uLEEdgnaHNGN7zBJxfc5dtbDjJ09mLgcJdVB8&e=
> >
> > We have also opened a ticket to MapR to clarify the expected behavior of
> > impersonation tickets with group restrictions.
> >
> > Regards, Joel
> >
> > On Sun, Aug 19, 2018 at 9:21 PM Oleksandr Kalinin <[email protected]>
> > wrote:
> >
> >> Hi Sorabh,
> >>
> >> In case of Hive, user connects to Hive server. Launching the query
> launches
> >> YARN application - each query is YARN application. To make sure that
> query
> >> uses YARN cluster resources launching user is authorized to use, YARN
> >> authorization kicks in - e.g. YARN queue ACLs - mechanism a bit similar
> to
> >> the one proposed in this thread. Once application is running,
> impersonation
> >> and data (FS) level authorization do the rest of the job like you say -
> >> that is indeed the key.
> >>
> >> We use the same authorization model for Spark - to run Spark job, user
> must
> >> launch it as YARN application on specific YARN resource protected by
> YARN
> >> authorization, with impersonation and FS level authorization following
> once
> >> the job is running.
> >>
> >> In case of Drill on YARN, user connects to Drill cluster which is
> *already*
> >> running as YARN application. Thus exposing that Drill cluster to any
> user
> >> in the entire YARN cluster we expose YARN resources users might be not
> >> authorized to use. That is main issue we are trying to solve.
> >>
> >> Hope this makes it clearer.
> >>
> >> Best Regards,
> >> Alex
> >>
> >>
> >> On Fri, Aug 17, 2018 at 11:57 PM, Sorabh Hamirwasia <
> [email protected]>
> >> wrote:
> >>
> >>> Hi Joel/Alex,
> >>> Thanks for explaining the use case with multi tenant cluster.
> >>>
> >>> @Joel
> >>> Unfortunately I have not used the setup described above but from
> >>> explanation looks like the impersonation tickets will be used by
> >> Drillbit's
> >>> on Tenant A to restrict the MapR platform access by a limited set of
> >>> Drillbit authenticated user. Using this any user in Tenant B will not
> be
> >>> able to execute query on Tenant A even though it can be authenticated
> >>> successfully by the Drillbit in Tenant A. This way authorization check
> is
> >>> done at data layer.
> >>>
> >>> @Alex,
> >>> Adding an authorization check for a valid authenticated cluster user
> >>> shouldn't be a big change. Based on a configured set's of users/group a
> >>> subset of cluster user can be allowed to connect. But can you please
> >> point
> >>> to how other services do these authorization checks when running in
> multi
> >>> tenant environment ? Based on my understanding all these authorization
> >>> check in Hadoop system are done at data layer or they have a separate
> >>> security service which does these checks along with other security
> checks
> >>> for authentication, etc.
> >>>
> >>> Also please feel free to open a JIRA ticket with details.
> >>>
> >>> Thanks,
> >>> Sorabh
> >>>
> >>>
> >>>
> >>> On Fri, Aug 17, 2018 at 11:21 AM, Oleksandr Kalinin <
> [email protected]>
> >>> wrote:
> >>>
> >>>> Hi Sorabh,
> >>>>
> >>>> Thanks for you comments. Joel described in detail our current thinking
> >> on
> >>>> how to overcome the issue. We are not yet 100% sure if it will
> actually
> >>>> work though.
> >>>>
> >>>> Actually I raised this topic in this mailing list because I think it's
> >>> not
> >>>> only specific to our setup. It's more about having nice "Drill on
> YARN"
> >>>> feature with very limited (frankly, no) access control which almost
> >> makes
> >>>> the feature unusable in environments where it is attractive - multi
> >>> tenant
> >>>> secure clusters. Supported security mechanisms are good for
> >>> authentication,
> >>>> but using them for authorization seems suboptimal. Typically, YARN
> >>> clusters
> >>>> run in single Kerberos realm and the need to introduce multiple realms
> >>> and
> >>>> separate identities for Drill service is not at all convenient (I am
> >>> pretty
> >>>> sure that in many environments like ours it is a no go). And how about
> >>> use
> >>>> cases with no Kerberos setup? If we can workaround access control by
> >>>> MapR-specific security tickets like described by Joel - good for us,
> >> but
> >>>> what about other environments?
> >>>>
> >>>> So the question is more whether it make sense to consider introducing
> >>> user
> >>>> authorization feature. This thread refers only to session
> authorization
> >>> to
> >>>> complement YARN feature, but it could be extendable of course, e.g. in
> >>>> similar ways like Drill already supports multiple authentication
> >>>> mechanisms.
> >>>>
> >>>> Thanks & Best Regards,
> >>>> Alex
> >>>>
> >>>> On Wed, Aug 15, 2018 at 11:30 PM, Sorabh Hamirwasia <
> >>> [email protected]>
> >>>> wrote:
> >>>>
> >>>>> Hi Oleksandr,
> >>>>> Drill doesn't do any user management in itself, instead relies on the
> >>>>> corresponding security mechanisms in use to do it. It uses SASL
> >>> framework
> >>>>> to allow using different pluggable security mechanisms. So it should
> >> be
> >>>>> upon the security mechanism in use to do the authorization level
> >>> checks.
> >>>>> For example in your use case if you want to allow only certain set's
> >> of
> >>>>> users to connect to a cluster then you can choose to use Kerberos
> >> with
> >>>> each
> >>>>> cluster running in different realms. This will ensure client users
> >>>> running
> >>>>> in corresponding realm can only connect to cluster running in that
> >>> realm.
> >>>>>
> >>>>> For the impersonation issue I think it's a configuration issue and
> >> the
> >>>>> behavior is expected where all queries whether from user A or B are
> >>>>> executed as admin users.
> >>>>>
> >>>>> Thanks,
> >>>>> Sorabh
> >>>>>
> >>>>> On Mon, Aug 13, 2018 at 9:02 AM, Oleksandr Kalinin <
> >> [email protected]
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello Drill community,
> >>>>>>
> >>>>>> In multi-tenant YARN clusters, running multiple Drill-on-YARN
> >>> clusters
> >>>>>> seems as attractive feature as it enables leveraging on YARN
> >>> mechanisms
> >>>>> of
> >>>>>> resource management and isolation. However, there seems to be
> >> simple
> >>>>> access
> >>>>>> restriction issue. Assume :
> >>>>>>
> >>>>>> - Cluster A launched by user X
> >>>>>> - Cluster B launched by user Y
> >>>>>>
> >>>>>> Both users X and Y will be able to connect and run queries against
> >>>>> clusters
> >>>>>> A and B (in fact, that applies to any positively authenticated
> >> user,
> >>>> not
> >>>>>> only X and Y). Whereas we obviously would like to ensure exclusive
> >>>> usage
> >>>>> of
> >>>>>> clusters by their owners - who are owners of respective YARN
> >>> resources.
> >>>>> In
> >>>>>> case users X and Y are non-privileged DFS users and impersonation
> >> is
> >>>> not
> >>>>>> enabled, then user A has access to data on behalf of user B and
> >> vice
> >>>>> versa
> >>>>>> which is additional potential security issue.
> >>>>>>
> >>>>>> I was looking for possibilities to control connect authorization,
> >> but
> >>>>>> couldn't find anything related yet. Do I miss something maybe? Are
> >>>> there
> >>>>>> any other considerations, perhaps this point was already discussed
> >>>>> before?
> >>>>>>
> >>>>>> It could be possible to tweak PAM setup to trigger authentication
> >>>> failure
> >>>>>> for "undesired" users but that looks like an overkill in terms of
> >>>>>> complexity.
> >>>>>>
> >>>>>> From user perspective, basic ACL configuration with users and
> >> groups
> >>>>>> authorized to connect to Drillbit would already be sufficient IMO.
> >> Or
> >>>>>> configuration switch to ensure that only owner user is authorized
> >> to
> >>>>>> connect.
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Alex
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Reply via email to