Hello,

"Unfortunately I have not used the setup described above but from
explanation looks like the impersonation tickets will be used by Drillbit's
on Tenant A to restrict the MapR platform access by a limited set of
Drillbit authenticated user. Using this any user in Tenant B will not be
able to execute query on Tenant A even though it can be authenticated
successfully by the Drillbit in Tenant A. This way authorization check is
done at data layer."

Unfortunately, the tests we have done so far do not confirm this expected
behavior.
That's why Alex opened a ticket for an Authorization framework :
https://issues.apache.org/jira/browse/DRILL-6699

We have also opened a ticket to MapR to clarify the expected behavior of
impersonation tickets with group restrictions.

Regards, Joel

On Sun, Aug 19, 2018 at 9:21 PM Oleksandr Kalinin <alexk...@gmail.com>
wrote:

> Hi Sorabh,
>
> In case of Hive, user connects to Hive server. Launching the query launches
> YARN application - each query is YARN application. To make sure that query
> uses YARN cluster resources launching user is authorized to use, YARN
> authorization kicks in - e.g. YARN queue ACLs - mechanism a bit similar to
> the one proposed in this thread. Once application is running, impersonation
> and data (FS) level authorization do the rest of the job like you say -
> that is indeed the key.
>
> We use the same authorization model for Spark - to run Spark job, user must
> launch it as YARN application on specific YARN resource protected by YARN
> authorization, with impersonation and FS level authorization following once
> the job is running.
>
> In case of Drill on YARN, user connects to Drill cluster which is *already*
> running as YARN application. Thus exposing that Drill cluster to any user
> in the entire YARN cluster we expose YARN resources users might be not
> authorized to use. That is main issue we are trying to solve.
>
> Hope this makes it clearer.
>
> Best Regards,
> Alex
>
>
> On Fri, Aug 17, 2018 at 11:57 PM, Sorabh Hamirwasia <shamirwa...@mapr.com>
> wrote:
>
> > Hi Joel/Alex,
> > Thanks for explaining the use case with multi tenant cluster.
> >
> > @Joel
> > Unfortunately I have not used the setup described above but from
> > explanation looks like the impersonation tickets will be used by
> Drillbit's
> > on Tenant A to restrict the MapR platform access by a limited set of
> > Drillbit authenticated user. Using this any user in Tenant B will not be
> > able to execute query on Tenant A even though it can be authenticated
> > successfully by the Drillbit in Tenant A. This way authorization check is
> > done at data layer.
> >
> > @Alex,
> > Adding an authorization check for a valid authenticated cluster user
> > shouldn't be a big change. Based on a configured set's of users/group a
> > subset of cluster user can be allowed to connect. But can you please
> point
> > to how other services do these authorization checks when running in multi
> > tenant environment ? Based on my understanding all these authorization
> > check in Hadoop system are done at data layer or they have a separate
> > security service which does these checks along with other security checks
> > for authentication, etc.
> >
> > Also please feel free to open a JIRA ticket with details.
> >
> > Thanks,
> > Sorabh
> >
> >
> >
> > On Fri, Aug 17, 2018 at 11:21 AM, Oleksandr Kalinin <alexk...@gmail.com>
> > wrote:
> >
> > > Hi Sorabh,
> > >
> > > Thanks for you comments. Joel described in detail our current thinking
> on
> > > how to overcome the issue. We are not yet 100% sure if it will actually
> > > work though.
> > >
> > > Actually I raised this topic in this mailing list because I think it's
> > not
> > > only specific to our setup. It's more about having nice "Drill on YARN"
> > > feature with very limited (frankly, no) access control which almost
> makes
> > > the feature unusable in environments where it is attractive - multi
> > tenant
> > > secure clusters. Supported security mechanisms are good for
> > authentication,
> > > but using them for authorization seems suboptimal. Typically, YARN
> > clusters
> > > run in single Kerberos realm and the need to introduce multiple realms
> > and
> > > separate identities for Drill service is not at all convenient (I am
> > pretty
> > > sure that in many environments like ours it is a no go). And how about
> > use
> > > cases with no Kerberos setup? If we can workaround access control by
> > > MapR-specific security tickets like described by Joel - good for us,
> but
> > > what about other environments?
> > >
> > > So the question is more whether it make sense to consider introducing
> > user
> > > authorization feature. This thread refers only to session authorization
> > to
> > > complement YARN feature, but it could be extendable of course, e.g. in
> > > similar ways like Drill already supports multiple authentication
> > > mechanisms.
> > >
> > > Thanks & Best Regards,
> > > Alex
> > >
> > > On Wed, Aug 15, 2018 at 11:30 PM, Sorabh Hamirwasia <
> > shamirwa...@mapr.com>
> > > wrote:
> > >
> > > > Hi Oleksandr,
> > > > Drill doesn't do any user management in itself, instead relies on the
> > > > corresponding security mechanisms in use to do it. It uses SASL
> > framework
> > > > to allow using different pluggable security mechanisms. So it should
> be
> > > > upon the security mechanism in use to do the authorization level
> > checks.
> > > > For example in your use case if you want to allow only certain set's
> of
> > > > users to connect to a cluster then you can choose to use Kerberos
> with
> > > each
> > > > cluster running in different realms. This will ensure client users
> > > running
> > > > in corresponding realm can only connect to cluster running in that
> > realm.
> > > >
> > > > For the impersonation issue I think it's a configuration issue and
> the
> > > > behavior is expected where all queries whether from user A or B are
> > > > executed as admin users.
> > > >
> > > > Thanks,
> > > > Sorabh
> > > >
> > > > On Mon, Aug 13, 2018 at 9:02 AM, Oleksandr Kalinin <
> alexk...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hello Drill community,
> > > > >
> > > > > In multi-tenant YARN clusters, running multiple Drill-on-YARN
> > clusters
> > > > > seems as attractive feature as it enables leveraging on YARN
> > mechanisms
> > > > of
> > > > > resource management and isolation. However, there seems to be
> simple
> > > > access
> > > > > restriction issue. Assume :
> > > > >
> > > > > - Cluster A launched by user X
> > > > > - Cluster B launched by user Y
> > > > >
> > > > > Both users X and Y will be able to connect and run queries against
> > > > clusters
> > > > > A and B (in fact, that applies to any positively authenticated
> user,
> > > not
> > > > > only X and Y). Whereas we obviously would like to ensure exclusive
> > > usage
> > > > of
> > > > > clusters by their owners - who are owners of respective YARN
> > resources.
> > > > In
> > > > > case users X and Y are non-privileged DFS users and impersonation
> is
> > > not
> > > > > enabled, then user A has access to data on behalf of user B and
> vice
> > > > versa
> > > > > which is additional potential security issue.
> > > > >
> > > > > I was looking for possibilities to control connect authorization,
> but
> > > > > couldn't find anything related yet. Do I miss something maybe? Are
> > > there
> > > > > any other considerations, perhaps this point was already discussed
> > > > before?
> > > > >
> > > > > It could be possible to tweak PAM setup to trigger authentication
> > > failure
> > > > > for "undesired" users but that looks like an overkill in terms of
> > > > > complexity.
> > > > >
> > > > > From user perspective, basic ACL configuration with users and
> groups
> > > > > authorized to connect to Drillbit would already be sufficient IMO.
> Or
> > > > > configuration switch to ensure that only owner user is authorized
> to
> > > > > connect.
> > > > >
> > > > > Best Regards,
> > > > > Alex
> > > > >
> > > >
> > >
> >
>

Reply via email to