Can you comment on what isn't working with MapR in this scenario? I'm familiar with impersonation tickets and constrained impersonation.
That said, I do agree that a general purpose feature in Drill that allows one to constrain who can issue queries seems useful. Keys _______________________________ Keys Botzum MapR Technologies http://www.mapr.com > On Aug 21, 2018, at 3:47 AM, Joel Pfaff <[email protected]> wrote: > > Hello, > > "Unfortunately I have not used the setup described above but from > explanation looks like the impersonation tickets will be used by Drillbit's > on Tenant A to restrict the MapR platform access by a limited set of > Drillbit authenticated user. Using this any user in Tenant B will not be > able to execute query on Tenant A even though it can be authenticated > successfully by the Drillbit in Tenant A. This way authorization check is > done at data layer." > > Unfortunately, the tests we have done so far do not confirm this expected > behavior. > That's why Alex opened a ticket for an Authorization framework : > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6699&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=GqmpS_1AHD_cvgkumRuDkBtRTvUsIvfjVomAQtdhBks&m=th4RzorF4mYi7oPGaRMacJVgsQwPrqO3721YuREqjM8&s=I9DqH7uLEEdgnaHNGN7zBJxfc5dtbDjJ09mLgcJdVB8&e= > > We have also opened a ticket to MapR to clarify the expected behavior of > impersonation tickets with group restrictions. > > Regards, Joel > > On Sun, Aug 19, 2018 at 9:21 PM Oleksandr Kalinin <[email protected]> > wrote: > >> Hi Sorabh, >> >> In case of Hive, user connects to Hive server. Launching the query launches >> YARN application - each query is YARN application. To make sure that query >> uses YARN cluster resources launching user is authorized to use, YARN >> authorization kicks in - e.g. YARN queue ACLs - mechanism a bit similar to >> the one proposed in this thread. Once application is running, impersonation >> and data (FS) level authorization do the rest of the job like you say - >> that is indeed the key. >> >> We use the same authorization model for Spark - to run Spark job, user must >> launch it as YARN application on specific YARN resource protected by YARN >> authorization, with impersonation and FS level authorization following once >> the job is running. >> >> In case of Drill on YARN, user connects to Drill cluster which is *already* >> running as YARN application. Thus exposing that Drill cluster to any user >> in the entire YARN cluster we expose YARN resources users might be not >> authorized to use. That is main issue we are trying to solve. >> >> Hope this makes it clearer. >> >> Best Regards, >> Alex >> >> >> On Fri, Aug 17, 2018 at 11:57 PM, Sorabh Hamirwasia <[email protected]> >> wrote: >> >>> Hi Joel/Alex, >>> Thanks for explaining the use case with multi tenant cluster. >>> >>> @Joel >>> Unfortunately I have not used the setup described above but from >>> explanation looks like the impersonation tickets will be used by >> Drillbit's >>> on Tenant A to restrict the MapR platform access by a limited set of >>> Drillbit authenticated user. Using this any user in Tenant B will not be >>> able to execute query on Tenant A even though it can be authenticated >>> successfully by the Drillbit in Tenant A. This way authorization check is >>> done at data layer. >>> >>> @Alex, >>> Adding an authorization check for a valid authenticated cluster user >>> shouldn't be a big change. Based on a configured set's of users/group a >>> subset of cluster user can be allowed to connect. But can you please >> point >>> to how other services do these authorization checks when running in multi >>> tenant environment ? Based on my understanding all these authorization >>> check in Hadoop system are done at data layer or they have a separate >>> security service which does these checks along with other security checks >>> for authentication, etc. >>> >>> Also please feel free to open a JIRA ticket with details. >>> >>> Thanks, >>> Sorabh >>> >>> >>> >>> On Fri, Aug 17, 2018 at 11:21 AM, Oleksandr Kalinin <[email protected]> >>> wrote: >>> >>>> Hi Sorabh, >>>> >>>> Thanks for you comments. Joel described in detail our current thinking >> on >>>> how to overcome the issue. We are not yet 100% sure if it will actually >>>> work though. >>>> >>>> Actually I raised this topic in this mailing list because I think it's >>> not >>>> only specific to our setup. It's more about having nice "Drill on YARN" >>>> feature with very limited (frankly, no) access control which almost >> makes >>>> the feature unusable in environments where it is attractive - multi >>> tenant >>>> secure clusters. Supported security mechanisms are good for >>> authentication, >>>> but using them for authorization seems suboptimal. Typically, YARN >>> clusters >>>> run in single Kerberos realm and the need to introduce multiple realms >>> and >>>> separate identities for Drill service is not at all convenient (I am >>> pretty >>>> sure that in many environments like ours it is a no go). And how about >>> use >>>> cases with no Kerberos setup? If we can workaround access control by >>>> MapR-specific security tickets like described by Joel - good for us, >> but >>>> what about other environments? >>>> >>>> So the question is more whether it make sense to consider introducing >>> user >>>> authorization feature. This thread refers only to session authorization >>> to >>>> complement YARN feature, but it could be extendable of course, e.g. in >>>> similar ways like Drill already supports multiple authentication >>>> mechanisms. >>>> >>>> Thanks & Best Regards, >>>> Alex >>>> >>>> On Wed, Aug 15, 2018 at 11:30 PM, Sorabh Hamirwasia < >>> [email protected]> >>>> wrote: >>>> >>>>> Hi Oleksandr, >>>>> Drill doesn't do any user management in itself, instead relies on the >>>>> corresponding security mechanisms in use to do it. It uses SASL >>> framework >>>>> to allow using different pluggable security mechanisms. So it should >> be >>>>> upon the security mechanism in use to do the authorization level >>> checks. >>>>> For example in your use case if you want to allow only certain set's >> of >>>>> users to connect to a cluster then you can choose to use Kerberos >> with >>>> each >>>>> cluster running in different realms. This will ensure client users >>>> running >>>>> in corresponding realm can only connect to cluster running in that >>> realm. >>>>> >>>>> For the impersonation issue I think it's a configuration issue and >> the >>>>> behavior is expected where all queries whether from user A or B are >>>>> executed as admin users. >>>>> >>>>> Thanks, >>>>> Sorabh >>>>> >>>>> On Mon, Aug 13, 2018 at 9:02 AM, Oleksandr Kalinin < >> [email protected] >>>> >>>>> wrote: >>>>> >>>>>> Hello Drill community, >>>>>> >>>>>> In multi-tenant YARN clusters, running multiple Drill-on-YARN >>> clusters >>>>>> seems as attractive feature as it enables leveraging on YARN >>> mechanisms >>>>> of >>>>>> resource management and isolation. However, there seems to be >> simple >>>>> access >>>>>> restriction issue. Assume : >>>>>> >>>>>> - Cluster A launched by user X >>>>>> - Cluster B launched by user Y >>>>>> >>>>>> Both users X and Y will be able to connect and run queries against >>>>> clusters >>>>>> A and B (in fact, that applies to any positively authenticated >> user, >>>> not >>>>>> only X and Y). Whereas we obviously would like to ensure exclusive >>>> usage >>>>> of >>>>>> clusters by their owners - who are owners of respective YARN >>> resources. >>>>> In >>>>>> case users X and Y are non-privileged DFS users and impersonation >> is >>>> not >>>>>> enabled, then user A has access to data on behalf of user B and >> vice >>>>> versa >>>>>> which is additional potential security issue. >>>>>> >>>>>> I was looking for possibilities to control connect authorization, >> but >>>>>> couldn't find anything related yet. Do I miss something maybe? Are >>>> there >>>>>> any other considerations, perhaps this point was already discussed >>>>> before? >>>>>> >>>>>> It could be possible to tweak PAM setup to trigger authentication >>>> failure >>>>>> for "undesired" users but that looks like an overkill in terms of >>>>>> complexity. >>>>>> >>>>>> From user perspective, basic ACL configuration with users and >> groups >>>>>> authorized to connect to Drillbit would already be sufficient IMO. >> Or >>>>>> configuration switch to ensure that only owner user is authorized >> to >>>>>> connect. >>>>>> >>>>>> Best Regards, >>>>>> Alex >>>>>> >>>>> >>>> >>> >>
