Colm,

Glad I can help. Do you know what configuration caused the columns not
parsed by Hive? If it is due to SessionState.get().isAuthorizationModeV2()
== false?

Thanks,

Lina

On Fri, Jan 5, 2018 at 6:12 AM, Colm O hEigeartaigh <cohei...@apache.org>
wrote:

> Hi Lina,
>
> Thanks a lot for your help on this! I was able to get the test to work by
> adding the following config option:
>
> conf.set(HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS.varname, "true");
>
> Colm.
>
> On Thu, Jan 4, 2018 at 10:06 PM, Na Li <lina...@cloudera.com> wrote:
>
> > Colm,
> >
> > The following code shows where Hive sets the column info. You can debug
> > into hive code and see why AccessedColumns is not set.
> >
> > The related code is in org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> >
> >               boolean isColumnInfoNeedForAuth = 
> > SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(this.conf, ConfVars.HIVE_AUTHORIZATION_ENABLED);
> >         if (isColumnInfoNeedForAuth || HiveConf.getBoolVar(this.conf,
> ConfVars.HIVE_STATS_COLLECT_SCANCOLS)) {
> >           ColumnAccessAnalyzer columnAccessAnalyzer = new
> ColumnAccessAnalyzer(pCtx);
> >           this.setColumnAccessInfo(columnAccessAnalyzer.
> analyzeColumnAccess(this.getColumnAccessInfo()));
> >         }
> >
> >           this.LOG.info("Completed plan generation");
> >         if (HiveConf.getBoolVar(this.conf, 
> > ConfVars.HIVE_STATS_COLLECT_SCANCOLS))
> {
> >           this.putAccessedColumnsToReadEntity(this.inputs,
> this.columnAccessInfo);
> >         }
> >
> >
> > On Wed, Jan 3, 2018 at 11:28 PM, Na Li <lina...@cloudera.com> wrote:
> >
> >> Colm,
> >>
> >> I tried to reproduce your issue using sentry 2.0 (master branch) with
> >> Hive 2.3.2.
> >>
> >> The test code is
> >>
> >>   @Test
> >>   public void testPositiveOnAll() throws Exception {
> >>     Connection connection = context.createConnection(ADMIN1);
> >>     Statement statement = context.createStatement(connection);
> >>     statement.execute("CREATE database " + DB1);
> >>     statement.execute("use " + DB1);
> >>     statement.execute("CREATE TABLE t1 (c1 string, c2 string)");
> >>     statement.execute("CREATE ROLE user_role1");
> >>     statement.execute("*GRANT SELECT ON TABLE t1 TO ROLE user_role1*");
> >>     statement.execute("GRANT ROLE user_role1 TO GROUP " + USERGROUP1);
> >>     statement.close();
> >>     connection.close();
> >>
> >>     connection = context.createConnection(USER1_1);
> >>     statement = context.createStatement(connection);
> >>     statement.execute("use " + DB1);
> >>     statement.execute("*SELECT * FROM t1*");
> >>
> >>     statement.close();
> >>     connection.close();
> >>   }
> >>
> >>
> >> required privileges:
> >>
> >>    - Server=server1->Db=db_1->Table=t1->*Column=c1*->action=select
> >>    - Server=server1->Db=db_1->Table=t1->*Column=c2*->action=select
> >>
> >>
> >> cached privilege:
> >>
> >>    - server=server1->db=db_1->table=t1->action=select
> >>
> >> So the authorization works.
> >>
> >> Note
> >>
> >>    - For me, the "*SELECT * FROM t1*" causes the required privileges to
> >>    contain each column explicitly. However, for you, The "privilege" to
> check
> >>    looks like:
> >>    Server=server1->Db=authz->Table=words->action=select; The columns
> are
> >>    not explicitly listed. Hive controls if the column is included in
> >>    required privilege. At org.apache.sentry.binding.h
> >>    ive.authz.HiveAuthzBindingHookBase.authorizeWithHiveBindings ->
> >>    getInputHierarchyFromInputs -> addColumnHierarchy, Sentry uses
> >>    accessedColumns from Hive input to add colHierarchy for each column.
> >>    You can check if accessedColumns is empty or null for the hive
> >>    version you are using.
> >>    - For me, the cached privilege does not include column part. For you,
> >>    the cached privilege is "Server=server1->Db=authz->Table=words->
> >>    *Column=**->action=select". *Can you share your test code*, so I can
> >>    see how you grant the privilege and therefore the cached privilege
> contains
> >>    column?
> >>       - I tried to use "GRANT *SELECT(*)* ON TABLE t1 TO ROLE
> >>       user_role1", and got following error
> >>       -
> >>       - 2018-01-03 23:23:50,459 (HiveServer2-Handler-Pool: Thread-212)
> >>       [WARN - org.apache.hive.service.cli.th
> >>       rift.ThriftCLIService.ExecuteStatement(
> ThriftCLIService.java:539)]
> >>       Error executing statement:
> >>       - org.apache.hive.service.cli.HiveSQLException: Error while
> >>       compiling statement: FAILED: ParseException line 1:6 cannot
> recognize input
> >>       near 'GRANT' 'SELECT' '(' in ddl statement
> >>       - at org.apache.hive.service.cli.operation.Operation.toSQLExcepti
> >>       on(Operation.java:380)
> >>       - at org.apache.hive.service.cli.operation.SQLOperation.prepare(
> >>       SQLOperation.java:206)
> >>       - at org.apache.hive.service.cli.operation.SQLOperation.runIntern
> >>       al(SQLOperation.java:290)
> >>       - at org.apache.hive.service.cli.operation.Operation.run(Operatio
> >>       n.java:320)
> >>       - at org.apache.hive.service.cli.session.HiveSessionImpl.executeS
> >>       tatementInternal(HiveSessionImpl.java:530)
> >>
> >> Thanks,
> >>
> >> Lina
> >>
> >> On Mon, Dec 18, 2017 at 10:14 AM, Colm O hEigeartaigh <
> >> cohei...@apache.org> wrote:
> >>
> >>> Thanks Kalyan! I was thinking that if the cached privilege part does
> not
> >>> appear in the requested "part", and if is "all", then we should skip
> that
> >>> part and continue on to the next one. But maybe there is a better
> >>> solution.
> >>>
> >>> Colm.
> >>>
> >>> On Mon, Dec 18, 2017 at 4:06 PM, Kalyan Kumar Kalvagadda <
> >>> kkal...@cloudera.com> wrote:
> >>>
> >>> > Colm,
> >>> >
> >>> > I will look closer into this today and see If i can help you out.
> >>> >
> >>> > -Kalyan
> >>> >
> >>> > On Mon, Dec 18, 2017 at 4:52 AM, Colm O hEigeartaigh <
> >>> cohei...@apache.org>
> >>> > wrote:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> I've done some further analysis of the problem, and I think it is
> not
> >>> >> directly related to SENTRY-1291. The problem manifests in
> >>> >> CommonPrivilege.implies(privilege, model). My (cached) privilege
> >>> looks
> >>> >> like:
> >>> >>
> >>> >> Server=server1->Db=authz->Table=words->Column=*->action=select
> >>> >>
> >>> >> The "privilege" I want to check looks like:
> >>> >>
> >>> >> Server=server1->Db=authz->Table=words->action=select;
> >>> >>
> >>> >> The problem is in the "for" loop in CommonPrivilege.implies. It
> loops
> >>> on
> >>> >> the parts of the second privilege, and matches up to
> "action=select".
> >>> Here
> >>> >> it tries to compare to "Column=*" of the cached privilege and fails
> on
> >>> >> this
> >>> >> line:
> >>> >>
> >>> >> https://github.com/apache/sentry/blob/a4924edc79b26f937e3e5e
> >>> >> a3584f0b4307dd4135/sentry-policy/sentry-policy-common/
> >>> >> src/main/java/org/apache/sentry/policy/common/CommonPrivileg
> >>> e.java#L86
> >>> >>
> >>> >> It's clear there's a bug here somewhere, but I'm not sure where -
> can
> >>> >> someone please advise?
> >>> >>
> >>> >> Thanks,
> >>> >>
> >>> >> Colm.
> >>> >>
> >>> >> On Wed, Dec 13, 2017 at 8:28 PM, Na Li <lina...@cloudera.com>
> wrote:
> >>> >>
> >>> >> > Sasha,
> >>> >> >
> >>> >> > sentry-1291 is helpful for the problem that sentry privilege
> checks
> >>> >> takes
> >>> >> > too long with many explicit grants, which is useful for big
> >>> customers.
> >>> >> > Another approach that can improve the performance is to organize
> the
> >>> >> > privileges according to the authorization hierarchy in a tree
> >>> >> structure, so
> >>> >> > finding match in ResourceAuthorizationProvider.doHasAccess() is
> in
> >>> the
> >>> >> > order of log(N), not linear of N, where N is the number of
> >>> privileges.
> >>> >> >
> >>> >> > We can wait for Colm to confirm his issue is caused by
> sentry-1291.
> >>> If
> >>> >> so,
> >>> >> > it may be fixed by selecting privileges by finding if the
> requesting
> >>> >> > authorization object is prefix of cached privileges instead of
> exact
> >>> >> match.
> >>> >> >
> >>> >> > in SimplePrivilegeCache
> >>> >> >
> >>> >> > public Set<String> listPrivileges(Set<String> groups, Set<String>
> >>> users,
> >>> >> > ActiveRoleSet roleSet,
> >>> >> >       Authorizable... authorizationHierarchy) {
> >>> >> >     Set<String> privileges = new HashSet<>();
> >>> >> >     Set<StringBuilder> authzKeys = getAuthzKeys(authorizationHier
> >>> >> archy);
> >>> >> >     for (StringBuilder authzKey : authzKeys) {
> >>> >> >       if (cachedAuthzPrivileges.get(authzKey.toString()) !=
> null) {
> >>> >> >   <-
> >>> >> > instead of exact matching, add extension function to check if
> >>> >> > authzKey.toString is the prefix of the key of the entries
> >>> >> > in cachedAuthzPrivileges.
> >>> >> >         privileges.addAll(cachedAuthzPrivileges.get(authzKey.
> >>> >> toString()));
> >>> >> >       }
> >>> >> >     }
> >>> >> >
> >>> >> >     return privileges;
> >>> >> >   }
> >>> >> >
> >>> >> > Thanks,
> >>> >> >
> >>> >> > Lina
> >>> >> >
> >>> >> > On Wed, Dec 13, 2017 at 1:08 PM, Alexander Kolbasov <
> >>> ak...@cloudera.com
> >>> >> >
> >>> >> > wrote:
> >>> >> >
> >>> >> > > I think that SENTRY-1291 should be just reverted - there are
> >>> multiple
> >>> >> > > issues with it and no one is actually using the fix. Anyone
> wants
> >>> to
> >>> >> do
> >>> >> > it?
> >>> >> > >
> >>> >> > > - Alex
> >>> >> > >
> >>> >> > > On Wed, Dec 13, 2017 at 4:44 AM, Na Li <lina...@cloudera.com>
> >>> wrote:
> >>> >> > >
> >>> >> > > > Colm,
> >>> >> > > >
> >>> >> > > > Glad you find the cause!
> >>> >> > > >
> >>> >> > > > You can revert Sentry-1291, and see if it works. If so, it is
> >>> issue
> >>> >> at
> >>> >> > > > finding cached privileges.
> >>> >> > > >
> >>> >> > > > Cheers,
> >>> >> > > >
> >>> >> > > > Lina
> >>> >> > > >
> >>> >> > > > Sent from my iPhone
> >>> >> > > >
> >>> >> > > > > On Dec 13, 2017, at 4:58 AM, Colm O hEigeartaigh <
> >>> >> > cohei...@apache.org>
> >>> >> > > > wrote:
> >>> >> > > > >
> >>> >> > > > > Hi,
> >>> >> > > > >
> >>> >> > > > > I can see what the problem is (that the authorization
> >>> hierarchy
> >>> >> does
> >>> >> > > not
> >>> >> > > > > contain the column, and hence doesn't match against the
> cached
> >>> >> > > > privilege),
> >>> >> > > > > but I'm not sure about the best way to solve it. Either the
> >>> way we
> >>> >> > are
> >>> >> > > > > creating the authorization hierarchy is incorrect (e.g. in
> >>> >> > > > > HiveAuthzBindingHookBase) or else the way we are parsing the
> >>> >> cached
> >>> >> > > > > privilege is incorrect (e.g. in SimplePrivilegeCache/
> >>> >> > CommonPrivilege).
> >>> >> > > > >
> >>> >> > > > > Colm.
> >>> >> > > > >
> >>> >> > > > >> On Wed, Dec 13, 2017 at 5:57 AM, Na Li <
> lina...@cloudera.com
> >>> >
> >>> >> > wrote:
> >>> >> > > > >>
> >>> >> > > > >> Colm,
> >>> >> > > > >>
> >>> >> > > > >> I did not get chance to look into this issue today. Sorry
> >>> about
> >>> >> > that.
> >>> >> > > > >>
> >>> >> > > > >> You can add a e2e test case and set break point at where
> the
> >>> >> > > > authorization
> >>> >> > > > >> object hierarchy to a list of authorization objects, which
> is
> >>> >> used
> >>> >> > to
> >>> >> > > do
> >>> >> > > > >> exact match with cache
> >>> >> > > > >>
> >>> >> > > > >> Sent from my iPhone
> >>> >> > > > >>
> >>> >> > > > >>> On Dec 12, 2017, at 11:27 AM, Colm O hEigeartaigh <
> >>> >> > > cohei...@apache.org
> >>> >> > > > >
> >>> >> > > > >> wrote:
> >>> >> > > > >>>
> >>> >> > > > >>> That would be great, thanks!
> >>> >> > > > >>>
> >>> >> > > > >>> Colm.
> >>> >> > > > >>>
> >>> >> > > > >>>> On Tue, Dec 12, 2017 at 4:36 PM, Na Li <
> >>> lina...@cloudera.com>
> >>> >> > > wrote:
> >>> >> > > > >>>>
> >>> >> > > > >>>> Colm,
> >>> >> > > > >>>>
> >>> >> > > > >>>> I suspect it is a bug in SENTRY-1291. I can take a look
> >>> later
> >>> >> > today.
> >>> >> > > > >>>>
> >>> >> > > > >>>> Thanks,
> >>> >> > > > >>>>
> >>> >> > > > >>>> Lina
> >>> >> > > > >>>>
> >>> >> > > > >>>> On Tue, Dec 12, 2017 at 4:32 AM, Colm O hEigeartaigh <
> >>> >> > > > >> cohei...@apache.org>
> >>> >> > > > >>>> wrote:
> >>> >> > > > >>>>
> >>> >> > > > >>>>> Hi all,
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> I've updated some local testcases to work with Sentry
> >>> 2.0.0
> >>> >> and
> >>> >> > the
> >>> >> > > > >> "v1"
> >>> >> > > > >>>>> Hive binding (previously working fine using 1.8.0 and
> the
> >>> "v2"
> >>> >> > > > >> binding).
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> I have a simple table called "words" (word STRING, count
> >>> >> INT). I
> >>> >> > am
> >>> >> > > > >>>> making
> >>> >> > > > >>>>> an SQL call as the user "bob", e.g. "SELECT * FROM words
> >>> where
> >>> >> > > count
> >>> >> > > > ==
> >>> >> > > > >>>>> '100'".
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> "bob" is in the "manager" group", which has the
> following
> >>> >> role:
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> select_all_role =
> >>> >> > > > >>>>> Server=server1->Db=authz->Tabl
> >>> e=words->Column=*->action=sele
> >>> >> ct
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> Essentially, authorization is denied even though the
> >>> policy is
> >>> >> > > > correct.
> >>> >> > > > >>>> If
> >>> >> > > > >>>>> I look at the SimplePrivilegeCache, the cached privilege
> >>> is:
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> server=server1->db=authz->
> table=words->column=*=[Server=
> >>> >> > > > >>>>> server1->Db=authz->Table=words->Column=*->action=
> select]
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> However, when "listPrivileges" is called, the
> authorizable
> >>> >> > > hierarchy
> >>> >> > > > >>>> looks
> >>> >> > > > >>>>> like:
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> Server [name=server1]
> >>> >> > > > >>>>> Database [name=authz]
> >>> >> > > > >>>>> Table [name=words]
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> There is no "column" here, and a match is not made
> >>> against the
> >>> >> > > cached
> >>> >> > > > >>>>> privilege as a result. Is this a bug or am I missing
> some
> >>> >> > > > configuration
> >>> >> > > > >>>>> switch?
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> Colm.
> >>> >> > > > >>>>>
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> --
> >>> >> > > > >>>>> Colm O hEigeartaigh
> >>> >> > > > >>>>>
> >>> >> > > > >>>>> Talend Community Coder
> >>> >> > > > >>>>> http://coders.talend.com
> >>> >> > > > >>>>>
> >>> >> > > > >>>>
> >>> >> > > > >>>
> >>> >> > > > >>>
> >>> >> > > > >>>
> >>> >> > > > >>> --
> >>> >> > > > >>> Colm O hEigeartaigh
> >>> >> > > > >>>
> >>> >> > > > >>> Talend Community Coder
> >>> >> > > > >>> http://coders.talend.com
> >>> >> > > > >>
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > --
> >>> >> > > > > Colm O hEigeartaigh
> >>> >> > > > >
> >>> >> > > > > Talend Community Coder
> >>> >> > > > > http://coders.talend.com
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Colm O hEigeartaigh
> >>> >>
> >>> >> Talend Community Coder
> >>> >> http://coders.talend.com
> >>> >>
> >>> >
> >>> >
> >>>
> >>>
> >>> --
> >>> Colm O hEigeartaigh
> >>>
> >>> Talend Community Coder
> >>> http://coders.talend.com
> >>>
> >>
> >>
> >
>
>
> --
> Colm O hEigeartaigh
>
> Talend Community Coder
> http://coders.talend.com
>

Reply via email to