[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Andrew Wong (Code Review)
Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 
0.015s sys 0.001s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s  
user 0.040s sys 0.013s

This above worst case is actually less performant than without this
patch. A follow-up patch will make this more performant by caching
privileges from requests at the database scope.

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s  user 
0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s  
 user 0.031s sys 0.013s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Reviewed-on: http://gerrit.cloudera.org:8080/13549
Reviewed-by: Adar Dembo 
Reviewed-by: Hao Hao 
Tested-by: Andrew Wong 
---
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
2 files changed, 123 insertions(+), 2 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, but someone else must approve
  Hao Hao: Looks good to me, approved
  Andrew Wong: Verified

--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Andrew Wong (Code Review)
Andrew Wong has removed a vote on this change.

Change subject: sentry: avoid authorizing every table in ListTables
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 8: Verified+1

DEBUG: unrelated MultiThreadedTabletTest/0.DeleteAndReinsert


--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 20:05:05 +
Gerrit-HasComments: No


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:33:56 +
Gerrit-HasComments: No


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 8: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 18:16:20 +
Gerrit-HasComments: No


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196
PS6, Line 196: *checked_table_names = false;
> multimap::equal_range() gives you that grouped lookup functionality.
Ah neat, thanks for the pointer.

Really the functionality I want is grouped iteration, which IIUC isn't as 
straight-forward with multimaps. At least, I started refactoring to use 
multimap, but found it easier to reason about a map of containers.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 17:51:21 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196
PS6, Line 196: *checked_table_names = false;
> Not sure about that; while multimaps allow for multiple values for a given
multimap::equal_range() gives you that grouped lookup functionality.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 16:22:03 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-09 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196
PS6, Line 196: *checked_table_names = false;
> Might find an unordered_multimap easier to work with.
Not sure about that; while multimaps allow for multiple values for a given key, 
the point of using a map> is to have a grouped vector 
so they can be considered authorized together, not just multiple associations 
with a given db.


http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@201
PS6, Line 201:   for (auto table_name : *table_names) {
 : Slice db_slice;
 : S
> This is interesting: you sure we don't want to pretend a table with a non-c
If it's non-conformant, we can't authorize it via Sentry. In other catalog 
manager methods, this is just treated as an authorization error.

Though thanks for pointing this out; as implemented, this wouldn't uphold 
existing behavior for trusted users (which should see everything). Updated and 
added a test.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 05:48:11 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-09 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13549

to look at the new patch set (#7).

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 
0.015s sys 0.001s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s  
user 0.040s sys 0.013s

This above worst case is actually less performant than without this
patch. A follow-up patch will make this more performant by caching
privileges from requests at the database scope.

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s  user 
0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s  
 user 0.031s sys 0.013s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
2 files changed, 123 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/7
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196
PS6, Line 196:   unordered_map> tables_by_db;
Might find an unordered_multimap easier to work with.


http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@201
PS6, Line 201: if (!s.ok()) {
 :   continue;
 : }
This is interesting: you sure we don't want to pretend a table with a 
non-conformant name has been authorized? How does such a table behave during 
other catalog manager operations?



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 10 Jun 2019 04:24:10 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-08 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13549

to look at the new patch set (#6).

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 
0.015s sys 0.001s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s  
user 0.040s sys 0.013s

This above worst case is actually less performant than without this
patch. A follow-up patch will make this more performant by caching
privileges from requests at the database scope.

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s  user 
0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s  
 user 0.031s sys 0.013s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
2 files changed, 79 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/6
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-08 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h
File src/kudu/master/authz_provider.h:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h@28
PS3, Line 28: #include "kudu/master/catalog_manager.h"
> Why do we need this? Can't we forward declare TableInfo?
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/default_authz_provider.h
File src/kudu/master/default_authz_provider.h:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/default_authz_provider.h@32
PS3, Line 32: class scoped_refptr;
> warning: invalid case style for class 'scoped_refptr' [readability-identifi
Ack


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc
File src/kudu/master/sentry_authz_provider-test.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@400
PS3, Line 400: .
> Nit: extra period here.
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@401
PS3, Line 401: const string dummy_name = Substitute("$0_$1", "foo", d);
 : if (FLAGS_has_db_privileges) {
 :   
ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(db_name, "METADATA")));
 : } else {
 :   
ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(dummy_name, 
"METADATA")));
 : }
> Rewrite:
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@409
PS3, Line 409: ,d
> Nit: , d
Refactored away


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@204
PS3, Line 204:   map>> tables_by_db;
> Could be unordered_map, unless you want alphanumeric ordering for the METAD
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210
PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ);
> I am onboard with this, though I hope we can continue to separate catalog m
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@233
PS3, Line 233: const string& first_table_name_in_db = tables_in_db[0].first;
> Does this need to be deterministic? Like, should we sort tables_in_db or so
No, added a comment.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@234
PS3, Line 234: if (user) {
> Is this check correct? Previously we would include all tables in the respon
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@235
PS3, Line 235:   scoped_refptr table_info;
> Unused.
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@239
PS3, Line 239: first_table_name_in_db
> I think I know why you're providing the table name and not database name he
Done


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@253
PS3, Line 253:   ListTablesResponsePB::TableInfo* table = 
pb->add_tables();
 :   table->set_name(name_and_id.first);
 :   table->set_id(name_and_id.second);
> May be able to save a few LOC with a lambda.
Refactored away



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 08 Jun 2019 08:30:24 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-08 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13549

to look at the new patch set (#4).

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 
0.015s sys 0.001s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s  
user 0.040s sys 0.013s

This above worst case is actually less performant than without this
patch. A follow-up patch will make this more performant by caching
privileges from requests at the database scope.

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s  user 
0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s  
 user 0.031s sys 0.013s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
2 files changed, 74 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/4
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210
PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ);
> I discussed this a bit with Hao and she agrees that authorization wrt the l
I am onboard with this, though I hope we can continue to separate catalog 
manager concerns from the authz provider by performing steps 1 and 3 in the 
catalog manager and step 2 in the provider.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 07 Jun 2019 20:58:20 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210
PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ);
> Another thing to consider: there's an invariant that we follow in the catal
I discussed this a bit with Hao and she agrees that authorization wrt the lock 
is important. We may be able to get around it by doing something along the 
lines of:
1. under lock, put together a map from {table_name => TableInfo} and 
{table_name => table ID}
2. authorize in bulk, keeping track of the authorized table names
3. iterate through the authorized table names and check, under lock, whether 
the table ID found in Step 1 for matches the current TableInfo ID. If so, there 
has been no rename and the authorization policy is upheld. If not, we can make 
no such guarantee and should not list the table.

This has the user-facing behavior of not showing up tables that are renamed 
concurrently with a ListTables call. While not ideal UX, this is conservative 
wrt security.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 07 Jun 2019 19:59:59 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210
PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ);
> It feels really sketchy to see this abstraction leak out of CatalogManager.
Another thing to consider: there's an invariant that we follow in the catalog 
manager where we only authorize tables while they're locked. That's a brutal 
proposition for any sort of grouping in ListTables though. Even if we get the 
plumbing right, it'll just be bad for concurrency to lock a bunch of tables 
while authorizing their database.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 07 Jun 2019 06:11:43 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h
File src/kudu/master/authz_provider.h:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h@28
PS3, Line 28: #include "kudu/master/catalog_manager.h"
Why do we need this? Can't we forward declare TableInfo?


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc
File src/kudu/master/sentry_authz_provider-test.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@400
PS3, Line 400: .
Nit: extra period here.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@401
PS3, Line 401: const string dummy_name = Substitute("$0_$1", "foo", d);
 : if (FLAGS_has_db_privileges) {
 :   
ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(db_name, "METADATA")));
 : } else {
 :   
ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(dummy_name, 
"METADATA")));
 : }
Rewrite:

  
ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(FLAGS_has_db_privileges 
? db_name : Substitute("foo_$0", d), "METADATA")));


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@409
PS3, Line 409: ,d
Nit: , d


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@204
PS3, Line 204:   map>> tables_by_db;
Could be unordered_map, unless you want alphanumeric ordering for the METADATA 
ON DATABASE authz checks below.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210
PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ);
It feels really sketchy to see this abstraction leak out of CatalogManager. 
Could we just pass a vector of pair into this function 
(after having locked and filtered the tables one by one), then parse and group 
by database here? Why do we need to have a table locked during the parse?

Alternatively, we can pass a pre-filled ListTablesResponsePB into this function 
and make the contract something like "the authz provider gets to filter out 
entries". Then there's less additional marshalling/unmarshalling of data, and 
you don't need the duplicated code in default_authz_provider.cc.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@233
PS3, Line 233: const string& first_table_name_in_db = tables_in_db[0].first;
Does this need to be deterministic? Like, should we sort tables_in_db or 
something?


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@234
PS3, Line 234: if (user) {
Is this check correct? Previously we would include all tables in the response 
if user was unset.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@235
PS3, Line 235:   scoped_refptr table_info;
Unused.


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@239
PS3, Line 239: first_table_name_in_db
I think I know why you're providing the table name and not database name here 
(to avoid getting privileges for non-Kudu tables sharing the database), but 
will it be obvious to anyone reading this?


http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@253
PS3, Line 253:   ListTablesResponsePB::TableInfo* table = 
pb->add_tables();
 :   table->set_name(name_and_id.first);
 :   table->set_id(name_and_id.second);
May be able to save a few LOC with a lambda.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 07 Jun 2019 06:05:28 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG@32
PS1, Line 32: patch. A follow-up patch will make this more performant by caching
: privileges from requests at the database scope.
:
: Iterating through one database and no tables since the user has
> I'm still digging into why this is so close to the third case, since I foun
I was expecting this to be lower than the third case since it's only doing a 
lookup on the database. I hadn't taken into account that when running the test 
and granting database-level privileges, there'd be more stuff in Sentry, so 
authorization is slightly slower. I've updated this in the new revision.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 07 Jun 2019 00:36:30 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13549

to look at the new patch set (#3).

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 21.184s  
user 0.023s sys 0.003s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 44.448s  
user 0.037s sys 0.018s

This above worst case is actually less performant than without this
patch. A follow-up patch will make this more performant by caching
privileges from requests at the database scope.

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 0.313s   
user 0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 43.180s  
user 0.031s sys 0.011s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/CMakeLists.txt
M src/kudu/master/authz_provider.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/default_authz_provider.cc
M src/kudu/master/default_authz_provider.h
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
M src/kudu/master/sentry_authz_provider.h
9 files changed, 237 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/3
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Andrew Wong (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13549

to look at the new patch set (#2).

Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

Iterating through many databases:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=true
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 21.184s  
user 0.023s sys 0.003s

Iterating through many databases and many tables. This is the worst case
since listing a given table will require two lookups in Sentry -- one
for the database and one for the table:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=300 --num_tables_per_db=1 --has-db-privileges=false
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 44.448s  
user 0.037s sys 0.018s

Iterating through one database and no tables since the user has
database-level privileges:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=true
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 0.313s   
user 0.000s sys 0.000s

Iterating through one database and many tables:
 $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* 
--num_databases=1 --num_tables_per_db=600 --has-db-privileges=false
 sentry_authz_provider-test.cc:420] Time spent Listing tables: real 43.180s  
user 0.031s sys 0.011s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/CMakeLists.txt
M src/kudu/master/authz_provider.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/default_authz_provider.cc
M src/kudu/master/default_authz_provider.h
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
M src/kudu/master/sentry_authz_provider.h
9 files changed, 218 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/2
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13549 )

Change subject: sentry: avoid authorizing every table in ListTables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG@32
PS1, Line 32:
: With 300 database and 1 table in each database, with privileges 
on the
: databases (warrants lookups on every database):
: sentry_authz_provider-test.cc:414] Time spent Listing tables: 
real 22.363s  user 0.021s sys 0.004s
I'm still digging into why this is so close to the third case, since I found 
this surprising.



--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 06 Jun 2019 22:45:26 +
Gerrit-HasComments: Yes


[kudu-CR] sentry: avoid authorizing every table in ListTables

2019-06-06 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13549


Change subject: sentry: avoid authorizing every table in ListTables
..

sentry: avoid authorizing every table in ListTables

Currently ListTables will call into Sentry for every table in Kudu's
catalog, checking whether the user has metadata privileges on the table,
and adding it to the ListTablesResponsePB if so. This is expensive,
particularly when there are thousands of tables in Kudu.

This patch updates the authorization behavior to check whether the user
has METADATA privileges on the table's database for each table. If no
such privilege exists for the database, each table within the database
is checked.

A benchmark is added to gauge the performance in various scenarios:

With 1 database and 300 tables in the database, with privileges on the
database (warrants only a lookup on the database):
sentry_authz_provider-test.cc:414] Time spent Listing tables: real 0.253s   
user 0.000s sys 0.000s

With 1 database and 300 tables in the database, without privileges on
the database (warrants lookups on the database and each table):
sentry_authz_provider-test.cc:414] Time spent Listing tables: real 11.707s  
user 0.016s sys 0.005s

With 300 database and 1 table in each database, without privileges on
the databases (warrants lookups on every database and every table):
sentry_authz_provider-test.cc:414] Time spent Listing tables: real 22.982s  
user 0.043s sys 0.022s

With 300 database and 1 table in each database, with privileges on the
databases (warrants lookups on every database):
sentry_authz_provider-test.cc:414] Time spent Listing tables: real 22.363s  
user 0.021s sys 0.004s

Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
---
M src/kudu/master/CMakeLists.txt
M src/kudu/master/authz_provider.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/default_authz_provider.cc
M src/kudu/master/default_authz_provider.h
M src/kudu/master/sentry_authz_provider-test.cc
M src/kudu/master/sentry_authz_provider.cc
M src/kudu/master/sentry_authz_provider.h
9 files changed, 212 insertions(+), 32 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/1
--
To view, visit http://gerrit.cloudera.org:8080/13549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527
Gerrit-Change-Number: 13549
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong