[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 0.015s sys 0.001s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s user 0.040s sys 0.013s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s user 0.031s sys 0.013s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Reviewed-on: http://gerrit.cloudera.org:8080/13549 Reviewed-by: Adar Dembo Reviewed-by: Hao Hao Tested-by: Andrew Wong --- M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc 2 files changed, 123 insertions(+), 2 deletions(-) Approvals: Adar Dembo: Looks good to me, but someone else must approve Hao Hao: Looks good to me, approved Andrew Wong: Verified -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 9 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has removed a vote on this change. Change subject: sentry: avoid authorizing every table in ListTables .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 8: Verified+1 DEBUG: unrelated MultiThreadedTabletTest/0.DeleteAndReinsert -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 20:05:05 + Gerrit-HasComments: No
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 19:33:56 + Gerrit-HasComments: No
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 18:16:20 + Gerrit-HasComments: No
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196 PS6, Line 196: *checked_table_names = false; > multimap::equal_range() gives you that grouped lookup functionality. Ah neat, thanks for the pointer. Really the functionality I want is grouped iteration, which IIUC isn't as straight-forward with multimaps. At least, I started refactoring to use multimap, but found it easier to reason about a map of containers. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 17:51:21 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196 PS6, Line 196: *checked_table_names = false; > Not sure about that; while multimaps allow for multiple values for a given multimap::equal_range() gives you that grouped lookup functionality. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 16:22:03 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196 PS6, Line 196: *checked_table_names = false; > Might find an unordered_multimap easier to work with. Not sure about that; while multimaps allow for multiple values for a given key, the point of using a map> is to have a grouped vector so they can be considered authorized together, not just multiple associations with a given db. http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@201 PS6, Line 201: for (auto table_name : *table_names) { : Slice db_slice; : S > This is interesting: you sure we don't want to pretend a table with a non-c If it's non-conformant, we can't authorize it via Sentry. In other catalog manager methods, this is just treated as an authorization error. Though thanks for pointing this out; as implemented, this wouldn't uphold existing behavior for trusted users (which should see everything). Updated and added a test. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 05:48:11 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#7). Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 0.015s sys 0.001s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s user 0.040s sys 0.013s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s user 0.031s sys 0.013s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc 2 files changed, 123 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/7 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@196 PS6, Line 196: unordered_map> tables_by_db; Might find an unordered_multimap easier to work with. http://gerrit.cloudera.org:8080/#/c/13549/6/src/kudu/master/sentry_authz_provider.cc@201 PS6, Line 201: if (!s.ok()) { : continue; : } This is interesting: you sure we don't want to pretend a table with a non-conformant name has been authorized? How does such a table behave during other catalog manager operations? -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 10 Jun 2019 04:24:10 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#6). Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 0.015s sys 0.001s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s user 0.040s sys 0.013s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s user 0.031s sys 0.013s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc 2 files changed, 79 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/6 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (12 comments) http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h File src/kudu/master/authz_provider.h: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h@28 PS3, Line 28: #include "kudu/master/catalog_manager.h" > Why do we need this? Can't we forward declare TableInfo? Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/default_authz_provider.h File src/kudu/master/default_authz_provider.h: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/default_authz_provider.h@32 PS3, Line 32: class scoped_refptr; > warning: invalid case style for class 'scoped_refptr' [readability-identifi Ack http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc File src/kudu/master/sentry_authz_provider-test.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@400 PS3, Line 400: . > Nit: extra period here. Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@401 PS3, Line 401: const string dummy_name = Substitute("$0_$1", "foo", d); : if (FLAGS_has_db_privileges) { : ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(db_name, "METADATA"))); : } else { : ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(dummy_name, "METADATA"))); : } > Rewrite: Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@409 PS3, Line 409: ,d > Nit: , d Refactored away http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@204 PS3, Line 204: map>> tables_by_db; > Could be unordered_map, unless you want alphanumeric ordering for the METAD Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210 PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ); > I am onboard with this, though I hope we can continue to separate catalog m Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@233 PS3, Line 233: const string& first_table_name_in_db = tables_in_db[0].first; > Does this need to be deterministic? Like, should we sort tables_in_db or so No, added a comment. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@234 PS3, Line 234: if (user) { > Is this check correct? Previously we would include all tables in the respon Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@235 PS3, Line 235: scoped_refptr table_info; > Unused. Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@239 PS3, Line 239: first_table_name_in_db > I think I know why you're providing the table name and not database name he Done http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@253 PS3, Line 253: ListTablesResponsePB::TableInfo* table = pb->add_tables(); : table->set_name(name_and_id.first); : table->set_id(name_and_id.second); > May be able to save a few LOC with a lambda. Refactored away -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 08 Jun 2019 08:30:24 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#4). Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 0.015s sys 0.001s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s user 0.040s sys 0.013s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s user 0.031s sys 0.013s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc 2 files changed, 74 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/4 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210 PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ); > I discussed this a bit with Hao and she agrees that authorization wrt the l I am onboard with this, though I hope we can continue to separate catalog manager concerns from the authz provider by performing steps 1 and 3 in the catalog manager and step 2 in the provider. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 07 Jun 2019 20:58:20 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210 PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ); > Another thing to consider: there's an invariant that we follow in the catal I discussed this a bit with Hao and she agrees that authorization wrt the lock is important. We may be able to get around it by doing something along the lines of: 1. under lock, put together a map from {table_name => TableInfo} and {table_name => table ID} 2. authorize in bulk, keeping track of the authorized table names 3. iterate through the authorized table names and check, under lock, whether the table ID found in Step 1 for matches the current TableInfo ID. If so, there has been no rename and the authorization policy is upheld. If not, we can make no such guarantee and should not list the table. This has the user-facing behavior of not showing up tables that are renamed concurrently with a ListTables call. While not ideal UX, this is conservative wrt security. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 07 Jun 2019 19:59:59 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210 PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ); > It feels really sketchy to see this abstraction leak out of CatalogManager. Another thing to consider: there's an invariant that we follow in the catalog manager where we only authorize tables while they're locked. That's a brutal proposition for any sort of grouping in ListTables though. Even if we get the plumbing right, it'll just be bad for concurrency to lock a bunch of tables while authorizing their database. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 07 Jun 2019 06:11:43 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h File src/kudu/master/authz_provider.h: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/authz_provider.h@28 PS3, Line 28: #include "kudu/master/catalog_manager.h" Why do we need this? Can't we forward declare TableInfo? http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc File src/kudu/master/sentry_authz_provider-test.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@400 PS3, Line 400: . Nit: extra period here. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@401 PS3, Line 401: const string dummy_name = Substitute("$0_$1", "foo", d); : if (FLAGS_has_db_privileges) { : ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(db_name, "METADATA"))); : } else { : ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(dummy_name, "METADATA"))); : } Rewrite: ASSERT_OK(AlterRoleGrantPrivilege(GetDatabasePrivilege(FLAGS_has_db_privileges ? db_name : Substitute("foo_$0", d), "METADATA"))); http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider-test.cc@409 PS3, Line 409: ,d Nit: , d http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@204 PS3, Line 204: map>> tables_by_db; Could be unordered_map, unless you want alphanumeric ordering for the METADATA ON DATABASE authz checks below. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@210 PS3, Line 210: TableMetadataLock ltm(table_info.get(), LockMode::READ); It feels really sketchy to see this abstraction leak out of CatalogManager. Could we just pass a vector of pair into this function (after having locked and filtered the tables one by one), then parse and group by database here? Why do we need to have a table locked during the parse? Alternatively, we can pass a pre-filled ListTablesResponsePB into this function and make the contract something like "the authz provider gets to filter out entries". Then there's less additional marshalling/unmarshalling of data, and you don't need the duplicated code in default_authz_provider.cc. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@233 PS3, Line 233: const string& first_table_name_in_db = tables_in_db[0].first; Does this need to be deterministic? Like, should we sort tables_in_db or something? http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@234 PS3, Line 234: if (user) { Is this check correct? Previously we would include all tables in the response if user was unset. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@235 PS3, Line 235: scoped_refptr table_info; Unused. http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@239 PS3, Line 239: first_table_name_in_db I think I know why you're providing the table name and not database name here (to avoid getting privileges for non-Kudu tables sharing the database), but will it be obvious to anyone reading this? http://gerrit.cloudera.org:8080/#/c/13549/3/src/kudu/master/sentry_authz_provider.cc@253 PS3, Line 253: ListTablesResponsePB::TableInfo* table = pb->add_tables(); : table->set_name(name_and_id.first); : table->set_id(name_and_id.second); May be able to save a few LOC with a lambda. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 07 Jun 2019 06:05:28 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG@32 PS1, Line 32: patch. A follow-up patch will make this more performant by caching : privileges from requests at the database scope. : : Iterating through one database and no tables since the user has > I'm still digging into why this is so close to the third case, since I foun I was expecting this to be lower than the third case since it's only doing a lookup on the database. I hadn't taken into account that when running the test and granting database-level privileges, there'd be more stuff in Sentry, so authorization is slightly slower. I've updated this in the new revision. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 07 Jun 2019 00:36:30 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#3). Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 21.184s user 0.023s sys 0.003s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 44.448s user 0.037s sys 0.018s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 0.313s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 43.180s user 0.031s sys 0.011s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/CMakeLists.txt M src/kudu/master/authz_provider.h M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/default_authz_provider.cc M src/kudu/master/default_authz_provider.h M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc M src/kudu/master/sentry_authz_provider.h 9 files changed, 237 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/3 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Hello Alexey Serbin, Kudu Jenkins, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#2). Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 21.184s user 0.023s sys 0.003s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 44.448s user 0.037s sys 0.018s Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 0.313s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 43.180s user 0.031s sys 0.011s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/CMakeLists.txt M src/kudu/master/authz_provider.h M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/default_authz_provider.cc M src/kudu/master/default_authz_provider.h M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc M src/kudu/master/sentry_authz_provider.h 9 files changed, 218 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/2 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13549 ) Change subject: sentry: avoid authorizing every table in ListTables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13549/1//COMMIT_MSG@32 PS1, Line 32: : With 300 database and 1 table in each database, with privileges on the : databases (warrants lookups on every database): : sentry_authz_provider-test.cc:414] Time spent Listing tables: real 22.363s user 0.021s sys 0.004s I'm still digging into why this is so close to the third case, since I found this surprising. -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Thu, 06 Jun 2019 22:45:26 + Gerrit-HasComments: Yes
[kudu-CR] sentry: avoid authorizing every table in ListTables
Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13549 Change subject: sentry: avoid authorizing every table in ListTables .. sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: With 1 database and 300 tables in the database, with privileges on the database (warrants only a lookup on the database): sentry_authz_provider-test.cc:414] Time spent Listing tables: real 0.253s user 0.000s sys 0.000s With 1 database and 300 tables in the database, without privileges on the database (warrants lookups on the database and each table): sentry_authz_provider-test.cc:414] Time spent Listing tables: real 11.707s user 0.016s sys 0.005s With 300 database and 1 table in each database, without privileges on the databases (warrants lookups on every database and every table): sentry_authz_provider-test.cc:414] Time spent Listing tables: real 22.982s user 0.043s sys 0.022s With 300 database and 1 table in each database, with privileges on the databases (warrants lookups on every database): sentry_authz_provider-test.cc:414] Time spent Listing tables: real 22.363s user 0.021s sys 0.004s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/CMakeLists.txt M src/kudu/master/authz_provider.h M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/default_authz_provider.cc M src/kudu/master/default_authz_provider.h M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc M src/kudu/master/sentry_authz_provider.h 9 files changed, 212 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/1 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong